WorldWideScience

Sample records for ontology-based phenotype annotation

  1. Linking human diseases to animal models using ontology-based phenotype annotation.

    Directory of Open Access Journals (Sweden)

    Nicole L Washington

    2009-11-01

    Full Text Available Scientists and clinicians who study genetic alterations and disease have traditionally described phenotypes in natural language. The considerable variation in these free-text descriptions has posed a hindrance to the important task of identifying candidate genes and models for human diseases and indicates the need for a computationally tractable method to mine data resources for mutant phenotypes. In this study, we tested the hypothesis that ontological annotation of disease phenotypes will facilitate the discovery of new genotype-phenotype relationships within and across species. To describe phenotypes using ontologies, we used an Entity-Quality (EQ methodology, wherein the affected entity (E and how it is affected (Q are recorded using terms from a variety of ontologies. Using this EQ method, we annotated the phenotypes of 11 gene-linked human diseases described in Online Mendelian Inheritance in Man (OMIM. These human annotations were loaded into our Ontology-Based Database (OBD along with other ontology-based phenotype descriptions of mutants from various model organism databases. Phenotypes recorded with this EQ method can be computationally compared based on the hierarchy of terms in the ontologies and the frequency of annotation. We utilized four similarity metrics to compare phenotypes and developed an ontology of homologous and analogous anatomical structures to compare phenotypes between species. Using these tools, we demonstrate that we can identify, through the similarity of the recorded phenotypes, other alleles of the same gene, other members of a signaling pathway, and orthologous genes and pathway members across species. We conclude that EQ-based annotation of phenotypes, in conjunction with a cross-species ontology, and a variety of similarity metrics can identify biologically meaningful similarities between genes by comparing phenotypes alone. This annotation and search method provides a novel and efficient means to identify

  2. A revision of Evaniscus (Hymenoptera, Evaniidae using ontology-based semantic phenotype annotation

    Directory of Open Access Journals (Sweden)

    Patricia Mullins

    2012-09-01

    Full Text Available The Neotropical evaniid genus Evaniscus Szépligeti currently includes six species. Two new species are described, Evaniscus lansdownei Mullins, sp. n. from Colombia and Brazil and E. rafaeli Kawada, sp. n. from Brazil. Evaniscus sulcigenis Roman, syn. n., is synonymized under E. rufithorax Enderlein. An identification key to species of Evaniscus is provided. Thirty-five parsimony informative morphological characters are analyzed for six ingroup and four outgroup taxa. A topology resulting in a monophyletic Evaniscus is presented with E. tibialis and E. rafaeli as sister to the remaining Evaniscus species. The Hymenoptera Anatomy Ontology and other relevant biomedical ontologies are employed to create semantic phenotype statements in Entity-Quality (EQ format for species descriptions. This approach is an early effort to formalize species descriptions and to make descriptive data available to other domains.

  3. A revision of Evaniscus (Hymenoptera, Evaniidae) using ontology-based semantic phenotype annotation.

    Science.gov (United States)

    Mullins, Patricia L; Kawada, Ricardo; Balhoff, James P; Deans, Andrew R

    2012-01-01

    The Neotropical evaniid genus Evaniscus Szépligeti currently includes six species. Two new species are described, Evaniscus lansdownei Mullins, sp. n. from Colombia and Brazil and Evaniscus rafaeli Kawada, sp. n. from Brazil. Evaniscus sulcigenis Roman, syn. n., is synonymized under Evaniscus rufithorax Enderlein. An identification key to species of Evaniscus is provided. Thirty-five parsimony informative morphological characters are analyzed for six ingroup and four outgroup taxa. A topology resulting in a monophyletic Evaniscus is presented with Evaniscus tibialis and Evaniscus rafaeli as sister to the remaining Evaniscus species. The Hymenoptera Anatomy Ontology and other relevant biomedical ontologies are employed to create semantic phenotype statements in Entity-Quality (EQ) format for species descriptions. This approach is an early effort to formalize species descriptions and to make descriptive data available to other domains.

  4. SORTA : a system for ontology-based re-coding and technical annotation of biomedical phenotype data

    NARCIS (Netherlands)

    Pang, Chao; Sollie, Annet; Sijtsma, Anna; Hendriksen, Dennis; Charbon, Bart; Haan, Mark de; de Boer, Tommy; Kelpin, Fleur; Jetten, Jonathan; van der Velde, Joeri K.; Smidt, Nynke; Sijmons, Rolf; Hillege, Hans; Swertz, Morris A.

    2015-01-01

    There is an urgent need to standardize the semantics of biomedical data values, such as phenotypes, to enable comparative and integrative analyses. However, it is unlikely that all studies will use the same data collection protocols. As a result, retrospective standardization is often required,

  5. SORTA: a system for ontology-based re-coding and technical annotation of biomedical phenotype data.

    Science.gov (United States)

    Pang, Chao; Sollie, Annet; Sijtsma, Anna; Hendriksen, Dennis; Charbon, Bart; de Haan, Mark; de Boer, Tommy; Kelpin, Fleur; Jetten, Jonathan; van der Velde, Joeri K; Smidt, Nynke; Sijmons, Rolf; Hillege, Hans; Swertz, Morris A

    2015-01-01

    There is an urgent need to standardize the semantics of biomedical data values, such as phenotypes, to enable comparative and integrative analyses. However, it is unlikely that all studies will use the same data collection protocols. As a result, retrospective standardization is often required, which involves matching of original (unstructured or locally coded) data to widely used coding or ontology systems such as SNOMED CT (clinical terms), ICD-10 (International Classification of Disease) and HPO (Human Phenotype Ontology). This data curation process is usually a time-consuming process performed by a human expert. To help mechanize this process, we have developed SORTA, a computer-aided system for rapidly encoding free text or locally coded values to a formal coding system or ontology. SORTA matches original data values (uploaded in semicolon delimited format) to a target coding system (uploaded in Excel spreadsheet, OWL ontology web language or OBO open biomedical ontologies format). It then semi- automatically shortlists candidate codes for each data value using Lucene and n-gram based matching algorithms, and can also learn from matches chosen by human experts. We evaluated SORTA's applicability in two use cases. For the LifeLines biobank, we used SORTA to recode 90 000 free text values (including 5211 unique values) about physical exercise to MET (Metabolic Equivalent of Task) codes. For the CINEAS clinical symptom coding system, we used SORTA to map to HPO, enriching HPO when necessary (315 terms matched so far). Out of the shortlists at rank 1, we found a precision/recall of 0.97/0.98 in LifeLines and of 0.58/0.45 in CINEAS. More importantly, users found the tool both a major time saver and a quality improvement because SORTA reduced the chances of human mistakes. Thus, SORTA can dramatically ease data (re)coding tasks and we believe it will prove useful for many more projects. Database URL: http://molgenis.org/sorta or as an open source download from

  6. Statistical mechanics of ontology based annotations

    CERN Document Server

    Hoyle, David C

    2016-01-01

    We present a statistical mechanical theory of the process of annotating an object with terms selected from an ontology. The term selection process is formulated as an ideal lattice gas model, but in a highly structured inhomogeneous field. The model enables us to explain patterns recently observed in real-world annotation data sets, in terms of the underlying graph structure of the ontology. By relating the external field strengths to the information content of each node in the ontology graph, the statistical mechanical model also allows us to propose a number of practical metrics for assessing the quality of both the ontology, and the annotations that arise from its use. Using the statistical mechanical formalism we also study an ensemble of ontologies of differing size and complexity; an analysis not readily performed using real data alone. Focusing on regular tree ontology graphs we uncover a rich set of scaling laws describing the growth in the optimal ontology size as the number of objects being annotate...

  7. OntoELAN: An Ontology-based Linguistic Multimedia Annotator

    CERN Document Server

    Chebotko, Artem; Lu, Shiyong; Fotouhi, Farshad; Aristar, Anthony; Brugman, Hennie; Klassmann, Alexander; Sloetjes, Han; Russel, Albert; Wittenburg, Peter

    2009-01-01

    Despite its scientific, political, and practical value, comprehensive information about human languages, in all their variety and complexity, is not readily obtainable and searchable. One reason is that many language data are collected as audio and video recordings which imposes a challenge to document indexing and retrieval. Annotation of multimedia data provides an opportunity for making the semantics explicit and facilitates the searching of multimedia documents. We have developed OntoELAN, an ontology-based linguistic multimedia annotator that features: (1) support for loading and displaying ontologies specified in OWL; (2) creation of a language profile, which allows a user to choose a subset of terms from an ontology and conveniently rename them if needed; (3) creation of ontological tiers, which can be annotated with profile terms and, therefore, corresponding ontological terms; and (4) saving annotations in the XML format as Multimedia Ontology class instances and, linked to them, class instances of o...

  8. Ontology-Based Prediction and Prioritization of Gene Functional Annotations.

    Science.gov (United States)

    Chicco, Davide; Masseroli, Marco

    2016-01-01

    Genes and their protein products are essential molecular units of a living organism. The knowledge of their functions is key for the understanding of physiological and pathological biological processes, as well as in the development of new drugs and therapies. The association of a gene or protein with its functions, described by controlled terms of biomolecular terminologies or ontologies, is named gene functional annotation. Very many and valuable gene annotations expressed through terminologies and ontologies are available. Nevertheless, they might include some erroneous information, since only a subset of annotations are reviewed by curators. Furthermore, they are incomplete by definition, given the rapidly evolving pace of biomolecular knowledge. In this scenario, computational methods that are able to quicken the annotation curation process and reliably suggest new annotations are very important. Here, we first propose a computational pipeline that uses different semantic and machine learning methods to predict novel ontology-based gene functional annotations; then, we introduce a new semantic prioritization rule to categorize the predicted annotations by their likelihood of being correct. Our tests and validations proved the effectiveness of our pipeline and prioritization of predicted annotations, by selecting as most likely manifold predicted annotations that were later confirmed.

  9. Collaborative Semantic Annotation of Images : Ontology-Based Model

    Directory of Open Access Journals (Sweden)

    Damien E. ZOMAHOUN

    2015-12-01

    Full Text Available In the quest for models that could help to represen t the meaning of images, some approaches have used contextual knowledge by building semantic hierarchi es. Others have resorted to the integration of imag es analysis improvement knowledge and images interpret ation using ontologies. The images are often annotated with a set of keywords (or ontologies, w hose relevance remains highly subjective and relate d to only one interpretation (one annotator. However , an image can get many associated semantics because annotators can interpret it differently. Th e purpose of this paper is to propose a collaborati ve annotation system that brings out the meaning of im ages from the different interpretations of annotato rs. The different works carried out in this paper lead to a semantic model of an image, i.e. the different means that a picture may have. This method relies o n the different tools of the Semantic Web, especial ly ontologies.

  10. Ontology-Based Semantic Annotation for Problem Set Archives in the Web

    Institute of Scientific and Technical Information of China (English)

    2006-01-01

    Aimming at the difficulty in getting semantic information from each problem in problem set archives, We propose a new method of ontology-based semantic annotation for problem set archives, which utilizes programming knowledge domain ontology to add semantic annotations to problems in the Web. The system we developed adds semantic annotation for each problem in the form of Extensible Makeup Language. Our method overcomes the difficulty of extracting semantics from problem set archives and the efficiency of this method is demonstrated through a case study. Having semantic annotations of problems, a student can efficiently locate the problems that logically correspond to his knowledge.

  11. Bridge Ontology:A Multi-Ontologies-Based Approach for Semantic Annotation

    Institute of Scientific and Technical Information of China (English)

    WANG Peng; XU Bao-wen; LU Jian-jiang; LI Yan-hui; JIANG Jian-hua

    2004-01-01

    Representing the relationships between ontologies is the key problem of semantic annotations based on multi-ontologies.Traditional approaches only had the ability of denoting the simple concept subsumption relations between ontologies.Through analyzing and classifying the relationships between ontologies, the idea of bridge ontology was proposed, which had the powerful capability of expressing the complex relationships between concepts and relationships between relations in multi-ontologies.Meanwhile, a new approach employing bridge ontology was proposed to deal with the multi-ontologies-based semantic annotation problem.The bridge ontology is a peculiar ontology, which can be created and maintained conveniently, and is effective in the multi-ontologies-based semantic annotation.The approach using bridge ontology has the advantages of low-cost, scalable, robust in the web circumstance, and avoiding the unnecessary ontology extending and integration.

  12. Ontology-Based Annotation of Multimedia Language Data for the Semantic Web

    CERN Document Server

    Chebotko, Artem; Fotouhi, Farshad; Aristar, Anthony

    2009-01-01

    There is an increasing interest and effort in preserving and documenting endangered languages. Language data are valuable only when they are well-cataloged, indexed and searchable. Many language data, particularly those of lesser-spoken languages, are collected as audio and video recordings. While multimedia data provide more channels and dimensions to describe a language's function, and gives a better presentation of the cultural system associated with the language of that community, they are not text-based or structured (in binary format), and their semantics is implicit in their content. The content is thus easy for a human being to understand, but difficult for computers to interpret. Hence, there is a great need for a powerful and user-friendly system to annotate multimedia data with text-based, well-structured and searchable metadata. This chapter describes an ontology-based multimedia annotation tool, OntoELAN, that enables annotation of language multimedia data with a linguistic ontology.

  13. Uncertainty modeling for ontology-based mammography annotation with intelligent BI-RADS scoring.

    Science.gov (United States)

    Bulu, Hakan; Alpkocak, Adil; Balci, Pinar

    2013-05-01

    This paper presents an ontology-based annotation system and BI-RADS (Breast Imaging Reporting and Data System) score reasoning with Semantic Web technologies in mammography. The annotation system is based on the Mammography Annotation Ontology (MAO) where the BI-RADS score reasoning works. However, ontologies are based on crisp logic and they cannot handle uncertainty. Consequently, we propose a Bayesian-based approach to model uncertainty in mammography ontology and make reasoning possible using BI-RADS scores with SQWRL (Semantic Query-enhanced Web Rule Language). First, we give general information about our system and present details of mammography annotation ontology, its main concepts and relationships. Then, we express uncertainty in mammography and present approaches to handle uncertainty issues. System is evaluated with a manually annotated dataset DEMS (Dokuz Eylul University Mammography Set) and DDSM (Digital Database for Screening Mammography). We give the result of experimentations in terms of accuracy, sensitivity, precision and uncertainty level measures. Copyright © 2013 Elsevier Ltd. All rights reserved.

  14. DEVA: An extensible ontology-based annotation model for visual document collections

    Science.gov (United States)

    Jelmini, Carlo; Marchand-Maillet, Stephane

    2003-01-01

    The description of visual documents is a fundamental aspect of any efficient information management system, but the process of manually annotating large collections of documents is tedious and far from being perfect. The need for a generic and extensible annotation model therefore arises. In this paper, we present DEVA, an open, generic and expressive multimedia annotation framework. DEVA is an extension of the Dublin Core specification. The model can represent the semantic content of any visual document. It is described in the ontology language DAML+OIL and can easily be extended with external specialized ontologies, adapting the vocabulary to the given application domain. In parallel, we present the Magritte annotation tool, which is an early prototype that validates the DEVA features. Magritte allows to manually annotating image collections. It is designed with a modular and extensible architecture, which enables the user to dynamically adapt the user interface to specialized ontologies merged into DEVA.

  15. BOWiki: an ontology-based wiki for annotation of data and integration of knowledge in biology

    Directory of Open Access Journals (Sweden)

    Gregorio Sergio E

    2009-05-01

    Full Text Available Abstract Motivation Ontology development and the annotation of biological data using ontologies are time-consuming exercises that currently require input from expert curators. Open, collaborative platforms for biological data annotation enable the wider scientific community to become involved in developing and maintaining such resources. However, this openness raises concerns regarding the quality and correctness of the information added to these knowledge bases. The combination of a collaborative web-based platform with logic-based approaches and Semantic Web technology can be used to address some of these challenges and concerns. Results We have developed the BOWiki, a web-based system that includes a biological core ontology. The core ontology provides background knowledge about biological types and relations. Against this background, an automated reasoner assesses the consistency of new information added to the knowledge base. The system provides a platform for research communities to integrate information and annotate data collaboratively. Availability The BOWiki and supplementary material is available at http://www.bowiki.net/. The source code is available under the GNU GPL from http://onto.eva.mpg.de/trac/BoWiki.

  16. Culto: AN Ontology-Based Annotation Tool for Data Curation in Cultural Heritage

    Science.gov (United States)

    Garozzo, R.; Murabito, F.; Santagati, C.; Pino, C.; Spampinato, C.

    2017-08-01

    This paper proposes CulTO, a software tool relying on a computational ontology for Cultural Heritage domain modelling, with a specific focus on religious historical buildings, for supporting cultural heritage experts in their investigations. It is specifically thought to support annotation, automatic indexing, classification and curation of photographic data and text documents of historical buildings. CULTO also serves as a useful tool for Historical Building Information Modeling (H-BIM) by enabling semantic 3D data modeling and further enrichment with non-geometrical information of historical buildings through the inclusion of new concepts about historical documents, images, decay or deformation evidence as well as decorative elements into BIM platforms. CulTO is the result of a joint research effort between the Laboratory of Surveying and Architectural Photogrammetry "Luigi Andreozzi" and the PeRCeiVe Lab (Pattern Recognition and Computer Vision Lab) of the University of Catania,

  17. Multi-source and ontology-based retrieval engine for maize mutant phenotypes.

    Science.gov (United States)

    Green, Jason M; Harnsomburana, Jaturon; Schaeffer, Mary L; Lawrence, Carolyn J; Shyu, Chi-Ren

    2011-01-01

    Model Organism Databases, including the various plant genome databases, collect and enable access to massive amounts of heterogeneous information, including sequence data, gene product information, images of mutant phenotypes, etc, as well as textual descriptions of many of these entities. While a variety of basic browsing and search capabilities are available to allow researchers to query and peruse the names and attributes of phenotypic data, next-generation search mechanisms that allow querying and ranking of text descriptions are much less common. In addition, the plant community needs an innovative way to leverage the existing links in these databases to search groups of text descriptions simultaneously. Furthermore, though much time and effort have been afforded to the development of plant-related ontologies, the knowledge embedded in these ontologies remains largely unused in available plant search mechanisms. Addressing these issues, we have developed a unique search engine for mutant phenotypes from MaizeGDB. This advanced search mechanism integrates various text description sources in MaizeGDB to aid a user in retrieving desired mutant phenotype information. Currently, descriptions of mutant phenotypes, loci and gene products are utilized collectively for each search, though expansion of the search mechanism to include other sources is straightforward. The retrieval engine, to our knowledge, is the first engine to exploit the content and structure of available domain ontologies, currently the Plant and Gene Ontologies, to expand and enrich retrieval results in major plant genomic databases. Database URL: http:www.PhenomicsWorld.org/QBTA.php.

  18. Multi-source and ontology-based retrieval engine for maize mutant phenotypes

    Science.gov (United States)

    In the midst of this genomics era, major plant genome databases are collecting massive amounts of heterogeneous information, including sequence data, gene product information, images of mutant phenotypes, etc., as well as textual descriptions of many of these entities. While basic browsing and sear...

  19. Phenex: ontological annotation of phenotypic diversity.

    Directory of Open Access Journals (Sweden)

    James P Balhoff

    Full Text Available BACKGROUND: Phenotypic differences among species have long been systematically itemized and described by biologists in the process of investigating phylogenetic relationships and trait evolution. Traditionally, these descriptions have been expressed in natural language within the context of individual journal publications or monographs. As such, this rich store of phenotype data has been largely unavailable for statistical and computational comparisons across studies or integration with other biological knowledge. METHODOLOGY/PRINCIPAL FINDINGS: Here we describe Phenex, a platform-independent desktop application designed to facilitate efficient and consistent annotation of phenotypic similarities and differences using Entity-Quality syntax, drawing on terms from community ontologies for anatomical entities, phenotypic qualities, and taxonomic names. Phenex can be configured to load only those ontologies pertinent to a taxonomic group of interest. The graphical user interface was optimized for evolutionary biologists accustomed to working with lists of taxa, characters, character states, and character-by-taxon matrices. CONCLUSIONS/SIGNIFICANCE: Annotation of phenotypic data using ontologies and globally unique taxonomic identifiers will allow biologists to integrate phenotypic data from different organisms and studies, leveraging decades of work in systematics and comparative morphology.

  20. Search of phenotype related candidate genes using gene ontology-based semantic similarity and protein interaction information: application to Brugada syndrome.

    Science.gov (United States)

    Massanet, Raimon; Gallardo-Chacon, Joan-Josep; Caminal, Pere; Perera, Alexandre

    2009-01-01

    This work presents a methodology for finding phenotype candidate genes starting from a set of known related genes. This is accomplished by automatically mining and organizing the available scientific literature using Gene Ontology-based semantic similarity. As a case study, Brugada syndrome related genes have been used as input in order to obtain a list of other possible candidate genes related with this disease. Brugada anomaly produces a typical alteration in the Electrocardiogram and carriers of the disease show an increased probability of sudden death. Results show a set of semantically coherent proteins that are shown to be related with synaptic transmission and muscle contraction physiological processes.

  1. Ontology Based Access Control

    Directory of Open Access Journals (Sweden)

    Özgü CAN

    2010-02-01

    Full Text Available As computer technologies become pervasive, the need for access control mechanisms grow. The purpose of an access control is to limit the operations that a computer system user can perform. Thus, access control ensures to prevent an activity which can lead to a security breach. For the success of Semantic Web, that allows machines to share and reuse the information by using formal semantics for machines to communicate with other machines, access control mechanisms are needed. Access control mechanism indicates certain constraints which must be achieved by the user before performing an operation to provide a secure Semantic Web. In this work, unlike traditional access control mechanisms, an "Ontology Based Access Control" mechanism has been developed by using Semantic Web based policies. In this mechanism, ontologies are used to model the access control knowledge and domain knowledge is used to create policy ontologies.

  2. Ontology-based application integration

    CERN Document Server

    Paulheim, Heiko

    2011-01-01

    Ontology-based Application Integration introduces UI-level (User Interface Level) application integration and discusses current problems which can be remedied by using ontologies. It shows a novel approach for applying ontologies in system integration. While ontologies have been used for integration of IT systems on the database and on the business logic layer, integration on the user interface layer is a novel field of research. This book also discusses how end users, not only developers, can benefit from semantic technologies. Ontology-based Application Integration presents the development o

  3. Annotation of phenotypic diversity: decoupling data curation and ontology curation using Phenex.

    Science.gov (United States)

    Balhoff, James P; Dahdul, Wasila M; Dececchi, T Alexander; Lapp, Hilmar; Mabee, Paula M; Vision, Todd J

    2014-01-01

    Phenex (http://phenex.phenoscape.org/) is a desktop application for semantically annotating the phenotypic character matrix datasets common in evolutionary biology. Since its initial publication, we have added new features that address several major bottlenecks in the efficiency of the phenotype curation process: allowing curators during the data curation phase to provisionally request terms that are not yet available from a relevant ontology; supporting quality control against annotation guidelines to reduce later manual review and revision; and enabling the sharing of files for collaboration among curators. We decoupled data annotation from ontology development by creating an Ontology Request Broker (ORB) within Phenex. Curators can use the ORB to request a provisional term for use in data annotation; the provisional term can be automatically replaced with a permanent identifier once the term is added to an ontology. We added a set of annotation consistency checks to prevent common curation errors, reducing the need for later correction. We facilitated collaborative editing by improving the reliability of Phenex when used with online folder sharing services, via file change monitoring and continual autosave. With the addition of these new features, and in particular the Ontology Request Broker, Phenex users have been able to focus more effectively on data annotation. Phenoscape curators using Phenex have reported a smoother annotation workflow, with much reduced interruptions from ontology maintenance and file management issues.

  4. Ontology-based geographic information semantic metadata integration

    Science.gov (United States)

    Zhan, Qin; Li, Deren; Zhang, Xia; Xia, Yu

    2009-10-01

    Metadata is important to facilitate data sharing among Geospatial Information Communities in distributed environment. For unanimous understanding and standard production of metadata annotations, metadata specifications are documented such as Geographic Information Metadata Standard (ISO19115-2003), the Content Standard for Digital Geospatial Metadata (CSDGM), and so on. Though these specifications provide frameworks for description of geographic data, there are two problems which embarrass sufficiently data sharing. One problem is that specifications are lack of domainspecific semantics. Another problem is that specifications can not always solve semantic heterogeneities. To solve the former problem, an ontology-based geographic information metadata extension framework is proposed which can incorporate domain-specific semantics. Besides, for solving the later problem, metadata integration mechanism based on the proposed extension is studied. In this paper, integration of metadata is realized through integration of ontologies. So integration of ontologies is also discussed. By ontology-based geographic information semantic metadata integration, sharing of geographic data is realized more efficiently.

  5. Enabling Ontology Based Semantic Queries in Biomedical Database Systems.

    Science.gov (United States)

    Zheng, Shuai; Wang, Fusheng; Lu, James

    2014-03-01

    There is a lack of tools to ease the integration and ontology based semantic queries in biomedical databases, which are often annotated with ontology concepts. We aim to provide a middle layer between ontology repositories and semantically annotated databases to support semantic queries directly in the databases with expressive standard database query languages. We have developed a semantic query engine that provides semantic reasoning and query processing, and translates the queries into ontology repository operations on NCBO BioPortal. Semantic operators are implemented in the database as user defined functions extended to the database engine, thus semantic queries can be directly specified in standard database query languages such as SQL and XQuery. The system provides caching management to boosts query performance. The system is highly adaptable to support different ontologies through easy customizations. We have implemented the system DBOntoLink as an open source software, which supports major ontologies hosted at BioPortal. DBOntoLink supports a set of common ontology based semantic operations and have them fully integrated with a database management system IBM DB2. The system has been deployed and evaluated with an existing biomedical database for managing and querying image annotations and markups (AIM). Our performance study demonstrates the high expressiveness of semantic queries and the high efficiency of the queries.

  6. Supporting the annotation of chronic obstructive pulmonary disease (COPD) phenotypes with text mining workflows.

    Science.gov (United States)

    Fu, Xiao; Batista-Navarro, Riza; Rak, Rafal; Ananiadou, Sophia

    2015-01-01

    Chronic obstructive pulmonary disease (COPD) is a life-threatening lung disorder whose recent prevalence has led to an increasing burden on public healthcare. Phenotypic information in electronic clinical records is essential in providing suitable personalised treatment to patients with COPD. However, as phenotypes are often "hidden" within free text in clinical records, clinicians could benefit from text mining systems that facilitate their prompt recognition. This paper reports on a semi-automatic methodology for producing a corpus that can ultimately support the development of text mining tools that, in turn, will expedite the process of identifying groups of COPD patients. A corpus of 30 full-text papers was formed based on selection criteria informed by the expertise of COPD specialists. We developed an annotation scheme that is aimed at producing fine-grained, expressive and computable COPD annotations without burdening our curators with a highly complicated task. This was implemented in the Argo platform by means of a semi-automatic annotation workflow that integrates several text mining tools, including a graphical user interface for marking up documents. When evaluated using gold standard (i.e., manually validated) annotations, the semi-automatic workflow was shown to obtain a micro-averaged F-score of 45.70% (with relaxed matching). Utilising the gold standard data to train new concept recognisers, we demonstrated that our corpus, although still a work in progress, can foster the development of significantly better performing COPD phenotype extractors. We describe in this work the means by which we aim to eventually support the process of COPD phenotype curation, i.e., by the application of various text mining tools integrated into an annotation workflow. Although the corpus being described is still under development, our results thus far are encouraging and show great potential in stimulating the development of further automatic COPD phenotype extractors.

  7. Generation of silver standard concept annotations from biomedical texts with special relevance to phenotypes.

    Directory of Open Access Journals (Sweden)

    Anika Oellrich

    Full Text Available Electronic health records and scientific articles possess differing linguistic characteristics that may impact the performance of natural language processing tools developed for one or the other. In this paper, we investigate the performance of four extant concept recognition tools: the clinical Text Analysis and Knowledge Extraction System (cTAKES, the National Center for Biomedical Ontology (NCBO Annotator, the Biomedical Concept Annotation System (BeCAS and MetaMap. Each of the four concept recognition systems is applied to four different corpora: the i2b2 corpus of clinical documents, a PubMed corpus of Medline abstracts, a clinical trails corpus and the ShARe/CLEF corpus. In addition, we assess the individual system performances with respect to one gold standard annotation set, available for the ShARe/CLEF corpus. Furthermore, we built a silver standard annotation set from the individual systems' output and assess the quality as well as the contribution of individual systems to the quality of the silver standard. Our results demonstrate that mainly the NCBO annotator and cTAKES contribute to the silver standard corpora (F1-measures in the range of 21% to 74% and their quality (best F1-measure of 33%, independent from the type of text investigated. While BeCAS and MetaMap can contribute to the precision of silver standard annotations (precision of up to 42%, the F1-measure drops when combined with NCBO Annotator and cTAKES due to a low recall. In conclusion, the performances of individual systems need to be improved independently from the text types, and the leveraging strategies to best take advantage of individual systems' annotations need to be revised. The textual content of the PubMed corpus, accession numbers for the clinical trials corpus, and assigned annotations of the four concept recognition systems as well as the generated silver standard annotation sets are available from http://purl.org/phenotype/resources. The textual content

  8. ONTOLOGY BASED WEB PAGE ANNOTATION FOR EFFECTIVE INFORMATION RETRIEVAL

    Directory of Open Access Journals (Sweden)

    S.Kalarani

    2010-11-01

    Full Text Available Today’s World Wide Web has large volume of data – billions of documents. So it is a time consuming process to discover effective knowledge from the input data. With today's keyword approach the amount of time and effort required to find the right information is directly proportional to the amount of information on the web.The web has grown exponentially and people are forced to spend more and more time in search for the information they are looking for. Lack of personalization as well as inability to easily separate commercial from non-commercial searches is among other limitations of today's web search technologies. This paper proposes a prototype relation-based search engine. “OntoLook” which has been designed in a virtual semantic web environment. The architecture has been proposed. The Semantic Web is well recognized as an effective infrastructure to enhance visibility of knowledge on the Web. The core of the Semantic Web is “ontology”, which is used to explicitly represent our conceptualizations. Ontology engineering in the Semantic Web isprimarily supported by languages such as RDF, RDFS and OWL. This paper discusses the requirements of ontology in the context of the Web, compares the above three languages with existing knowledge representation formalisms, and surveys tools for managing and applying ontology. Advantages of using ontology in both knowledge-base-style and database-style applications are demonstrated using one real world applications.

  9. Ontology-Based Classification System Development Methodology

    OpenAIRE

    2015-01-01

    The aim of the article is to analyse and develop an ontology-based classification system methodology that uses decision tree learning with statement propositionalized attributes. Classical decision tree learning algorithms, as well as decision tree learning with taxonomy and propositionalized attributes have been observed. Thus, domain ontology can be extracted from the data sets and can be used for data classification with the help of a decision tree. The use of ontology methods in decision ...

  10. Ontology Based Feature Driven Development Life Cycle

    Directory of Open Access Journals (Sweden)

    Farheen Siddiqui

    2012-01-01

    Full Text Available The upcoming technology support for semantic web promises fresh directions for Software Engineering community. Also semantic web has its roots in knowledge engineering that provoke software engineers to look for application of ontology applications throughout the Software Engineering lifecycle. The internal components of a semantic web are "light weight", and may be of less quality standards than the externally visible modules. In fact the internal components are generated from external (ontological component. That's the reason agile development approaches such as feature driven development are suitable for applications internal component development. As yet there is no particular procedure that describes the role of ontology in FDD processes. Therefore we propose an ontology based feature driven development for semantic web application that can be used form application model development to feature design and implementation. Features are precisely defined in the OWL-based domain model. Transition from OWL based domain model to feature list is directly defined in transformation rules. On the other hand the ontology based overall model can be easily validated through automated tools. Advantages of ontology-based feature Driven development are also discussed.

  11. Construction and accessibility of a cross-species phenotype ontology along with gene annotations for biomedical research.

    Science.gov (United States)

    Köhler, Sebastian; Doelken, Sandra C; Ruef, Barbara J; Bauer, Sebastian; Washington, Nicole; Westerfield, Monte; Gkoutos, George; Schofield, Paul; Smedley, Damian; Lewis, Suzanna E; Robinson, Peter N; Mungall, Christopher J

    2013-01-01

    Phenotype analyses, e.g. investigating metabolic processes, tissue formation, or organism behavior, are an important element of most biological and medical research activities. Biomedical researchers are making increased use of ontological standards and methods to capture the results of such analyses, with one focus being the comparison and analysis of phenotype information between species. We have generated a cross-species phenotype ontology for human, mouse and zebrafish that contains classes from the Human Phenotype Ontology, Mammalian Phenotype Ontology, and generated classes for zebrafish phenotypes. We also provide up-to-date annotation data connecting human genes to phenotype classes from the generated ontology. We have included the data generation pipeline into our continuous integration system ensuring stable and up-to-date releases. This article describes the data generation process and is intended to help interested researchers access both the phenotype annotation data and the associated cross-species phenotype ontology. The resource described here can be used in sophisticated semantic similarity and gene set enrichment analyses for phenotype data across species. The stable releases of this resource can be obtained from http://purl.obolibrary.org/obo/hp/uberpheno/.

  12. Subsumption Checking between Concept Queries in Different Ontologies Based on Mutual Instances

    Institute of Scientific and Technical Information of China (English)

    2006-01-01

    This paper proposes a checking method based on mutual instances and discusses three key problems in the method: how to deal with mistakes in the mutual instances and how to deal with too many or too few mutual instances. It provides the checking based on the weighted mutual instances considering fault tolerance, gives a way to partition the large-scale mutual instances, and proposes a process greatly reducing the manual annotation work to get more mutual instances. Intension annotation that improves the checking method is also discussed. The method is practical and effective to check subsumption relations between concept queries in different ontologies based on mutual instances.

  13. Ontology-Based Classification System Development Methodology

    Directory of Open Access Journals (Sweden)

    Grabusts Peter

    2015-12-01

    Full Text Available The aim of the article is to analyse and develop an ontology-based classification system methodology that uses decision tree learning with statement propositionalized attributes. Classical decision tree learning algorithms, as well as decision tree learning with taxonomy and propositionalized attributes have been observed. Thus, domain ontology can be extracted from the data sets and can be used for data classification with the help of a decision tree. The use of ontology methods in decision tree-based classification systems has been researched. Using such methodologies, the classification accuracy in some cases can be improved.

  14. Oceanographic ontology-based spatial knowledge query

    Institute of Scientific and Technical Information of China (English)

    2005-01-01

    The construction of oceanographic ontologies is fundamental to the "digital ocean". Therefore, on the basis of introduction of new concept of oceanographic ontology, an oceanographic ontology-based spatial knowledge query (OOBSKQ) method was proposed and developed. Because the method uses a natural language to describe query conditions and the query result is highly integrated knowledge,it can provide users with direct answers while hiding the complicated computation and reasoning processes, and achieves intelligent,automatic oceanographic spatial information query on the level of knowledge and semantics. A case study of resource and environmental application in bay has shown the implementation process of the method and its feasibility and usefulness.

  15. Ontology-Based Model Of Firm Competitiveness

    Science.gov (United States)

    Deliyska, Boryana; Stoenchev, Nikolay

    2010-10-01

    Competitiveness is important characteristics of each business organization (firm, company, corporation etc). It is of great significance for the organization existence and defines evaluation criteria of business success at microeconomical level. Each criterium comprises set of indicators with specific weight coefficients. In the work an ontology-based model of firm competitiveness is presented as a set of several mutually connected ontologies. It would be useful for knowledge structuring, standardization and sharing among experts and software engineers who develop application in the domain. Then the assessment of the competitiveness of various business organizations could be generated more effectively.

  16. An Ontology Based Personalised Mobile Search Engine

    Directory of Open Access Journals (Sweden)

    Mrs. Rashmi A. Jolhe

    2014-02-01

    Full Text Available As the amount of Web information grows rapidly, Search engines must be able to retrieve information according to the user's preference. In this paper, we propose Ontology Based Personalised Mobile Search Engine (OBPMSE that captures user‟s interest and preferences in the form of concepts by mining search results and their clickthroughs. OBPMSE profile the user‟s interest and personalised the search results according to user‟s profile. OBPMSE classifies these concepts into content concepts and location concepts. In addition, users‟ locations (positioned by GPS are used to supplement the location concepts in OBPMSE. The user preferences are organized in an ontology-based, multifacet user profile, used to adapt a personalized ranking function which in turn used for rank adaptation of future search results. we propose to define personalization effectiveness based on the entropies and use it to balance the weights between the content and location facets. In our design, the client collects and stores locally the clickthrough data to protect privacy, whereas heavy tasks such as concept extraction ,training, and reranking are performed at the OBPMSE server. OBPMSE provide client-server architecture and distribute the task to each individual component to decrease the complexity.

  17. ONTOPARK: ONTOLOGY BASED PAGE RANKING FRAMEWORK USING RESOURCE DESCRIPTION FRAMEWORK

    Directory of Open Access Journals (Sweden)

    S. Yasodha

    2014-01-01

    Full Text Available Traditional search engines like Google and Yahoo fail to rank the relevant information for users’ query. This is because such search engines rely on keywords for searching and they fail to consider the semantics of the query. More sophisticated methods that do provide the relevant information for the query is the need of the time. The Semantic Web that stores metadata as ontology could be used to solve this problem. The major drawback of the PageRank algorithm of Google is that ranking is based not only on the page ranks produced but also on the number of hits to the Web page. This paved way for illegitimate means of boosting page ranks. As a result, Web pages whose page rank is zero are also ranked in top-order. This drawback of PageRank algorithm motivated us to contribute to the Web community to provide semantic search results. So we propose ONTOPARK, an ontology based framework for ranking Web pages. The proposed framework combines the Vector Space Model of Information Retrieval with Ontology. The framework constructs semantically annotated Resource Description Framework (RDF files which form the RDF knowledgebase for each query. The proposed framework has been evaluated by two measures, precision and recall. The proposed framework improves the precision of both single-word and multi-word queries which infer that replacing Web database by semantic knowledgebase will definitely improve the quality of search. The surfing time of the surfers will also be minimized.

  18. Ontology Based Qos Driven Web Service Discovery

    Directory of Open Access Journals (Sweden)

    R Suganyakala

    2011-07-01

    Full Text Available In today's scenario web services have become a grand vision to implement the business process functionalities. With increase in number of similar web services, one of the essential challenges is to discover relevant web service with regard to user specification. Relevancy of web service discovery can be improved by augmenting semantics through expressive formats like OWL. QoS based service selection will play a significant role in meeting the non-functional user requirements. Hence QoS and semantics has been used as finer search constraints to discover the most relevant service. In this paper, we describe a QoS framework for ontology based web service discovery. The QoS factors taken into consideration are execution time, response time, throughput, scalability, reputation, accessibility and availability. The behavior of each web service at various instances is observed over a period of time and their QoS based performance is analyzed.

  19. Ontology-Based Semantic Cache in AOKB

    Institute of Scientific and Technical Information of China (English)

    郑红; 陆汝钤; 金芝; 胡思康

    2002-01-01

    When querying on a large-scale knowledge base, a major technique of im-proving performance is to preload knowledge to minimize the number of roundtrips to theknowledge base. In this paper, an ontology-based semantic cache is proposed for an agentand ontology-oriented knowledge base (AOKB). In AOKB, an ontology is the collection of re-lationships between a group of knowledge units (agents and/or other sub-ontologies). Whenloading some agent A, its relationships with other knowledge units are examined, and thosewho have a tight semantic tie with A will be preloaded at the same time, including agents andsub-ontologies in the same ontology where A is. The preloaded agents and ontologies are savedat a semantic cache located in the memory. Test results show that up to 50% reduction inrunning time is achieved.

  20. Discovering Diabetes Complications: an Ontology Based Model

    Science.gov (United States)

    Daghistani, Tahani; Shammari, Riyad Al; Razzak, Muhammad Imran

    2015-01-01

    Background: Diabetes is a serious disease that spread in the world dramatically. The diabetes patient has an average of risk to experience complications. Take advantage of recorded information to build ontology as information technology solution will help to predict patients who have average of risk level with certain complication. It is helpful to search and present patient’s history regarding different risk factors. Discovering diabetes complications could be useful to prevent or delay the complications. Method: We designed ontology based model, using adult diabetes patients’ data, to discover the rules of diabetes with its complications in disease to disease relationship. Result: Various rules between different risk factors of diabetes Patients and certain complications generated. Furthermore, new complications (diseases) might be discovered as new finding of this study, discovering diabetes complications could be useful to prevent or delay the complications. Conclusion: The system can identify the patients who are suffering from certain risk factors such as high body mass index (obesity) and starting controlling and maintaining plan. PMID:26862251

  1. Ontology Based Metadata Management for National Healthcare Data Dictionary

    Directory of Open Access Journals (Sweden)

    Yasemin Yüksek

    2012-02-01

    Full Text Available Ontology based metadata is based on ontologies that give formal semantics to information for content level. In this study, ontology based metadata management that intended the metadata modeling developed for National Health Data Dictionary (NHDD was proposed. NHDD is used as a reference to all health institutions in Turkey and it provides great contribution in terms of the terminology. The approach of the proposed ontology based metadata management was achieved by using modeling methodology of metadata requirements. This methodology includes determination of metadata beneficiaries, listing of metadata requirements for each beneficiary, identification of the source of metadata, categorizing of metadata and a metamodel building.

  2. Semantic annotation of medical images

    Science.gov (United States)

    Seifert, Sascha; Kelm, Michael; Moeller, Manuel; Mukherjee, Saikat; Cavallaro, Alexander; Huber, Martin; Comaniciu, Dorin

    2010-03-01

    Diagnosis and treatment planning for patients can be significantly improved by comparing with clinical images of other patients with similar anatomical and pathological characteristics. This requires the images to be annotated using common vocabulary from clinical ontologies. Current approaches to such annotation are typically manual, consuming extensive clinician time, and cannot be scaled to large amounts of imaging data in hospitals. On the other hand, automated image analysis while being very scalable do not leverage standardized semantics and thus cannot be used across specific applications. In our work, we describe an automated and context-sensitive workflow based on an image parsing system complemented by an ontology-based context-sensitive annotation tool. An unique characteristic of our framework is that it brings together the diverse paradigms of machine learning based image analysis and ontology based modeling for accurate and scalable semantic image annotation.

  3. Aber-OWL: a framework for ontology-based data access in biology

    KAUST Repository

    Hoehndorf, Robert

    2015-01-28

    Background: Many ontologies have been developed in biology and these ontologies increasingly contain large volumes of formalized knowledge commonly expressed in the Web Ontology Language (OWL). Computational access to the knowledge contained within these ontologies relies on the use of automated reasoning. Results: We have developed the Aber-OWL infrastructure that provides reasoning services for bio-ontologies. Aber-OWL consists of an ontology repository, a set of web services and web interfaces that enable ontology-based semantic access to biological data and literature. Aber-OWL is freely available at http://aber-owl.net. Conclusions: Aber-OWL provides a framework for automatically accessing information that is annotated with ontologies or contains terms used to label classes in ontologies. When using Aber-OWL, access to ontologies and data annotated with them is not merely based on class names or identifiers but rather on the knowledge the ontologies contain and the inferences that can be drawn from it.

  4. Gene ontology based transfer learning for protein subcellular localization

    Directory of Open Access Journals (Sweden)

    Zhou Shuigeng

    2011-02-01

    Full Text Available Abstract Background Prediction of protein subcellular localization generally involves many complex factors, and using only one or two aspects of data information may not tell the true story. For this reason, some recent predictive models are deliberately designed to integrate multiple heterogeneous data sources for exploiting multi-aspect protein feature information. Gene ontology, hereinafter referred to as GO, uses a controlled vocabulary to depict biological molecules or gene products in terms of biological process, molecular function and cellular component. With the rapid expansion of annotated protein sequences, gene ontology has become a general protein feature that can be used to construct predictive models in computational biology. Existing models generally either concatenated the GO terms into a flat binary vector or applied majority-vote based ensemble learning for protein subcellular localization, both of which can not estimate the individual discriminative abilities of the three aspects of gene ontology. Results In this paper, we propose a Gene Ontology Based Transfer Learning Model (GO-TLM for large-scale protein subcellular localization. The model transfers the signature-based homologous GO terms to the target proteins, and further constructs a reliable learning system to reduce the adverse affect of the potential false GO terms that are resulted from evolutionary divergence. We derive three GO kernels from the three aspects of gene ontology to measure the GO similarity of two proteins, and derive two other spectrum kernels to measure the similarity of two protein sequences. We use simple non-parametric cross validation to explicitly weigh the discriminative abilities of the five kernels, such that the time & space computational complexities are greatly reduced when compared to the complicated semi-definite programming and semi-indefinite linear programming. The five kernels are then linearly merged into one single kernel for

  5. Ontology-based knowledge discovery in pharmacogenomics.

    Science.gov (United States)

    Coulet, Adrien; Smaïl-Tabbone, Malika; Napoli, Amedeo; Devignes, Marie-Dominique

    2011-01-01

    One current challenge in biomedicine is to analyze large amounts of complex biological data for extracting domain knowledge. This work holds on the use of knowledge-based techniques such as knowledge discovery (KD) and knowledge representation (KR) in pharmacogenomics, where knowledge units represent genotype-phenotype relationships in the context of a given treatment. An objective is to design knowledge base (KB, here also mentioned as an ontology) and then to use it in the KD process itself. A method is proposed for dealing with two main tasks: (1) building a KB from heterogeneous data related to genotype, phenotype, and treatment, and (2) applying KD techniques on knowledge assertions for extracting genotype-phenotype relationships. An application was carried out on a clinical trial concerned with the variability of drug response to montelukast treatment. Genotype-genotype and genotype-phenotype associations were retrieved together with new associations, allowing the extension of the initial KB. This experiment shows the potential of KR and KD processes, especially for designing KB, checking KB consistency, and reasoning for problem solving.

  6. An Ontology-Based Service Matching Strategy in Grid Environments

    Institute of Scientific and Technical Information of China (English)

    YIN Nan; SHEN De-rong; YU Ge; KOU Yue; NIE Tie-zheng; CAO Yu

    2004-01-01

    An efficient ontology-based service searching scheme is put forward in this paper by introducing semantic information into grid systems.The ideas of ontology and OWL (Web ontology language) are applied to establish a uniform abstract concept model and standardization for grid services.We propose a general framework of ontology-based service discovery sub-system, which includes ontology storage module, context-based domain selection module and specific service matching module.Implementation policies are also presented in this paper.

  7. Ontology-Based e-Assessment for Accounting Education

    Science.gov (United States)

    Litherland, Kate; Carmichael, Patrick; Martínez-García, Agustina

    2013-01-01

    This summary reports on a pilot of a novel, ontology-based e-assessment system in accounting. The system, OeLe, uses emerging semantic technologies to offer an online assessment environment capable of marking students' free text answers to questions of a conceptual nature. It does this by matching their response with a "concept map" or…

  8. Ontology-Based e-Assessment for Accounting Education

    Science.gov (United States)

    Litherland, Kate; Carmichael, Patrick; Martínez-García, Agustina

    2013-01-01

    This summary reports on a pilot of a novel, ontology-based e-assessment system in accounting. The system, OeLe, uses emerging semantic technologies to offer an online assessment environment capable of marking students' free text answers to questions of a conceptual nature. It does this by matching their response with a "concept map" or…

  9. Six scenarios of exploiting an ontology based, mobilized learning environment

    NARCIS (Netherlands)

    Kismihók, G.; Szabó, I.; Vas, R.

    2012-01-01

    In this article, six different exploitation possibilities of an educational ontology based, mobilized learning management system are presented. The focal point of this system is the educational ontology model. The first version of this educational ontology model serves as a foundation for curriculum

  10. Ontology-based content analysis of US patent applications from 2001-2010.

    Science.gov (United States)

    Weber, Lutz; Böhme, Timo; Irmer, Matthias

    2013-01-01

    Ontology-based semantic text analysis methods allow to automatically extract knowledge relationships and data from text documents. In this review, we have applied these technologies for the systematic analysis of pharmaceutical patents. Hierarchical concepts from the knowledge domains of chemical compounds, diseases and proteins were used to annotate full-text US patent applications that deal with pharmacological activities of chemical compounds and filed in the years 2001-2010. Compounds claimed in these applications have been classified into their respective compound classes to review the distribution of scaffold types or general compound classes such as natural products in a time-dependent manner. Similarly, the target proteins and claimed utility of the compounds have been classified and the most relevant were extracted. The method presented allows the discovery of the main areas of innovation as well as emerging fields of patenting activities - providing a broad statistical basis for competitor analysis and decision-making efforts.

  11. An Ontology-Based Framework for Geographic Data Integration

    Science.gov (United States)

    Vidal, Vânia M. P.; Sacramento, Eveline R.; de Macêdo, José Antonio Fernandes; Casanova, Marco Antonio

    Ontologies have been extensively used to model domain-specific knowledge. Recent research has applied ontologies to enhance the discovery and retrieval of geographic data in Spatial Data Infrastructures (SDIs). However, in those approaches it is assumed that all the data required for answering a query can be obtained from a single data source. In this work, we propose an ontology-based framework for the integration of geographic data. In our approach, a query posed on a domain ontology is rewritten into sub-queries submitted over multiples data sources, and the query result is obtained by the proper combination of data resulting from these sub-queries. We illustrate how our framework allows the combination of data from different sources, thus overcoming some limitations of other ontology-based approaches. Our approach is illustrated by an example from the domain of aeronautical flights.

  12. An Ontology-Based Representation Architecture of Unstructured Information

    Institute of Scientific and Technical Information of China (English)

    GU Jin-guang; CHEN He-ping; CHEN Xin-meng

    2004-01-01

    Integrating with the respective advantages of XML Schema and Ontology, this paper puts forward a semantic information processing architecture-OBSA to solve the problem of heterogeneity of information sources and uncertainty of semantic.It introduces an F-Logic based semantic information presentation mechanism, presents a design of an ontology-based semantic representation language and a mapping algorithm converting Ontology to XML DTD/Schema, and an adapter framework for accessing distributed and heterogeneous information.

  13. An Ontology-Based Resource Selection Service on Science Cloud

    Science.gov (United States)

    Yoo, Hyunjeong; Hur, Cinyoung; Kim, Seoyoung; Kim, Yoonhee

    Cloud computing requires scalable and cooperative sharing the resources in various organizations by dynamic configuring a virtual organization according to user's requirements. Ontology-based representation of Cloud computing environment would be able to conceptualize common attributes among Cloud resources and to describe relations among them semantically. However, mutual compatibility among organizations is limited because a method applying ontology to Cloud is not established yet.

  14. Towards ontology-based decision support systems for complex ultrasound diagnosis in obstetrics and gynecology.

    Science.gov (United States)

    Maurice, P; Dhombres, F; Blondiaux, E; Friszer, S; Guilbaud, L; Lelong, N; Khoshnood, B; Charlet, J; Perrot, N; Jauniaux, E; Jurkovic, D; Jouannic, J-M

    2017-05-01

    We have developed a new knowledge base intelligent system for obstetrics and gynecology ultrasound imaging, based on an ontology and a reference image collection. This study evaluates the new system to support accurate annotations of ultrasound images. We have used the early ultrasound diagnosis of ectopic pregnancies as a model clinical issue. The ectopic pregnancy ontology was derived from medical texts (4260 ultrasound reports of ectopic pregnancy from a specialist center in the UK and 2795 Pubmed abstracts indexed with the MeSH term "Pregnancy, Ectopic") and the reference image collection was built on a selection from 106 publications. We conducted a retrospective analysis of the signs in 35 scans of ectopic pregnancy by six observers using the new system. The resulting ectopic pregnancy ontology consisted of 1395 terms, and 80 images were collected for the reference collection. The observers used the knowledge base intelligent system to provide a total of 1486 sign annotations. The precision, recall and F-measure for the annotations were 0.83, 0.62 and 0.71, respectively. The global proportion of agreement was 40.35% 95% CI [38.64-42.05]. The ontology-based intelligent system provides accurate annotations of ultrasound images and suggests that it may benefit non-expert operators. The precision rate is appropriate for accurate input of a computer-based clinical decision support and could be used to support medical imaging diagnosis of complex conditions in obstetrics and gynecology. Copyright © 2017. Published by Elsevier Masson SAS.

  15. Bridging the phenotypic and genetic data useful for integrated breeding through a data annotation using the Crop Ontology developed by the crop communities of practice

    Directory of Open Access Journals (Sweden)

    Rosemary eShrestha

    2012-08-01

    Full Text Available The Crop Ontology (CO of the Generation Challenge Program (GCP (http://cropontology.org/ is developed for the Integrated Breeding Platform (https://www.integratedbreeding.net/ by several centers of The Consultative Group on International Agricultural Research (CGIAR: Bioversity, CIMMYT, CIP, ICRISAT, IITA, and IRRI. Integrated breeding necessitates that breeders access genotypic and phenotypic data related to a given trait. The Crop Ontology provides validated trait names used by the crop communities of practice for harmonizing the annotation of phenotypic and genotypic data and thus supporting data accessibility and discovery through web queries. The trait information is completed by the description of the measurement methods and scales, and images. The trait dictionaries used to produce the Integrated Breeding (IB fieldbooks are synchronized with the Crop Ontology terms for an automatic annotation of the phenotypic data measured in the field. The IB fieldbook provides breeders with direct access to the CO to get additional descriptive information on the traits. Ontologies and trait dictionaries are online for cassava, chickpea, common bean, groundnut, maize, Musa, potato, rice, sorghum and wheat. Online curation and annotation tools facilitate (http://cropontology.org direct maintenance of the trait information and production of trait dictionaries by the crop communities. An important feature is the cross referencing of CO terms with the Crop database trait ID and with their synonyms in Plant Ontology and Trait Ontology. Web links between cross referenced terms in CO provide online access to data annotated with similar ontological terms, particularly the genetic data in Gramene (University of Cornell or the evaluation and climatic data in the Global Repository of evaluation trials of the Climate Change, Agriculture and Food Security programme (CCAFS. Cross-referencing and annotation will be further applied in the Integrated Breeding Platform.

  16. Bridging the phenotypic and genetic data useful for integrated breeding through a data annotation using the Crop Ontology developed by the crop communities of practice

    Science.gov (United States)

    Shrestha, Rosemary; Matteis, Luca; Skofic, Milko; Portugal, Arllet; McLaren, Graham; Hyman, Glenn; Arnaud, Elizabeth

    2012-01-01

    The Crop Ontology (CO) of the Generation Challenge Program (GCP) (http://cropontology.org/) is developed for the Integrated Breeding Platform (IBP) (http://www.integratedbreeding.net/) by several centers of The Consultative Group on International Agricultural Research (CGIAR): bioversity, CIMMYT, CIP, ICRISAT, IITA, and IRRI. Integrated breeding necessitates that breeders access genotypic and phenotypic data related to a given trait. The CO provides validated trait names used by the crop communities of practice (CoP) for harmonizing the annotation of phenotypic and genotypic data and thus supporting data accessibility and discovery through web queries. The trait information is completed by the description of the measurement methods and scales, and images. The trait dictionaries used to produce the Integrated Breeding (IB) fieldbooks are synchronized with the CO terms for an automatic annotation of the phenotypic data measured in the field. The IB fieldbook provides breeders with direct access to the CO to get additional descriptive information on the traits. Ontologies and trait dictionaries are online for cassava, chickpea, common bean, groundnut, maize, Musa, potato, rice, sorghum, and wheat. Online curation and annotation tools facilitate (http://cropontology.org) direct maintenance of the trait information and production of trait dictionaries by the crop communities. An important feature is the cross referencing of CO terms with the Crop database trait ID and with their synonyms in Plant Ontology (PO) and Trait Ontology (TO). Web links between cross referenced terms in CO provide online access to data annotated with similar ontological terms, particularly the genetic data in Gramene (University of Cornell) or the evaluation and climatic data in the Global Repository of evaluation trials of the Climate Change, Agriculture and Food Security programme (CCAFS). Cross-referencing and annotation will be further applied in the IBP. PMID:22934074

  17. Ontology-based specification, identification and analysis of perioperative risks.

    Science.gov (United States)

    Uciteli, Alexandr; Neumann, Juliane; Tahar, Kais; Saleh, Kutaiba; Stucke, Stephan; Faulbrück-Röhr, Sebastian; Kaeding, André; Specht, Martin; Schmidt, Tobias; Neumuth, Thomas; Besting, Andreas; Stegemann, Dominik; Portheine, Frank; Herre, Heinrich

    2017-09-06

    Medical personnel in hospitals often works under great physical and mental strain. In medical decision-making, errors can never be completely ruled out. Several studies have shown that between 50 and 60% of adverse events could have been avoided through better organization, more attention or more effective security procedures. Critical situations especially arise during interdisciplinary collaboration and the use of complex medical technology, for example during surgical interventions and in perioperative settings (the period of time before, during and after surgical intervention). In this paper, we present an ontology and an ontology-based software system, which can identify risks across medical processes and supports the avoidance of errors in particular in the perioperative setting. We developed a practicable definition of the risk notion, which is easily understandable by the medical staff and is usable for the software tools. Based on this definition, we developed a Risk Identification Ontology (RIO) and used it for the specification and the identification of perioperative risks. An agent system was developed, which gathers risk-relevant data during the whole perioperative treatment process from various sources and provides it for risk identification and analysis in a centralized fashion. The results of such an analysis are provided to the medical personnel in form of context-sensitive hints and alerts. For the identification of the ontologically specified risks, we developed an ontology-based software module, called Ontology-based Risk Detector (OntoRiDe). About 20 risks relating to cochlear implantation (CI) have already been implemented. Comprehensive testing has indicated the correctness of the data acquisition, risk identification and analysis components, as well as the web-based visualization of results.

  18. Ontology Based Resolution of Semantic Conflicts in Information Integration

    Institute of Scientific and Technical Information of China (English)

    LU Han; LI Qing-zhong

    2004-01-01

    Semantic conflict is the conflict caused by using different ways in heterogeneous systems to express the same entity in reality.This prevents information integration from accomplishing semantic coherence.Since ontology helps to solve semantic problems, this area has become a hot topic in information integration.In this paper, we introduce semantic conflict into information integration of heterogeneous applications.We discuss the origins and categories of the conflict, and present an ontology-based schema mapping approach to eliminate semantic conflicts.

  19. ONTOLOGY BASED SEMANTIC KNOWLEDGE REPRESENTATION FOR SOFTWARE RISK MANAGEMENT

    Directory of Open Access Journals (Sweden)

    C.R.Rene Robin

    2010-10-01

    Full Text Available Domain specific knowledge representation is achieved through the use of ontologies. The ontology model of software risk management is an effective approach for the intercommunion between people from teaching and learning community, the communication and interoperation among various knowledge oriented applications, and the share and reuse of the software. But the lack of formal representation tools for domain modeling results in taking liberties with conceptualization. This paper narrates an ontology based semantic knowledge representation mechanism and the architecture we proposed has been successfully implemented for the domain software riskmanagement.

  20. Ontology-based literature mining of E. coli vaccine-associated gene interaction networks.

    Science.gov (United States)

    Hur, Junguk; Özgür, Arzucan; He, Yongqun

    2017-03-14

    Pathogenic Escherichia coli infections cause various diseases in humans and many animal species. However, with extensive E. coli vaccine research, we are still unable to fully protect ourselves against E. coli infections. To more rational development of effective and safe E. coli vaccine, it is important to better understand E. coli vaccine-associated gene interaction networks. In this study, we first extended the Vaccine Ontology (VO) to semantically represent various E. coli vaccines and genes used in the vaccine development. We also normalized E. coli gene names compiled from the annotations of various E. coli strains using a pan-genome-based annotation strategy. The Interaction Network Ontology (INO) includes a hierarchy of various interaction-related keywords useful for literature mining. Using VO, INO, and normalized E. coli gene names, we applied an ontology-based SciMiner literature mining strategy to mine all PubMed abstracts and retrieve E. coli vaccine-associated E. coli gene interactions. Four centrality metrics (i.e., degree, eigenvector, closeness, and betweenness) were calculated for identifying highly ranked genes and interaction types. Using vaccine-related PubMed abstracts, our study identified 11,350 sentences that contain 88 unique INO interactions types and 1,781 unique E. coli genes. Each sentence contained at least one interaction type and two unique E. coli genes. An E. coli gene interaction network of genes and INO interaction types was created. From this big network, a sub-network consisting of 5 E. coli vaccine genes, including carA, carB, fimH, fepA, and vat, and 62 other E. coli genes, and 25 INO interaction types was identified. While many interaction types represent direct interactions between two indicated genes, our study has also shown that many of these retrieved interaction types are indirect in that the two genes participated in the specified interaction process in a required but indirect process. Our centrality analysis of

  1. An Ontology Based Approach to Implement the Online Recommendation System

    Directory of Open Access Journals (Sweden)

    Vijayakumar Mohanraj

    2011-01-01

    Full Text Available Problem statement: Every web user has different intent when accessing the information on website. The primary goal of recommendation system is to anticipate the user intent and recommend the web pages that contain user expected information. Effective recommendation of web pages involves two important challenges: accurately identifying the user intent and predict the result show that novel web usage mining method and ontological concept scoring algorithm based on website domain ontological profile helps the recommendation system imminent navigation pattern in such a way that it provides required content while users browse the predicted navigation. Approach: We present a ontology based approach to implement recommendation system that involves applying innovative web usage mining on log system to discover all possible imminent navigation patterns of current user and resolve any uncertainties in discovering the navigation pattern by applying ontological concept based similarity comparison and scoring algorithm. Results: result show that novel web usage mining method and ontological concept scoring algorithm based on website domain ontological profile helps the recommendation system to predict and present most relevant navigation pattern to users. Conclusion: our recommendation system confirms that ontology based approach should be used to ensure excellent accuracy in predicting and capturing future navigation pattern of web user.

  2. NCBO Resource Index: Ontology-Based Search and Mining of Biomedical Resources.

    Science.gov (United States)

    Jonquet, Clement; Lependu, Paea; Falconer, Sean; Coulet, Adrien; Noy, Natalya F; Musen, Mark A; Shah, Nigam H

    2011-09-01

    The volume of publicly available data in biomedicine is constantly increasing. However, these data are stored in different formats and on different platforms. Integrating these data will enable us to facilitate the pace of medical discoveries by providing scientists with a unified view of this diverse information. Under the auspices of the National Center for Biomedical Ontology (NCBO), we have developed the Resource Index-a growing, large-scale ontology-based index of more than twenty heterogeneous biomedical resources. The resources come from a variety of repositories maintained by organizations from around the world. We use a set of over 200 publicly available ontologies contributed by researchers in various domains to annotate the elements in these resources. We use the semantics that the ontologies encode, such as different properties of classes, the class hierarchies, and the mappings between ontologies, in order to improve the search experience for the Resource Index user. Our user interface enables scientists to search the multiple resources quickly and efficiently using domain terms, without even being aware that there is semantics "under the hood."

  3. NCBO Resource Index: Ontology-Based Search and Mining of Biomedical Resources

    Science.gov (United States)

    Jonquet, Clement; LePendu, Paea; Falconer, Sean; Coulet, Adrien; Noy, Natalya F.; Musen, Mark A.; Shah, Nigam H.

    2011-01-01

    The volume of publicly available data in biomedicine is constantly increasing. However, these data are stored in different formats and on different platforms. Integrating these data will enable us to facilitate the pace of medical discoveries by providing scientists with a unified view of this diverse information. Under the auspices of the National Center for Biomedical Ontology (NCBO), we have developed the Resource Index—a growing, large-scale ontology-based index of more than twenty heterogeneous biomedical resources. The resources come from a variety of repositories maintained by organizations from around the world. We use a set of over 200 publicly available ontologies contributed by researchers in various domains to annotate the elements in these resources. We use the semantics that the ontologies encode, such as different properties of classes, the class hierarchies, and the mappings between ontologies, in order to improve the search experience for the Resource Index user. Our user interface enables scientists to search the multiple resources quickly and efficiently using domain terms, without even being aware that there is semantics “under the hood.” PMID:21918645

  4. Toward an Ontology-Based Framework for Clinical Research Databases

    Science.gov (United States)

    Kong, Y. Megan; Dahlke, Carl; Xiang, Qun; Qian, Yu; Karp, David; Scheuermann, Richard H.

    2010-01-01

    Clinical research includes a wide range of study designs from focused observational studies to complex interventional studies with multiple study arms, treatment and assessment events, and specimen procurement procedures. Participant characteristics from case report forms need to be integrated with molecular characteristics from mechanistic experiments on procured specimens. In order to capture and manage this diverse array of data, we have developed the Ontology-Based eXtensible conceptual model (OBX) to serve as a framework for clinical research data in the Immunology Database and Analysis Portal (ImmPort). By designing OBX around the logical structure of the Basic Formal Ontology (BFO) and the Ontology for Biomedical Investigations (OBI), we have found that a relatively simple conceptual model can represent the relatively complex domain of clinical research. In addition, the common framework provided by BFO makes it straightforward to develop data dictionaries based on reference and application ontologies from the OBO Foundry. PMID:20460173

  5. A PSL Ontology-based Shop Floor Dynamical Scheduler Design

    Institute of Scientific and Technical Information of China (English)

    WANG Wei-da; XU He; PENG Gao-liang; LIU Wen-jian; Khalil Alipour

    2008-01-01

    Due to the complex,uncertainty and dynamics in the modern manufacturing environment,a flexible and robust shop floor scheduler is essential to achieve the production goals.A design framework of a shop floor dynamical scheduler is presented in this paper.The workflow and function modules of the scheduler are discussed in detail.A multi-step adaptive scheduling strategy and a process specification language,which is an ontology-based representation of process plan,are utilized in the proposed scheduler.The scheduler acquires the dispatching rule from the knowledge base and uses the build in on-line simulator to evaluate the obtained rule.These technologies enable the scheduler to improve its fine-tune ability and effectively transfer process information into other heterogeneous information systems in a shop floor.The effectiveness of the suggested structure will be demonstrated via its application in the scheduling system of a manufacturing enterprise.

  6. An Ontology-Based Framework for Modeling User Behavior

    DEFF Research Database (Denmark)

    Razmerita, Liana

    2011-01-01

    This paper focuses on the role of user modeling and semantically enhanced representations for personalization. This paper presents a generic Ontology-based User Modeling framework (OntobUMf), its components, and its associated user modeling processes. This framework models the behavior of the users....... The results of this research may contribute to the development of other frameworks for modeling user behavior, other semantically enhanced user modeling frameworks, or other semantically enhanced information systems....... and classifies its users according to their behavior. The user ontology is the backbone of OntobUMf and has been designed according to the Information Management System Learning Information Package (IMS LIP). The user ontology includes a Behavior concept that extends IMS LIP specification and defines...

  7. Ontology-Based Information Extraction for Business Intelligence

    Science.gov (United States)

    Saggion, Horacio; Funk, Adam; Maynard, Diana; Bontcheva, Kalina

    Business Intelligence (BI) requires the acquisition and aggregation of key pieces of knowledge from multiple sources in order to provide valuable information to customers or feed statistical BI models and tools. The massive amount of information available to business analysts makes information extraction and other natural language processing tools key enablers for the acquisition and use of that semantic information. We describe the application of ontology-based extraction and merging in the context of a practical e-business application for the EU MUSING Project where the goal is to gather international company intelligence and country/region information. The results of our experiments so far are very promising and we are now in the process of building a complete end-to-end solution.

  8. Semiotic Triangle Revisited for the Purposes of Ontology-based Terminology Management

    OpenAIRE

    Kudashev, Igor; Kudasheva, Irina

    2010-01-01

    In this paper, we examine the limitations of the traditional semiotic triangle from the point of view of ontology-based, multipurpose terminology management and suggest an alternative model based on the concept of terminological lexeme. The new model is being tested in the TermFactory project aimed at creating a platform and a workflow for distributed collaborative ontology-based terminology work.

  9. A Semantic-Oriented Approach for Organizing and Developing Annotation for E-Learning

    Science.gov (United States)

    Brut, Mihaela M.; Sedes, Florence; Dumitrescu, Stefan D.

    2011-01-01

    This paper presents a solution to extend the IEEE LOM standard with ontology-based semantic annotations for efficient use of learning objects outside Learning Management Systems. The data model corresponding to this approach is first presented. The proposed indexing technique for this model development in order to acquire a better annotation of…

  10. GeneYenta: a phenotype-based rare disease case matching tool based on online dating algorithms for the acceleration of exome interpretation.

    Science.gov (United States)

    Gottlieb, Michael M; Arenillas, David J; Maithripala, Savanie; Maurer, Zachary D; Tarailo Graovac, Maja; Armstrong, Linlea; Patel, Millan; van Karnebeek, Clara; Wasserman, Wyeth W

    2015-04-01

    Advances in next-generation sequencing (NGS) technologies have helped reveal causal variants for genetic diseases. In order to establish causality, it is often necessary to compare genomes of unrelated individuals with similar disease phenotypes to identify common disrupted genes. When working with cases of rare genetic disorders, finding similar individuals can be extremely difficult. We introduce a web tool, GeneYenta, which facilitates the matchmaking process, allowing clinicians to coordinate detailed comparisons for phenotypically similar cases. Importantly, the system is focused on phenotype annotation, with explicit limitations on highly confidential data that create barriers to participation. The procedure for matching of patient phenotypes, inspired by online dating services, uses an ontology-based semantic case matching algorithm with attribute weighting. We evaluate the capacity of the system using a curated reference data set and 19 clinician entered cases comparing four matching algorithms. We find that the inclusion of clinician weights can augment phenotype matching.

  11. Ontology-based geospatial data query and integration

    Science.gov (United States)

    Zhao, T.; Zhang, C.; Wei, M.; Peng, Z.-R.

    2008-01-01

    Geospatial data sharing is an increasingly important subject as large amount of data is produced by a variety of sources, stored in incompatible formats, and accessible through different GIS applications. Past efforts to enable sharing have produced standardized data format such as GML and data access protocols such as Web Feature Service (WFS). While these standards help enabling client applications to gain access to heterogeneous data stored in different formats from diverse sources, the usability of the access is limited due to the lack of data semantics encoded in the WFS feature types. Past research has used ontology languages to describe the semantics of geospatial data but ontology-based queries cannot be applied directly to legacy data stored in databases or shapefiles, or to feature data in WFS services. This paper presents a method to enable ontology query on spatial data available from WFS services and on data stored in databases. We do not create ontology instances explicitly and thus avoid the problems of data replication. Instead, user queries are rewritten to WFS getFeature requests and SQL queries to database. The method also has the benefits of being able to utilize existing tools of databases, WFS, and GML while enabling query based on ontology semantics. ?? 2008 Springer-Verlag Berlin Heidelberg.

  12. OJADEAC: An Ontology Based Access Control Model for JADE Platform

    Directory of Open Access Journals (Sweden)

    Ban Sharief Mustafa

    2014-06-01

    Full Text Available Java Agent Development Framework (JADE is a software framework to make easy the development of Multi-Agent applications in compliance with the Foundation for Intelligent Physical Agents (FIPA specifications. JADE propose new infrastructure solutions to support the development of useful and convenient distributed applications. Security is one of the most important issues in implementing and deploying such applications. JADE-S security add-ons are one of the most popular security solutions in JADE platform. It provides several security services including authentication, authorization, signature and encryption services. Authorization service will give authorities to perform an action based on a set of permission objects attached to every authenticated user. This service has several drawbacks when implemented in a scalable distributed context aware applications. In this paper, an ontology-based access control model called (OJADEAC is proposed to be applied in JADE platform by combining Semantic Web technologies with context-aware policy mechanism to overcome the shortcoming of this service. The access control model is represented by a semantic ontology, and a set of two level semantic rules representing platform and application specific policy rules. OJADEAC model is distributed, intelligent, dynamic, context-aware and use reasoning engine to infer access decisions based on ontology knowledge.

  13. Ontology based Knowledge Management for Administrative Processes of University

    Directory of Open Access Journals (Sweden)

    Anand Kumar

    2015-07-01

    Full Text Available Knowledge management is a challenging task especially in administrative processes with a typical workflow such as higher educational institutions and Universities. We have proposed a system aSPOCMS (An Agent-based Semantic Web for Paperless Office Content Management System that aims at providing paperless environment for the typical workflows of the universities, which requires ontology based knowledge management to manage the files and documents of various departments and sections of a university. In Semantic Web, Ontology describes the concepts, relationships among the concepts and properties within their domain. It provides automatic inferring and interoperability between applications which is an appropriate vision for knowledge management. In this paper we discussed, how Semantic Web technology can be utilized in higher educational institution for knowledge representation of various resources and handling the task of administrative processes. This requires exploitation of knowledge of various resources such as department, school, section, file and employee etc. of the University by aSPOCMS which is built as an agent-based system using the ontology for communication between agent, user and for knowledge representation and management.

  14. SPATIAL DATA INTEGRATION USING ONTOLOGY-BASED APPROACH

    Directory of Open Access Journals (Sweden)

    S. Hasani

    2015-12-01

    Full Text Available In today's world, the necessity for spatial data for various organizations is becoming so crucial that many of these organizations have begun to produce spatial data for that purpose. In some circumstances, the need to obtain real time integrated data requires sustainable mechanism to process real-time integration. Case in point, the disater management situations that requires obtaining real time data from various sources of information. One of the problematic challenges in the mentioned situation is the high degree of heterogeneity between different organizations data. To solve this issue, we introduce an ontology-based method to provide sharing and integration capabilities for the existing databases. In addition to resolving semantic heterogeneity, better access to information is also provided by our proposed method. Our approach is consisted of three steps, the first step is identification of the object in a relational database, then the semantic relationships between them are modelled and subsequently, the ontology of each database is created. In a second step, the relative ontology will be inserted into the database and the relationship of each class of ontology will be inserted into the new created column in database tables. Last step is consisted of a platform based on service-oriented architecture, which allows integration of data. This is done by using the concept of ontology mapping. The proposed approach, in addition to being fast and low cost, makes the process of data integration easy and the data remains unchanged and thus takes advantage of the legacy application provided.

  15. VuWiki: An Ontology-Based Semantic Wiki for Vulnerability Assessments

    Science.gov (United States)

    Khazai, Bijan; Kunz-Plapp, Tina; Büscher, Christian; Wegner, Antje

    2014-05-01

    The concept of vulnerability, as well as its implementation in vulnerability assessments, is used in various disciplines and contexts ranging from disaster management and reduction to ecology, public health or climate change and adaptation, and a corresponding multitude of ideas about how to conceptualize and measure vulnerability exists. Three decades of research in vulnerability have generated a complex and growing body of knowledge that challenges newcomers, practitioners and even experienced researchers. To provide a structured representation of the knowledge field "vulnerability assessment", we have set up an ontology-based semantic wiki for reviewing and representing vulnerability assessments: VuWiki, www.vuwiki.org. Based on a survey of 55 vulnerability assessment studies, we first developed an ontology as an explicit reference system for describing vulnerability assessments. We developed the ontology in a theoretically controlled manner based on general systems theory and guided by principles for ontology development in the field of earth and environment (Raskin and Pan 2005). Four key questions form the first level "branches" or categories of the developed ontology: (1) Vulnerability of what? (2) Vulnerability to what? (3) What reference framework was used in the vulnerability assessment?, and (4) What methodological approach was used in the vulnerability assessment? These questions correspond to the basic, abstract structure of the knowledge domain of vulnerability assessments and have been deduced from theories and concepts of various disciplines. The ontology was then implemented in a semantic wiki which allows for the classification and annotation of vulnerability assessments. As a semantic wiki, VuWiki does not aim at "synthesizing" a holistic and overarching model of vulnerability. Instead, it provides both scientists and practitioners with a uniform ontology as a reference system and offers easy and structured access to the knowledge field of

  16. The MECA project : Ontology-based data portability for space missions

    NARCIS (Netherlands)

    Breebaart, L.; Bos, A.; Grant, T.; Neerincx, M.; Smets, N.; Lindenberg, J.; Soler, A.O.; Brauer, U.; Wolff, M.

    2009-01-01

    This article describes the authors' experiences with a pragmatic, ontology-based approach to data portability and knowledge sharing, as used in the first Mission Execution Crew Assistant (MECA) Proof-of-concept demonstrator software. © 2009 IEEE.

  17. CoryneRegNet: An ontology-based data warehouse of corynebacterial transcription factors and regulatory networks

    Directory of Open Access Journals (Sweden)

    Czaja Lisa F

    2006-02-01

    Full Text Available Abstract Background The application of DNA microarray technology in post-genomic analysis of bacterial genome sequences has allowed the generation of huge amounts of data related to regulatory networks. This data along with literature-derived knowledge on regulation of gene expression has opened the way for genome-wide reconstruction of transcriptional regulatory networks. These large-scale reconstructions can be converted into in silico models of bacterial cells that allow a systematic analysis of network behavior in response to changing environmental conditions. Description CoryneRegNet was designed to facilitate the genome-wide reconstruction of transcriptional regulatory networks of corynebacteria relevant in biotechnology and human medicine. During the import and integration process of data derived from experimental studies or literature knowledge CoryneRegNet generates links to genome annotations, to identified transcription factors and to the corresponding cis-regulatory elements. CoryneRegNet is based on a multi-layered, hierarchical and modular concept of transcriptional regulation and was implemented by using the relational database management system MySQL and an ontology-based data structure. Reconstructed regulatory networks can be visualized by using the yFiles JAVA graph library. As an application example of CoryneRegNet, we have reconstructed the global transcriptional regulation of a cellular module involved in SOS and stress response of corynebacteria. Conclusion CoryneRegNet is an ontology-based data warehouse that allows a pertinent data management of regulatory interactions along with the genome-scale reconstruction of transcriptional regulatory networks. These models can further be combined with metabolic networks to build integrated models of cellular function including both metabolism and its transcriptional regulation.

  18. Fast Gene Ontology based clustering for microarray experiments

    OpenAIRE

    Ovaska Kristian; Laakso Marko; Hautaniemi Sampsa

    2008-01-01

    Abstract Background Analysis of a microarray experiment often results in a list of hundreds of disease-associated genes. In order to suggest common biological processes and functions for these genes, Gene Ontology annotations with statistical testing are widely used. However, these analyses can produce a very large number of significantly altered biological processes. Thus, it is often challenging to interpret GO results and identify novel testable biological hypotheses. Results We present fa...

  19. Construction and accessibility of a cross-species phenotype ontology along with gene annotations for biomedical research [v1; ref status: indexed, http://f1000r.es/p5

    Directory of Open Access Journals (Sweden)

    Sebastian Köhler

    2013-02-01

    Full Text Available Phenotype analyses, e.g. investigating metabolic processes, tissue formation, or organism behavior, are an important element of most biological and medical research activities. Biomedical researchers are making increased use of ontological standards and methods to capture the results of such analyses, with one focus being the comparison and analysis of phenotype information between species. We have generated a cross-species phenotype ontology for human, mouse and zebra fish that contains zebrafish phenotypes. We also provide up-to-date annotation data connecting human genes to phenotype classes from the generated ontology. We have included the data generation pipeline into our continuous integration system ensuring stable and up-to-date releases. This article describes the data generation process and is intended to help interested researchers access both the phenotype annotation data and the associated cross-species phenotype ontology. The resource described here can be used in sophisticated semantic similarity and gene set enrichment analyses for phenotype data across species. The stable releases of this resource can be obtained from http://purl.obolibrary.org/obo/hp/uberpheno/.

  20. Annotated English

    CERN Document Server

    Hernandez-Orallo, Jose

    2010-01-01

    This document presents Annotated English, a system of diacritical symbols which turns English pronunciation into a precise and unambiguous process. The annotations are defined and located in such a way that the original English text is not altered (not even a letter), thus allowing for a consistent reading and learning of the English language with and without annotations. The annotations are based on a set of general rules that make the frequency of annotations not dramatically high. This makes the reader easily associate annotations with exceptions, and makes it possible to shape, internalise and consolidate some rules for the English language which otherwise are weakened by the enormous amount of exceptions in English pronunciation. The advantages of this annotation system are manifold. Any existing text can be annotated without a significant increase in size. This means that we can get an annotated version of any document or book with the same number of pages and fontsize. Since no letter is affected, the ...

  1. Comparison of concept recognizers for building the Open Biomedical Annotator

    Science.gov (United States)

    Shah, Nigam H; Bhatia, Nipun; Jonquet, Clement; Rubin, Daniel; Chiang, Annie P; Musen, Mark A

    2009-01-01

    The National Center for Biomedical Ontology (NCBO) is developing a system for automated, ontology-based access to online biomedical resources (Shah NH, et al.: Ontology-driven indexing of public datasets for translational bioinformatics. BMC Bioinformatics 2009, 10(Suppl 2):S1). The system's indexing workflow processes the text metadata of diverse resources such as datasets from GEO and ArrayExpress to annotate and index them with concepts from appropriate ontologies. This indexing requires the use of a concept-recognition tool to identify ontology concepts in the resource's textual metadata. In this paper, we present a comparison of two concept recognizers – NLM's MetaMap and the University of Michigan's Mgrep. We utilize a number of data sources and dictionaries to evaluate the concept recognizers in terms of precision, recall, speed of execution, scalability and customizability. Our evaluations demonstrate that Mgrep has a clear edge over MetaMap for large-scale service oriented applications. Based on our analysis we also suggest areas of potential improvements for Mgrep. We have subsequently used Mgrep to build the Open Biomedical Annotator service. The Annotator service has access to a large dictionary of biomedical terms derived from the United Medical Language System (UMLS) and NCBO ontologies. The Annotator also leverages the hierarchical structure of the ontologies and their mappings to expand annotations. The Annotator service is available to the community as a REST Web service for creating ontology-based annotations of their data. PMID:19761568

  2. On the ontology based representation of cell lines.

    Directory of Open Access Journals (Sweden)

    Matthias Ganzinger

    Full Text Available Cell lines are frequently used as highly standardized and reproducible in vitro models for biomedical analyses and assays. Cell lines are distributed by cell banks that operate databases describing their products. However, the description of the cell lines' properties are not standardized across different cell banks. Existing cell line-related ontologies mostly focus on the description of the cell lines' names, but do not cover aspects like the origin or optimal growth conditions. The objective of this work is to develop an ontology that allows for a more comprehensive description of cell lines and their metadata, which should cover the data elements provided by cell banks. This will provide the basis for the standardized annotation of cell lines and corresponding assays in biomedical research. In addition, the ontology will be the foundation for automated evaluation of such assays and their respective protocols in the future. To accomplish this, a broad range of cell bank databases as well as existing ontologies were analyzed in a comprehensive manner. We identified existing ontologies capable of covering different aspects of the cell line domain. However, not all data fields derived from the cell banks' databases could be mapped to existing ontologies. As a result, we created a new ontology called cell culture ontology (CCONT integrating existing ontologies where possible. CCONT provides classes from the areas of cell line identification, origin, cell line properties, propagation and tests performed.

  3. TU-CD-BRB-07: Identification of Associations Between Radiologist-Annotated Imaging Features and Genomic Alterations in Breast Invasive Carcinoma, a TCGA Phenotype Research Group Study

    Energy Technology Data Exchange (ETDEWEB)

    Rao, A; Net, J [University of Miami, Miami, Florida (United States); Brandt, K [Mayo Clinic, Rochester, Minnesota (United States); Huang, E [National Cancer Institute, NIH, Bethesda, MD (United States); Freymann, J; Kirby, J [Leidos Biomedical Research Inc., Frederick, MD (United States); Burnside, E [University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin (United States); Morris, E; Sutton, E [Memorial Sloan Kettering Cancer Center, New York, NY (United States); Bonaccio, E [Roswell Park Cancer Institute, Buffalo, NY (United States); Giger, M; Jaffe, C [Univ Chicago, Chicago, IL (United States); Ganott, M; Zuley, M [University of Pittsburgh Medical Center - Magee Womens Hospital, Pittsburgh, Pennsylvania (United States); Le-Petross, H [MD Anderson Cancer Center, Houston, TX (United States); Dogan, B [UT MDACC, Houston, TX (United States); Whitman, G [UTMDACC, Houston, TX (United States)

    2015-06-15

    Purpose: To determine associations between radiologist-annotated MRI features and genomic measurements in breast invasive carcinoma (BRCA) from the Cancer Genome Atlas (TCGA). Methods: 98 TCGA patients with BRCA were assessed by a panel of radiologists (TCGA Breast Phenotype Research Group) based on a variety of mass and non-mass features according to the Breast Imaging Reporting and Data System (BI-RADS). Batch corrected gene expression data was obtained from the TCGA Data Portal. The Kruskal-Wallis test was used to assess correlations between categorical image features and tumor-derived genomic features (such as gene pathway activity, copy number and mutation characteristics). Image-derived features were also correlated with estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (HER2/neu) status. Multiple hypothesis correction was done using Benjamini-Hochberg FDR. Associations at an FDR of 0.1 were selected for interpretation. Results: ER status was associated with rim enhancement and peritumoral edema. PR status was associated with internal enhancement. Several components of the PI3K/Akt pathway were associated with rim enhancement as well as heterogeneity. In addition, several components of cell cycle regulation and cell division were associated with imaging characteristics.TP53 and GATA3 mutations were associated with lesion size. MRI features associated with TP53 mutation status were rim enhancement and peritumoral edema. Rim enhancement was associated with activity of RB1, PIK3R1, MAP3K1, AKT1,PI3K, and PIK3CA. Margin status was associated with HIF1A/ARNT, Ras/ GTP/PI3K, KRAS, and GADD45A. Axillary lymphadenopathy was associated with RB1 and BCL2L1. Peritumoral edema was associated with Aurora A/GADD45A, BCL2L1, CCNE1, and FOXA1. Heterogeneous internal nonmass enhancement was associated with EGFR, PI3K, AKT1, HF/MET, and EGFR/Erbb4/neuregulin 1. Diffuse nonmass enhancement was associated with HGF/MET/MUC20/SHIP

  4. ELE: An Ontology-Based System Integrating Semantic Search and E-Learning Technologies

    Science.gov (United States)

    Barbagallo, A.; Formica, A.

    2017-01-01

    ELSE (E-Learning for the Semantic ECM) is an ontology-based system which integrates semantic search methodologies and e-learning technologies. It has been developed within a project of the CME (Continuing Medical Education) program--ECM (Educazione Continua nella Medicina) for Italian participants. ELSE allows the creation of e-learning courses…

  5. AN ONTOLOGY-BASED TOURISM RECOMMENDER SYSTEM BASED ON SPREADING ACTIVATION MODEL

    Directory of Open Access Journals (Sweden)

    Z. Bahramian

    2015-12-01

    Full Text Available A tourist has time and budget limitations; hence, he needs to select points of interest (POIs optimally. Since the available information about POIs is overloading, it is difficult for a tourist to select the most appreciate ones considering preferences. In this paper, a new travel recommender system is proposed to overcome information overload problem. A recommender system (RS evaluates the overwhelming number of POIs and provides personalized recommendations to users based on their preferences. A content-based recommendation system is proposed, which uses the information about the user’s preferences and POIs and calculates a degree of similarity between them. It selects POIs, which have highest similarity with the user’s preferences. The proposed content-based recommender system is enhanced using the ontological information about tourism domain to represent both the user profile and the recommendable POIs. The proposed ontology-based recommendation process is performed in three steps including: ontology-based content analyzer, ontology-based profile learner, and ontology-based filtering component. User’s feedback adapts the user’s preferences using Spreading Activation (SA strategy. It shows the proposed recommender system is effective and improves the overall performance of the traditional content-based recommender systems.

  6. Ontology-based Malaria Parasite Stage and Species Identification from Peripheral Blood Smear Images

    NARCIS (Netherlands)

    Makkapati, V.; Rao, R.

    2011-01-01

    The diagnosis and treatment of malaria infection requires detectingthe presence of malaria parasite in the patient as well as identification of the parasite species. We present an image processing-basedapproach to detect parasites in microscope images of blood smear andan ontology-based classificati

  7. Ontology-based high-level context inference for human behavior identification

    NARCIS (Netherlands)

    Villalonga, Claudia; Razzaq, Muhammad Asif; Ali Khan, Wajahat; Pomares, Hector; Rojas, Ignacio; Lee, Sungyoung; Banos Legran, Oresti

    2016-01-01

    Recent years have witnessed a huge progress in the utomatic identification of individual primitives of human behavior, such as activities or locations. However, the complex nature of human behavior demands more abstract contextual information for its analysis. This work presents an ontology-based

  8. A Novel Mobile Video Community Discovery Scheme Using Ontology-Based Semantical Interest Capture

    Directory of Open Access Journals (Sweden)

    Ruiling Zhang

    2016-01-01

    Full Text Available Leveraging network virtualization technologies, the community-based video systems rely on the measurement of common interests to define and steady relationship between community members, which promotes video sharing performance and improves scalability community structure. In this paper, we propose a novel mobile Video Community discovery scheme using ontology-based semantical interest capture (VCOSI. An ontology-based semantical extension approach is proposed, which describes video content and measures video similarity according to video key word selection methods. In order to reduce the calculation load of video similarity, VCOSI designs a prefix-filtering-based estimation algorithm to decrease energy consumption of mobile nodes. VCOSI further proposes a member relationship estimate method to construct scalable and resilient node communities, which promotes video sharing capacity of video systems with the flexible and economic community maintenance. Extensive tests show how VCOSI obtains better performance results in comparison with other state-of-the-art solutions.

  9. ONTOLOGY BASED MEANINGFUL SEARCH USING SEMANTIC WEB AND NATURAL LANGUAGE PROCESSING TECHNIQUES

    Directory of Open Access Journals (Sweden)

    K. Palaniammal

    2013-10-01

    Full Text Available The semantic web extends the current World Wide Web by adding facilities for the machine understood description of meaning. The ontology based search model is used to enhance efficiency and accuracy of information retrieval. Ontology is the core technology for the semantic web and this mechanism for representing formal and shared domain descriptions. In this paper, we proposed ontology based meaningful search using semantic web and Natural Language Processing (NLP techniques in the educational domain. First we build the educational ontology then we present the semantic search system. The search model consisting three parts which are embedding spell-check, finding synonyms using WordNet API and querying ontology using SPARQL language. The results are both sensitive to spell check and synonymous context. This paper provides more accurate results and the complete details for the selected field in a single page.

  10. An Ontology Based Reuse Algorithm towards Process Planning in Software Development

    Directory of Open Access Journals (Sweden)

    Shilpa Sharma

    2011-09-01

    Full Text Available The process planning task for specified design provisions in software development can be significantly developed by referencing the knowledge reuse scheme. Reuse is considered to be one of the most promising techniques to improve software excellence and productivity. Reuse during software development depends much on the existing design knowledge in meta-model, a “read only” repository of information. We have proposed, an ontology based reuse algorithm towards process planning in software development. According to the common conceptual base facilitated by ontology and the characteristics of knowledge, the concepts and the entities are represented into meta-model and endeavor prospects. The relations between these prospects and its linkage knowledge are used to construct an ontology based reuse algorithm. In addition, our experiment illustrates realization of process planning in software development by practicing this algorithm. Subsequently, its benefits are delineated.

  11. A Generalized Framework for Ontology-Based Information Retrieval Application to a public-transportation system

    OpenAIRE

    2014-01-01

    In this paper we present a generic framework for ontology-based information retrieval. We focus on the recognition of semantic information extracted from data sources and the mapping of this knowledge into ontology. In order to achieve more scalability, we propose an approach for semantic indexing based on entity retrieval model. In addition, we have used ontology of public transportation domain in order to validate these proposals. Finally, we evaluated our system using ontology mapping and ...

  12. TOWARDS AN ONTOLOGY-BASED MULTI-AGENT MEDICAL INFORMATION SYSTEM BASED ON THE WEB

    Institute of Scientific and Technical Information of China (English)

    张全海; 施鹏飞

    2002-01-01

    This paper described an ontology-based multi-agent knowledge process made (MAKM) which is one of multi-agents systems (MAS) and uses semantic network to describe agents to help to locate relative agents distributed in the workgroup. In MAKM, an agent is the entity to implement the distributed task processing and to access the information or knowledge. Knowledge query manipulation language (KQML) is adapted to realize the communication among agents. So using the MAKM mode, different knowledge and information on the medical domain could be organized and utilized efficiently when a collaborative task is implemented on the web.

  13. An Ontology-Based Framework for Semi-Automatic Schema Integration

    Institute of Scientific and Technical Information of China (English)

    Zille Huma; Muhammad Jaffar-Ur Rehman; Nadeem Iftikhar

    2005-01-01

    Currently, schema integration frameworks use approaches like rule-based, machine learning, etc. This paper presents an ontology-based wrapper-mediator framework that uses both the rule-based and machine learning strategies at the same time. The proposed framework uses global and local ontologies for resolving syntactic and semantic heterogeneity,and XML for interoperability. The concepts in the candidate schemas are merged on the basis of the similarity coefficient,which is calculated using the defined rules and the prior mappings stored in the case-base.

  14. Modeling Project Management Competences: An Ontology-Based Solution for Competency-Based Learning

    Science.gov (United States)

    Bodea, Constanţa-Nicoleta; Dascălu, Maria-Iuliana

    Due to growing requirements for skilled workers, the education should value the outcome and address students' real performance in life. A learning process turns out to be good when the degree of transformation made possible through that process is high or the degree of competences increases. Current paper indicates e-learning as a suitable activity for competences development. The authors also argue that a proper competences modeling solution would increase the efficiency of competence-based learning. Consequently, an ontology based solution is presented for project management domain.

  15. Ontology-based time information representation of vaccine adverse events in VAERS for temporal analysis

    Directory of Open Access Journals (Sweden)

    Tao Cui

    2012-12-01

    Full Text Available Abstract Background The U.S. FDA/CDC Vaccine Adverse Event Reporting System (VAERS provides a valuable data source for post-vaccination adverse event analyses. The structured data in the system has been widely used, but the information in the write-up narratives is rarely included in these kinds of analyses. In fact, the unstructured nature of the narratives makes the data embedded in them difficult to be used for any further studies. Results We developed an ontology-based approach to represent the data in the narratives in a “machine-understandable” way, so that it can be easily queried and further analyzed. Our focus is the time aspect in the data for time trending analysis. The Time Event Ontology (TEO, Ontology of Adverse Events (OAE, and Vaccine Ontology (VO are leveraged for the semantic representation of this purpose. A VAERS case report is presented as a use case for the ontological representations. The advantages of using our ontology-based Semantic web representation and data analysis are emphasized. Conclusions We believe that representing both the structured data and the data from write-up narratives in an integrated, unified, and “machine-understandable” way can improve research for vaccine safety analyses, causality assessments, and retrospective studies.

  16. [Implementation of ontology-based clinical decision support system for management of interactions between antihypertensive drugs and diet].

    Science.gov (United States)

    Park, Jeong Eun; Kim, Hwa Sun; Chang, Min Jung; Hong, Hae Sook

    2014-06-01

    The influence of dietary composition on blood pressure is an important subject in healthcare. Interactions between antihypertensive drugs and diet (IBADD) is the most important factor in the management of hypertension. It is therefore essential to support healthcare providers' decision making role in active and continuous interaction control in hypertension management. The aim of this study was to implement an ontology-based clinical decision support system (CDSS) for IBADD management (IBADDM). We considered the concepts of antihypertensive drugs and foods, and focused on the interchangeability between the database and the CDSS when providing tailored information. An ontology-based CDSS for IBADDM was implemented in eight phases: (1) determining the domain and scope of ontology, (2) reviewing existing ontology, (3) extracting and defining the concepts, (4) assigning relationships between concepts, (5) creating a conceptual map with CmapTools, (6) selecting upper ontology, (7) formally representing the ontology with Protégé (ver.4.3), (8) implementing an ontology-based CDSS as a JAVA prototype application. We extracted 5,926 concepts, 15 properties, and formally represented them using Protégé. An ontology-based CDSS for IBADDM was implemented and the evaluation score was 4.60 out of 5. We endeavored to map functions of a CDSS and implement an ontology-based CDSS for IBADDM.

  17. Enhancing Users' Participation in Business Process Modeling through Ontology-Based Training

    Science.gov (United States)

    Macris, A.; Malamateniou, F.; Vassilacopoulos, G.

    Successful business process design requires active participation of users who are familiar with organizational activities and business process modelling concepts. Hence, there is a need to provide users with reusable, flexible, agile and adaptable training material in order to enable them instil their knowledge and expertise in business process design and automation activities. Knowledge reusability is of paramount importance in designing training material on process modelling since it enables users participate actively in process design/redesign activities stimulated by the changing business environment. This paper presents a prototype approach for the design and use of training material that provides significant advantages to both the designer (knowledge - content reusability and semantic web enabling) and the user (semantic search, knowledge navigation and knowledge dissemination). The approach is based on externalizing domain knowledge in the form of ontology-based knowledge networks (i.e. training scenarios serving specific training needs) so that it is made reusable.

  18. Ontology-Based Clinical Decision Support System for Predicting High-Risk Pregnant Woman

    Directory of Open Access Journals (Sweden)

    Umar Manzoor

    2015-12-01

    Full Text Available According to Pakistan Medical and Dental Council (PMDC, Pakistan is facing a shortage of approximately 182,000 medical doctors. Due to the shortage of doctors; a large number of lives are in danger especially pregnant woman. A large number of pregnant women die every year due to pregnancy complications, and usually the reason behind their death is that the complications are not timely handled. In this paper, we proposed ontology-based clinical decision support system that diagnoses high-risk pregnant women and refer them to the qualified medical doctors for timely treatment. The Ontology of the proposed system is built automatically and enhanced afterward using doctor’s feedback. The proposed framework has been tested on a large number of test cases; experimental results are satisfactory and support the implementation of the solution.

  19. Analysis of Deviations in an Agent and Ontology-Based Dialogue Management System

    Institute of Scientific and Technical Information of China (English)

    2006-01-01

    Algorithms of detecting dialogue deviations from a dialogue topic in an agent and ontology-based dialogue management system(AODMS) are proposed. In AODMS, agents and ontologies are introduced to represent domain knowledge. And general algorithms that model dialogue phenomena in different domains can be realized in that complex relationships between knowledge in different domains can be described by ontologies. An evaluation of the dialogue management system with deviation-judging algorithms on 736 utterances shows that the AODMS is able to talk about the given topic consistently and answer 86.6% of the utterances, while only 72.1% of the utterances can be responded correctly without deviation-judging module.

  20. An Ontology-Based Approach for Semantic Conflict Resolution in Database Integration

    Institute of Scientific and Technical Information of China (English)

    Qiang Liu; Tao Huang; Shao-Hua Liu; Hua Zhong

    2007-01-01

    An important task in database integration is to resolve data conflicts, on both schema-level and semantic-level.Especially difficult the latter is. Some existing ontology-based approaches have been criticized for their lack of domain generality and semantic richness. With the aim to overcome these limitations, this paper introduces a systematic approach for detecting and resolving various semantic conflicts in heterogeneous databases, which includes two important parts: a semantic conflict representation model based on our classification framework of semantic conflicts, and a methodology for detecting and resolving semantic conflicts based on this model. The system has been developed, experimental evaluations on which indicate that this approach can resolve much of the semantic conflicts effectively, and keep independent of domains and integration patterns.

  1. Ontology-Based Big Dimension Modeling in Data Warehouse Schema Design

    DEFF Research Database (Denmark)

    Xiufeng, Liu; Iftikhar, Nadeem

    2013-01-01

    During data warehouse schema design, designers often encounter how to model big dimensions that typically contain a large number of attributes and records. To investigate effective approaches for modeling big dimensions is necessary in order to achieve better query performance, with respect...... partitioning, vertical partitioning and their hybrid. We formalize the design methods and propose an algorithm that describes the modeling process from an OWL ontology to a data warehouse schema. In addition, this paper also presents an effective ontology-based tool to automate the modeling process. The tool...... can automatically generate the data warehouse schema from the ontology of describing the terms and business semantics for the big dimension. In case of any change in the requirements, we only need to modify the ontology, and re-generate the schema using the tool. This paper also evaluates the proposed...

  2. Constructing global view with an ontology-based method for information sharing in the virtual organization

    Institute of Scientific and Technical Information of China (English)

    2005-01-01

    The problem of constructing global view of heterogeneous information sources for information sharing is becoming more and more important due to the availability of multiple information sources within the virtual organization. Global view is defined to provide a unified representation of the information in the different sources by analyzing concept schemas associated with them and resolving possible semantic heterogeneity. An ontology-based method for global view construction is proposed. In the method, ( 1 ) Based on the formal ontologies, the concept of semantic affinity is introduced to assess the level of semantic relationship between information classes from different information sources; (2) Information classes are classified by semantic affinity levels using clustering procedures so that their different representations can be analyzed for unification; (3) Global view is constructed starting from selected clusters by unifying representation of their elements. The application example of using the method in the joint-aerial defense organization is illustrated and the result shows that the proposed method is feasible.

  3. Internal Data Market Services: An Ontology-Based Architecture and Its Evaluation

    Directory of Open Access Journals (Sweden)

    Fons Wijnhoven

    2003-01-01

    Full Text Available On information markets, many suppliers and buyers of information goods exchange values. Some of these goods are data, whose value is created in buyer interactions with data sources. These interactions are enabled by data market services (DMS. DMS give access to one or several data sources. The major problems with the creation of information value in these contexts are (1 the quality of information re-trievals and related queries, and (2 the complexity of matching information needs and supplies when different semantics are used by source systems and information buyers. This study reports about a proto-type DMS (called CIRBA, which employs an ontology-based information retrieval system to solve se-mantic problems for a DMS. The DMS quality is tested in an experiment to assess its quality from a user perspective against a traditional data warehouse (with SQL solution. The CIRBA solution gave substan-tially higher user satisfaction than the data warehouse alternative.

  4. Ontology-Based Device Descriptions and Device Repository for Building Automation Devices

    Directory of Open Access Journals (Sweden)

    Dibowski Henrik

    2011-01-01

    Full Text Available Device descriptions play an important role in the design and commissioning of modern building automation systems and help reducing the design time and costs. However, all established device descriptions are specialized for certain purposes and suffer from several weaknesses. This hinders a further design automation, which is strongly needed for the more and more complex building automation systems. To overcome these problems, this paper presents novel Ontology-based Device Descriptions (ODDs along with a layered ontology architecture, a specific ontology view approach with virtual properties, a generic access interface, a triple store-based database backend, and a generic search mask GUI with underlying query generation algorithm. It enables a formal, unified, and extensible specification of building automation devices, ensures their comparability, and facilitates a computer-enabled retrieval, selection, and interoperability evaluation, which is essential for an automated design. The scalability of the approach to several ten thousand devices is demonstrated.

  5. An Ontology-based Context Aware System for Selective Dissemination of Information in a Digital Library

    CERN Document Server

    De Giusti, Marisa R; Vosou, Agustín; Martínez, Juan P

    2010-01-01

    Users of Institutional Repositories and Digital Libraries are known by their needs for very specific information about one or more subjects. To characterize users profiles and offer them new documents and resources is one of the main challenges of today's libraries. In this paper, a Selective Dissemination of Information service is described, which proposes an Ontology-based Context Aware system for identifying user's context (research subjects, work team, areas of interest). This system enables librarians to broaden users profiles beyond the information that users have introduced by hand (such as institution, age and language). The system requires a context retrieval layer to capture user information and behavior, and an inference engine to support context inference from many information sources (selected documents and users' queries).

  6. Ontology based Intrusion Detection System in Wireless Sensor Network for Active Attacks

    Directory of Open Access Journals (Sweden)

    Naheed Akhter

    2016-06-01

    Full Text Available WSNs are vulnerable to attacks and have deemed special attention for developing mechanism for securing against various threats that could effect the overall infrastructure. WSNs are open to miscellaneous classes of attacks and security breaches are intolerable in WSNs. Threats like untrusted data transmissions, settlement in open and unfavorable environments are still open research issues. Safekeeping is an essential and complex requirement in WSNs. These issues raise the need to develop a security-based mechanism for Wireless Sensor Network to categorize the different attacks based on their relevance. A detailed survey of active attacks is highlighted based on the nature and attributes of those attacks. An Ontology based mechanism is developed and tested for active attack in WSNs.

  7. An Ontology Based Crawler for Retrieving Information Distributed on the Web

    Directory of Open Access Journals (Sweden)

    Wael A. Gab–Allah

    2016-06-01

    Full Text Available One of the principal motivations for the creation of the Web was to retrieve information in a fast and easy way. So, building systems for retrieving distributed information is crucially essential. This paper introduces an ontology based focused crawling system that exhibits high recall and high precision. The reason behind the power of the system is two–fold. First, it is focused, thanks to the underlying ontology–based retrieval subsystem. Second, operates in two phases, one to increase recall and the other to increase precision. We have implemented the proposed system using the Python language and the WordNet taxonomy. The results obtained by the system are given at the end of the paper and show clearly that it outperforms general purpose crawling systems built on approaches such as breadth first search

  8. Ontology-based Integration of Web Navigation for Dynamic User Profiling

    Directory of Open Access Journals (Sweden)

    Anett HOPPE

    2015-01-01

    Full Text Available The development of technology for handling information on a Big Data-scale is a buzzing topic of current research. Indeed, improved techniques for knowledge discovery are crucial for scientific and economic exploitation of large-scale raw data. In research collaboration with an industrial actor, we explore the applicability of ontology-based knowledge extraction and representation for today's biggest source of large-scale data, the Web. The goal is to develop a profil-ing application, based on the implicit information that every user leaves while navigating the online, with the goal to identify and model preferences and interests in a detailed user profile. This includes the identification of current tendencies as well as the prediction of possible future interests, as far as they are deducible from the collected browsing information, and integrated expert domain knowledge. The article at hand gives an overview on the current state of the research, the developments made and insights gained.

  9. Real-time context aware reasoning in on-board intelligent traffic systems: An Architecture for Ontology-based Reasoning using Finite State Machines

    NARCIS (Netherlands)

    Stoter, Arjan; Dalmolen, Simon; Drenth, Eduard; Cornelisse, Erik; Mulder, Wico

    2011-01-01

    In-vehicle information management is vital in intelligent traffic systems. In this paper we motivate an architecture for ontology-based context-aware reasoning for in-vehicle information management. An ontology is essential for system standardization and communication, and ontology-based reasoning

  10. Ontology-based semantic query for clinical trials%基于本体的临床试验数据语义查询

    Institute of Scientific and Technical Information of China (English)

    黄必清; 王涛; 朱鹏; 薛霄; 吴芸

    2012-01-01

    The extensive medical terminology used to describe clinical trials complicates Key words searches to locate resources.This paper presents an ontology-based semantic query system for clinical trial descriptions.This system uses the Web ontology language(OWL) to create the ontology based on ICD-10 and ICMJE(the ontology includes a clinical trial class and a disease class),retrieves clinical trial data from ClinicalTrials.gov,annotates the data as instances of the clinical trial class and creates relationships between the clinical trials and diseases to enable structured semantic clinical trial queries using SPARQL.Through this method,users can use disease and property Key words to express the structured semantic query and locate the resources.The query conditions generated by this method more accurately meet the user's needs than traditional queries.%临床试验数据的描述中多自然语言、多专业医学术语的特点使得用户难以通过自定义的关键字快速定位所需的资源。该文采用基于本体的方法实现对于临床试验数据的语义查询。该系统的实现步骤如下:使用OWL(Web on-tology language)构建基于ICD-10和ICMJE标准的本体,包含疾病和临床试验类;从Clinical Trials注册库获取临床试验数据,标注为本体中的临床试验类实例;建立临床试验实例与疾病实例的联系;借助SPARQL实现对于临床试验数据结构化的查询。使用上述方法,用户能够通过疾病实例和相关属性的关键字,表达结构化的语义查询条件,精确定位所需的临床试验。与传统的仅基于关键字匹配的查询方法相比,该方法所表达的查询条件能够更加准确地描述用户的查询需求。

  11. Identification of fever and vaccine-associated gene interaction networks using ontology-based literature mining.

    Science.gov (United States)

    Hur, Junguk; Ozgür, Arzucan; Xiang, Zuoshuang; He, Yongqun

    2012-12-20

    Fever is one of the most common adverse events of vaccines. The detailed mechanisms of fever and vaccine-associated gene interaction networks are not fully understood. In the present study, we employed a genome-wide, Centrality and Ontology-based Network Discovery using Literature data (CONDL) approach to analyse the genes and gene interaction networks associated with fever or vaccine-related fever responses. Over 170,000 fever-related articles from PubMed abstracts and titles were retrieved and analysed at the sentence level using natural language processing techniques to identify genes and vaccines (including 186 Vaccine Ontology terms) as well as their interactions. This resulted in a generic fever network consisting of 403 genes and 577 gene interactions. A vaccine-specific fever sub-network consisting of 29 genes and 28 gene interactions was extracted from articles that are related to both fever and vaccines. In addition, gene-vaccine interactions were identified. Vaccines (including 4 specific vaccine names) were found to directly interact with 26 genes. Gene set enrichment analysis was performed using the genes in the generated interaction networks. Moreover, the genes in these networks were prioritized using network centrality metrics. Making scientific discoveries and generating new hypotheses were possible by using network centrality and gene set enrichment analyses. For example, our study found that the genes in the generic fever network were more enriched in cell death and responses to wounding, and the vaccine sub-network had more gene enrichment in leukocyte activation and phosphorylation regulation. The most central genes in the vaccine-specific fever network are predicted to be highly relevant to vaccine-induced fever, whereas genes that are central only in the generic fever network are likely to be highly relevant to generic fever responses. Interestingly, no Toll-like receptors (TLRs) were found in the gene-vaccine interaction network. Since

  12. Web Approach for Ontology-Based Classification, Integration, and Interdisciplinary Usage of Geoscience Metadata

    Directory of Open Access Journals (Sweden)

    B Ritschel

    2012-10-01

    Full Text Available The Semantic Web is a W3C approach that integrates the different sources of semantics within documents and services using ontology-based techniques. The main objective of this approach in the geoscience domain is the improvement of understanding, integration, and usage of Earth and space science related web content in terms of data, information, and knowledge for machines and people. The modeling and representation of semantic attributes and relations within and among documents can be realized by human readable concept maps and machine readable OWL documents. The objectives for the usage of the Semantic Web approach in the GFZ data center ISDC project are the design of an extended classification of metadata documents for product types related to instruments, platforms, and projects as well as the integration of different types of metadata related to data product providers, users, and data centers. Sources of content and semantics for the description of Earth and space science product types and related classes are standardized metadata documents (e.g., DIF documents, publications, grey literature, and Web pages. Other sources are information provided by users, such as tagging data and social navigation information. The integration of controlled vocabularies as well as folksonomies plays an important role in the design of well formed ontologies.

  13. Resident Space Object Characterization and Behavior Understanding via Machine Learning and Ontology-based Bayesian Networks

    Science.gov (United States)

    Furfaro, R.; Linares, R.; Gaylor, D.; Jah, M.; Walls, R.

    2016-09-01

    In this paper, we present an end-to-end approach that employs machine learning techniques and Ontology-based Bayesian Networks (BN) to characterize the behavior of resident space objects. State-of-the-Art machine learning architectures (e.g. Extreme Learning Machines, Convolutional Deep Networks) are trained on physical models to learn the Resident Space Object (RSO) features in the vectorized energy and momentum states and parameters. The mapping from measurements to vectorized energy and momentum states and parameters enables behavior characterization via clustering in the features space and subsequent RSO classification. Additionally, Space Object Behavioral Ontologies (SOBO) are employed to define and capture the domain knowledge-base (KB) and BNs are constructed from the SOBO in a semi-automatic fashion to execute probabilistic reasoning over conclusions drawn from trained classifiers and/or directly from processed data. Such an approach enables integrating machine learning classifiers and probabilistic reasoning to support higher-level decision making for space domain awareness applications. The innovation here is to use these methods (which have enjoyed great success in other domains) in synergy so that it enables a "from data to discovery" paradigm by facilitating the linkage and fusion of large and disparate sources of information via a Big Data Science and Analytics framework.

  14. Ontology-based classification of remote sensing images using spectral rules

    Science.gov (United States)

    Andrés, Samuel; Arvor, Damien; Mougenot, Isabelle; Libourel, Thérèse; Durieux, Laurent

    2017-05-01

    Earth Observation data is of great interest for a wide spectrum of scientific domain applications. An enhanced access to remote sensing images for ;domain; experts thus represents a great advance since it allows users to interpret remote sensing images based on their domain expert knowledge. However, such an advantage can also turn into a major limitation if this knowledge is not formalized, and thus is difficult for it to be shared with and understood by other users. In this context, knowledge representation techniques such as ontologies should play a major role in the future of remote sensing applications. We implemented an ontology-based prototype to automatically classify Landsat images based on explicit spectral rules. The ontology is designed in a very modular way in order to achieve a generic and versatile representation of concepts we think of utmost importance in remote sensing. The prototype was tested on four subsets of Landsat images and the results confirmed the potential of ontologies to formalize expert knowledge and classify remote sensing images.

  15. Ontology-based analysis of multi-scale modeling of geographical features

    Institute of Scientific and Technical Information of China (English)

    WANG; Yanhui; LI; Xiaojuan; GONG; Huili

    2006-01-01

    As multi-scale databases based on scale series of map data are built, conceptual models are needed to define proper multi-scale representation formulas and to extract model entities and the relationships among them. However, the results of multi-scale conceptual abstraction schema may differ, according to which cognition, abstraction and application views are utilized, which presents an obvious obstacle to the reuse and sharing of spatial data. To facilitate the design of unified, common and objective abstract schema views for multi-scale spatial databases, this paper proposes an ontology-based analysis method for the multi-scale modeling of geographical features. It includes a three-layer ontology model, which serves as the framework for common multi-scale abstraction schema; an explanation of formulary abstractions accompanied by definitions of entities and their relationships at the same scale, as well as different scales,which are meant to provide strong feasibility, expansibility and speciality functions; and a case in point involving multi-scale representations of road features, to verify the method's feasibility.

  16. Ontology-based Deep Learning for Human Behavior Prediction with Explanations in Health Social Networks.

    Science.gov (United States)

    Phan, Nhathai; Dou, Dejing; Wang, Hao; Kil, David; Piniewski, Brigitte

    2017-04-01

    Human behavior modeling is a key component in application domains such as healthcare and social behavior research. In addition to accurate prediction, having the capacity to understand the roles of human behavior determinants and to provide explanations for the predicted behaviors is also important. Having this capacity increases trust in the systems and the likelihood that the systems actually will be adopted, thus driving engagement and loyalty. However, most prediction models do not provide explanations for the behaviors they predict. In this paper, we study the research problem, human behavior prediction with explanations, for healthcare intervention systems in health social networks. We propose an ontology-based deep learning model (ORBM(+)) for human behavior prediction over undirected and nodes-attributed graphs. We first propose a bottom-up algorithm to learn the user representation from health ontologies. Then the user representation is utilized to incorporate self-motivation, social influences, and environmental events together in a human behavior prediction model, which extends a well-known deep learning method, the Restricted Boltzmann Machine. ORBM(+) not only predicts human behaviors accurately, but also, it generates explanations for each predicted behavior. Experiments conducted on both real and synthetic health social networks have shown the tremendous effectiveness of our approach compared with conventional methods.

  17. Development of a Framework for Ontology Based Sentiment Analysis on Social Media

    Directory of Open Access Journals (Sweden)

    Kadir Tutar

    2015-10-01

    Full Text Available Developing internet technology, trend social media applications and web 2.0 have changed the internet usage habits of internet users. By this means the internet users have started to share their feelings and thoughts on social media from anywhere at anytime. With the increasement of social media usage, valuable feedback data has been increased more and more as well. To this end collection, interpretation and evaluation of this data has come into importance. 'Sentiment analysis' and 'natural language process' methods have been used on text-based data for evaluation and opinion mining to overcome this necessity. In this study, a new ontology-based sentiment analysis method has been developed in order to enhance the accuracy of results that obtained by current sentiment analysis methods. This newly developed method requires to model the domain-specific information on the ontology prior to the analysis procedure. Though this approach, more accurate and more qualified results have been provided to obtain in compared to classic sentiment analysis methods. Another important and innovative feature of this developed infrastructure is being able to do Turkish based sentiment analysis.

  18. Ontology-based automatic identification of public health-related Turkish tweets.

    Science.gov (United States)

    Küçük, Emine Ela; Yapar, Kürşad; Küçük, Dilek; Küçük, Doğan

    2017-02-04

    Social media analysis, such as the analysis of tweets, is a promising research topic for tracking public health concerns including epidemics. In this paper, we present an ontology-based approach to automatically identify public health-related Turkish tweets. The system is based on a public health ontology that we have constructed through a semi-automated procedure. The ontology concepts are expanded through a linguistically motivated relaxation scheme as the last stage of ontology development, before being integrated into our system to increase its coverage. The ultimate lexical resource which includes the terms corresponding to the ontology concepts is used to filter the Twitter stream so that a plausible tweet subset, including mostly public-health related tweets, can be obtained. Experiments are carried out on two million genuine tweets and promising precision rates are obtained. Also implemented within the course of the current study is a Web-based interface, to track the results of this identification system, to be used by the related public health staff. Hence, the current social media analysis study has both technical and practical contributions to the significant domain of public health.

  19. Ontology-Based High-Level Context Inference for Human Behavior Identification

    Directory of Open Access Journals (Sweden)

    Claudia Villalonga

    2016-09-01

    Full Text Available Recent years have witnessed a huge progress in the automatic identification of individual primitives of human behavior, such as activities or locations. However, the complex nature of human behavior demands more abstract contextual information for its analysis. This work presents an ontology-based method that combines low-level primitives of behavior, namely activity, locations and emotions, unprecedented to date, to intelligently derive more meaningful high-level context information. The paper contributes with a new open ontology describing both low-level and high-level context information, as well as their relationships. Furthermore, a framework building on the developed ontology and reasoning models is presented and evaluated. The proposed method proves to be robust while identifying high-level contexts even in the event of erroneously-detected low-level contexts. Despite reasonable inference times being obtained for a relevant set of users and instances, additional work is required to scale to long-term scenarios with a large number of users.

  20. An ontology-based semantic configuration approach to constructing Data as a Service for enterprises

    Science.gov (United States)

    Cai, Hongming; Xie, Cheng; Jiang, Lihong; Fang, Lu; Huang, Chenxi

    2016-03-01

    To align business strategies with IT systems, enterprises should rapidly implement new applications based on existing information with complex associations to adapt to the continually changing external business environment. Thus, Data as a Service (DaaS) has become an enabling technology for enterprise through information integration and the configuration of existing distributed enterprise systems and heterogonous data sources. However, business modelling, system configuration and model alignment face challenges at the design and execution stages. To provide a comprehensive solution to facilitate data-centric application design in a highly complex and large-scale situation, a configurable ontology-based service integrated platform (COSIP) is proposed to support business modelling, system configuration and execution management. First, a meta-resource model is constructed and used to describe and encapsulate information resources by way of multi-view business modelling. Then, based on ontologies, three semantic configuration patterns, namely composite resource configuration, business scene configuration and runtime environment configuration, are designed to systematically connect business goals with executable applications. Finally, a software architecture based on model-view-controller (MVC) is provided and used to assemble components for software implementation. The result of the case study demonstrates that the proposed approach provides a flexible method of implementing data-centric applications.

  1. A framework for unifying ontology-based semantic similarity measures: a study in the biomedical domain.

    Science.gov (United States)

    Harispe, Sébastien; Sánchez, David; Ranwez, Sylvie; Janaqi, Stefan; Montmain, Jacky

    2014-04-01

    Ontologies are widely adopted in the biomedical domain to characterize various resources (e.g. diseases, drugs, scientific publications) with non-ambiguous meanings. By exploiting the structured knowledge that ontologies provide, a plethora of ad hoc and domain-specific semantic similarity measures have been defined over the last years. Nevertheless, some critical questions remain: which measure should be defined/chosen for a concrete application? Are some of the, a priori different, measures indeed equivalent? In order to bring some light to these questions, we perform an in-depth analysis of existing ontology-based measures to identify the core elements of semantic similarity assessment. As a result, this paper presents a unifying framework that aims to improve the understanding of semantic measures, to highlight their equivalences and to propose bridges between their theoretical bases. By demonstrating that groups of measures are just particular instantiations of parameterized functions, we unify a large number of state-of-the-art semantic similarity measures through common expressions. The application of the proposed framework and its practical usefulness is underlined by an empirical analysis of hundreds of semantic measures in a biomedical context. Copyright © 2013 Elsevier Inc. All rights reserved.

  2. Home-Explorer: Ontology-Based Physical Artifact Search and Hidden Object Detection System

    Directory of Open Access Journals (Sweden)

    Bin Guo

    2008-01-01

    Full Text Available A new system named Home-Explorer that searches and finds physical artifacts in a smart indoor environment is proposed. The view on which it is based is artifact-centered and uses sensors attached to the everyday artifacts (called smart objects in the real world. This paper makes two main contributions: First, it addresses, the robustness of the embedded sensors, which is seldom discussed in previous smart artifact research. Because sensors may sometimes be broken or fail to work under certain conditions, smart objects become hidden ones. However, current systems provide no mechanism to detect and manage objects when this problem occurs. Second, there is no common context infrastructure for building smart artifact systems, which makes it difficult for separately developed applications to interact with each other and uneasy for them to share and reuse knowledge. Unlike previous systems, Home-Explorer builds on an ontology-based knowledge infrastructure named Sixth-Sense, which makes it easy for the system to interact with other applications or agents also based on this ontology. The hidden object problem is also reflected in our ontology, which enables Home-Explorer to deal with both smart objects and hidden objects. A set of rules for deducing an object's status or location information and for locating hidden objects are described and evaluated.

  3. Automatically multi-paradigm requirements modeling and analyzing: An ontology-based approach

    Institute of Scientific and Technical Information of China (English)

    2003-01-01

    There are several purposes for modeling and analyzing the problem domain before starting the software requirements analysis. First, it focuses on the problem domain, so that the domain users could be involved easily. Secondly, a comprehensive description on the problem domain will advantage getting a comprehensive software requirements model. This paper proposes an ontology-based approach for mod-eling the problem domain. It interacts with the domain users by using terminology that they can under-stand and guides them to provide the relevant information. A multiple paradigm analysis approach, with the basis of the description on the problem domain, has also been presented. Three criteria, i.e. the ra-tionality of organization structure, the achievability of organization goals, and the feasibility of organiza-tion process, have been proposed. The results of the analysis could be used as feedbacks for guiding the domain users to provide further information on the problem domain. And those models on the problem domain could be a kind of document for the pre-requirements analysis phase. They also will be the basis for further software requirements modeling.

  4. STUDY THE IMPACT OF REQUIREMENTS MANAGEMENT CHARACTERISTICS IN GLOBAL SOFTWARE DEVELOPMENT PROJECTS: AN ONTOLOGY BASED APPROACH

    Directory of Open Access Journals (Sweden)

    S. Arun Kumar

    2011-11-01

    Full Text Available Requirements Management is one of the challenging and key tasks in the development of software productsin distributed software development environment. One of the key reasons found in our literature survey thefailure of software projects due to poor project management and requirement management activity. Thismain aim of this paper 1. Formulate a framework for the successful and efficient requirements managementframework for Global Software Development Projects. (GSD 2. Design a Mixed organization structure ofboth traditional approaches and agile approaches, of global software development projects. 3. ApplyOntology based Knowledge Management Systems for both the approaches to achieve requirements issuessuch as missing, inconsistency of requirements, communication and knowledge management issues andimprove the project management activities in a global software development environment. 4. Proposerequirements management metrics to measure and manage software process during the development ofinformation systems. The major contribution of this paper is to analyze the requirements managementissues and challenges associated with global software development projects. Two hypotheses have beenformulated and tested this problem through statistical techniques like correlation and regression analysisand validate the same.

  5. Ontology-based, Tissue MicroArray oriented, image centered tissue bank

    Directory of Open Access Journals (Sweden)

    Viti Federica

    2008-04-01

    Full Text Available Abstract Background Tissue MicroArray technique is becoming increasingly important in pathology for the validation of experimental data from transcriptomic analysis. This approach produces many images which need to be properly managed, if possible with an infrastructure able to support tissue sharing between institutes. Moreover, the available frameworks oriented to Tissue MicroArray provide good storage for clinical patient, sample treatment and block construction information, but their utility is limited by the lack of data integration with biomolecular information. Results In this work we propose a Tissue MicroArray web oriented system to support researchers in managing bio-samples and, through the use of ontologies, enables tissue sharing aimed at the design of Tissue MicroArray experiments and results evaluation. Indeed, our system provides ontological description both for pre-analysis tissue images and for post-process analysis image results, which is crucial for information exchange. Moreover, working on well-defined terms it is then possible to query web resources for literature articles to integrate both pathology and bioinformatics data. Conclusions Using this system, users associate an ontology-based description to each image uploaded into the database and also integrate results with the ontological description of biosequences identified in every tissue. Moreover, it is possible to integrate the ontological description provided by the user with a full compliant gene ontology definition, enabling statistical studies about correlation between the analyzed pathology and the most commonly related biological processes.

  6. The Design and Engineering of Mobile Data Services: Developing an Ontology Based on Business Model Thinking

    Science.gov (United States)

    Al-Debei, Mutaz M.; Fitzgerald, Guy

    This paper addresses the design and engineering problem related to mobile data services. The aim of the research is to inform and advise mobile service design and engineering by looking at this issue from a rigorous and holistic perspective. To this aim, this paper develops an ontology based on business model thinking. The developed ontology identifies four primary dimensions in designing business models of mobile data services: value proposition, value network, value architecture, and value finance. Within these dimensions, 15 key design concepts are identified along with their interrelationships and rules in the telecommunication service business model domain and unambiguous semantics are produced. The developed ontology is of value to academics and practitioners alike, particularly those interested in strategic-oriented IS/IT and business developments in telecommunications. Employing the developed ontology would systemize mobile service engineering functions and make them more manageable, effective, and creative. The research approach to building the mobile service business model ontology essentially follows the design science paradigm. Within this paradigm, we incorporate a number of different research methods, so the employed methodology might be better characterized as a pluralist approach.

  7. Developing an Ontology-Based Cold Chain Logistics Monitoring and Decision System

    Directory of Open Access Journals (Sweden)

    Yujun Wang

    2015-01-01

    Full Text Available Nowadays the cold chain logistics for perishable goods is increasingly complex, while most of the research works are focusing on the monitoring of temperature and humidity but seldom on the assessment and decision support for the monitored cold chain quality. In this context, a monitoring and decision system based on wireless sensor networks (WSN and ontology is proposed in this paper which consists of sensing layer, network layer, and application layer. Ontology, as a shared concept model, can describe the objective world better with its own syntax and provides the general understanding of the specialized knowledge in a domain. Therefore, cold chain quality assessment software based on ontology has been developed; consequently, assessment and diagnosis for cold chain quality can be achieved, which can provide constructive advice and suggestions for its treatment. A demonstration of the system along a rabies vaccine logistics chain is validated in this paper. These results proved that this system presents important advantages such as effective regulation, low power consumption, and accurate ontology-based analysis.

  8. An ontology for microbial phenotypes.

    Science.gov (United States)

    Chibucos, Marcus C; Zweifel, Adrienne E; Herrera, Jonathan C; Meza, William; Eslamfam, Shabnam; Uetz, Peter; Siegele, Deborah A; Hu, James C; Giglio, Michelle G

    2014-11-30

    Phenotypic data are routinely used to elucidate gene function in organisms amenable to genetic manipulation. However, previous to this work, there was no generalizable system in place for the structured storage and retrieval of phenotypic information for bacteria. The Ontology of Microbial Phenotypes (OMP) has been created to standardize the capture of such phenotypic information from microbes. OMP has been built on the foundations of the Basic Formal Ontology and the Phenotype and Trait Ontology. Terms have logical definitions that can facilitate computational searching of phenotypes and their associated genes. OMP can be accessed via a wiki page as well as downloaded from SourceForge. Initial annotations with OMP are being made for Escherichia coli using a wiki-based annotation capture system. New OMP terms are being concurrently developed as annotation proceeds. We anticipate that diverse groups studying microbial genetics and associated phenotypes will employ OMP for standardizing microbial phenotype annotation, much as the Gene Ontology has standardized gene product annotation. The resulting OMP resource and associated annotations will facilitate prediction of phenotypes for unknown genes and result in new experimental characterization of phenotypes and functions.

  9. My Corporis Fabrica Embryo: An ontology-based 3D spatio-temporal modeling of human embryo development

    OpenAIRE

    Rabattu, Pierre-Yves; Massé, Benoit; Ulliana, Federico; Rousset, Marie-Christine; Rohmer, Damien; Léon, Jean-Claude; Palombi, Olivier

    2015-01-01

    Background Embryology is a complex morphologic discipline involving a set of entangled mechanisms, sometime difficult to understand and to visualize. Recent computer based techniques ranging from geometrical to physically based modeling are used to assist the visualization and the simulation of virtual humans for numerous domains such as surgical simulation and learning. On the other side, the ontology-based approach applied to knowledge representation is more and more successfully adopted in...

  10. User centered and ontology based information retrieval system for life sciences

    Directory of Open Access Journals (Sweden)

    Sy Mohameth-François

    2012-01-01

    Full Text Available Abstract Background Because of the increasing number of electronic resources, designing efficient tools to retrieve and exploit them is a major challenge. Some improvements have been offered by semantic Web technologies and applications based on domain ontologies. In life science, for instance, the Gene Ontology is widely exploited in genomic applications and the Medical Subject Headings is the basis of biomedical publications indexation and information retrieval process proposed by PubMed. However current search engines suffer from two main drawbacks: there is limited user interaction with the list of retrieved resources and no explanation for their adequacy to the query is provided. Users may thus be confused by the selection and have no idea on how to adapt their queries so that the results match their expectations. Results This paper describes an information retrieval system that relies on domain ontology to widen the set of relevant documents that is retrieved and that uses a graphical rendering of query results to favor user interactions. Semantic proximities between ontology concepts and aggregating models are used to assess documents adequacy with respect to a query. The selection of documents is displayed in a semantic map to provide graphical indications that make explicit to what extent they match the user's query; this man/machine interface favors a more interactive and iterative exploration of data corpus, by facilitating query concepts weighting and visual explanation. We illustrate the benefit of using this information retrieval system on two case studies one of which aiming at collecting human genes related to transcription factors involved in hemopoiesis pathway. Conclusions The ontology based information retrieval system described in this paper (OBIRS is freely available at: http://www.ontotoolkit.mines-ales.fr/ObirsClient/. This environment is a first step towards a user centred application in which the system enlightens

  11. Ontology based molecular signatures for immune cell types via gene expression analysis

    Science.gov (United States)

    2013-01-01

    Background New technologies are focusing on characterizing cell types to better understand their heterogeneity. With large volumes of cellular data being generated, innovative methods are needed to structure the resulting data analyses. Here, we describe an ‘Ontologically BAsed Molecular Signature’ (OBAMS) method that identifies novel cellular biomarkers and infers biological functions as characteristics of particular cell types. This method finds molecular signatures for immune cell types based on mapping biological samples to the Cell Ontology (CL) and navigating the space of all possible pairwise comparisons between cell types to find genes whose expression is core to a particular cell type’s identity. Results We illustrate this ontological approach by evaluating expression data available from the Immunological Genome project (IGP) to identify unique biomarkers of mature B cell subtypes. We find that using OBAMS, candidate biomarkers can be identified at every strata of cellular identity from broad classifications to very granular. Furthermore, we show that Gene Ontology can be used to cluster cell types by shared biological processes in order to find candidate genes responsible for somatic hypermutation in germinal center B cells. Moreover, through in silico experiments based on this approach, we have identified genes sets that represent genes overexpressed in germinal center B cells and identify genes uniquely expressed in these B cells compared to other B cell types. Conclusions This work demonstrates the utility of incorporating structured ontological knowledge into biological data analysis – providing a new method for defining novel biomarkers and providing an opportunity for new biological insights. PMID:24004649

  12. Towards Cache-Enabled, Order-Aware, Ontology-Based Stream Reasoning Framework

    Energy Technology Data Exchange (ETDEWEB)

    Yan, Rui; Praggastis, Brenda L.; Smith, William P.; McGuinness, Deborah L.

    2016-08-16

    While streaming data have become increasingly more popular in business and research communities, semantic models and processing software for streaming data have not kept pace. Traditional semantic solutions have not addressed transient data streams. Semantic web languages (e.g., RDF, OWL) have typically addressed static data settings and linked data approaches have predominantly addressed static or growing data repositories. Streaming data settings have some fundamental differences; in particular, data are consumed on the fly and data may expire. Stream reasoning, a combination of stream processing and semantic reasoning, has emerged with the vision of providing "smart" processing of streaming data. C-SPARQL is a prominent stream reasoning system that handles semantic (RDF) data streams. Many stream reasoning systems including C-SPARQL use a sliding window and use data arrival time to evict data. For data streams that include expiration times, a simple arrival time scheme is inadequate if the window size does not match the expiration period. In this paper, we propose a cache-enabled, order-aware, ontology-based stream reasoning framework. This framework consumes RDF streams with expiration timestamps assigned by the streaming source. Our framework utilizes both arrival and expiration timestamps in its cache eviction policies. In addition, we introduce the notion of "semantic importance" which aims to address the relevance of data to the expected reasoning, thus enabling the eviction algorithms to be more context- and reasoning-aware when choosing what data to maintain for question answering. We evaluate this framework by implementing three different prototypes and utilizing five metrics. The trade-offs of deploying the proposed framework are also discussed.

  13. Insight: An ontology-based integrated database and analysis platform for epilepsy self-management research.

    Science.gov (United States)

    Sahoo, Satya S; Ramesh, Priya; Welter, Elisabeth; Bukach, Ashley; Valdez, Joshua; Tatsuoka, Curtis; Bamps, Yvan; Stoll, Shelley; Jobst, Barbara C; Sajatovic, Martha

    2016-10-01

    We present Insight as an integrated database and analysis platform for epilepsy self-management research as part of the national Managing Epilepsy Well Network. Insight is the only available informatics platform for accessing and analyzing integrated data from multiple epilepsy self-management research studies with several new data management features and user-friendly functionalities. The features of Insight include, (1) use of Common Data Elements defined by members of the research community and an epilepsy domain ontology for data integration and querying, (2) visualization tools to support real time exploration of data distribution across research studies, and (3) an interactive visual query interface for provenance-enabled research cohort identification. The Insight platform contains data from five completed epilepsy self-management research studies covering various categories of data, including depression, quality of life, seizure frequency, and socioeconomic information. The data represents over 400 participants with 7552 data points. The Insight data exploration and cohort identification query interface has been developed using Ruby on Rails Web technology and open source Web Ontology Language Application Programming Interface to support ontology-based reasoning. We have developed an efficient ontology management module that automatically updates the ontology mappings each time a new version of the Epilepsy and Seizure Ontology is released. The Insight platform features a Role-based Access Control module to authenticate and effectively manage user access to different research studies. User access to Insight is managed by the Managing Epilepsy Well Network database steering committee consisting of representatives of all current collaborating centers of the Managing Epilepsy Well Network. New research studies are being continuously added to the Insight database and the size as well as the unique coverage of the dataset allows investigators to conduct

  14. KaBOB: ontology-based semantic integration of biomedical databases.

    Science.gov (United States)

    Livingston, Kevin M; Bada, Michael; Baumgartner, William A; Hunter, Lawrence E

    2015-04-23

    The ability to query many independent biological databases using a common ontology-based semantic model would facilitate deeper integration and more effective utilization of these diverse and rapidly growing resources. Despite ongoing work moving toward shared data formats and linked identifiers, significant problems persist in semantic data integration in order to establish shared identity and shared meaning across heterogeneous biomedical data sources. We present five processes for semantic data integration that, when applied collectively, solve seven key problems. These processes include making explicit the differences between biomedical concepts and database records, aggregating sets of identifiers denoting the same biomedical concepts across data sources, and using declaratively represented forward-chaining rules to take information that is variably represented in source databases and integrating it into a consistent biomedical representation. We demonstrate these processes and solutions by presenting KaBOB (the Knowledge Base Of Biomedicine), a knowledge base of semantically integrated data from 18 prominent biomedical databases using common representations grounded in Open Biomedical Ontologies. An instance of KaBOB with data about humans and seven major model organisms can be built using on the order of 500 million RDF triples. All source code for building KaBOB is available under an open-source license. KaBOB is an integrated knowledge base of biomedical data representationally based in prominent, actively maintained Open Biomedical Ontologies, thus enabling queries of the underlying data in terms of biomedical concepts (e.g., genes and gene products, interactions and processes) rather than features of source-specific data schemas or file formats. KaBOB resolves many of the issues that routinely plague biomedical researchers intending to work with data from multiple data sources and provides a platform for ongoing data integration and development and for

  15. The ontology-based answers (OBA) service: a connector for embedded usage of ontologies in applications.

    Science.gov (United States)

    Dönitz, Jürgen; Wingender, Edgar

    2012-01-01

    The semantic web depends on the use of ontologies to let electronic systems interpret contextual information. Optimally, the handling and access of ontologies should be completely transparent to the user. As a means to this end, we have developed a service that attempts to bridge the gap between experts in a certain knowledge domain, ontologists, and application developers. The ontology-based answers (OBA) service introduced here can be embedded into custom applications to grant access to the classes of ontologies and their relations as most important structural features as well as to information encoded in the relations between ontology classes. Thus computational biologists can benefit from ontologies without detailed knowledge about the respective ontology. The content of ontologies is mapped to a graph of connected objects which is compatible to the object-oriented programming style in Java. Semantic functions implement knowledge about the complex semantics of an ontology beyond the class hierarchy and "partOf" relations. By using these OBA functions an application can, for example, provide a semantic search function, or (in the examples outlined) map an anatomical structure to the organs it belongs to. The semantic functions relieve the application developer from the necessity of acquiring in-depth knowledge about the semantics and curation guidelines of the used ontologies by implementing the required knowledge. The architecture of the OBA service encapsulates the logic to process ontologies in order to achieve a separation from the application logic. A public server with the current plugins is available and can be used with the provided connector in a custom application in scenarios analogous to the presented use cases. The server and the client are freely available if a project requires the use of custom plugins or non-public ontologies. The OBA service and further documentation is available at http://www.bioinf.med.uni-goettingen.de/projects/oba.

  16. Gene Ontology-Based Analysis of Zebrafish Omics Data Using the Web Tool Comparative Gene Ontology.

    Science.gov (United States)

    Ebrahimie, Esmaeil; Fruzangohar, Mario; Moussavi Nik, Seyyed Hani; Newman, Morgan

    2017-09-05

    Gene Ontology (GO) analysis is a powerful tool in systems biology, which uses a defined nomenclature to annotate genes/proteins within three categories: "Molecular Function," "Biological Process," and "Cellular Component." GO analysis can assist in revealing functional mechanisms underlying observed patterns in transcriptomic, genomic, and proteomic data. The already extensive and increasing use of zebrafish for modeling genetic and other diseases highlights the need to develop a GO analytical tool for this organism. The web tool Comparative GO was originally developed for GO analysis of bacterial data in 2013 ( www.comparativego.com ). We have now upgraded and elaborated this web tool for analysis of zebrafish genetic data using GOs and annotations from the Gene Ontology Consortium.

  17. OntoPIN: an ontology-annotated PPI database.

    Science.gov (United States)

    Guzzi, Pietro Hiram; Veltri, Pierangelo; Cannataro, Mario

    2013-09-01

    Protein-protein interaction (PPI) data stored in publicly available databases are queried by the use of simple query interfaces allowing only key-based queries. A typical query on such databases is based on the use of protein identifiers and enables the retrieval of one or more proteins. Nevertheless, a lot of biological information is available and is spread on different sources and encoded in different ontologies such as Gene Ontology. The integration of existing PPI databases and biological information may result in richer querying interfaces and successively could enable the development of novel algorithms that may use biological information. The OntoPIN project showed the effectiveness of the introduction of a framework for the ontology-based management and querying of Protein-Protein Interaction Data. The OntoPIN framework first merges PPI data with annotations extracted from existing ontologies (e.g. Gene Ontology) and stores annotated data into a database. Then, a semantic-based query interface enables users to query these data by using biological concepts. OntoPIN allows: (a) to extend existing PPI databases by using ontologies, (b) to enable a key-based querying of annotated data, and (c) to offer a novel query interface based on semantic similarity among annotations.

  18. OlyMPUS - The Ontology-based Metadata Portal for Unified Semantics

    Science.gov (United States)

    Huffer, E.; Gleason, J. L.

    2015-12-01

    The Ontology-based Metadata Portal for Unified Semantics (OlyMPUS), funded by the NASA Earth Science Technology Office Advanced Information Systems Technology program, is an end-to-end system designed to support data consumers and data providers, enabling the latter to register their data sets and provision them with the semantically rich metadata that drives the Ontology-Driven Interactive Search Environment for Earth Sciences (ODISEES). OlyMPUS leverages the semantics and reasoning capabilities of ODISEES to provide data producers with a semi-automated interface for producing the semantically rich metadata needed to support ODISEES' data discovery and access services. It integrates the ODISEES metadata search system with multiple NASA data delivery tools to enable data consumers to create customized data sets for download to their computers, or for NASA Advanced Supercomputing (NAS) facility registered users, directly to NAS storage resources for access by applications running on NAS supercomputers. A core function of NASA's Earth Science Division is research and analysis that uses the full spectrum of data products available in NASA archives. Scientists need to perform complex analyses that identify correlations and non-obvious relationships across all types of Earth System phenomena. Comprehensive analytics are hindered, however, by the fact that many Earth science data products are disparate and hard to synthesize. Variations in how data are collected, processed, gridded, and stored, create challenges for data interoperability and synthesis, which are exacerbated by the sheer volume of available data. Robust, semantically rich metadata can support tools for data discovery and facilitate machine-to-machine transactions with services such as data subsetting, regridding, and reformatting. Such capabilities are critical to enabling the research activities integral to NASA's strategic plans. However, as metadata requirements increase and competing standards emerge

  19. Assessing the IADC Space Debris Mitigation Guidelines: A Case for Ontology-based Data Management

    Science.gov (United States)

    Walls, R.; Gaylor, D.; Reddy, V.; Furfaro, R.; Jah, M.

    2016-09-01

    As the population of man-made debris orbiting the Earth increases, so does the risk of damaging collisions. The Inter-Agency Space Debris Coordination Committee (IADC) has issued space debris mitigation guidelines including a key recommendation that before mission's end, spacecraft should move far enough from GEO so as not to be an operational hazard to other objects in active missions. It can be extremely difficult to determine if a spacecraft or operator is in compliance with this guideline, as it requires prediction of future actions based upon many data types. Furthermore, there has been no comprehensive assessment of the adequacy or validity of the IADC recommendations. The EU strives for a Code of Conduct in space, the United Nations-Committee On Peaceful Uses of Outer Space (UN-COPUOS) strives for guidelines to ensure the Long Term Sustainability of Space Activities (LTSSA), the FAA is concerned with Space Traffic Management (STM), etc. If rules, policies, guidelines, and laws are put in place, how can any entity know who and what is adhering to them, when we don't even know how to quantify and assess behavior of space objects? The University of Arizona aims to address this salient issue. As part of its new Space Object Behavioral Sciences (SOBS) initiative, the University of Arizona is developing an ontology-based system to support integration, use, and sharing of space domain data. As a first use-case, we will test the system's ability to assess compliance with the IADC recommendation to move beyond GEO at the end of a mission as well as the adequacy and validity of recommendations. We describe the relevant data types gathered for this use-case, present a prototype ontology, and outline methods for combining semantic analysis with astrodynamics modeling. Without loss of generality, we present this method as an approach that will form the foundation of SOBS and be used to address pressing challenges in Space Situational Awareness (SSA), Orbital Safety

  20. Literature Mining and Ontology based Analysis of Host-Brucella Gene-Gene Interaction Network.

    Science.gov (United States)

    Karadeniz, İlknur; Hur, Junguk; He, Yongqun; Özgür, Arzucan

    2015-01-01

    results show that the introduced literature mining and ontology-based modeling approach are effective in retrieving and analyzing host-pathogen gene-gene interaction networks.

  1. Improving microbial genome annotations in an integrated database context.

    Directory of Open Access Journals (Sweden)

    I-Min A Chen

    Full Text Available Effective comparative analysis of microbial genomes requires a consistent and complete view of biological data. Consistency regards the biological coherence of annotations, while completeness regards the extent and coverage of functional characterization for genomes. We have developed tools that allow scientists to assess and improve the consistency and completeness of microbial genome annotations in the context of the Integrated Microbial Genomes (IMG family of systems. All publicly available microbial genomes are characterized in IMG using different functional annotation and pathway resources, thus providing a comprehensive framework for identifying and resolving annotation discrepancies. A rule based system for predicting phenotypes in IMG provides a powerful mechanism for validating functional annotations, whereby the phenotypic traits of an organism are inferred based on the presence of certain metabolic reactions and pathways and compared to experimentally observed phenotypes. The IMG family of systems are available at http://img.jgi.doe.gov/.

  2. Ontology-based retrieval of bio-medical information based on microarray text corpora

    DEFF Research Database (Denmark)

    Hansen, Kim Allan; Zambach, Sine; Have, Christian Theil

    Microarray technology is often used in gene expression exper- iments. Information retrieval in the context of microarrays has mainly been concerned with the analysis of the numeric data produced; how- ever, the experiments are often annotated with textual metadata. Al- though biomedical resources...... are exponentially growing, the text corpora are sparse and inconsistent in spite of attempts to standardize the format. Ordinary keyword search may in some cases be insucient to nd rele- vant information and the potential benet of using a semantic approach in this context has only been investigated to a limited...

  3. xGENIA: A comprehensive OWL ontology based on the GENIA corpus

    OpenAIRE

    Rak, Rafal; Kurgan, Lukasz; Reformat, Marek

    2007-01-01

    The GENIA ontology is a taxonomy that was developed as a result of manual annotation of a subset of MEDLINE, the GENIA corpus. Both the ontology and corpus have been used as a benchmark to test and develop biological information extraction tools. Recent work shows, however, that there is a demand for a more comprehensive ontology that would go along with the corpus. We propose a complete OWL ontology built on top of the GENIA ontology utilizing the GENIA corpus. The proposed ontology includes...

  4. Maize microarray annotation database

    Directory of Open Access Journals (Sweden)

    Berger Dave K

    2011-10-01

    Full Text Available Abstract Background Microarray technology has matured over the past fifteen years into a cost-effective solution with established data analysis protocols for global gene expression profiling. The Agilent-016047 maize 44 K microarray was custom-designed from EST sequences, but only reporter sequences with EST accession numbers are publicly available. The following information is lacking: (a reporter - gene model match, (b number of reporters per gene model, (c potential for cross hybridization, (d sense/antisense orientation of reporters, (e position of reporter on B73 genome sequence (for eQTL studies, and (f functional annotations of genes represented by reporters. To address this, we developed a strategy to annotate the Agilent-016047 maize microarray, and built a publicly accessible annotation database. Description Genomic annotation of the 42,034 reporters on the Agilent-016047 maize microarray was based on BLASTN results of the 60-mer reporter sequences and their corresponding ESTs against the maize B73 RefGen v2 "Working Gene Set" (WGS predicted transcripts and the genome sequence. The agreement between the EST, WGS transcript and gDNA BLASTN results were used to assign the reporters into six genomic annotation groups. These annotation groups were: (i "annotation by sense gene model" (23,668 reporters, (ii "annotation by antisense gene model" (4,330; (iii "annotation by gDNA" without a WGS transcript hit (1,549; (iv "annotation by EST", in which case the EST from which the reporter was designed, but not the reporter itself, has a WGS transcript hit (3,390; (v "ambiguous annotation" (2,608; and (vi "inconclusive annotation" (6,489. Functional annotations of reporters were obtained by BLASTX and Blast2GO analysis of corresponding WGS transcripts against GenBank. The annotations are available in the Maize Microarray Annotation Database http://MaizeArrayAnnot.bi.up.ac.za/, as well as through a GBrowse annotation file that can be uploaded to

  5. An Ontology-Based GIS for Genomic Data Management of Rumen Microbes.

    Science.gov (United States)

    Jelokhani-Niaraki, Saber; Tahmoorespur, Mojtaba; Minuchehr, Zarrin; Nassiri, Mohammad Reza

    2015-03-01

    During recent years, there has been exponential growth in biological information. With the emergence of large datasets in biology, life scientists are encountering bottlenecks in handling the biological data. This study presents an integrated geographic information system (GIS)-ontology application for handling microbial genome data. The application uses a linear referencing technique as one of the GIS functionalities to represent genes as linear events on the genome layer, where users can define/change the attributes of genes in an event table and interactively see the gene events on a genome layer. Our application adopted ontology to portray and store genomic data in a semantic framework, which facilitates data-sharing among biology domains, applications, and experts. The application was developed in two steps. In the first step, the genome annotated data were prepared and stored in a MySQL database. The second step involved the connection of the database to both ArcGIS and Protégé as the GIS engine and ontology platform, respectively. We have designed this application specifically to manage the genome-annotated data of rumen microbial populations. Such a GIS-ontology application offers powerful capabilities for visualizing, managing, reusing, sharing, and querying genome-related data.

  6. SANB-SEB Clustering: A Hybrid Ontology Based Image and Webpage Retrieval for Knowledge Extraction

    Directory of Open Access Journals (Sweden)

    Anna Saro Vijendran

    2014-12-01

    Full Text Available Data mining is a hype-word and its major goal is to extract the information from the dataset and convert it into readable format. Web mining is one of the applications of data mining which helps to extract the web page. Personalized image was retrieved in existing systems by using tag-annotation-demand ranking for image retrieval (TAD where image uploading, query searching, and page refreshing steps were taken place. In the proposed work, both the image and web page are retrieved by several techniques. Two major steps are followed in this work, where the primary step is server database upload. Herein, database for both image and content are stored using block acquiring page segmentation (BAPS. The subsequent step is to extract the image and content from the respective server database. The subsequent database is further applied into semantic annotation based clustering (SANB (for image and semantic based clustering (SEB (for content. The experimental results show that the proposed approach accurately retrieves both the images and relevant pages.

  7. User Centered and Ontology Based Information Retrieval System for Life Sciences

    CERN Document Server

    Ranwez, Sylvie; Sy, Mohameth-François; Montmain, Jacky; Crampes, Michel

    2010-01-01

    Because of the increasing number of electronic data, designing efficient tools to retrieve and exploit documents is a major challenge. Current search engines suffer from two main drawbacks: there is limited interaction with the list of retrieved documents and no explanation for their adequacy to the query. Users may thus be confused by the selection and have no idea how to adapt their query so that the results match their expectations. This paper describes a request method and an environment based on aggregating models to assess the relevance of documents annotated by concepts of ontology. The selection of documents is then displayed in a semantic map to provide graphical indications that make explicit to what extent they match the user's query; this man/machine interface favors a more interactive exploration of data corpus.

  8. An ontology-based search engine for digital reconstructions of neuronal morphology.

    Science.gov (United States)

    Polavaram, Sridevi; Ascoli, Giorgio A

    2017-03-23

    Neuronal morphology is extremely diverse across and within animal species, developmental stages, brain regions, and cell types. This diversity is functionally important because neuronal structure strongly affects synaptic integration, spiking dynamics, and network connectivity. Digital reconstructions of axonal and dendritic arbors are thus essential to quantify and model information processing in the nervous system. NeuroMorpho.Org is an established repository containing tens of thousands of digitally reconstructed neurons shared by several hundred laboratories worldwide. Each neuron is annotated with specific metadata based on the published references and additional details provided by data owners. The number of represented metadata concepts has grown over the years in parallel with the increase of available data. Until now, however, the lack of standardized terminologies and of an adequately structured metadata schema limited the effectiveness of user searches. Here we present a new organization of NeuroMorpho.Org metadata grounded on a set of interconnected hierarchies focusing on the main dimensions of animal species, anatomical regions, and cell types. We have comprehensively mapped each metadata term in NeuroMorpho.Org to this formal ontology, explicitly resolving all ambiguities caused by synonymy and homonymy. Leveraging this consistent framework, we introduce OntoSearch, a powerful functionality that seamlessly enables retrieval of morphological data based on expert knowledge and logical inferences through an intuitive string-based user interface with auto-complete capability. In addition to returning the data directly matching the search criteria, OntoSearch also identifies a pool of possible hits by taking into consideration incomplete metadata annotation.

  9. Ontology-based approach for in vivo human connectomics: the medial Brodmann area 6 case study.

    Science.gov (United States)

    Moreau, Tristan; Gibaud, Bernard

    2015-01-01

    Different non-invasive neuroimaging modalities and multi-level analysis of human connectomics datasets yield a great amount of heterogeneous data which are hard to integrate into an unified representation. Biomedical ontologies can provide a suitable integrative framework for domain knowledge as well as a tool to facilitate information retrieval, data sharing and data comparisons across scales, modalities and species. Especially, it is urgently needed to fill the gap between neurobiology and in vivo human connectomics in order to better take into account the reality highlighted in Magnetic Resonance Imaging (MRI) and relate it to existing brain knowledge. The aim of this study was to create a neuroanatomical ontology, called "Human Connectomics Ontology" (HCO), in order to represent macroscopic gray matter regions connected with fiber bundles assessed by diffusion tractography and to annotate MRI connectomics datasets acquired in the living human brain. First a neuroanatomical "view" called NEURO-DL-FMA was extracted from the reference ontology Foundational Model of Anatomy (FMA) in order to construct a gross anatomy ontology of the brain. HCO extends NEURO-DL-FMA by introducing entities (such as "MR_Node" and "MR_Route") and object properties (such as "tracto_connects") pertaining to MR connectivity. The Web Ontology Language Description Logics (OWL DL) formalism was used in order to enable reasoning with common reasoning engines. Moreover, an experimental work was achieved in order to demonstrate how the HCO could be effectively used to address complex queries concerning in vivo MRI connectomics datasets. Indeed, neuroimaging datasets of five healthy subjects were annotated with terms of the HCO and a multi-level analysis of the connectivity patterns assessed by diffusion tractography of the right medial Brodmann Area 6 was achieved using a set of queries. This approach can facilitate comparison of data across scales, modalities and species.

  10. Ontology-based approach for in vivo human connectomics: the medial Brodmann area 6 case study

    Directory of Open Access Journals (Sweden)

    Tristan eMoreau

    2015-04-01

    Full Text Available Different non-invasive neuroimaging modalities and multi-level analysis of human connectomics datasets yield a great amount of heterogeneous data which are hard to integrate into an unified representation. Biomedical ontologies can provide a suitable integrative framework for domain knowledge as well as a tool to facilitate information retrieval, data sharing and data comparisons across scales, modalities and species. Especially, it is urgently needed to fill the gap between neurobiology and in vivo human connectomics in order to better take into account the reality highlighted in Magnetic Resonance Imaging (MRI and relate it to existing brain knowledge. The aim of this study was to create a neuroanatomical ontology, called Human Connectomics Ontology (HCO, in order to represent macroscopic gray matter regions connected with fiber bundles assessed by diffusion tractography and to annotate MRI connectomics datasets acquired in the living human brain. First a neuroanatomical view called NEURO-DL-FMA was extracted from the reference ontology Foundational Model of Anatomy (FMA in order to construct a gross anatomy ontology of the brain. HCO extends NEURO-DL-FMA by introducing entities (such as MR_Node and MR_Route and object properties such as tracto_connects pertaining to MR connectivity. The Web Ontology Language Description Logics (OWL DL formalism was used in order to enable reasoning with common reasoning engines. Moreover, an experimental work was achieved in order to demonstrate how the HCO could be effectively used to address complex queries concerning in vivo MRI connectomics datasets. Indeed, neuroimaging datasets of five healthy subjects were annotated with terms of the HCO and a multi-level analysis of the connectivity patterns assessed by diffusion tractography of the right medial Brodmann Area 6 was achieved using a set of queries. This approach can facilitate comparison of data across scales, modalities and species.

  11. Dynamic multimedia annotation tool

    Science.gov (United States)

    Pfund, Thomas; Marchand-Maillet, Stephane

    2001-12-01

    Annotating image collections is crucial for different multimedia applications. Not only this provides an alternative access to visual information but it is a critical step to perform the evaluation of content-based image retrieval systems. Annotation is a tedious task so that there is a real need for developing tools that lighten the work of annotators. The tool should be flexible and offer customization so as to make the annotator the most comfortable. It should also automate the most tasks as possible. In this paper, we present a still image annotation tool that has been developed with the aim of being flexible and adaptive. The principle is to create a set of dynamic web pages that are an interface to a SQL database. The keyword set is fixed and every image receives from concurrent annotators a set of keywords along with time stamps and annotator Ids. Each annotator has the possibility of going back and forth within the collection and its previous annotations. He is helped by a number of search services and customization options. An administrative section allows the supervisor to control the parameter of the annotation, including the keyword set, given via an XML structure. The architecture of the tool is made flexible so as to accommodate further options through its development.

  12. Annotate-it: a Swiss-knife approach to annotation, analysis and interpretation of single nucleotide variation in human disease.

    Science.gov (United States)

    Sifrim, Alejandro; Van Houdt, Jeroen Kj; Tranchevent, Leon-Charles; Nowakowska, Beata; Sakai, Ryo; Pavlopoulos, Georgios A; Devriendt, Koen; Vermeesch, Joris R; Moreau, Yves; Aerts, Jan

    2012-01-01

    The increasing size and complexity of exome/genome sequencing data requires new tools for clinical geneticists to discover disease-causing variants. Bottlenecks in identifying the causative variation include poor cross-sample querying, constantly changing functional annotation and not considering existing knowledge concerning the phenotype. We describe a methodology that facilitates exploration of patient sequencing data towards identification of causal variants under different genetic hypotheses. Annotate-it facilitates handling, analysis and interpretation of high-throughput single nucleotide variant data. We demonstrate our strategy using three case studies. Annotate-it is freely available and test data are accessible to all users at http://www.annotate-it.org.

  13. 基于本体的Deep Web自动标注方法研究%Research on Ontology-based Automatic Annotation for Deep Web

    Institute of Scientific and Technical Information of China (English)

    张玉连; 李帅; 周兴林

    2009-01-01

    结合基于查询接口模式的Deep Web标注方法,提出一种基于网页视觉信息的Deep Web标注方法,用本体词组去替换原有标注信息,这种替换确保标注信息的一致性,可以很好地弥补原有方法的许多缺陷,并且有效提高原有方法的准确率和召回率.

  14. Ubiquitous Annotation Systems

    DEFF Research Database (Denmark)

    Hansen, Frank Allan

    2006-01-01

    Ubiquitous annotation systems allow users to annotate physical places, objects, and persons with digital information. Especially in the field of location based information systems much work has been done to implement adaptive and context-aware systems, but few efforts have focused on the general...... requirements for linking information to objects in both physical and digital space. This paper surveys annotation techniques from open hypermedia systems, Web based annotation systems, and mobile and augmented reality systems to illustrate different approaches to four central challenges ubiquitous annotation...... systems have to deal with: anchoring, structuring, presentation, and authoring. Through a number of examples each challenge is discussed and HyCon, a context-aware hypermedia framework developed at the University of Aarhus, Denmark, is used to illustrate an integrated approach to ubiquitous annotations...

  15. Gene Ontology based housekeeping gene selection for RNA-seq normalization.

    Science.gov (United States)

    Chen, Chien-Ming; Lu, Yu-Lun; Sio, Chi-Pong; Wu, Guan-Chung; Tzou, Wen-Shyong; Pai, Tun-Wen

    2014-06-01

    RNA-seq analysis provides a powerful tool for revealing relationships between gene expression level and biological function of proteins. In order to identify differentially expressed genes among various RNA-seq datasets obtained from different experimental designs, an appropriate normalization method for calibrating multiple experimental datasets is the first challenging problem. We propose a novel method to facilitate biologists in selecting a set of suitable housekeeping genes for inter-sample normalization. The approach is achieved by adopting user defined experimentally related keywords, GO annotations, GO term distance matrices, orthologous housekeeping gene candidates, and stability ranking of housekeeping genes. By identifying the most distanced GO terms from query keywords and selecting housekeeping gene candidates with low coefficients of variation among different spatio-temporal datasets, the proposed method can automatically enumerate a set of functionally irrelevant housekeeping genes for pratical normalization. Novel and benchmark testing RNA-seq datasets were applied to demostrate that different selections of housekeeping gene lead to strong impact on differential gene expression analysis, and compared results have shown that our proposed method outperformed other traditional approaches in terms of both sensitivity and specificity. The proposed mechanism of selecting appropriate houskeeping genes for inter-dataset normalization is robust and accurate for differential expression analyses. Copyright © 2014 Elsevier Inc. All rights reserved.

  16. Saliva Ontology: An ontology-based framework for a Salivaomics Knowledge Base

    Directory of Open Access Journals (Sweden)

    Smith Barry

    2010-06-01

    Full Text Available Abstract Background The Salivaomics Knowledge Base (SKB is designed to serve as a computational infrastructure that can permit global exploration and utilization of data and information relevant to salivaomics. SKB is created by aligning (1 the saliva biomarker discovery and validation resources at UCLA with (2 the ontology resources developed by the OBO (Open Biomedical Ontologies Foundry, including a new Saliva Ontology (SALO. Results We define the Saliva Ontology (SALO; http://www.skb.ucla.edu/SALO/ as a consensus-based controlled vocabulary of terms and relations dedicated to the salivaomics domain and to saliva-related diagnostics following the principles of the OBO (Open Biomedical Ontologies Foundry. Conclusions The Saliva Ontology is an ongoing exploratory initiative. The ontology will be used to facilitate salivaomics data retrieval and integration across multiple fields of research together with data analysis and data mining. The ontology will be tested through its ability to serve the annotation ('tagging' of a representative corpus of salivaomics research literature that is to be incorporated into the SKB.

  17. Improving the extraction of complex regulatory events from scientific text by using ontology-based inference

    Directory of Open Access Journals (Sweden)

    Kim Jung-jae

    2011-10-01

    Full Text Available Abstract Background The extraction of complex events from biomedical text is a challenging task and requires in-depth semantic analysis. Previous approaches associate lexical and syntactic resources with ontologies for the semantic analysis, but fall short in testing the benefits from the use of domain knowledge. Results We developed a system that deduces implicit events from explicitly expressed events by using inference rules that encode domain knowledge. We evaluated the system with the inference module on three tasks: First, when tested against a corpus with manually annotated events, the inference module of our system contributes 53.2% of correct extractions, but does not cause any incorrect results. Second, the system overall reproduces 33.1% of the transcription regulatory events contained in RegulonDB (up to 85.0% precision and the inference module is required for 93.8% of the reproduced events. Third, we applied the system with minimum adaptations to the identification of cell activity regulation events, confirming that the inference improves the performance of the system also on this task. Conclusions Our research shows that the inference based on domain knowledge plays a significant role in extracting complex events from text. This approach has great potential in recognizing the complex concepts of such biomedical ontologies as Gene Ontology in the literature.

  18. PPISEARCHENGINE: gene ontology-based search for protein-protein interactions.

    Science.gov (United States)

    Park, Byungkyu; Cui, Guangyu; Lee, Hyunjin; Huang, De-Shuang; Han, Kyungsook

    2013-01-01

    This paper presents a new search engine called PPISearchEngine which finds protein-protein interactions (PPIs) using the gene ontology (GO) and the biological relations of proteins. For efficient retrieval of PPIs, each GO term is assigned a prime number and the relation between the terms is represented by the product of prime numbers. This representation is hidden from users but facilitates the search for the interactions of a query protein by unique prime factorisation of the number that represents the query protein. For a query protein, PPISearchEngine considers not only the GO term associated with the query protein but also the GO terms at the lower level than the GO term in the GO hierarchy, and finds all the interactions of the query protein which satisfy the search condition. In contrast, the standard keyword-matching or ID-matching search method cannot find the interactions of a protein unless the interactions involve a protein with explicit annotations. To the best of our knowledge, this search engine is the first method that can process queries like 'for protein p with GO [Formula: see text], find p's interaction partners with GO [Formula: see text]'. PPISearchEngine is freely available to academics at http://search.hpid.org/.

  19. Orymold: ontology based gene expression data integration and analysis tool applied to rice

    Directory of Open Access Journals (Sweden)

    Segura Jordi

    2009-05-01

    Full Text Available Abstract Background Integration and exploration of data obtained from genome wide monitoring technologies has become a major challenge for many bioinformaticists and biologists due to its heterogeneity and high dimensionality. A widely accepted approach to solve these issues has been the creation and use of controlled vocabularies (ontologies. Ontologies allow for the formalization of domain knowledge, which in turn enables generalization in the creation of querying interfaces as well as in the integration of heterogeneous data, providing both human and machine readable interfaces. Results We designed and implemented a software tool that allows investigators to create their own semantic model of an organism and to use it to dynamically integrate expression data obtained from DNA microarrays and other probe based technologies. The software provides tools to use the semantic model to postulate and validate of hypotheses on the spatial and temporal expression and function of genes. In order to illustrate the software's use and features, we used it to build a semantic model of rice (Oryza sativa and integrated experimental data into it. Conclusion In this paper we describe the development and features of a flexible software application for dynamic gene expression data annotation, integration, and exploration called Orymold. Orymold is freely available for non-commercial users from http://www.oryzon.com/media/orymold.html

  20. An incremental and distributed inference method for large-scale ontologies based on MapReduce paradigm.

    Science.gov (United States)

    Liu, Bo; Huang, Keman; Li, Jianqiang; Zhou, MengChu

    2015-01-01

    With the upcoming data deluge of semantic data, the fast growth of ontology bases has brought significant challenges in performing efficient and scalable reasoning. Traditional centralized reasoning methods are not sufficient to process large ontologies. Distributed reasoning methods are thus required to improve the scalability and performance of inferences. This paper proposes an incremental and distributed inference method for large-scale ontologies by using MapReduce, which realizes high-performance reasoning and runtime searching, especially for incremental knowledge base. By constructing transfer inference forest and effective assertional triples, the storage is largely reduced and the reasoning process is simplified and accelerated. Finally, a prototype system is implemented on a Hadoop framework and the experimental results validate the usability and effectiveness of the proposed approach.

  1. Ontology-Based Gap Analysis for Technology Selection: A Knowledge Management Framework for the Support of Equipment Purchasing Processes

    Science.gov (United States)

    Macris, Aristomenis M.; Georgakellos, Dimitrios A.

    Technology selection decisions such as equipment purchasing and supplier selection are decisions of strategic importance to companies. The nature of these decisions usually is complex, unstructured and thus, difficult to be captured in a way that will be efficiently reusable. Knowledge reusability is of paramount importance since it enables users participate actively in process design/redesign activities stimulated by the changing technology selection environment. This paper addresses the technology selection problem through an ontology-based approach that captures and makes reusable the equipment purchasing process and assists in identifying (a) the specifications requested by the users' organization, (b) those offered by various candidate vendors' organizations and (c) in performing specifications gap analysis as a prerequisite for effective and efficient technology selection. This approach has practical appeal, operational simplicity, and the potential for both immediate and long-term strategic impact. An example from the iron and steel industry is also presented to illustrate the approach.

  2. OntologyNavigator: WEB 2.0 scalable ontology based CLIR portal to IT scientific corpus for researchers

    CERN Document Server

    Kembellec, Gérald; Sauvaget, Catherine

    2011-01-01

    This work presents the architecture used in the ongoing OntologyNavigator project. It is a research tool to help advanced learners to find adapted IT papers to create scientific bibliographies. The purpose is the use of an IT representation as educational research software for researchers. We use an ontology based on the ACM's Computing Classification System in order to find scientific papers directly related to the new researcher's domain without any formal request. An ontology translation in French is automatically proposed and can be based on Web 2.0 enhanced by a community of users. A visualization and navigation model is proposed to make it more accessible and examples are given to show the interface of the tool. This model offers the possibility of cross language query. Users deeply interact with the translation by providing alternative translation of the node label. Customers also enrich the ontology node labels with implicit descriptors.

  3. An Effective Security Mechanism for M-Commerce Applications Exploiting Ontology Based Access Control Model for Healthcare System

    Directory of Open Access Journals (Sweden)

    S.M. Roychoudri

    2016-09-01

    Full Text Available Health organizations are beginning to move mobile commerce services in recent years to enhance services and quality without spending much investment for IT infrastructure. Medical records are very sensitive and private to any individuals. Hence effective security mechanism is required. The challenges of our research work are to maintain privacy for the users and provide smart and secure environment for accessing the application. It is achieved with the help of personalization. Internet has provided the way for personalization. Personalization is a term which refers to the delivery of information that is relevant to individual or group of individuals in the format, layout specified and in time interval. In this paper we propose an Ontology Based Access Control (OBAC Model that can address the permitted access control among the service providers and users. Personal Health Records sharing is highly expected by the users for the acceptance in mobile commerce applications in health care systems.

  4. Logical Gene Ontology Annotations (GOAL): exploring gene ontology annotations with OWL.

    Science.gov (United States)

    Jupp, Simon; Stevens, Robert; Hoehndorf, Robert

    2012-04-24

    Ontologies such as the Gene Ontology (GO) and their use in annotations make cross species comparisons of genes possible, along with a wide range of other analytical activities. The bio-ontologies community, in particular the Open Biomedical Ontologies (OBO) community, have provided many other ontologies and an increasingly large volume of annotations of gene products that can be exploited in query and analysis. As many annotations with different ontologies centre upon gene products, there is a possibility to explore gene products through multiple ontological perspectives at the same time. Questions could be asked that link a gene product's function, process, cellular location, phenotype and disease. Current tools, such as AmiGO, allow exploration of genes based on their GO annotations, but not through multiple ontological perspectives. In addition, the semantics of these ontology's representations should be able to, through automated reasoning, afford richer query opportunities of the gene product annotations than is currently possible. To do this multi-perspective, richer querying of gene product annotations, we have created the Logical Gene Ontology, or GOAL ontology, in OWL that combines the Gene Ontology, Human Disease Ontology and the Mammalian Phenotype Ontology, together with classes that represent the annotations with these ontologies for mouse gene products. Each mouse gene product is represented as a class, with the appropriate relationships to the GO aspects, phenotype and disease with which it has been annotated. We then use defined classes to query these protein classes through automated reasoning, and to build a complex hierarchy of gene products. We have presented this through a Web interface that allows arbitrary queries to be constructed and the results displayed. This standard use of OWL affords a rich interaction with Gene Ontology, Human Disease Ontology and Mammalian Phenotype Ontology annotations for the mouse, to give a fine partitioning of

  5. Annotating Coloured Petri Nets

    DEFF Research Database (Denmark)

    Lindstrøm, Bo; Wells, Lisa Marie

    2002-01-01

    Coloured Petri nets (CP-nets) can be used for several fundamentally different purposes like functional analysis, performance analysis, and visualisation. To be able to use the corresponding tool extensions and libraries it is sometimes necessary to include extra auxiliary information in the CP-ne...... a certain use of the CP-net. We define the semantics of annotations by describing a translation from a CP-net and the corresponding annotation layers to another CP-net where the annotations are an integrated part of the CP-net....... a method which makes it possible to associate auxiliary information, called annotations, with tokens without modifying the colour sets of the CP-net. Annotations are pieces of information that are not essential for determining the behaviour of the system being modelled, but are rather added to support...

  6. Ontology-based representation and analysis of host-Brucella interactions.

    Science.gov (United States)

    Lin, Yu; Xiang, Zuoshuang; He, Yongqun

    2015-01-01

    Biomedical ontologies are representations of classes of entities in the biomedical domain and how these classes are related in computer- and human-interpretable formats. Ontologies support data standardization and exchange and provide a basis for computer-assisted automated reasoning. IDOBRU is an ontology in the domain of Brucella and brucellosis. Brucella is a Gram-negative intracellular bacterium that causes brucellosis, the most common zoonotic disease in the world. In this study, IDOBRU is used as a platform to model and analyze how the hosts, especially host macrophages, interact with virulent Brucella strains or live attenuated Brucella vaccine strains. Such a study allows us to better integrate and understand intricate Brucella pathogenesis and host immunity mechanisms. Different levels of host-Brucella interactions based on different host cell types and Brucella strains were first defined ontologically. Three important processes of virulent Brucella interacting with host macrophages were represented: Brucella entry into macrophage, intracellular trafficking, and intracellular replication. Two Brucella pathogenesis mechanisms were ontologically represented: Brucella Type IV secretion system that supports intracellular trafficking and replication, and Brucella erythritol metabolism that participates in Brucella intracellular survival and pathogenesis. The host cell death pathway is critical to the outcome of host-Brucella interactions. For better survival and replication, virulent Brucella prevents macrophage cell death. However, live attenuated B. abortus vaccine strain RB51 induces caspase-2-mediated proinflammatory cell death. Brucella-associated cell death processes are represented in IDOBRU. The gene and protein information of 432 manually annotated Brucella virulence factors were represented using the Ontology of Genes and Genomes (OGG) and Protein Ontology (PRO), respectively. Seven inference rules were defined to capture the knowledge of host

  7. Ontology-based Brucella vaccine literature indexing and systematic analysis of gene-vaccine association network

    Science.gov (United States)

    2011-01-01

    Background Vaccine literature indexing is poorly performed in PubMed due to limited hierarchy of Medical Subject Headings (MeSH) annotation in the vaccine field. Vaccine Ontology (VO) is a community-based biomedical ontology that represents various vaccines and their relations. SciMiner is an in-house literature mining system that supports literature indexing and gene name tagging. We hypothesize that application of VO in SciMiner will aid vaccine literature indexing and mining of vaccine-gene interaction networks. As a test case, we have examined vaccines for Brucella, the causative agent of brucellosis in humans and animals. Results The VO-based SciMiner (VO-SciMiner) was developed to incorporate a total of 67 Brucella vaccine terms. A set of rules for term expansion of VO terms were learned from training data, consisting of 90 biomedical articles related to Brucella vaccine terms. VO-SciMiner demonstrated high recall (91%) and precision (99%) from testing a separate set of 100 manually selected biomedical articles. VO-SciMiner indexing exhibited superior performance in retrieving Brucella vaccine-related papers over that obtained with MeSH-based PubMed literature search. For example, a VO-SciMiner search of "live attenuated Brucella vaccine" returned 922 hits as of April 20, 2011, while a PubMed search of the same query resulted in only 74 hits. Using the abstracts of 14,947 Brucella-related papers, VO-SciMiner identified 140 Brucella genes associated with Brucella vaccines. These genes included known protective antigens, virulence factors, and genes closely related to Brucella vaccines. These VO-interacting Brucella genes were significantly over-represented in biological functional categories, including metabolite transport and metabolism, replication and repair, cell wall biogenesis, intracellular trafficking and secretion, posttranslational modification, and chaperones. Furthermore, a comprehensive interaction network of Brucella vaccines and genes were

  8. The Mammalian Phenotype Ontology as a unifying standard for experimental and high-throughput phenotyping data.

    Science.gov (United States)

    Smith, Cynthia L; Eppig, Janan T

    2012-10-01

    The Mammalian Phenotype Ontology (MP) is a structured vocabulary for describing mammalian phenotypes and serves as a critical tool for efficient annotation and comprehensive retrieval of phenotype data. Importantly, the ontology contains broad and specific terms, facilitating annotation of data from initial observations or screens and detailed data from subsequent experimental research. Using the ontology structure, data are retrieved inclusively, i.e., data annotated to chosen terms and to terms subordinate in the hierarchy. Thus, searching for "abnormal craniofacial morphology" also returns annotations to "megacephaly" and "microcephaly," more specific terms in the hierarchy path. The development and refinement of the MP is ongoing, with new terms and modifications to its organization undergoing continuous assessment as users and expert reviewers propose expansions and revisions. A wealth of phenotype data on mouse mutations and variants annotated to the MP already exists in the Mouse Genome Informatics database. These data, along with data curated to the MP by many mouse mutagenesis programs and mouse repositories, provide a platform for comparative analyses and correlative discoveries. The MP provides a standard underpinning to mouse phenotype descriptions for existing and future experimental and large-scale phenotyping projects. In this review we describe the MP as it presently exists, its application to phenotype annotations, the relationship of the MP to other ontologies, and the integration of the MP within large-scale phenotyping projects. Finally we discuss future application of the MP in providing standard descriptors of the phenotype pipeline test results from the International Mouse Phenotype Consortium projects.

  9. Personnalisation de Syst\\`emes OLAP Annot\\'es

    CERN Document Server

    Jerbi, Houssem; Ravat, Franck; Teste, Olivier

    2010-01-01

    This paper deals with personalization of annotated OLAP systems. Data constellation is extended to support annotations and user preferences. Annotations reflect the decision-maker experience whereas user preferences enable users to focus on the most interesting data. User preferences allow annotated contextual recommendations helping the decision-maker during his/her multidimensional navigations.

  10. Annotation-Based Whole Genomic Prediction and Selection

    DEFF Research Database (Denmark)

    Kadarmideen, Haja; Do, Duy Ngoc; Janss, Luc

    Cπ method and applied to 1,272 Duroc pigs with both genotypic and phenotypic records including residual (RFI) and daily feed intake (DFI), average daily gain (ADG) and back fat (BF)). Records were split into a training (968 pigs) and a validation dataset (304 pigs). SNPs were annotated by 14 different...... groups. Genomic prediction has accuracy comparable to an own phenotype and use of genomic prediction can be cost effective by replacing feed intake measurement. Use of genomic annotation of SNPs and QTL information had no largely significant impact on predictive accuracy for the current traits but may...... in their contribution to estimated genomic variances and in prediction of genomic breeding values by applying SNP annotation approaches to feed efficiency. Ensembl Variant Predictor (EVP) and Pig QTL database were used as the source of genomic annotation for 60K chip. Genomic prediction was performed using the Bayes...

  11. Textpresso: an ontology-based information retrieval and extraction system for biological literature.

    Directory of Open Access Journals (Sweden)

    Hans-Michael Müller

    2004-11-01

    Full Text Available We have developed Textpresso, a new text-mining system for scientific literature whose capabilities go far beyond those of a simple keyword search engine. Textpresso's two major elements are a collection of the full text of scientific articles split into individual sentences, and the implementation of categories of terms for which a database of articles and individual sentences can be searched. The categories are classes of biological concepts (e.g., gene, allele, cell or cell group, phenotype, etc. and classes that relate two objects (e.g., association, regulation, etc. or describe one (e.g., biological process, etc.. Together they form a catalog of types of objects and concepts called an ontology. After this ontology is populated with terms, the whole corpus of articles and abstracts is marked up to identify terms of these categories. The current ontology comprises 33 categories of terms. A search engine enables the user to search for one or a combination of these tags and/or keywords within a sentence or document, and as the ontology allows word meaning to be queried, it is possible to formulate semantic queries. Full text access increases recall of biological data types from 45% to 95%. Extraction of particular biological facts, such as gene-gene interactions, can be accelerated significantly by ontologies, with Textpresso automatically performing nearly as well as expert curators to identify sentences; in searches for two uniquely named genes and an interaction term, the ontology confers a 3-fold increase of search efficiency. Textpresso currently focuses on Caenorhabditis elegans literature, with 3,800 full text articles and 16,000 abstracts. The lexicon of the ontology contains 14,500 entries, each of which includes all versions of a specific word or phrase, and it includes all categories of the Gene Ontology database. Textpresso is a useful curation tool, as well as search engine for researchers, and can readily be extended to other

  12. Textpresso: an ontology-based information retrieval and extraction system for biological literature.

    Science.gov (United States)

    Müller, Hans-Michael; Kenny, Eimear E; Sternberg, Paul W

    2004-11-01

    We have developed Textpresso, a new text-mining system for scientific literature whose capabilities go far beyond those of a simple keyword search engine. Textpresso's two major elements are a collection of the full text of scientific articles split into individual sentences, and the implementation of categories of terms for which a database of articles and individual sentences can be searched. The categories are classes of biological concepts (e.g., gene, allele, cell or cell group, phenotype, etc.) and classes that relate two objects (e.g., association, regulation, etc.) or describe one (e.g., biological process, etc.). Together they form a catalog of types of objects and concepts called an ontology. After this ontology is populated with terms, the whole corpus of articles and abstracts is marked up to identify terms of these categories. The current ontology comprises 33 categories of terms. A search engine enables the user to search for one or a combination of these tags and/or keywords within a sentence or document, and as the ontology allows word meaning to be queried, it is possible to formulate semantic queries. Full text access increases recall of biological data types from 45% to 95%. Extraction of particular biological facts, such as gene-gene interactions, can be accelerated significantly by ontologies, with Textpresso automatically performing nearly as well as expert curators to identify sentences; in searches for two uniquely named genes and an interaction term, the ontology confers a 3-fold increase of search efficiency. Textpresso currently focuses on Caenorhabditis elegans literature, with 3,800 full text articles and 16,000 abstracts. The lexicon of the ontology contains 14,500 entries, each of which includes all versions of a specific word or phrase, and it includes all categories of the Gene Ontology database. Textpresso is a useful curation tool, as well as search engine for researchers, and can readily be extended to other organism

  13. The CLEF corpus: semantic annotation of clinical text.

    Science.gov (United States)

    Roberts, Angus; Gaizauskas, Robert; Hepple, Mark; Davis, Neil; Demetriou, George; Guo, Yikun; Kola, Jay; Roberts, Ian; Setzer, Andrea; Tapuria, Archana; Wheeldin, Bill

    2007-10-11

    The Clinical E-Science Framework (CLEF) project is building a framework for the capture, integration and presentation of clinical information: for clinical research, evidence-based health care and genotype-meets-phenotype informatics. A significant portion of the information required by such a framework originates as text, even in EHR-savvy organizations. CLEF uses Information Extraction (IE) to make this unstructured information available. An important part of IE is the identification of semantic entities and relationships. Typical approaches require human annotated documents to provide both evaluation standards and material for system development. CLEF has a corpus of clinical narratives, histopathology reports and imaging reports from 20 thousand patients. We describe the selection of a subset of this corpus for manual annotation of clinical entities and relationships. We describe an annotation methodology and report encouraging initial results of inter-annotator agreement. Comparisons are made between different text sub-genres, and between annotators with different skills.

  14. An Introduction to Genome Annotation.

    Science.gov (United States)

    Campbell, Michael S; Yandell, Mark

    2015-12-17

    Genome projects have evolved from large international undertakings to tractable endeavors for a single lab. Accurate genome annotation is critical for successful genomic, genetic, and molecular biology experiments. These annotations can be generated using a number of approaches and available software tools. This unit describes methods for genome annotation and a number of software tools commonly used in gene annotation.

  15. Semantic annotation of mutable data.

    Science.gov (United States)

    Morris, Robert A; Dou, Lei; Hanken, James; Kelly, Maureen; Lowery, David B; Ludäscher, Bertram; Macklin, James A; Morris, Paul J

    2013-01-01

    Electronic annotation of scientific data is very similar to annotation of documents. Both types of annotation amplify the original object, add related knowledge to it, and dispute or support assertions in it. In each case, annotation is a framework for discourse about the original object, and, in each case, an annotation needs to clearly identify its scope and its own terminology. However, electronic annotation of data differs from annotation of documents: the content of the annotations, including expectations and supporting evidence, is more often shared among members of networks. Any consequent actions taken by the holders of the annotated data could be shared as well. But even those current annotation systems that admit data as their subject often make it difficult or impossible to annotate at fine-enough granularity to use the results in this way for data quality control. We address these kinds of issues by offering simple extensions to an existing annotation ontology and describe how the results support an interest-based distribution of annotations. We are using the result to design and deploy a platform that supports annotation services overlaid on networks of distributed data, with particular application to data quality control. Our initial instance supports a set of natural science collection metadata services. An important application is the support for data quality control and provision of missing data. A previous proof of concept demonstrated such use based on data annotations modeled with XML-Schema.

  16. Study on Digital Library’s Ontology-based Service Mode%基于本体的数字图书馆服务模式研究

    Institute of Scientific and Technical Information of China (English)

    王秀利

    2014-01-01

    从本体的内涵、本体的功能、本体的构建等方面阐释了本体思想,详细介绍了基于本体的数字图书馆知识服务模式的构建,以期帮助用户在数字图书馆的海量信息中及时、准确地找到自己所需要的知识。%This paper expounds the ontology-based thought from aspects of ontology’s connotation, ontology’s function, and ontology’s construction, etc., and introduces in detail the construction of digital library’s ontology-based knowledge service mode in order to help the users to find the knowledge what they need in mass information of digital library.

  17. Algal functional annotation tool

    Energy Technology Data Exchange (ETDEWEB)

    2012-07-12

    Abstract BACKGROUND: Progress in genome sequencing is proceeding at an exponential pace, and several new algal genomes are becoming available every year. One of the challenges facing the community is the association of protein sequences encoded in the genomes with biological function. While most genome assembly projects generate annotations for predicted protein sequences, they are usually limited and integrate functional terms from a limited number of databases. Another challenge is the use of annotations to interpret large lists of 'interesting' genes generated by genome-scale datasets. Previously, these gene lists had to be analyzed across several independent biological databases, often on a gene-by-gene basis. In contrast, several annotation databases, such as DAVID, integrate data from multiple functional databases and reveal underlying biological themes of large gene lists. While several such databases have been constructed for animals, none is currently available for the study of algae. Due to renewed interest in algae as potential sources of biofuels and the emergence of multiple algal genome sequences, a significant need has arisen for such a database to process the growing compendiums of algal genomic data. DESCRIPTION: The Algal Functional Annotation Tool is a web-based comprehensive analysis suite integrating annotation data from several pathway, ontology, and protein family databases. The current version provides annotation for the model alga Chlamydomonas reinhardtii, and in the future will include additional genomes. The site allows users to interpret large gene lists by identifying associated functional terms, and their enrichment. Additionally, expression data for several experimental conditions were compiled and analyzed to provide an expression-based enrichment search. A tool to search for functionally-related genes based on gene expression across these conditions is also provided. Other features include dynamic visualization of genes

  18. DaGO-Fun: tool for Gene Ontology-based functional analysis using term information content measures.

    Science.gov (United States)

    Mazandu, Gaston K; Mulder, Nicola J

    2013-09-25

    The use of Gene Ontology (GO) data in protein analyses have largely contributed to the improved outcomes of these analyses. Several GO semantic similarity measures have been proposed in recent years and provide tools that allow the integration of biological knowledge embedded in the GO structure into different biological analyses. There is a need for a unified tool that provides the scientific community with the opportunity to explore these different GO similarity measure approaches and their biological applications. We have developed DaGO-Fun, an online tool available at http://web.cbio.uct.ac.za/ITGOM, which incorporates many different GO similarity measures for exploring, analyzing and comparing GO terms and proteins within the context of GO. It uses GO data and UniProt proteins with their GO annotations as provided by the Gene Ontology Annotation (GOA) project to precompute GO term information content (IC), enabling rapid response to user queries. The DaGO-Fun online tool presents the advantage of integrating all the relevant IC-based GO similarity measures, including topology- and annotation-based approaches to facilitate effective exploration of these measures, thus enabling users to choose the most relevant approach for their application. Furthermore, this tool includes several biological applications related to GO semantic similarity scores, including the retrieval of genes based on their GO annotations, the clustering of functionally related genes within a set, and term enrichment analysis.

  19. Algal functional annotation tool

    Energy Technology Data Exchange (ETDEWEB)

    Lopez, D. [UCLA; Casero, D. [UCLA; Cokus, S. J. [UCLA; Merchant, S. S. [UCLA; Pellegrini, M. [UCLA

    2012-07-01

    The Algal Functional Annotation Tool is a web-based comprehensive analysis suite integrating annotation data from several pathway, ontology, and protein family databases. The current version provides annotation for the model alga Chlamydomonas reinhardtii, and in the future will include additional genomes. The site allows users to interpret large gene lists by identifying associated functional terms, and their enrichment. Additionally, expression data for several experimental conditions were compiled and analyzed to provide an expression-based enrichment search. A tool to search for functionally-related genes based on gene expression across these conditions is also provided. Other features include dynamic visualization of genes on KEGG pathway maps and batch gene identifier conversion.

  20. An ontology-based approach to patient follow-up assessment for continuous and personalized chronic disease management.

    Science.gov (United States)

    Zhang, Yi-Fan; Gou, Ling; Zhou, Tian-Shu; Lin, De-Nan; Zheng, Jing; Li, Ye; Li, Jing-Song

    2017-08-01

    Chronic diseases are complex and persistent clinical conditions that require close collaboration among patients and health care providers in the implementation of long-term and integrated care programs. However, current solutions focus partially on intensive interventions at hospitals rather than on continuous and personalized chronic disease management. This study aims to fill this gap by providing computerized clinical decision support during follow-up assessments of chronically ill patients at home. We proposed an ontology-based framework to integrate patient data, medical domain knowledge, and patient assessment criteria for chronic disease patient follow-up assessments. A clinical decision support system was developed to implement this framework for automatic selection and adaptation of standard assessment protocols to suit patient personal conditions. We evaluated our method in the case study of type 2 diabetic patient follow-up assessments. The proposed framework was instantiated using real data from 115,477 follow-up assessment records of 36,162 type 2 diabetic patients. Standard evaluation criteria were automatically selected and adapted to the particularities of each patient. Assessment results were generated as a general typing of patient overall condition and detailed scoring for each criterion, providing important indicators to the case manager about possible inappropriate judgments, in addition to raising patient awareness of their disease control outcomes. Using historical data as the gold standard, our system achieved a rate of accuracy of 99.93% and completeness of 95.00%. This study contributes to improving the accessibility, efficiency and quality of current patient follow-up services. It also provides a generic approach to knowledge sharing and reuse for patient-centered chronic disease management. Copyright © 2017 Elsevier Inc. All rights reserved.

  1. My Corporis Fabrica Embryo: An ontology-based 3D spatio-temporal modeling of human embryo development.

    Science.gov (United States)

    Rabattu, Pierre-Yves; Massé, Benoit; Ulliana, Federico; Rousset, Marie-Christine; Rohmer, Damien; Léon, Jean-Claude; Palombi, Olivier

    2015-01-01

    Embryology is a complex morphologic discipline involving a set of entangled mechanisms, sometime difficult to understand and to visualize. Recent computer based techniques ranging from geometrical to physically based modeling are used to assist the visualization and the simulation of virtual humans for numerous domains such as surgical simulation and learning. On the other side, the ontology-based approach applied to knowledge representation is more and more successfully adopted in the life-science domains to formalize biological entities and phenomena, thanks to a declarative approach for expressing and reasoning over symbolic information. 3D models and ontologies are two complementary ways to describe biological entities that remain largely separated. Indeed, while many ontologies providing a unified formalization of anatomy and embryology exist, they remain only descriptive and make the access to anatomical content of complex 3D embryology models and simulations difficult. In this work, we present a novel ontology describing the development of the human embryology deforming 3D models. Beyond describing how organs and structures are composed, our ontology integrates a procedural description of their 3D representations, temporal deformation and relations with respect to their developments. We also created inferences rules to express complex connections between entities. It results in a unified description of both the knowledge of the organs deformation and their 3D representations enabling to visualize dynamically the embryo deformation during the Carnegie stages. Through a simplified ontology, containing representative entities which are linked to spatial position and temporal process information, we illustrate the added-value of such a declarative approach for interactive simulation and visualization of 3D embryos. Combining ontologies and 3D models enables a declarative description of different embryological models that capture the complexity of human

  2. Scaling Out and Evaluation of OBSecAn, an Automated Section Annotator for Semi-Structured Clinical Documents, on a Large VA Clinical Corpus.

    Science.gov (United States)

    Tran, Le-Thuy T; Divita, Guy; Redd, Andrew; Carter, Marjorie E; Samore, Matthew; Gundlapalli, Adi V

    2015-01-01

    "Identifying and labeling" (annotating) sections improves the effectiveness of extracting information stored in the free text of clinical documents. OBSecAn, an automated ontology-based section annotator, was developed to identify and label sections of semi-structured clinical documents from the Department of Veterans Affairs (VA). In the first step, the algorithm reads and parses the document to obtain and store information regarding sections into a structure that supports the hierarchy of sections. The second stage detects and makes correction to errors in the parsed structure. The third stage produces the section annotation output using the final parsed tree. In this study, we present the OBSecAn method and its scale to a million document corpus and evaluate its performance in identifying family history sections. We identify high yield sections for this use case from note titles such as primary care and demonstrate a median rate of 99% in correctly identifying a family history section.

  3. Cheating. An Annotated Bibliography.

    Science.gov (United States)

    Wildemuth, Barbara M., Comp.

    This 89-item, annotated bibliography was compiled to provide access to research and discussions of cheating and, specifically, cheating on tests. It is not limited to any educational level, nor is it confined to any specific curriculum area. Two data bases were searched by computer, and a library search was conducted. A computer search of the…

  4. Collaborative Movie Annotation

    Science.gov (United States)

    Zad, Damon Daylamani; Agius, Harry

    In this paper, we focus on metadata for self-created movies like those found on YouTube and Google Video, the duration of which are increasing in line with falling upload restrictions. While simple tags may have been sufficient for most purposes for traditionally very short video footage that contains a relatively small amount of semantic content, this is not the case for movies of longer duration which embody more intricate semantics. Creating metadata is a time-consuming process that takes a great deal of individual effort; however, this effort can be greatly reduced by harnessing the power of Web 2.0 communities to create, update and maintain it. Consequently, we consider the annotation of movies within Web 2.0 environments, such that users create and share that metadata collaboratively and propose an architecture for collaborative movie annotation. This architecture arises from the results of an empirical experiment where metadata creation tools, YouTube and an MPEG-7 modelling tool, were used by users to create movie metadata. The next section discusses related work in the areas of collaborative retrieval and tagging. Then, we describe the experiments that were undertaken on a sample of 50 users. Next, the results are presented which provide some insight into how users interact with existing tools and systems for annotating movies. Based on these results, the paper then develops an architecture for collaborative movie annotation.

  5. Annotated Bibliography. First Edition.

    Science.gov (United States)

    Haring, Norris G.

    An annotated bibliography which presents approximately 300 references from 1951 to 1973 on the education of severely/profoundly handicapped persons. Citations are grouped alphabetically by author's name within the following categories: characteristics and treatment, gross motor development, sensory and motor development, physical therapy for the…

  6. Annotation of Regular Polysemy

    DEFF Research Database (Denmark)

    Martinez Alonso, Hector

    Regular polysemy has received a lot of attention from the theory of lexical semantics and from computational linguistics. However, there is no consensus on how to represent the sense of underspecified examples at the token level, namely when annotating or disambiguating senses of metonymic words...

  7. Annotated bibliography traceability

    NARCIS (Netherlands)

    Narain, G.

    2006-01-01

    This annotated bibliography contains summaries of articles and chapters of books, which are relevant to traceability. After each summary there is a part about the relevancy of the paper for the LEI project. The aim of the LEI-project is to gain insight in several aspects of traceability in order to

  8. Annotation of Ehux ESTs

    Energy Technology Data Exchange (ETDEWEB)

    Kuo, Alan; Grigoriev, Igor

    2009-06-12

    22 percent ESTs do no align with scaffolds. EST Pipeleine assembles 17126 consensi from the noaligned ESTs. Annotation Pipeline predicts 8564 ORFS on the consensi. Domain analysis of ORFs reveals missing genes. Cluster analysis reveals missing genes. Expression analysis reveals potential strain specific genes.

  9. Annotation: The Savant Syndrome

    Science.gov (United States)

    Heaton, Pamela; Wallace, Gregory L.

    2004-01-01

    Background: Whilst interest has focused on the origin and nature of the savant syndrome for over a century, it is only within the past two decades that empirical group studies have been carried out. Methods: The following annotation briefly reviews relevant research and also attempts to address outstanding issues in this research area.…

  10. GSV Annotated Bibliography

    Energy Technology Data Exchange (ETDEWEB)

    Roberts, Randy S. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Pope, Paul A. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Jiang, Ming [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Trucano, Timothy G. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Aragon, Cecilia R. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Ni, Kevin [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Wei, Thomas [Argonne National Lab. (ANL), Argonne, IL (United States); Chilton, Lawrence K. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Bakel, Alan [Argonne National Lab. (ANL), Argonne, IL (United States)

    2011-06-14

    The following annotated bibliography was developed as part of the Geospatial Algorithm Veri cation and Validation (GSV) project for the Simulation, Algorithms and Modeling program of NA-22. Veri cation and Validation of geospatial image analysis algorithms covers a wide range of technologies. Papers in the bibliography are thus organized into the following ve topic areas: Image processing and analysis, usability and validation of geospatial image analysis algorithms, image distance measures, scene modeling and image rendering, and transportation simulation models.

  11. 基于本体的农业数据语义关联发现技术%Ontology based semantic relevant discovery in agriculture data

    Institute of Scientific and Technical Information of China (English)

    徐晓文; 陈维斌; 李海波

    2012-01-01

    提出了基于本体的语义关联发现模型,通过解析构建的农业领域本体,从本体语义路径的深度广度方面计算概念间相关度,并将计算的结果扩充语义知识库。在农业领域模型中关联发现算法的应用与传统的方法相比。结果更符合领域相关性。依据关联发现模型设计了一个茶叶语义检索系统,实验验证了该提出的模型的实用性和可行性。%This paper proposes the ontology based semantic relevant discovery model to solve these problems. By parsing the constructed agriculture domain ontology, we calculate the correlations from the semantic distance of the depth and breadth in ontology, it improves the semantic knowledge base. Compared with the traditional methods, the experimental data in agricultural sector shows that the novel model better corresponded with domain reality. We develop an ontology based tea semantic retrieval system in agriculture on the basis of this model. The realization of the system verifies the feasibility and practicability of the relevant discovery model.

  12. Annotation of Regular Polysemy

    DEFF Research Database (Denmark)

    Martinez Alonso, Hector

    Regular polysemy has received a lot of attention from the theory of lexical semantics and from computational linguistics. However, there is no consensus on how to represent the sense of underspecified examples at the token level, namely when annotating or disambiguating senses of metonymic words...... like “London” (Location/Organization) or “cup” (Container/Content). The goal of this dissertation is to assess whether metonymic sense underspecification justifies incorporating a third sense into our sense inventories, thereby treating the underspecified sense as independent from the literal...

  13. Predicting word sense annotation agreement

    DEFF Research Database (Denmark)

    Martinez Alonso, Hector; Johannsen, Anders Trærup; Lopez de Lacalle, Oier

    2015-01-01

    High agreement is a common objective when annotating data for word senses. However, a number of factors make perfect agreement impossible, e.g. the limitations of the sense inventories, the difficulty of the examples or the interpretation preferences of the annotations. Estimating potential...... agreement is thus a relevant task to supplement the evaluation of sense annotations. In this article we propose two methods to predict agreement on word-annotation instances. We experiment with a continuous representation and a three-way discretization of observed agreement. In spite of the difficulty...

  14. Phylogenetic molecular function annotation

    Science.gov (United States)

    Engelhardt, Barbara E.; Jordan, Michael I.; Repo, Susanna T.; Brenner, Steven E.

    2009-07-01

    It is now easier to discover thousands of protein sequences in a new microbial genome than it is to biochemically characterize the specific activity of a single protein of unknown function. The molecular functions of protein sequences have typically been predicted using homology-based computational methods, which rely on the principle that homologous proteins share a similar function. However, some protein families include groups of proteins with different molecular functions. A phylogenetic approach for predicting molecular function (sometimes called "phylogenomics") is an effective means to predict protein molecular function. These methods incorporate functional evidence from all members of a family that have functional characterizations using the evolutionary history of the protein family to make robust predictions for the uncharacterized proteins. However, they are often difficult to apply on a genome-wide scale because of the time-consuming step of reconstructing the phylogenies of each protein to be annotated. Our automated approach for function annotation using phylogeny, the SIFTER (Statistical Inference of Function Through Evolutionary Relationships) methodology, uses a statistical graphical model to compute the probabilities of molecular functions for unannotated proteins. Our benchmark tests showed that SIFTER provides accurate functional predictions on various protein families, outperforming other available methods.

  15. An Ontology-based Time Semantic Specification and Verification Approach for Web Service%一种基于Ontology的WEB服务时间约束定义及验证方法

    Institute of Scientific and Technical Information of China (English)

    刘如娟; 陈俊杰; 王立军; 谢红薇

    2009-01-01

    To solve the problem that Non-functional property specification & verification in web service (WS) flow has become indispensable, an ontology-based approach was presented for specifying and verifying time constraints consistency in WS flow. Time OWL-S was built to express basic time information of WS flow roundly. Meanwhile, Petri Nets Ontology was enriched with time semantic information, which can describe WS semantics based on OWL. The mapping definition between Time OWL-S and extended Petfi Nets Ontology was also given. Through extending OWL-S API,annotated OWL-S model was transformed into PNML, a standard format to describe Petri Nets Ontolo-gy. A verification system was taken to illustrate the correctness and feasibility of the verification process.%针对目前面向服务计算(SCO)模式中组合Web服务验证缺乏对时间等非功能属性验证的实际,基于本体理论,设计并实现了一种基于时间约束Petri网的Web服务时间模型定义方法.该方法考虑服务流与以往工作流的区别,在服务模型OWL-S基础上建立时间本体(Time Ontology),全面地刻画了服务的基本时间约束.扩展已有的PetIi网本体,增强其描述能力,使其能够描述服务数据及时间信息,并定义了在OWL-S上定义的时间本体与Pe-tri网时间本体间的关系.根据模型检测理论,通过扩展OWL-API,将附加时间属性的OWL-S的服务描述本体转换为Petri网的标准格式PNML文件,建立起扩展时间属性的Web服务模型与形式化模型时间约束PetIi网(TCPN)间的映射关系.该方法很好地描述了服务时间约束,为后续的基于TCPN的服务时间形式化验证打下了坚定的基础.

  16. Annotation in Digital Scholarly Editions

    NARCIS (Netherlands)

    Boot, P.; Haentjens Dekker, R.

    2016-01-01

    Annotation in digital scholarly editions (of historical documents, literary works, letters, etc.) has long been recognized as an important desideratum, but has also proven to be an elusive ideal. In so far as annotation functionality is available, it is usually developed for a single edition and

  17. Mesotext. Framing and exploring annotations

    NARCIS (Netherlands)

    Boot, P.; Boot, P.; Stronks, E.

    2007-01-01

    From the introduction: Annotation is an important item on the wish list for digital scholarly tools. It is one of John Unsworth’s primitives of scholarship (Unsworth 2000). Especially in linguistics,a number of tools have been developed that facilitate the creation of annotations to source material

  18. Mesotext. Framing and exploring annotations

    NARCIS (Netherlands)

    Boot, P.; Boot, P.; Stronks, E.

    2007-01-01

    From the introduction: Annotation is an important item on the wish list for digital scholarly tools. It is one of John Unsworth’s primitives of scholarship (Unsworth 2000). Especially in linguistics,a number of tools have been developed that facilitate the creation of annotations to source material

  19. Collective dynamics of social annotation

    CERN Document Server

    Cattuto, Ciro; Baldassarri, Andrea; Schehr, G; Loreto, Vittorio

    2009-01-01

    The enormous increase of popularity and use of the WWW has led in the recent years to important changes in the ways people communicate. An interesting example of this fact is provided by the now very popular social annotation systems, through which users annotate resources (such as web pages or digital photographs) with text keywords dubbed tags. Understanding the rich emerging structures resulting from the uncoordinated actions of users calls for an interdisciplinary effort. In particular concepts borrowed from statistical physics, such as random walks, and the complex networks framework, can effectively contribute to the mathematical modeling of social annotation systems. Here we show that the process of social annotation can be seen as a collective but uncoordinated exploration of an underlying semantic space, pictured as a graph, through a series of random walks. This modeling framework reproduces several aspects, so far unexplained, of social annotation, among which the peculiar growth of the size of the...

  20. GIFtS: annotation landscape analysis with GeneCards

    Directory of Open Access Journals (Sweden)

    Dalah Irina

    2009-10-01

    Full Text Available Abstract Background Gene annotation is a pivotal component in computational genomics, encompassing prediction of gene function, expression analysis, and sequence scrutiny. Hence, quantitative measures of the annotation landscape constitute a pertinent bioinformatics tool. GeneCards® is a gene-centric compendium of rich annotative information for over 50,000 human gene entries, building upon 68 data sources, including Gene Ontology (GO, pathways, interactions, phenotypes, publications and many more. Results We present the GeneCards Inferred Functionality Score (GIFtS which allows a quantitative assessment of a gene's annotation status, by exploiting the unique wealth and diversity of GeneCards information. The GIFtS tool, linked from the GeneCards home page, facilitates browsing the human genome by searching for the annotation level of a specified gene, retrieving a list of genes within a specified range of GIFtS value, obtaining random genes with a specific GIFtS value, and experimenting with the GIFtS weighting algorithm for a variety of annotation categories. The bimodal shape of the GIFtS distribution suggests a division of the human gene repertoire into two main groups: the high-GIFtS peak consists almost entirely of protein-coding genes; the low-GIFtS peak consists of genes from all of the categories. Cluster analysis of GIFtS annotation vectors provides the classification of gene groups by detailed positioning in the annotation arena. GIFtS also provide measures which enable the evaluation of the databases that serve as GeneCards sources. An inverse correlation is found (for GIFtS>25 between the number of genes annotated by each source, and the average GIFtS value of genes associated with that source. Three typical source prototypes are revealed by their GIFtS distribution: genome-wide sources, sources comprising mainly highly annotated genes, and sources comprising mainly poorly annotated genes. The degree of accumulated knowledge for a

  1. A Framework for Ontology-Based Service Discovery and Composition%一种基于语义的服务组装框架

    Institute of Scientific and Technical Information of China (English)

    李伟平; 褚伟杰; 高福亮; 刘利; 童缙

    2009-01-01

    Currently a large number of web services as well as other kinds of services such as EJBs,COM,and even Java Classes are made available to the general public.Facilitating the SOA based system development by leveraging such kinds of services becomes a challenge.A framework for service repository,ontology based service discovery and service composition is put forward.The service repository can maintain the web services,EJBs,and Java Classes with the functions such as service registration,publishing,discovery,matching,versioning,and monitoring.The details of service description are analyzed.A domain ontology for Procurement,Selling,and Inventory is also given.Based on the domain ontology and the service repository,the semantic enhanced service composition algorithm is discussed.

  2. Dictionary and Gene Ontology Based Similarity for Named Entity Relationship Protein-protein Interaction Prediction from Biotext Corpus

    Directory of Open Access Journals (Sweden)

    Smt K. Prabavathy

    2014-12-01

    Full Text Available Protein-protein interactions functions as a significant key role in several biological systems. These involves in complex formation and many pathways which are used to perform biological processes. By accurate identification of the set of interacting proteins can get rid of new light on the functional role of various proteins in the complex surroundings of the cell. The ability to construct biologically consequential gene networks and identification of the exact relationship in the gene network is critical for present-day systems biology. In earlier research, the power of presented gene modules to shed light on the functioning of complex biological systems is studied. Most of modules in these networks have shown small link with meaningful biological function, because these methods doesn’t exactly calculate the semantic relationship between the entities. In order to overcome these problems and improve the PPI results in the biotext corpus a new method is proposed in this research. The proposed method which directly incorporates Gene Ontology (GO annotation in construction of gene modules and Dictionary-based text is proposed to extract biotext information. Dictionary-Based Text and Gene Ontology (DBTGO approach that integrates with various gene-gene pairwise similarity values, protein-protein interaction relationship obtained from gene expression, in order to gain better biotext information retrieval result. A result analysis has been carried out on Biotext Project at UC Berkley. Testing the DBTGO algorithm indicates that it is able to improve PPI relationship identification result with all previously suggested methods in terms of the precision, recall, F measure and Normalized Discounted Cumulative Gain (NDCG. The proposed DBTGO algorithm can facilitate comprehensive and in-depth analysis of high throughput experimental data at the gene network level.

  3. Gene Ontology annotations and resources.

    Science.gov (United States)

    Blake, J A; Dolan, M; Drabkin, H; Hill, D P; Li, Ni; Sitnikov, D; Bridges, S; Burgess, S; Buza, T; McCarthy, F; Peddinti, D; Pillai, L; Carbon, S; Dietze, H; Ireland, A; Lewis, S E; Mungall, C J; Gaudet, P; Chrisholm, R L; Fey, P; Kibbe, W A; Basu, S; Siegele, D A; McIntosh, B K; Renfro, D P; Zweifel, A E; Hu, J C; Brown, N H; Tweedie, S; Alam-Faruque, Y; Apweiler, R; Auchinchloss, A; Axelsen, K; Bely, B; Blatter, M -C; Bonilla, C; Bouguerleret, L; Boutet, E; Breuza, L; Bridge, A; Chan, W M; Chavali, G; Coudert, E; Dimmer, E; Estreicher, A; Famiglietti, L; Feuermann, M; Gos, A; Gruaz-Gumowski, N; Hieta, R; Hinz, C; Hulo, C; Huntley, R; James, J; Jungo, F; Keller, G; Laiho, K; Legge, D; Lemercier, P; Lieberherr, D; Magrane, M; Martin, M J; Masson, P; Mutowo-Muellenet, P; O'Donovan, C; Pedruzzi, I; Pichler, K; Poggioli, D; Porras Millán, P; Poux, S; Rivoire, C; Roechert, B; Sawford, T; Schneider, M; Stutz, A; Sundaram, S; Tognolli, M; Xenarios, I; Foulgar, R; Lomax, J; Roncaglia, P; Khodiyar, V K; Lovering, R C; Talmud, P J; Chibucos, M; Giglio, M Gwinn; Chang, H -Y; Hunter, S; McAnulla, C; Mitchell, A; Sangrador, A; Stephan, R; Harris, M A; Oliver, S G; Rutherford, K; Wood, V; Bahler, J; Lock, A; Kersey, P J; McDowall, D M; Staines, D M; Dwinell, M; Shimoyama, M; Laulederkind, S; Hayman, T; Wang, S -J; Petri, V; Lowry, T; D'Eustachio, P; Matthews, L; Balakrishnan, R; Binkley, G; Cherry, J M; Costanzo, M C; Dwight, S S; Engel, S R; Fisk, D G; Hitz, B C; Hong, E L; Karra, K; Miyasato, S R; Nash, R S; Park, J; Skrzypek, M S; Weng, S; Wong, E D; Berardini, T Z; Huala, E; Mi, H; Thomas, P D; Chan, J; Kishore, R; Sternberg, P; Van Auken, K; Howe, D; Westerfield, M

    2013-01-01

    The Gene Ontology (GO) Consortium (GOC, http://www.geneontology.org) is a community-based bioinformatics resource that classifies gene product function through the use of structured, controlled vocabularies. Over the past year, the GOC has implemented several processes to increase the quantity, quality and specificity of GO annotations. First, the number of manual, literature-based annotations has grown at an increasing rate. Second, as a result of a new 'phylogenetic annotation' process, manually reviewed, homology-based annotations are becoming available for a broad range of species. Third, the quality of GO annotations has been improved through a streamlined process for, and automated quality checks of, GO annotations deposited by different annotation groups. Fourth, the consistency and correctness of the ontology itself has increased by using automated reasoning tools. Finally, the GO has been expanded not only to cover new areas of biology through focused interaction with experts, but also to capture greater specificity in all areas of the ontology using tools for adding new combinatorial terms. The GOC works closely with other ontology developers to support integrated use of terminologies. The GOC supports its user community through the use of e-mail lists, social media and web-based resources.

  4. Sentiment Analysis of Document Based on Annotation

    CERN Document Server

    Shukla, Archana

    2011-01-01

    I present a tool which tells the quality of document or its usefulness based on annotations. Annotation may include comments, notes, observation, highlights, underline, explanation, question or help etc. comments are used for evaluative purpose while others are used for summarization or for expansion also. Further these comments may be on another annotation. Such annotations are referred as meta-annotation. All annotation may not get equal weightage. My tool considered highlights, underline as well as comments to infer the collective sentiment of annotators. Collective sentiments of annotators are classified as positive, negative, objectivity. My tool computes collective sentiment of annotations in two manners. It counts all the annotation present on the documents as well as it also computes sentiment scores of all annotation which includes comments to obtain the collective sentiments about the document or to judge the quality of document. I demonstrate the use of tool on research paper.

  5. Publication Production: An Annotated Bibliography.

    Science.gov (United States)

    Firman, Anthony H.

    1994-01-01

    Offers brief annotations of 52 articles and papers on document production (from the Society for Technical Communication's journal and proceedings) on 9 topics: information processing, document design, using color, typography, tables, illustrations, photography, printing and binding, and production management. (SR)

  6. NCBI prokaryotic genome annotation pipeline.

    Science.gov (United States)

    Tatusova, Tatiana; DiCuccio, Michael; Badretdin, Azat; Chetvernin, Vyacheslav; Nawrocki, Eric P; Zaslavsky, Leonid; Lomsadze, Alexandre; Pruitt, Kim D; Borodovsky, Mark; Ostell, James

    2016-08-19

    Recent technological advances have opened unprecedented opportunities for large-scale sequencing and analysis of populations of pathogenic species in disease outbreaks, as well as for large-scale diversity studies aimed at expanding our knowledge across the whole domain of prokaryotes. To meet the challenge of timely interpretation of structure, function and meaning of this vast genetic information, a comprehensive approach to automatic genome annotation is critically needed. In collaboration with Georgia Tech, NCBI has developed a new approach to genome annotation that combines alignment based methods with methods of predicting protein-coding and RNA genes and other functional elements directly from sequence. A new gene finding tool, GeneMarkS+, uses the combined evidence of protein and RNA placement by homology as an initial map of annotation to generate and modify ab initio gene predictions across the whole genome. Thus, the new NCBI's Prokaryotic Genome Annotation Pipeline (PGAP) relies more on sequence similarity when confident comparative data are available, while it relies more on statistical predictions in the absence of external evidence. The pipeline provides a framework for generation and analysis of annotation on the full breadth of prokaryotic taxonomy. For additional information on PGAP see https://www.ncbi.nlm.nih.gov/genome/annotation_prok/ and the NCBI Handbook, https://www.ncbi.nlm.nih.gov/books/NBK174280/.

  7. Objective-guided image annotation.

    Science.gov (United States)

    Mao, Qi; Tsang, Ivor Wai-Hung; Gao, Shenghua

    2013-04-01

    Automatic image annotation, which is usually formulated as a multi-label classification problem, is one of the major tools used to enhance the semantic understanding of web images. Many multimedia applications (e.g., tag-based image retrieval) can greatly benefit from image annotation. However, the insufficient performance of image annotation methods prevents these applications from being practical. On the other hand, specific measures are usually designed to evaluate how well one annotation method performs for a specific objective or application, but most image annotation methods do not consider optimization of these measures, so that they are inevitably trapped into suboptimal performance of these objective-specific measures. To address this issue, we first summarize a variety of objective-guided performance measures under a unified representation. Our analysis reveals that macro-averaging measures are very sensitive to infrequent keywords, and hamming measure is easily affected by skewed distributions. We then propose a unified multi-label learning framework, which directly optimizes a variety of objective-specific measures of multi-label learning tasks. Specifically, we first present a multilayer hierarchical structure of learning hypotheses for multi-label problems based on which a variety of loss functions with respect to objective-guided measures are defined. And then, we formulate these loss functions as relaxed surrogate functions and optimize them by structural SVMs. According to the analysis of various measures and the high time complexity of optimizing micro-averaging measures, in this paper, we focus on example-based measures that are tailor-made for image annotation tasks but are seldom explored in the literature. Experiments show consistency with the formal analysis on two widely used multi-label datasets, and demonstrate the superior performance of our proposed method over state-of-the-art baseline methods in terms of example-based measures on four

  8. 基于OWL的双语领域本体构建方法研究%Research on Construction Method of Bilingual Domain Ontology Based on OWL

    Institute of Scientific and Technical Information of China (English)

    刘言; 林民

    2014-01-01

    随着在万维网上进行多语种语义查询要求的提高,多语言本体的研究逐渐成为热点,但是专业领域多语言本体的研究还相对较少。在对现有多语言本体构建方法的分析比较基础上,设计了两种基于OWL的、面向计算机软件工程专业领域知识的、汉英双语领域本体的构建方法,并进一步研究了基于上述两种方法的概念间相似度的计算方法,并且利用以上两种方法,以软件工程中UML知识为来源,建立了一个实验性双语领域本体。该方法在概念映射方面改进了原有方法无法有效地实现概念映射的缺陷,在多语种概念映射方面具有比较好的效果。%As the increasing demands of multilingual semantic query on the World Wide Web,the research on multilingual ontology has gradually become a hot spot,but the study of multilingual ontology on professional field is relatively rare. In this paper,present two con-struction method of Chinese-English bilingual domain ontology based on OWL and oriented knowledge of computer software engineering domain,on the basis of the analysis and comparison of existing construction method of multilingual ontology,and the calculation method of concept similarity based on the above two methods is researched further,using the above two methods,establish an experimental bilin-gual domain ontology based on UML knowledge in the software engineering. This method improves the fault that original methods can’t implement the concept mapping effectively in the aspect of concept mapping,and it has a better effect in terms of multilingual concept mapping.

  9. Collective dynamics of social annotation.

    Science.gov (United States)

    Cattuto, Ciro; Barrat, Alain; Baldassarri, Andrea; Schehr, Gregory; Loreto, Vittorio

    2009-06-30

    The enormous increase of popularity and use of the worldwide web has led in the recent years to important changes in the ways people communicate. An interesting example of this fact is provided by the now very popular social annotation systems, through which users annotate resources (such as web pages or digital photographs) with keywords known as "tags." Understanding the rich emergent structures resulting from the uncoordinated actions of users calls for an interdisciplinary effort. In particular concepts borrowed from statistical physics, such as random walks (RWs), and complex networks theory, can effectively contribute to the mathematical modeling of social annotation systems. Here, we show that the process of social annotation can be seen as a collective but uncoordinated exploration of an underlying semantic space, pictured as a graph, through a series of RWs. This modeling framework reproduces several aspects, thus far unexplained, of social annotation, among which are the peculiar growth of the size of the vocabulary used by the community and its complex network structure that represents an externalization of semantic structures grounded in cognition and that are typically hard to access.

  10. 基于本体的关联知识可视化检索模型%Ontology-based Related Knowledge Visualization Retrieval Model

    Institute of Scientific and Technical Information of China (English)

    扛潇俊; 李善平; 刘思屹

    2011-01-01

    Ontology as a formalized description of shared conceptual system, can solve the problem of massive knowledge usage in knowledge retrieval. This paper suggests an ontology-based related knowledge visualized retrieval model based on full study of current research. From a practical point of view, the model focuses on the relationship between knowledge sources and user experience of knowledge retrieval, improves traditional ontology construction and maintenance method, and suggests a new knowledge retrieval method. The model can largely improve efficiency and quality of knowledge retrieval by users.%本体作为共享概念体系的形式化描述,在知识检索方面可解决海量知识利用问题.为此,在已有研究成果的基础上,提出一种基于本体的关联知识可视化检索模型.该模型从实用角度出发,关注知识源之间的关联性和知识检索的用户体验,改进传统的本体构建及维护方法,提出新的知识检索方法.应用实例结果表明,该模型能够提升用户获取知识的效率和质量.

  11. Handling Real-World Context Awareness, Uncertainty and Vagueness in Real-Time Human Activity Tracking and Recognition with a Fuzzy Ontology-Based Hybrid Method

    Directory of Open Access Journals (Sweden)

    Natalia Díaz-Rodríguez

    2014-09-01

    Full Text Available Human activity recognition is a key task in ambient intelligence applications to achieve proper ambient assisted living. There has been remarkable progress in this domain, but some challenges still remain to obtain robust methods. Our goal in this work is to provide a system that allows the modeling and recognition of a set of complex activities in real life scenarios involving interaction with the environment. The proposed framework is a hybrid model that comprises two main modules: a low level sub-activity recognizer, based on data-driven methods, and a high-level activity recognizer, implemented with a fuzzy ontology to include the semantic interpretation of actions performed by users. The fuzzy ontology is fed by the sub-activities recognized by the low level data-driven component and provides fuzzy ontological reasoning to recognize both the activities and their influence in the environment with semantics. An additional benefit of the approach is the ability to handle vagueness and uncertainty in the knowledge-based module, which substantially outperforms the treatment of incomplete and/or imprecise data with respect to classic crisp ontologies. We validate these advantages with the public CAD-120 dataset (Cornell Activity Dataset, achieving an accuracy of 90.1% and 91.07% for low-level and high-level activities, respectively. This entails an improvement over fully data-driven or ontology-based approaches.

  12. Alignment-Annotator web server: rendering and annotating sequence alignments.

    Science.gov (United States)

    Gille, Christoph; Fähling, Michael; Weyand, Birgit; Wieland, Thomas; Gille, Andreas

    2014-07-01

    Alignment-Annotator is a novel web service designed to generate interactive views of annotated nucleotide and amino acid sequence alignments (i) de novo and (ii) embedded in other software. All computations are performed at server side. Interactivity is implemented in HTML5, a language native to web browsers. The alignment is initially displayed using default settings and can be modified with the graphical user interfaces. For example, individual sequences can be reordered or deleted using drag and drop, amino acid color code schemes can be applied and annotations can be added. Annotations can be made manually or imported (BioDAS servers, the UniProt, the Catalytic Site Atlas and the PDB). Some edits take immediate effect while others require server interaction and may take a few seconds to execute. The final alignment document can be downloaded as a zip-archive containing the HTML files. Because of the use of HTML the resulting interactive alignment can be viewed on any platform including Windows, Mac OS X, Linux, Android and iOS in any standard web browser. Importantly, no plugins nor Java are required and therefore Alignment-Anotator represents the first interactive browser-based alignment visualization. http://www.bioinformatics.org/strap/aa/ and http://strap.charite.de/aa/. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  13. Bioinformatics for plant genome annotation

    NARCIS (Netherlands)

    Fiers, M.W.E.J.

    2006-01-01

    Large amounts of genome sequence data are available and much more will become available in the near future. A DNA sequence alone has, however, limited use. Genome annotation is required to assign biological interpretation to the DNA sequence. This thesis describ

  14. Child Development: An Annotated Bibliography.

    Science.gov (United States)

    Dickerson, LaVerne Thornton, Comp.

    This annotated bibliography focuses on recent publications dealing with factors that influence child growth and development, rather than the developmental processes themselves. Topics include: general sources on child development; physical and perceptual-motor development; cognitive development; social and personality development; and play.…

  15. Teacher Evaluation: An Annotated Bibliography.

    Science.gov (United States)

    McKenna, Bernard H.; And Others

    In his introduction to the 86-item annotated bibliography by Mueller and Poliakoff, McKenna discusses his views on teacher evaluation and his impressions of the documents cited. He observes, in part, that the current concern is with the process of evaluation and that most researchers continue to believe that student achievement is the most…

  16. Meaningful Assessment: An Annotated Bibliography.

    Science.gov (United States)

    Thrond, Mary A.

    The annotated bibliography contains citations of nine references on alternative student assessment methods in second language programs, particularly at the secondary school level. The references include a critique of conventional reading comprehension assessment, a discussion of performance assessment, a proposal for a multi-trait, multi-method…

  17. Annotated Bibliography on Humanistic Education

    Science.gov (United States)

    Ganung, Cynthia

    1975-01-01

    Part I of this annotated bibliography deals with books and articles on such topics as achievement motivation, process education, transactional analysis, discipline without punishment, role-playing, interpersonal skills, self-acceptance, moral education, self-awareness, values clarification, and non-verbal communication. Part II focuses on…

  18. Nikos Kazantzakis: An Annotated Bibliography.

    Science.gov (United States)

    Qiu, Kui

    This research paper consists of an annotated bibliography about Nikos Kazantzakis, one of the major modern Greek writers and author of "The Last Temptation of Christ,""Zorba the Greek," and many other works. Because of Kazantzakis' position in world literature there are many critical works about him; however, bibliographical control of these works…

  19. The surplus value of semantic annotations

    NARCIS (Netherlands)

    Marx, M.

    2010-01-01

    We compare the costs of semantic annotation of textual documents to its benefits for information processing tasks. Semantic annotation can improve the performance of retrieval tasks and facilitates an improved search experience through faceted search, focused retrieval, better document summaries,

  20. Annotating BI Visualization Dashboards: Needs and Challenges

    OpenAIRE

    Elias, Micheline; Bezerianos, Anastasia

    2012-01-01

    International audience; Annotations have been identified as an important aid in analysis record-keeping and recently data discovery. In this paper we discuss the use of annotations on visualization dashboards, with a special focus on business intelligence (BI) analysis. In-depth interviews with experts lead to new annotation needs for multi-chart visualization systems, on which we based the design of a dashboard prototype that supports data and context aware annotations. We focus particularly ...

  1. Are clickthrough data reliable as image annotations?

    NARCIS (Netherlands)

    Tsikrika, T.; Diou, C.; Vries, A.P. de; Delopoulos, A.

    2009-01-01

    We examine the reliability of clickthrough data as concept-based image annotations, by comparing them against manual annotations, for different concept categories. Our analysis shows that, for many concepts, the image annotations generated by using clickthrough data are reliable, with up to 90% of t

  2. Annotating images by mining image search results

    NARCIS (Netherlands)

    Wang, X.J.; Zhang, L.; Li, X.; Ma, W.Y.

    2008-01-01

    Although it has been studied for years by the computer vision and machine learning communities, image annotation is still far from practical. In this paper, we propose a novel attempt at model-free image annotation, which is a data-driven approach that annotates images by mining their search results

  3. Annotating the Function of the Human Genome with Gene Ontology and Disease Ontology.

    Science.gov (United States)

    Hu, Yang; Zhou, Wenyang; Ren, Jun; Dong, Lixiang; Wang, Yadong; Jin, Shuilin; Cheng, Liang

    2016-01-01

    Increasing evidences indicated that function annotation of human genome in molecular level and phenotype level is very important for systematic analysis of genes. In this study, we presented a framework named Gene2Function to annotate Gene Reference into Functions (GeneRIFs), in which each functional description of GeneRIFs could be annotated by a text mining tool Open Biomedical Annotator (OBA), and each Entrez gene could be mapped to Human Genome Organisation Gene Nomenclature Committee (HGNC) gene symbol. After annotating all the records about human genes of GeneRIFs, 288,869 associations between 13,148 mRNAs and 7,182 terms, 9,496 associations between 948 microRNAs and 533 terms, and 901 associations between 139 long noncoding RNAs (lncRNAs) and 297 terms were obtained as a comprehensive annotation resource of human genome. High consistency of term frequency of individual gene (Pearson correlation = 0.6401, p = 2.2e - 16) and gene frequency of individual term (Pearson correlation = 0.1298, p = 3.686e - 14) in GeneRIFs and GOA shows our annotation resource is very reliable.

  4. Multi-annotation discursive de corpus écrit

    OpenAIRE

    Péry-Woodley, Marie-Paule

    2011-01-01

    National audience; On the basis of the experience acquired in the course of the ANNODIS project, the following questions are discussed: - what is the annotation campaign for? building an annotated " reference corpus" vs. annotation as an experiment; - defining annotation tasks. Naïve vs. expert annotation; - the annotation manual : from linguistic model to annotation protocol; - automatic pre-processing vs. manual annotation. Segmentation, tagging and mark-ups: steps in corpus preparation; - ...

  5. Dictionary-driven protein annotation.

    Science.gov (United States)

    Rigoutsos, Isidore; Huynh, Tien; Floratos, Aris; Parida, Laxmi; Platt, Daniel

    2002-09-01

    Computational methods seeking to automatically determine the properties (functional, structural, physicochemical, etc.) of a protein directly from the sequence have long been the focus of numerous research groups. With the advent of advanced sequencing methods and systems, the number of amino acid sequences that are being deposited in the public databases has been increasing steadily. This has in turn generated a renewed demand for automated approaches that can annotate individual sequences and complete genomes quickly, exhaustively and objectively. In this paper, we present one such approach that is centered around and exploits the Bio-Dictionary, a collection of amino acid patterns that completely covers the natural sequence space and can capture functional and structural signals that have been reused during evolution, within and across protein families. Our annotation approach also makes use of a weighted, position-specific scoring scheme that is unaffected by the over-representation of well-conserved proteins and protein fragments in the databases used. For a given query sequence, the method permits one to determine, in a single pass, the following: local and global similarities between the query and any protein already present in a public database; the likeness of the query to all available archaeal/ bacterial/eukaryotic/viral sequences in the database as a function of amino acid position within the query; the character of secondary structure of the query as a function of amino acid position within the query; the cytoplasmic, transmembrane or extracellular behavior of the query; the nature and position of binding domains, active sites, post-translationally modified sites, signal peptides, etc. In terms of performance, the proposed method is exhaustive, objective and allows for the rapid annotation of individual sequences and full genomes. Annotation examples are presented and discussed in Results, including individual queries and complete genomes that were

  6. The Human Phenotype Ontology in 2017

    Science.gov (United States)

    Köhler, Sebastian; Vasilevsky, Nicole A.; Engelstad, Mark; Foster, Erin; McMurry, Julie; Aymé, Ségolène; Baynam, Gareth; Bello, Susan M.; Boerkoel, Cornelius F.; Boycott, Kym M.; Brudno, Michael; Buske, Orion J.; Chinnery, Patrick F.; Cipriani, Valentina; Connell, Laureen E.; Dawkins, Hugh J.S.; DeMare, Laura E.; Devereau, Andrew D.; de Vries, Bert B.A.; Firth, Helen V.; Freson, Kathleen; Greene, Daniel; Hamosh, Ada; Helbig, Ingo; Hum, Courtney; Jähn, Johanna A.; James, Roger; Krause, Roland; F. Laulederkind, Stanley J.; Lochmüller, Hanns; Lyon, Gholson J.; Ogishima, Soichi; Olry, Annie; Ouwehand, Willem H.; Pontikos, Nikolas; Rath, Ana; Schaefer, Franz; Scott, Richard H.; Segal, Michael; Sergouniotis, Panagiotis I.; Sever, Richard; Smith, Cynthia L.; Straub, Volker; Thompson, Rachel; Turner, Catherine; Turro, Ernest; Veltman, Marijcke W.M.; Vulliamy, Tom; Yu, Jing; von Ziegenweidt, Julie; Zankl, Andreas; Züchner, Stephan; Zemojtel, Tomasz; Jacobsen, Julius O.B.; Groza, Tudor; Smedley, Damian; Mungall, Christopher J.; Haendel, Melissa; Robinson, Peter N.

    2017-01-01

    Deep phenotyping has been defined as the precise and comprehensive analysis of phenotypic abnormalities in which the individual components of the phenotype are observed and described. The three components of the Human Phenotype Ontology (HPO; www.human-phenotype-ontology.org) project are the phenotype vocabulary, disease-phenotype annotations and the algorithms that operate on these. These components are being used for computational deep phenotyping and precision medicine as well as integration of clinical data into translational research. The HPO is being increasingly adopted as a standard for phenotypic abnormalities by diverse groups such as international rare disease organizations, registries, clinical labs, biomedical resources, and clinical software tools and will thereby contribute toward nascent efforts at global data exchange for identifying disease etiologies. This update article reviews the progress of the HPO project since the debut Nucleic Acids Research database article in 2014, including specific areas of expansion such as common (complex) disease, new algorithms for phenotype driven genomic discovery and diagnostics, integration of cross-species mapping efforts with the Mammalian Phenotype Ontology, an improved quality control pipeline, and the addition of patient-friendly terminology. PMID:27899602

  7. Lynx web services for annotations and systems analysis of multi-gene disorders.

    Science.gov (United States)

    Sulakhe, Dinanath; Taylor, Andrew; Balasubramanian, Sandhya; Feng, Bo; Xie, Bingqing; Börnigen, Daniela; Dave, Utpal J; Foster, Ian T; Gilliam, T Conrad; Maltsev, Natalia

    2014-07-01

    Lynx is a web-based integrated systems biology platform that supports annotation and analysis of experimental data and generation of weighted hypotheses on molecular mechanisms contributing to human phenotypes and disorders of interest. Lynx has integrated multiple classes of biomedical data (genomic, proteomic, pathways, phenotypic, toxicogenomic, contextual and others) from various public databases as well as manually curated data from our group and collaborators (LynxKB). Lynx provides tools for gene list enrichment analysis using multiple functional annotations and network-based gene prioritization. Lynx provides access to the integrated database and the analytical tools via REST based Web Services (http://lynx.ci.uchicago.edu/webservices.html). This comprises data retrieval services for specific functional annotations, services to search across the complete LynxKB (powered by Lucene), and services to access the analytical tools built within the Lynx platform.

  8. Automatic annotation of organellar genomes with DOGMA

    Energy Technology Data Exchange (ETDEWEB)

    Wyman, Stacia; Jansen, Robert K.; Boore, Jeffrey L.

    2004-06-01

    Dual Organellar GenoMe Annotator (DOGMA) automates the annotation of extra-nuclear organellar (chloroplast and animal mitochondrial) genomes. It is a web-based package that allows the use of comparative BLAST searches to identify and annotate genes in a genome. DOGMA presents a list of putative genes to the user in a graphical format for viewing and editing. Annotations are stored on our password-protected server. Complete annotations can be extracted for direct submission to GenBank. Furthermore, intergenic regions of specified length can be extracted, as well the nucleotide sequences and amino acid sequences of the genes.

  9. Ontology-based Information Retrieval

    DEFF Research Database (Denmark)

    Styltsvig, Henrik Bulskov

    of concept similarity in query evaluation is discussed. A semantic expansion approach that incorporates concept similarity is introduced and a generalized fuzzy set retrieval model that applies expansion during query evaluation is presented. While not commonly used in present information retrieval systems......In this thesis, we will present methods for introducing ontologies in information retrieval. The main hypothesis is that the inclusion of conceptual knowledge such as ontologies in the information retrieval process can contribute to the solution of major problems currently found in information...... retrieval. This utilization of ontologies has a number of challenges. Our focus is on the use of similarity measures derived from the knowledge about relations between concepts in ontologies, the recognition of semantic information in texts and the mapping of this knowledge into the ontologies in use...

  10. Ontology Based Model Transformation Infrastructure

    NARCIS (Netherlands)

    Göknil, A.; Topaloglu, N.Y.

    2005-01-01

    Using MDA in ontology development has been investigated in several works recently. The mappings and transformations between the UML constructs and the OWL elements to develop ontologies are the main concern of these research projects. We propose another approach in order to achieve the collaboration

  11. Certifying cost annotations in compilers

    CERN Document Server

    Amadio, Roberto M; Régis-Gianas, Yann; Saillard, Ronan

    2010-01-01

    We discuss the problem of building a compiler which can lift in a provably correct way pieces of information on the execution cost of the object code to cost annotations on the source code. To this end, we need a clear and flexible picture of: (i) the meaning of cost annotations, (ii) the method to prove them sound and precise, and (iii) the way such proofs can be composed. We propose a so-called labelling approach to these three questions. As a first step, we examine its application to a toy compiler. This formal study suggests that the labelling approach has good compositionality and scalability properties. In order to provide further evidence for this claim, we report our successful experience in implementing and testing the labelling approach on top of a prototype compiler written in OCAML for (a large fragment of) the C language.

  12. Evaluating Hierarchical Structure in Music Annotations.

    Science.gov (United States)

    McFee, Brian; Nieto, Oriol; Farbood, Morwaread M; Bello, Juan Pablo

    2017-01-01

    Music exhibits structure at multiple scales, ranging from motifs to large-scale functional components. When inferring the structure of a piece, different listeners may attend to different temporal scales, which can result in disagreements when they describe the same piece. In the field of music informatics research (MIR), it is common to use corpora annotated with structural boundaries at different levels. By quantifying disagreements between multiple annotators, previous research has yielded several insights relevant to the study of music cognition. First, annotators tend to agree when structural boundaries are ambiguous. Second, this ambiguity seems to depend on musical features, time scale, and genre. Furthermore, it is possible to tune current annotation evaluation metrics to better align with these perceptual differences. However, previous work has not directly analyzed the effects of hierarchical structure because the existing methods for comparing structural annotations are designed for "flat" descriptions, and do not readily generalize to hierarchical annotations. In this paper, we extend and generalize previous work on the evaluation of hierarchical descriptions of musical structure. We derive an evaluation metric which can compare hierarchical annotations holistically across multiple levels. sing this metric, we investigate inter-annotator agreement on the multilevel annotations of two different music corpora, investigate the influence of acoustic properties on hierarchical annotations, and evaluate existing hierarchical segmentation algorithms against the distribution of inter-annotator agreement.

  13. Evaluating Hierarchical Structure in Music Annotations

    Directory of Open Access Journals (Sweden)

    Brian McFee

    2017-08-01

    Full Text Available Music exhibits structure at multiple scales, ranging from motifs to large-scale functional components. When inferring the structure of a piece, different listeners may attend to different temporal scales, which can result in disagreements when they describe the same piece. In the field of music informatics research (MIR, it is common to use corpora annotated with structural boundaries at different levels. By quantifying disagreements between multiple annotators, previous research has yielded several insights relevant to the study of music cognition. First, annotators tend to agree when structural boundaries are ambiguous. Second, this ambiguity seems to depend on musical features, time scale, and genre. Furthermore, it is possible to tune current annotation evaluation metrics to better align with these perceptual differences. However, previous work has not directly analyzed the effects of hierarchical structure because the existing methods for comparing structural annotations are designed for “flat” descriptions, and do not readily generalize to hierarchical annotations. In this paper, we extend and generalize previous work on the evaluation of hierarchical descriptions of musical structure. We derive an evaluation metric which can compare hierarchical annotations holistically across multiple levels. sing this metric, we investigate inter-annotator agreement on the multilevel annotations of two different music corpora, investigate the influence of acoustic properties on hierarchical annotations, and evaluate existing hierarchical segmentation algorithms against the distribution of inter-annotator agreement.

  14. Corpus Annotation for Parser Evaluation

    OpenAIRE

    CARROLL, JOHN; Minnen, Guido; Briscoe, Ted

    1999-01-01

    We describe a recently developed corpus annotation scheme for evaluating parsers that avoids shortcomings of current methods. The scheme encodes grammatical relations between heads and dependents, and has been used to mark up a new public-domain corpus of naturally occurring English text. We show how the corpus can be used to evaluate the accuracy of a robust parser, and relate the corpus to extant resources.

  15. Marky: a tool supporting annotation consistency in multi-user and iterative document annotation projects.

    Science.gov (United States)

    Pérez-Pérez, Martín; Glez-Peña, Daniel; Fdez-Riverola, Florentino; Lourenço, Anália

    2015-02-01

    Document annotation is a key task in the development of Text Mining methods and applications. High quality annotated corpora are invaluable, but their preparation requires a considerable amount of resources and time. Although the existing annotation tools offer good user interaction interfaces to domain experts, project management and quality control abilities are still limited. Therefore, the current work introduces Marky, a new Web-based document annotation tool equipped to manage multi-user and iterative projects, and to evaluate annotation quality throughout the project life cycle. At the core, Marky is a Web application based on the open source CakePHP framework. User interface relies on HTML5 and CSS3 technologies. Rangy library assists in browser-independent implementation of common DOM range and selection tasks, and Ajax and JQuery technologies are used to enhance user-system interaction. Marky grants solid management of inter- and intra-annotator work. Most notably, its annotation tracking system supports systematic and on-demand agreement analysis and annotation amendment. Each annotator may work over documents as usual, but all the annotations made are saved by the tracking system and may be further compared. So, the project administrator is able to evaluate annotation consistency among annotators and across rounds of annotation, while annotators are able to reject or amend subsets of annotations made in previous rounds. As a side effect, the tracking system minimises resource and time consumption. Marky is a novel environment for managing multi-user and iterative document annotation projects. Compared to other tools, Marky offers a similar visually intuitive annotation experience while providing unique means to minimise annotation effort and enforce annotation quality, and therefore corpus consistency. Marky is freely available for non-commercial use at http://sing.ei.uvigo.es/marky. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  16. A genetic similarity algorithm for searching the Gene Ontology terms and annotating anonymous protein sequences.

    Science.gov (United States)

    Othman, Razib M; Deris, Safaai; Illias, Rosli M

    2008-02-01

    A genetic similarity algorithm is introduced in this study to find a group of semantically similar Gene Ontology terms. The genetic similarity algorithm combines semantic similarity measure algorithm with parallel genetic algorithm. The semantic similarity measure algorithm is used to compute the similitude strength between the Gene Ontology terms. Then, the parallel genetic algorithm is employed to perform batch retrieval and to accelerate the search in large search space of the Gene Ontology graph. The genetic similarity algorithm is implemented in the Gene Ontology browser named basic UTMGO to overcome the weaknesses of the existing Gene Ontology browsers which use a conventional approach based on keyword matching. To show the applicability of the basic UTMGO, we extend its structure to develop a Gene Ontology -based protein sequence annotation tool named extended UTMGO. The objective of developing the extended UTMGO is to provide a simple and practical tool that is capable of producing better results and requires a reasonable amount of running time with low computing cost specifically for offline usage. The computational results and comparison with other related tools are presented to show the effectiveness of the proposed algorithm and tools.

  17. Functional annotation of hierarchical modularity.

    Directory of Open Access Journals (Sweden)

    Kanchana Padmanabhan

    Full Text Available In biological networks of molecular interactions in a cell, network motifs that are biologically relevant are also functionally coherent, or form functional modules. These functionally coherent modules combine in a hierarchical manner into larger, less cohesive subsystems, thus revealing one of the essential design principles of system-level cellular organization and function-hierarchical modularity. Arguably, hierarchical modularity has not been explicitly taken into consideration by most, if not all, functional annotation systems. As a result, the existing methods would often fail to assign a statistically significant functional coherence score to biologically relevant molecular machines. We developed a methodology for hierarchical functional annotation. Given the hierarchical taxonomy of functional concepts (e.g., Gene Ontology and the association of individual genes or proteins with these concepts (e.g., GO terms, our method will assign a Hierarchical Modularity Score (HMS to each node in the hierarchy of functional modules; the HMS score and its p-value measure functional coherence of each module in the hierarchy. While existing methods annotate each module with a set of "enriched" functional terms in a bag of genes, our complementary method provides the hierarchical functional annotation of the modules and their hierarchically organized components. A hierarchical organization of functional modules often comes as a bi-product of cluster analysis of gene expression data or protein interaction data. Otherwise, our method will automatically build such a hierarchy by directly incorporating the functional taxonomy information into the hierarchy search process and by allowing multi-functional genes to be part of more than one component in the hierarchy. In addition, its underlying HMS scoring metric ensures that functional specificity of the terms across different levels of the hierarchical taxonomy is properly treated. We have evaluated our

  18. Chromosomal phenotypes and submicroscopic abnormalities

    Directory of Open Access Journals (Sweden)

    Devriendt Koen

    2004-01-01

    Full Text Available Abstract The finding, during the last decade, that several common, clinically delineated syndromes are caused by submicroscopic deletions or, more rarely, by duplications, has provided a powerful tool in the annotation of the human genome. Since most microdeletion/microduplication syndromes are defined by a common deleted/duplicated region, abnormal dosage of genes located within these regions can explain the phenotypic similarities among individuals with a specific syndrome. As such, they provide a unique resource towards the genetic dissection of complex phenotypes such as congenital heart defects, mental and growth retardation and abnormal behaviour. In addition, the study of phenotypic differences in individuals with the same microdeletion syndrome may also become a treasury for the identification of modifying factors for complex phenotypes. The molecular analysis of these chromosomal anomalies has led to a growing understanding of their mechanisms of origin. Novel tools to uncover additional submicroscopic chromosomal anomalies at a higher resolution and higher speed, as well as the novel tools at hand for deciphering the modifying factors and epistatic interactors, are 'on the doorstep' and will, besides their obvious diagnostic role, play a pivotal role in the genetic dissection of complex phenotypes.

  19. Representation and application of ontology based emergency plan for logistics%基于本体的物流应急预案表示及应用

    Institute of Scientific and Technical Information of China (English)

    冯志勇; 杨倩; 饶国政; 张晨; 李广鹏

    2011-01-01

    To compute the logistics emergency plan intelligently, this paper represented logistics emergency plan as emergency plan ontology. The relevance between concepts and emergency plan ontology could be computed,in this way the required emergency plan could be found. The semantics of the emergency plan ontology might be violated and the inconsistency might occur during linking the emergency plan ontology by OWL: imports and the rules in Jena, constrains were defined to avoid that. The emergency plan ontology could be queried by SPARQL and be effectively reasoned with custom builtin and rules which could make up for the expression ability of emergency plan ontology. Thus, implemented representation and application of ontology based emergency plan for logistics, solved the problem of logistics emergency plan intelligent processing based on the ability of expression, inference and query of ontology.%为让计算机智能地处理物流应急预案,将每个物流应急预案表示为一个应急预案本体,通过计算概念和应急预案本体的相关程度,从而找到所需物流应急预案.用OWL:imports和Jena规则可连接应急预案本体,从而形成内容更全面的预案,但会破坏原有应急预案本体语义,产生不一致问题,所提出的约束原则有效避免了这一问题.应用SPARQL查询应急预案本体,并通过自定义原语和规则弥补应急预案本体在表达能力上的不足,以有效地对应急预案本体进行推理,从而找到所需结论,进而实现了基于本体的物流应急预案的表示和应用.利用本体的语义表达能力以及推理和查询机制,解决了物流应急预案智能处理的问题.

  20. 基于本体的医学影像信息整合①%Ontology-Based Information Model for Integration of Medical Imaging Data

    Institute of Scientific and Technical Information of China (English)

    2013-01-01

    A computer readable unified information model is the data foundation in medical imaging semantic retrieve . In this paper, some challenges including lacking of unified information model for medical imaging information, the terminology and syntax for describing the semantic content in medical imaging varying were discussed, and an ontology-based information scheme for medical imaging information integrating was developed. Based on the analysis of medical imaging data source and the relationship of them, a medical imaging information ontology model was developed using "seven-step" method proposed by Stanford University, and the persistence of ontology model, extracting original data and data integration were realized. The information model was used in medical imaging semantic retrieve.%  计算机可理解的统一信息模型是基于语义的医学影像检索研究的数据基础。讨论了医学影像及其相关信息使用中存在的数据异构、图像标注术语及语法不一致及数据格式不支持现有数据挖掘和图像语义检索的问题,提出了一种基于本体的医学影像信息集成方案。在分析医学影像信息来源及其关系基础上,结合领域专家知识,使用斯坦福大学提出的本体构建“七步法”设计了医学影像信息本体模型,实现了本体模型的持久化、原始数据提取和数据整合,解决了医学影像信息使用中存在的问题,该信息模型已用于医学影像检索系统中。

  1. 一种基于本体的并行网络流量分类方法%An Ontology Based Parallel Network Traffic Classification Method

    Institute of Scientific and Technical Information of China (English)

    陶晓玲; 韦毅; 王勇

    2016-01-01

    海量网络流量数据的处理与单一节点的计算能力瓶颈这一矛盾导致数据分类效率低,无法满足现实需求。为解决这一问题,结合本体与MapReduce技术各自在海量异构数据描述与处理方面的优势,提出一种基于本体的并行网络流量分类方法。该方法基于MapReduce并行计算架构,根据网络流量本体结构,对网络流量本体并行化构建;通过并行知识推理完成基于流量统计特征的网络流量分类。实验结果表明,集群环境下基于MapReduce的网络流量本体构建效率明显高于单机环境,而且适当增加计算节点使得加速比线性提升;并行知识推理的分类方法能够有效地提高大规模网络流量的分类效率。%The contradiction between the processing of mass network traffic data and the computing bottleneck of a single node leads to low efficiency of data classification. To address this challenge, we propose an ontology based parallel network traffic classification method by integrating the advantage of ontology and MapReduce in dealing with the description and processing of mass heterogeneous data. Our approach makes use of MapReduce, a framework of parallel computing. Firstly, it uses the ontology to describe and manage network traffic data, and constructs the layered and parallel network traffic ontology. Then it builds the classification model by employing the decision tree algorithm, by which the inference rule set is generated. Network traffic classification based on traffic statistical features is completed by utilizing parallel knowledge reasoning. Implementation results show that data classification efficiency of the proposed approach in group environment is higher than in stand-alone scenario. The speedup ratio increases linearly when increasing the quantity of compute nodes. In addition, the new method is able to improve the classification efficiency of large-scale network traffic significantly.

  2. Omics data management and annotation.

    Science.gov (United States)

    Harel, Arye; Dalah, Irina; Pietrokovski, Shmuel; Safran, Marilyn; Lancet, Doron

    2011-01-01

    Technological Omics breakthroughs, including next generation sequencing, bring avalanches of data which need to undergo effective data management to ensure integrity, security, and maximal knowledge-gleaning. Data management system requirements include flexible input formats, diverse data entry mechanisms and views, user friendliness, attention to standards, hardware and software platform definition, as well as robustness. Relevant solutions elaborated by the scientific community include Laboratory Information Management Systems (LIMS) and standardization protocols facilitating data sharing and managing. In project planning, special consideration has to be made when choosing relevant Omics annotation sources, since many of them overlap and require sophisticated integration heuristics. The data modeling step defines and categorizes the data into objects (e.g., genes, articles, disorders) and creates an application flow. A data storage/warehouse mechanism must be selected, such as file-based systems and relational databases, the latter typically used for larger projects. Omics project life cycle considerations must include the definition and deployment of new versions, incorporating either full or partial updates. Finally, quality assurance (QA) procedures must validate data and feature integrity, as well as system performance expectations. We illustrate these data management principles with examples from the life cycle of the GeneCards Omics project (http://www.genecards.org), a comprehensive, widely used compendium of annotative information about human genes. For example, the GeneCards infrastructure has recently been changed from text files to a relational database, enabling better organization and views of the growing data. Omics data handling benefits from the wealth of Web-based information, the vast amount of public domain software, increasingly affordable hardware, and effective use of data management and annotation principles as outlined in this chapter.

  3. Mouse phenotyping.

    Science.gov (United States)

    Fuchs, Helmut; Gailus-Durner, Valérie; Adler, Thure; Aguilar-Pimentel, Juan Antonio; Becker, Lore; Calzada-Wack, Julia; Da Silva-Buttkus, Patricia; Neff, Frauke; Götz, Alexander; Hans, Wolfgang; Hölter, Sabine M; Horsch, Marion; Kastenmüller, Gabi; Kemter, Elisabeth; Lengger, Christoph; Maier, Holger; Matloka, Mikolaj; Möller, Gabriele; Naton, Beatrix; Prehn, Cornelia; Puk, Oliver; Rácz, Ildikó; Rathkolb, Birgit; Römisch-Margl, Werner; Rozman, Jan; Wang-Sattler, Rui; Schrewe, Anja; Stöger, Claudia; Tost, Monica; Adamski, Jerzy; Aigner, Bernhard; Beckers, Johannes; Behrendt, Heidrun; Busch, Dirk H; Esposito, Irene; Graw, Jochen; Illig, Thomas; Ivandic, Boris; Klingenspor, Martin; Klopstock, Thomas; Kremmer, Elisabeth; Mempel, Martin; Neschen, Susanne; Ollert, Markus; Schulz, Holger; Suhre, Karsten; Wolf, Eckhard; Wurst, Wolfgang; Zimmer, Andreas; Hrabě de Angelis, Martin

    2011-02-01

    Model organisms like the mouse are important tools to learn more about gene function in man. Within the last 20 years many mutant mouse lines have been generated by different methods such as ENU mutagenesis, constitutive and conditional knock-out approaches, knock-down, introduction of human genes, and knock-in techniques, thus creating models which mimic human conditions. Due to pleiotropic effects, one gene may have different functions in different organ systems or time points during development. Therefore mutant mouse lines have to be phenotyped comprehensively in a highly standardized manner to enable the detection of phenotypes which might otherwise remain hidden. The German Mouse Clinic (GMC) has been established at the Helmholtz Zentrum München as a phenotyping platform with open access to the scientific community (www.mousclinic.de; [1]). The GMC is a member of the EUMODIC consortium which created the European standard workflow EMPReSSslim for the systemic phenotyping of mouse models (http://www.eumodic.org/[2]). Copyright © 2010 Elsevier Inc. All rights reserved.

  4. Knowledge Annotation maknig implicit knowledge explicit

    CERN Document Server

    Dingli, Alexiei

    2011-01-01

    Did you ever read something on a book, felt the need to comment, took up a pencil and scribbled something on the books' text'? If you did, you just annotated a book. But that process has now become something fundamental and revolutionary in these days of computing. Annotation is all about adding further information to text, pictures, movies and even to physical objects. In practice, anything which can be identified either virtually or physically can be annotated. In this book, we will delve into what makes annotations, and analyse their significance for the future evolutions of the web. We wil

  5. ANNOTATION SUPPORTED OCCLUDED OBJECT TRACKING

    Directory of Open Access Journals (Sweden)

    Devinder Kumar

    2012-08-01

    Full Text Available Tracking occluded objects at different depths has become as extremely important component of study for any video sequence having wide applications in object tracking, scene recognition, coding, editing the videos and mosaicking. The paper studies the ability of annotation to track the occluded object based on pyramids with variation in depth further establishing a threshold at which the ability of the system to track the occluded object fails. Image annotation is applied on 3 similar video sequences varying in depth. In the experiment, one bike occludes the other at a depth of 60cm, 80cm and 100cm respectively. Another experiment is performed on tracking humans with similar depth to authenticate the results. The paper also computes the frame by frame error incurred by the system, supported by detailed simulations. This system can be effectively used to analyze the error in motion tracking and further correcting the error leading to flawless tracking. This can be of great interest to computer scientists while designing surveillance systems etc.

  6. Ontology-based image navigation: exploring 3.0-T MR neurography of the brachial plexus using AIM and RadLex.

    Science.gov (United States)

    Wang, Kenneth C; Salunkhe, Aditya R; Morrison, James J; Lee, Pearlene P; Mejino, José L V; Detwiler, Landon T; Brinkley, James F; Siegel, Eliot L; Rubin, Daniel L; Carrino, John A

    2015-01-01

    Disorders of the peripheral nervous system have traditionally been evaluated using clinical history, physical examination, and electrodiagnostic testing. In selected cases, imaging modalities such as magnetic resonance (MR) neurography may help further localize or characterize abnormalities associated with peripheral neuropathies, and the clinical importance of such techniques is increasing. However, MR image interpretation with respect to peripheral nerve anatomy and disease often presents a diagnostic challenge because the relevant knowledge base remains relatively specialized. Using the radiology knowledge resource RadLex®, a series of RadLex queries, the Annotation and Image Markup standard for image annotation, and a Web services-based software architecture, the authors developed an application that allows ontology-assisted image navigation. The application provides an image browsing interface, allowing users to visually inspect the imaging appearance of anatomic structures. By interacting directly with the images, users can access additional structure-related information that is derived from RadLex (eg, muscle innervation, muscle attachment sites). These data also serve as conceptual links to navigate from one portion of the imaging atlas to another. With 3.0-T MR neurography of the brachial plexus as the initial area of interest, the resulting application provides support to radiologists in the image interpretation process by allowing efficient exploration of the MR imaging appearance of relevant nerve segments, muscles, bone structures, vascular landmarks, anatomic spaces, and entrapment sites, and the investigation of neuromuscular relationships.

  7. DFLAT: functional annotation for human development.

    Science.gov (United States)

    Wick, Heather C; Drabkin, Harold; Ngu, Huy; Sackman, Michael; Fournier, Craig; Haggett, Jessica; Blake, Judith A; Bianchi, Diana W; Slonim, Donna K

    2014-02-07

    Recent increases in genomic studies of the developing human fetus and neonate have led to a need for widespread characterization of the functional roles of genes at different developmental stages. The Gene Ontology (GO), a valuable and widely-used resource for characterizing gene function, offers perhaps the most suitable functional annotation system for this purpose. However, due in part to the difficulty of studying molecular genetic effects in humans, even the current collection of comprehensive GO annotations for human genes and gene products often lacks adequate developmental context for scientists wishing to study gene function in the human fetus. The Developmental FunctionaL Annotation at Tufts (DFLAT) project aims to improve the quality of analyses of fetal gene expression and regulation by curating human fetal gene functions using both manual and semi-automated GO procedures. Eligible annotations are then contributed to the GO database and included in GO releases of human data. DFLAT has produced a considerable body of functional annotation that we demonstrate provides valuable information about developmental genomics. A collection of gene sets (genes implicated in the same function or biological process), made by combining existing GO annotations with the 13,344 new DFLAT annotations, is available for use in novel analyses. Gene set analyses of expression in several data sets, including amniotic fluid RNA from fetuses with trisomies 21 and 18, umbilical cord blood, and blood from newborns with bronchopulmonary dysplasia, were conducted both with and without the DFLAT annotation. Functional analysis of expression data using the DFLAT annotation increases the number of implicated gene sets, reflecting the DFLAT's improved representation of current knowledge. Blinded literature review supports the validity of newly significant findings obtained with the DFLAT annotations. Newly implicated significant gene sets also suggest specific hypotheses for future

  8. Mining skeletal phenotype descriptions from scientific literature.

    Directory of Open Access Journals (Sweden)

    Tudor Groza

    Full Text Available Phenotype descriptions are important for our understanding of genetics, as they enable the computation and analysis of a varied range of issues related to the genetic and developmental bases of correlated characters. The literature contains a wealth of such phenotype descriptions, usually reported as free-text entries, similar to typical clinical summaries. In this paper, we focus on creating and making available an annotated corpus of skeletal phenotype descriptions. In addition, we present and evaluate a hybrid Machine Learning approach for mining phenotype descriptions from free text. Our hybrid approach uses an ensemble of four classifiers and experiments with several aggregation techniques. The best scoring technique achieves an F-1 score of 71.52%, which is close to the state-of-the-art in other domains, where training data exists in abundance. Finally, we discuss the influence of the features chosen for the model on the overall performance of the method.

  9. Harnessing Collaborative Annotations on Online Formative Assessments

    Science.gov (United States)

    Lin, Jian-Wei; Lai, Yuan-Cheng

    2013-01-01

    This paper harnesses collaborative annotations by students as learning feedback on online formative assessments to improve the learning achievements of students. Through the developed Web platform, students can conduct formative assessments, collaboratively annotate, and review historical records in a convenient way, while teachers can generate…

  10. The surplus value of semantic annotations

    NARCIS (Netherlands)

    M. Marx

    2010-01-01

    We compare the costs of semantic annotation of textual documents to its benefits for information processing tasks. Semantic annotation can improve the performance of retrieval tasks and facilitates an improved search experience through faceted search, focused retrieval, better document summaries, an

  11. Ground Truth Annotation in T Analyst

    DEFF Research Database (Denmark)

    2015-01-01

    This video shows how to annotate the ground truth tracks in the thermal videos. The ground truth tracks are produced to be able to compare them to tracks obtained from a Computer Vision tracking approach. The program used for annotation is T-Analyst, which is developed by Aliaksei Laureshyn, Ph...

  12. Ground Truth Annotation in T Analyst

    DEFF Research Database (Denmark)

    2015-01-01

    This video shows how to annotate the ground truth tracks in the thermal videos. The ground truth tracks are produced to be able to compare them to tracks obtained from a Computer Vision tracking approach. The program used for annotation is T-Analyst, which is developed by Aliaksei Laureshyn, Ph...

  13. Towards an event annotated corpus of Polish

    Directory of Open Access Journals (Sweden)

    Michał Marcińczuk

    2015-12-01

    Full Text Available Towards an event annotated corpus of Polish The paper presents a typology of events built on the basis of TimeML specification adapted to Polish language. Some changes were introduced to the definition of the event categories and a motivation for event categorization was formulated. The event annotation task is presented on two levels – ontology level (language independent and text mentions (language dependant. The various types of event mentions in Polish text are discussed. A procedure for annotation of event mentions in Polish texts is presented and evaluated. In the evaluation a randomly selected set of documents from the Corpus of Wrocław University of Technology (called KPWr was annotated by two linguists and the annotator agreement was calculated. The evaluation was done in two iterations. After the first evaluation we revised and improved the annotation procedure. The second evaluation showed a significant improvement of the agreement between annotators. The current work was focused on annotation and categorisation of event mentions in text. The future work will be focused on description of event with a set of attributes, arguments and relations.

  14. Harnessing Collaborative Annotations on Online Formative Assessments

    Science.gov (United States)

    Lin, Jian-Wei; Lai, Yuan-Cheng

    2013-01-01

    This paper harnesses collaborative annotations by students as learning feedback on online formative assessments to improve the learning achievements of students. Through the developed Web platform, students can conduct formative assessments, collaboratively annotate, and review historical records in a convenient way, while teachers can generate…

  15. Annotation of regular polysemy and underspecification

    DEFF Research Database (Denmark)

    Martínez Alonso, Héctor; Pedersen, Bolette Sandford; Bel, Núria

    2013-01-01

    We present the result of an annotation task on regular polysemy for a series of seman- tic classes or dot types in English, Dan- ish and Spanish. This article describes the annotation process, the results in terms of inter-encoder agreement, and the sense distributions obtained with two methods...

  16. Assisted annotation of medical free text using RapTAT.

    Science.gov (United States)

    Gobbel, Glenn T; Garvin, Jennifer; Reeves, Ruth; Cronin, Robert M; Heavirland, Julia; Williams, Jenifer; Weaver, Allison; Jayaramaraja, Shrimalini; Giuse, Dario; Speroff, Theodore; Brown, Steven H; Xu, Hua; Matheny, Michael E

    2014-01-01

    To determine whether assisted annotation using interactive training can reduce the time required to annotate a clinical document corpus without introducing bias. A tool, RapTAT, was designed to assist annotation by iteratively pre-annotating probable phrases of interest within a document, presenting the annotations to a reviewer for correction, and then using the corrected annotations for further machine learning-based training before pre-annotating subsequent documents. Annotators reviewed 404 clinical notes either manually or using RapTAT assistance for concepts related to quality of care during heart failure treatment. Notes were divided into 20 batches of 19-21 documents for iterative annotation and training. The number of correct RapTAT pre-annotations increased significantly and annotation time per batch decreased by ~50% over the course of annotation. Annotation rate increased from batch to batch for assisted but not manual reviewers. Pre-annotation F-measure increased from 0.5 to 0.6 to >0.80 (relative to both assisted reviewer and reference annotations) over the first three batches and more slowly thereafter. Overall inter-annotator agreement was significantly higher between RapTAT-assisted reviewers (0.89) than between manual reviewers (0.85). The tool reduced workload by decreasing the number of annotations needing to be added and helping reviewers to annotate at an increased rate. Agreement between the pre-annotations and reference standard, and agreement between the pre-annotations and assisted annotations, were similar throughout the annotation process, which suggests that pre-annotation did not introduce bias. Pre-annotations generated by a tool capable of interactive training can reduce the time required to create an annotated document corpus by up to 50%. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  17. Towards Viral Genome Annotation Standards, Report from the 2010 NCBI Annotation Workshop.

    Science.gov (United States)

    Brister, James Rodney; Bao, Yiming; Kuiken, Carla; Lefkowitz, Elliot J; Le Mercier, Philippe; Leplae, Raphael; Madupu, Ramana; Scheuermann, Richard H; Schobel, Seth; Seto, Donald; Shrivastava, Susmita; Sterk, Peter; Zeng, Qiandong; Klimke, William; Tatusova, Tatiana

    2010-10-01

    Improvements in DNA sequencing technologies portend a new era in virology and could possibly lead to a giant leap in our understanding of viral evolution and ecology. Yet, as viral genome sequences begin to fill the world's biological databases, it is critically important to recognize that the scientific promise of this era is dependent on consistent and comprehensive genome annotation. With this in mind, the NCBI Genome Annotation Workshop recently hosted a study group tasked with developing sequence, function, and metadata annotation standards for viral genomes. This report describes the issues involved in viral genome annotation and reviews policy recommendations presented at the NCBI Annotation Workshop.

  18. Manual Annotation of Translational Equivalence The Blinker Project

    CERN Document Server

    Melamed, I D

    1998-01-01

    Bilingual annotators were paid to link roughly sixteen thousand corresponding words between on-line versions of the Bible in modern French and modern English. These annotations are freely available to the research community from http://www.cis.upenn.edu/~melamed . The annotations can be used for several purposes. First, they can be used as a standard data set for developing and testing translation lexicons and statistical translation models. Second, researchers in lexical semantics will be able to mine the annotations for insights about cross-linguistic lexicalization patterns. Third, the annotations can be used in research into certain recently proposed methods for monolingual word-sense disambiguation. This paper describes the annotated texts, the specially-designed annotation tool, and the strategies employed to increase the consistency of the annotations. The annotation process was repeated five times by different annotators. Inter-annotator agreement rates indicate that the annotations are reasonably rel...

  19. The Human Phenotype Ontology project: linking molecular biology and disease through phenotype data.

    Science.gov (United States)

    Köhler, Sebastian; Doelken, Sandra C; Mungall, Christopher J; Bauer, Sebastian; Firth, Helen V; Bailleul-Forestier, Isabelle; Black, Graeme C M; Brown, Danielle L; Brudno, Michael; Campbell, Jennifer; FitzPatrick, David R; Eppig, Janan T; Jackson, Andrew P; Freson, Kathleen; Girdea, Marta; Helbig, Ingo; Hurst, Jane A; Jähn, Johanna; Jackson, Laird G; Kelly, Anne M; Ledbetter, David H; Mansour, Sahar; Martin, Christa L; Moss, Celia; Mumford, Andrew; Ouwehand, Willem H; Park, Soo-Mi; Riggs, Erin Rooney; Scott, Richard H; Sisodiya, Sanjay; Van Vooren, Steven; Wapner, Ronald J; Wilkie, Andrew O M; Wright, Caroline F; Vulto-van Silfhout, Anneke T; de Leeuw, Nicole; de Vries, Bert B A; Washingthon, Nicole L; Smith, Cynthia L; Westerfield, Monte; Schofield, Paul; Ruef, Barbara J; Gkoutos, Georgios V; Haendel, Melissa; Smedley, Damian; Lewis, Suzanna E; Robinson, Peter N

    2014-01-01

    The Human Phenotype Ontology (HPO) project, available at http://www.human-phenotype-ontology.org, provides a structured, comprehensive and well-defined set of 10,088 classes (terms) describing human phenotypic abnormalities and 13,326 subclass relations between the HPO classes. In addition we have developed logical definitions for 46% of all HPO classes using terms from ontologies for anatomy, cell types, function, embryology, pathology and other domains. This allows interoperability with several resources, especially those containing phenotype information on model organisms such as mouse and zebrafish. Here we describe the updated HPO database, which provides annotations of 7,278 human hereditary syndromes listed in OMIM, Orphanet and DECIPHER to classes of the HPO. Various meta-attributes such as frequency, references and negations are associated with each annotation. Several large-scale projects worldwide utilize the HPO for describing phenotype information in their datasets. We have therefore generated equivalence mappings to other phenotype vocabularies such as LDDB, Orphanet, MedDRA, UMLS and phenoDB, allowing integration of existing datasets and interoperability with multiple biomedical resources. We have created various ways to access the HPO database content using flat files, a MySQL database, and Web-based tools. All data and documentation on the HPO project can be found online.

  20. Facilitating functional annotation of chicken microarray data

    Directory of Open Access Journals (Sweden)

    Gresham Cathy R

    2009-10-01

    Full Text Available Abstract Background Modeling results from chicken microarray studies is challenging for researchers due to little functional annotation associated with these arrays. The Affymetrix GenChip chicken genome array, one of the biggest arrays that serve as a key research tool for the study of chicken functional genomics, is among the few arrays that link gene products to Gene Ontology (GO. However the GO annotation data presented by Affymetrix is incomplete, for example, they do not show references linked to manually annotated functions. In addition, there is no tool that facilitates microarray researchers to directly retrieve functional annotations for their datasets from the annotated arrays. This costs researchers amount of time in searching multiple GO databases for functional information. Results We have improved the breadth of functional annotations of the gene products associated with probesets on the Affymetrix chicken genome array by 45% and the quality of annotation by 14%. We have also identified the most significant diseases and disorders, different types of genes, and known drug targets represented on Affymetrix chicken genome array. To facilitate functional annotation of other arrays and microarray experimental datasets we developed an Array GO Mapper (AGOM tool to help researchers to quickly retrieve corresponding functional information for their dataset. Conclusion Results from this study will directly facilitate annotation of other chicken arrays and microarray experimental datasets. Researchers will be able to quickly model their microarray dataset into more reliable biological functional information by using AGOM tool. The disease, disorders, gene types and drug targets revealed in the study will allow researchers to learn more about how genes function in complex biological systems and may lead to new drug discovery and development of therapies. The GO annotation data generated will be available for public use via AgBase website and

  1. Concept annotation in the CRAFT corpus

    Directory of Open Access Journals (Sweden)

    Bada Michael

    2012-07-01

    Full Text Available Abstract Background Manually annotated corpora are critical for the training and evaluation of automated methods to identify concepts in biomedical text. Results This paper presents the concept annotations of the Colorado Richly Annotated Full-Text (CRAFT Corpus, a collection of 97 full-length, open-access biomedical journal articles that have been annotated both semantically and syntactically to serve as a research resource for the biomedical natural-language-processing (NLP community. CRAFT identifies all mentions of nearly all concepts from nine prominent biomedical ontologies and terminologies: the Cell Type Ontology, the Chemical Entities of Biological Interest ontology, the NCBI Taxonomy, the Protein Ontology, the Sequence Ontology, the entries of the Entrez Gene database, and the three subontologies of the Gene Ontology. The first public release includes the annotations for 67 of the 97 articles, reserving two sets of 15 articles for future text-mining competitions (after which these too will be released. Concept annotations were created based on a single set of guidelines, which has enabled us to achieve consistently high interannotator agreement. Conclusions As the initial 67-article release contains more than 560,000 tokens (and the full set more than 790,000 tokens, our corpus is among the largest gold-standard annotated biomedical corpora. Unlike most others, the journal articles that comprise the corpus are drawn from diverse biomedical disciplines and are marked up in their entirety. Additionally, with a concept-annotation count of nearly 100,000 in the 67-article subset (and more than 140,000 in the full collection, the scale of conceptual markup is also among the largest of comparable corpora. The concept annotations of the CRAFT Corpus have the potential to significantly advance biomedical text mining by providing a high-quality gold standard for NLP systems. The corpus, annotation guidelines, and other associated resources are

  2. Teaching and Learning Communities through Online Annotation

    Science.gov (United States)

    van der Pluijm, B.

    2016-12-01

    What do colleagues do with your assigned textbook? What they say or think about the material? Want students to be more engaged in their learning experience? If so, online materials that complement standard lecture format provide new opportunity through managed, online group annotation that leverages the ubiquity of internet access, while personalizing learning. The concept is illustrated with the new online textbook "Processes in Structural Geology and Tectonics", by Ben van der Pluijm and Stephen Marshak, which offers a platform for sharing of experiences, supplementary materials and approaches, including readings, mathematical applications, exercises, challenge questions, quizzes, alternative explanations, and more. The annotation framework used is Hypothes.is, which offers a free, open platform markup environment for annotation of websites and PDF postings. The annotations can be public, grouped or individualized, as desired, including export access and download of annotations. A teacher group, hosted by a moderator/owner, limits access to members of a user group of teachers, so that its members can use, copy or transcribe annotations for their own lesson material. Likewise, an instructor can host a student group that encourages sharing of observations, questions and answers among students and instructor. Also, the instructor can create one or more closed groups that offers study help and hints to students. Options galore, all of which aim to engage students and to promote greater responsibility for their learning experience. Beyond new capacity, the ability to analyze student annotation supports individual learners and their needs. For example, student notes can be analyzed for key phrases and concepts, and identify misunderstandings, omissions and problems. Also, example annotations can be shared to enhance notetaking skills and to help with studying. Lastly, online annotation allows active application to lecture posted slides, supporting real-time notetaking

  3. A Common XML-based Framework for Syntactic Annotations

    CERN Document Server

    Ide, Nancy; Erjavec, Tomaz

    2009-01-01

    It is widely recognized that the proliferation of annotation schemes runs counter to the need to re-use language resources, and that standards for linguistic annotation are becoming increasingly mandatory. To answer this need, we have developed a framework comprised of an abstract model for a variety of different annotation types (e.g., morpho-syntactic tagging, syntactic annotation, co-reference annotation, etc.), which can be instantiated in different ways depending on the annotator's approach and goals. In this paper we provide an overview of the framework, demonstrate its applicability to syntactic annotation, and show how it can contribute to comparative evaluation of parser output and diverse syntactic annotation schemes.

  4. Making web annotations persistent over time

    Energy Technology Data Exchange (ETDEWEB)

    Sanderson, Robert [Los Alamos National Laboratory; Van De Sompel, Herbert [Los Alamos National Laboratory

    2010-01-01

    As Digital Libraries (DL) become more aligned with the web architecture, their functional components need to be fundamentally rethought in terms of URIs and HTTP. Annotation, a core scholarly activity enabled by many DL solutions, exhibits a clearly unacceptable characteristic when existing models are applied to the web: due to the representations of web resources changing over time, an annotation made about a web resource today may no longer be relevant to the representation that is served from that same resource tomorrow. We assume the existence of archived versions of resources, and combine the temporal features of the emerging Open Annotation data model with the capability offered by the Memento framework that allows seamless navigation from the URI of a resource to archived versions of that resource, and arrive at a solution that provides guarantees regarding the persistence of web annotations over time. More specifically, we provide theoretical solutions and proof-of-concept experimental evaluations for two problems: reconstructing an existing annotation so that the correct archived version is displayed for all resources involved in the annotation, and retrieving all annotations that involve a given archived version of a web resource.

  5. COGNATE: comparative gene annotation characterizer.

    Science.gov (United States)

    Wilbrandt, Jeanne; Misof, Bernhard; Niehuis, Oliver

    2017-07-17

    The comparison of gene and genome structures across species has the potential to reveal major trends of genome evolution. However, such a comparative approach is currently hampered by a lack of standardization (e.g., Elliott TA, Gregory TR, Philos Trans Royal Soc B: Biol Sci 370:20140331, 2015). For example, testing the hypothesis that the total amount of coding sequences is a reliable measure of potential proteome diversity (Wang M, Kurland CG, Caetano-Anollés G, PNAS 108:11954, 2011) requires the application of standardized definitions of coding sequence and genes to create both comparable and comprehensive data sets and corresponding summary statistics. However, such standard definitions either do not exist or are not consistently applied. These circumstances call for a standard at the descriptive level using a minimum of parameters as well as an undeviating use of standardized terms, and for software that infers the required data under these strict definitions. The acquisition of a comprehensive, descriptive, and standardized set of parameters and summary statistics for genome publications and further analyses can thus greatly benefit from the availability of an easy to use standard tool. We developed a new open-source command-line tool, COGNATE (Comparative Gene Annotation Characterizer), which uses a given genome assembly and its annotation of protein-coding genes for a detailed description of the respective gene and genome structure parameters. Additionally, we revised the standard definitions of gene and genome structures and provide the definitions used by COGNATE as a working draft suggestion for further reference. Complete parameter lists and summary statistics are inferred using this set of definitions to allow down-stream analyses and to provide an overview of the genome and gene repertoire characteristics. COGNATE is written in Perl and freely available at the ZFMK homepage ( https://www.zfmk.de/en/COGNATE ) and on github ( https

  6. Creating Gaze Annotations in Head Mounted Displays

    DEFF Research Database (Denmark)

    Mardanbeigi, Diako; Qvarfordt, Pernilla

    2015-01-01

    , the user simply captures an image using the HMD’s camera, looks at an object of interest in the image, and speaks out the information to be associated with the object. The gaze location is recorded and visualized with a marker. The voice is transcribed using speech recognition. Gaze annotations can......To facilitate distributed communication in mobile settings, we developed GazeNote for creating and sharing gaze annotations in head mounted displays (HMDs). With gaze annotations it possible to point out objects of interest within an image and add a verbal description. To create an annota- tion...

  7. Crowdsourcing and annotating NER for Twitter #drift

    DEFF Research Database (Denmark)

    Fromreide, Hege; Hovy, Dirk; Søgaard, Anders

    2014-01-01

    We present two new NER datasets for Twitter; a manually annotated set of 1,467 tweets (kappa=0.942) and a set of 2,975 expert-corrected, crowdsourced NER annotated tweets from the dataset described in Finin et al. (2010). In our experiments with these datasets, we observe two important points: (a......) language drift on Twitter is significant, and while off-the-shelf systems have been reported to perform well on in-sample data, they often perform poorly on new samples of tweets, (b) state-of-the-art performance across various datasets can beobtained from crowdsourced annotations, making it more feasible...

  8. Annotation Style Guide for the Blinker Project

    CERN Document Server

    Melamed, I D

    1998-01-01

    This annotation style guide was created by and for the Blinker project at the University of Pennsylvania. The Blinker project was so named after the ``bilingual linker'' GUI, which was created to enable bilingual annotators to ``link'' word tokens that are mutual translations in parallel texts. The parallel text chosen for this project was the Bible, because it is probably the easiest text to obtain in electronic form in multiple languages. The languages involved were English and French, because, of the languages with which the project co-ordinator was familiar, these were the two for which a sufficient number of annotators was likely to be found.

  9. DIMA – Annotation guidelines for German intonation

    DEFF Research Database (Denmark)

    Kügler, Frank; Smolibocki, Bernadett; Arnold, Denis

    2015-01-01

    easier since German intonation is currently annotated according to different models. To this end, we aim to provide guidelines that are easy to learn. The guidelines were evaluated running an inter-annotator reliability study on three different speech styles (read speech, monologue and dialogue......This paper presents newly developed guidelines for prosodic annotation of German as a consensus system agreed upon by German intonologists. The DIMA system is rooted in the framework of autosegmental-metrical phonology. One important goal of the consensus is to make exchanging data between groups...

  10. GPA: a statistical approach to prioritizing GWAS results by integrating pleiotropy and annotation.

    Science.gov (United States)

    Chung, Dongjun; Yang, Can; Li, Cong; Gelernter, Joel; Zhao, Hongyu

    2014-11-01

    Results from Genome-Wide Association Studies (GWAS) have shown that complex diseases are often affected by many genetic variants with small or moderate effects. Identifications of these risk variants remain a very challenging problem. There is a need to develop more powerful statistical methods to leverage available information to improve upon traditional approaches that focus on a single GWAS dataset without incorporating additional data. In this paper, we propose a novel statistical approach, GPA (Genetic analysis incorporating Pleiotropy and Annotation), to increase statistical power to identify risk variants through joint analysis of multiple GWAS data sets and annotation information because: (1) accumulating evidence suggests that different complex diseases share common risk bases, i.e., pleiotropy; and (2) functionally annotated variants have been consistently demonstrated to be enriched among GWAS hits. GPA can integrate multiple GWAS datasets and functional annotations to seek association signals, and it can also perform hypothesis testing to test the presence of pleiotropy and enrichment of functional annotation. Statistical inference of the model parameters and SNP ranking is achieved through an EM algorithm that can handle genome-wide markers efficiently. When we applied GPA to jointly analyze five psychiatric disorders with annotation information, not only did GPA identify many weak signals missed by the traditional single phenotype analysis, but it also revealed relationships in the genetic architecture of these disorders. Using our hypothesis testing framework, statistically significant pleiotropic effects were detected among these psychiatric disorders, and the markers annotated in the central nervous system genes and eQTLs from the Genotype-Tissue Expression (GTEx) database were significantly enriched. We also applied GPA to a bladder cancer GWAS data set with the ENCODE DNase-seq data from 125 cell lines. GPA was able to detect cell lines that are

  11. Meteor showers an annotated catalog

    CERN Document Server

    Kronk, Gary W

    2014-01-01

    Meteor showers are among the most spectacular celestial events that may be observed by the naked eye, and have been the object of fascination throughout human history. In “Meteor Showers: An Annotated Catalog,” the interested observer can access detailed research on over 100 annual and periodic meteor streams in order to capitalize on these majestic spectacles. Each meteor shower entry includes details of their discovery, important observations and orbits, and gives a full picture of duration, location in the sky, and expected hourly rates. Armed with a fuller understanding, the amateur observer can better view and appreciate the shower of their choice. The original book, published in 1988, has been updated with over 25 years of research in this new and improved edition. Almost every meteor shower study is expanded, with some original minor showers being dropped while new ones are added. The book also includes breakthroughs in the study of meteor showers, such as accurate predictions of outbursts as well ...

  12. Sequencing and annotated analysis of an Estonian human genome.

    Science.gov (United States)

    Lilleoja, Rutt; Sarapik, Aili; Reimann, Ene; Reemann, Paula; Jaakma, Ülle; Vasar, Eero; Kõks, Sulev

    2012-02-01

    In present study we describe the sequencing and annotated analysis of the individual genome of Estonian. Using SOLID technology we generated 2,449,441,916 of 50-bp reads. The Bioscope version 1.3 was used for mapping and pairing of reads to the NCBI human genome reference (build 36, hg18). Bioscope enables also the annotation of the results of variant (tertiary) analysis. The average mapping of reads was 75.5% with total coverage of 107.72 Gb. resulting in mean fold coverage of 34.6. We found 3,482,975 SNPs out of which 352,492 were novel. 21,222 SNPs were in coding region: 10,649 were synonymous SNPs, 10,360 were nonsynonymous missense SNPs, 155 were nonsynonymous nonsense SNPs and 58 were nonsynonymous frameshifts. We identified 219 CNVs with total base pair coverage of 37,326,300 bp and 87,451 large insertion/deletion polymorphisms covering 10,152,256 bp of the genome. In addition, we found 285,864 small size insertion/deletion polymorphisms out of which 133,969 were novel. Finally, we identified 53 inversions, 19 overlapped genes and 2 overlapped exons. Interestingly, we found the region in chromosome 6 to be enriched with the coding SNPs and CNVs. This study confirms previous findings, that our genomes are more complex and variable as thought before. Therefore, sequencing of the personal genomes followed by annotation would improve the analysis of heritability of phenotypes and our understandings on the functions of genome.

  13. Annotation of Scientific Summaries for Information Retrieval

    CERN Document Server

    Ibekwe-Sanjuan, Fidelia; Eric, Sanjuan; Eric, Charton

    2011-01-01

    We present a methodology combining surface NLP and Machine Learning techniques for ranking asbtracts and generating summaries based on annotated corpora. The corpora were annotated with meta-semantic tags indicating the category of information a sentence is bearing (objective, findings, newthing, hypothesis, conclusion, future work, related work). The annotated corpus is fed into an automatic summarizer for query-oriented abstract ranking and multi- abstract summarization. To adapt the summarizer to these two tasks, two novel weighting functions were devised in order to take into account the distribution of the tags in the corpus. Results, although still preliminary, are encouraging us to pursue this line of work and find better ways of building IR systems that can take into account semantic annotations in a corpus.

  14. Annotation and retrieval in protein interaction databases

    Science.gov (United States)

    Cannataro, Mario; Hiram Guzzi, Pietro; Veltri, Pierangelo

    2014-06-01

    Biological databases have been developed with a special focus on the efficient retrieval of single records or the efficient computation of specialized bioinformatics algorithms against the overall database, such as in sequence alignment. The continuos production of biological knowledge spread on several biological databases and ontologies, such as Gene Ontology, and the availability of efficient techniques to handle such knowledge, such as annotation and semantic similarity measures, enable the development on novel bioinformatics applications that explicitly use and integrate such knowledge. After introducing the annotation process and the main semantic similarity measures, this paper shows how annotations and semantic similarity can be exploited to improve the extraction and analysis of biologically relevant data from protein interaction databases. As case studies, the paper presents two novel software tools, OntoPIN and CytoSeVis, both based on the use of Gene Ontology annotations, for the advanced querying of protein interaction databases and for the enhanced visualization of protein interaction networks.

  15. SASL: A Semantic Annotation System for Literature

    Science.gov (United States)

    Yuan, Pingpeng; Wang, Guoyin; Zhang, Qin; Jin, Hai

    Due to ambiguity, search engines for scientific literatures may not return right search results. One efficient solution to the problems is to automatically annotate literatures and attach the semantic information to them. Generally, semantic annotation requires identifying entities before attaching semantic information to them. However, due to abbreviation and other reasons, it is very difficult to identify entities correctly. The paper presents a Semantic Annotation System for Literature (SASL), which utilizes Wikipedia as knowledge base to annotate literatures. SASL mainly attaches semantic to terminology, academic institutions, conferences, and journals etc. Many of them are usually abbreviations, which induces ambiguity. Here, SASL uses regular expressions to extract the mapping between full name of entities and their abbreviation. Since full names of several entities may map to a single abbreviation, SASL introduces Hidden Markov Model to implement name disambiguation. Finally, the paper presents the experimental results, which confirm SASL a good performance.

  16. Modeling Social Annotation: a Bayesian Approach

    CERN Document Server

    Plangprasopchok, Anon

    2008-01-01

    Collaborative tagging systems, such as del.icio.us, CiteULike, and others, allow users to annotate objects, e.g., Web pages or scientific papers, with descriptive labels called tags. The social annotations, contributed by thousands of users, can potentially be used to infer categorical knowledge, classify documents or recommend new relevant information. Traditional text inference methods do not make best use of socially-generated data, since they do not take into account variations in individual users' perspectives and vocabulary. In a previous work, we introduced a simple probabilistic model that takes interests of individual annotators into account in order to find hidden topics of annotated objects. Unfortunately, our proposed approach had a number of shortcomings, including overfitting, local maxima and the requirement to specify values for some parameters. In this paper we address these shortcomings in two ways. First, we extend the model to a fully Bayesian framework. Second, we describe an infinite ver...

  17. Semantic Annotation to Support Automatic Taxonomy Classification

    DEFF Research Database (Denmark)

    Kim, Sanghee; Ahmed, Saeema; Wallace, Ken

    2006-01-01

    , the annotations identify which parts of a text are more important for understanding its contents. The extraction of salient sentences is a major issue in text summarisation. Commonly used methods are based on statistical analysis, but for subject-matter type texts, linguistically motivated natural language...... processing techniques, like semantic annotations, are preferred. An experiment to test the method using 140 documents collected from industry demonstrated that classification accuracy can be improved by up to 16%....

  18. Genepi: a blackboard framework for genome annotation.

    Science.gov (United States)

    Descorps-Declère, Stéphane; Ziébelin, Danielle; Rechenmann, François; Viari, Alain

    2006-10-12

    Genome annotation can be viewed as an incremental, cooperative, data-driven, knowledge-based process that involves multiple methods to predict gene locations and structures. This process might have to be executed more than once and might be subjected to several revisions as the biological (new data) or methodological (new methods) knowledge evolves. In this context, although a lot of annotation platforms already exist, there is still a strong need for computer systems which take in charge, not only the primary annotation, but also the update and advance of the associated knowledge. In this paper, we propose to adopt a blackboard architecture for designing such a system We have implemented a blackboard framework (called Genepi) for developing automatic annotation systems. The system is not bound to any specific annotation strategy. Instead, the user will specify a blackboard structure in a configuration file and the system will instantiate and run this particular annotation strategy. The characteristics of this framework are presented and discussed. Specific adaptations to the classical blackboard architecture have been required, such as the description of the activation patterns of the knowledge sources by using an extended set of Allen's temporal relations. Although the system is robust enough to be used on real-size applications, it is of primary use to bioinformatics researchers who want to experiment with blackboard architectures. In the context of genome annotation, blackboards have several interesting features related to the way methodological and biological knowledge can be updated. They can readily handle the cooperative (several methods are implied) and opportunistic (the flow of execution depends on the state of our knowledge) aspects of the annotation process.

  19. Genepi: a blackboard framework for genome annotation

    Directory of Open Access Journals (Sweden)

    Ziébelin Danielle

    2006-10-01

    Full Text Available Abstract Background Genome annotation can be viewed as an incremental, cooperative, data-driven, knowledge-based process that involves multiple methods to predict gene locations and structures. This process might have to be executed more than once and might be subjected to several revisions as the biological (new data or methodological (new methods knowledge evolves. In this context, although a lot of annotation platforms already exist, there is still a strong need for computer systems which take in charge, not only the primary annotation, but also the update and advance of the associated knowledge. In this paper, we propose to adopt a blackboard architecture for designing such a system Results We have implemented a blackboard framework (called Genepi for developing automatic annotation systems. The system is not bound to any specific annotation strategy. Instead, the user will specify a blackboard structure in a configuration file and the system will instantiate and run this particular annotation strategy. The characteristics of this framework are presented and discussed. Specific adaptations to the classical blackboard architecture have been required, such as the description of the activation patterns of the knowledge sources by using an extended set of Allen's temporal relations. Although the system is robust enough to be used on real-size applications, it is of primary use to bioinformatics researchers who want to experiment with blackboard architectures. Conclusion In the context of genome annotation, blackboards have several interesting features related to the way methodological and biological knowledge can be updated. They can readily handle the cooperative (several methods are implied and opportunistic (the flow of execution depends on the state of our knowledge aspects of the annotation process.

  20. Evaluating Hierarchical Structure in Music Annotations

    OpenAIRE

    Brian McFee; Oriol Nieto; Morwaread M. Farbood; Juan Pablo Bello

    2017-01-01

    Music exhibits structure at multiple scales, ranging from motifs to large-scale functional components. When inferring the structure of a piece, different listeners may attend to different temporal scales, which can result in disagreements when they describe the same piece. In the field of music informatics research (MIR), it is common to use corpora annotated with structural boundaries at different levels. By quantifying disagreements between multiple annotators, previous research has yielded...

  1. Fluid Annotations in a Open World

    DEFF Research Database (Denmark)

    Zellweger, Polle Trescott; Bouvin, Niels Olof; Jehøj, Henning

    2001-01-01

    Fluid Documents use animated typographical changes to provide a novel and appealing user experience for hypertext browsing and for viewing document annotations in context. This paper describes an effort to broaden the utility of Fluid Documents by using the open hypermedia Arakne Environment to l...... to layer fluid annotations and links on top of abitrary HTML pages on the World Wide Web. Changes to both Fluid Documents and Arakne are required....

  2. Research on Ontology-based of Semantic Retrieval Key Technology of Educational Resources%基于本体的教育资源语义检索关键技术研究

    Institute of Scientific and Technical Information of China (English)

    刘琪; 王小正; 王磊

    2014-01-01

    该文对基于本体的语义检索涉及的几个关键技术进行了深入探究,包括教育资源本体的构建、本体数据的存储等。并在此基础上设计出基于本体的自适应Web信息抽取模型和本体数据及实例数据存储模型。%This paper studies Ontology-Based of Semantic Retrieval Key Technology of Educational Resources, which includes ontology of educational resources construction and ontology data storage. Finally, the designs of adaptive web information extrac-tion model based on ontology and ontology data and instance data storage model are described.

  3. A Research on the Logical Basis of Ontology-Based Knowledge Organization%逻辑学原理在基于本体的知识组织中的应用

    Institute of Scientific and Technical Information of China (English)

    刘海涛; 张秀兰

    2012-01-01

    在介绍本体论和逻辑学基本原理的基础上,分别从概念逻辑、思维逻辑、谓词逻辑和归纳推理逻辑四个方面探讨了逻辑学原理在本体构建方法、本体检错推理和本体整合技术三方面基于本体的知识组织中的应用。%Based on the introduction of the basic principle of ontology and logic,this article discussed the application of logic principle in the ontology construction methodology,ontology error checking reasoning and ontology merging basis of ontology-based knowledge organization respectively in terms of concept logic,thinking logic,predicate logic and inductive reasoning logic.

  4. Collaborative annotation of 3D crystallographic models.

    Science.gov (United States)

    Hunter, J; Henderson, M; Khan, I

    2007-01-01

    This paper describes the AnnoCryst system-a tool that was designed to enable authenticated collaborators to share online discussions about 3D crystallographic structures through the asynchronous attachment, storage, and retrieval of annotations. Annotations are personal comments, interpretations, questions, assessments, or references that can be attached to files, data, digital objects, or Web pages. The AnnoCryst system enables annotations to be attached to 3D crystallographic models retrieved from either private local repositories (e.g., Fedora) or public online databases (e.g., Protein Data Bank or Inorganic Crystal Structure Database) via a Web browser. The system uses the Jmol plugin for viewing and manipulating the 3D crystal structures but extends Jmol by providing an additional interface through which annotations can be created, attached, stored, searched, browsed, and retrieved. The annotations are stored on a standardized Web annotation server (Annotea), which has been extended to support 3D macromolecular structures. Finally, the system is embedded within a security framework that is capable of authenticating users and restricting access only to trusted colleagues.

  5. JGI Plant Genomics Gene Annotation Pipeline

    Energy Technology Data Exchange (ETDEWEB)

    Shu, Shengqiang; Rokhsar, Dan; Goodstein, David; Hayes, David; Mitros, Therese

    2014-07-14

    Plant genomes vary in size and are highly complex with a high amount of repeats, genome duplication and tandem duplication. Gene encodes a wealth of information useful in studying organism and it is critical to have high quality and stable gene annotation. Thanks to advancement of sequencing technology, many plant species genomes have been sequenced and transcriptomes are also sequenced. To use these vastly large amounts of sequence data to make gene annotation or re-annotation in a timely fashion, an automatic pipeline is needed. JGI plant genomics gene annotation pipeline, called integrated gene call (IGC), is our effort toward this aim with aid of a RNA-seq transcriptome assembly pipeline. It utilizes several gene predictors based on homolog peptides and transcript ORFs. See Methods for detail. Here we present genome annotation of JGI flagship green plants produced by this pipeline plus Arabidopsis and rice except for chlamy which is done by a third party. The genome annotations of these species and others are used in our gene family build pipeline and accessible via JGI Phytozome portal whose URL and front page snapshot are shown below.

  6. Expanding the mammalian phenotype ontology to support automated exchange of high throughput mouse phenotyping data generated by large-scale mouse knockout screens.

    Science.gov (United States)

    Smith, Cynthia L; Eppig, Janan T

    2015-01-01

    A vast array of data is about to emerge from the large scale high-throughput mouse knockout phenotyping projects worldwide. It is critical that this information is captured in a standardized manner, made accessible, and is fully integrated with other phenotype data sets for comprehensive querying and analysis across all phenotype data types. The volume of data generated by the high-throughput phenotyping screens is expected to grow exponentially, thus, automated methods and standards to exchange phenotype data are required. The IMPC (International Mouse Phenotyping Consortium) is using the Mammalian Phenotype (MP) ontology in the automated annotation of phenodeviant data from high throughput phenotyping screens. 287 new term additions with additional hierarchy revisions were made in multiple branches of the MP ontology to accurately describe the results generated by these high throughput screens. Because these large scale phenotyping data sets will be reported using the MP as the common data standard for annotation and data exchange, automated importation of these data to MGI (Mouse Genome Informatics) and other resources is possible without curatorial effort. Maximum biomedical value of these mutant mice will come from integrating primary high-throughput phenotyping data with secondary, comprehensive phenotypic analyses combined with published phenotype details on these and related mutants at MGI and other resources.

  7. An annotation based approach to support design communication

    CERN Document Server

    Hisarciklilar, Onur

    2007-01-01

    The aim of this paper is to propose an approach based on the concept of annotation for supporting design communication. In this paper, we describe a co-operative design case study where we analyse some annotation practices, mainly focused on design minutes recorded during project reviews. We point out specific requirements concerning annotation needs. Based on these requirements, we propose an annotation model, inspired from the Speech Act Theory (SAT) to support communication in a 3D digital environment. We define two types of annotations in the engineering design context, locutionary and illocutionary annotations. The annotations we describe in this paper are materialised by a set of digital artefacts, which have a semantic dimension allowing express/record elements of technical justifications, traces of contradictory debates, etc. In this paper, we first clarify the semantic annotation concept, and we define general properties of annotations in the engineering design context, and the role of annotations in...

  8. Mouse, man, and meaning: bridging the semantics of mouse phenotype and human disease.

    Science.gov (United States)

    Hancock, John M; Mallon, Ann-Marie; Beck, Tim; Gkoutos, Georgios V; Mungall, Chris; Schofield, Paul N

    2009-08-01

    Now that the laboratory mouse genome is sequenced and the annotation of its gene content is improving, the next major challenge is the annotation of the phenotypic associations of mouse genes. This requires the development of systematic phenotyping pipelines that use standardized phenotyping procedures which allow comparison across laboratories. It also requires the development of a sophisticated informatics infrastructure for the description and interchange of phenotype data. Here we focus on the current state of the art in the description of data produced by systematic phenotyping approaches using ontologies, in particular, the EQ (Entity-Quality) approach, and what developments are required to facilitate the linking of phenotypic descriptions of mutant mice to human diseases.

  9. Analysis of mammalian gene function through broad based phenotypic screens across a consortium of mouse clinics

    Science.gov (United States)

    Adams, David J; Adams, Niels C; Adler, Thure; Aguilar-Pimentel, Antonio; Ali-Hadji, Dalila; Amann, Gregory; André, Philippe; Atkins, Sarah; Auburtin, Aurelie; Ayadi, Abdel; Becker, Julien; Becker, Lore; Bedu, Elodie; Bekeredjian, Raffi; Birling, Marie-Christine; Blake, Andrew; Bottomley, Joanna; Bowl, Mike; Brault, Véronique; Busch, Dirk H; Bussell, James N; Calzada-Wack, Julia; Cater, Heather; Champy, Marie-France; Charles, Philippe; Chevalier, Claire; Chiani, Francesco; Codner, Gemma F; Combe, Roy; Cox, Roger; Dalloneau, Emilie; Dierich, André; Di Fenza, Armida; Doe, Brendan; Duchon, Arnaud; Eickelberg, Oliver; Esapa, Chris T; El Fertak, Lahcen; Feigel, Tanja; Emelyanova, Irina; Estabel, Jeanne; Favor, Jack; Flenniken, Ann; Gambadoro, Alessia; Garrett, Lilian; Gates, Hilary; Gerdin, Anna-Karin; Gkoutos, George; Greenaway, Simon; Glasl, Lisa; Goetz, Patrice; Da Cruz, Isabelle Goncalves; Götz, Alexander; Graw, Jochen; Guimond, Alain; Hans, Wolfgang; Hicks, Geoff; Hölter, Sabine M; Höfler, Heinz; Hancock, John M; Hoehndorf, Robert; Hough, Tertius; Houghton, Richard; Hurt, Anja; Ivandic, Boris; Jacobs, Hughes; Jacquot, Sylvie; Jones, Nora; Karp, Natasha A; Katus, Hugo A; Kitchen, Sharon; Klein-Rodewald, Tanja; Klingenspor, Martin; Klopstock, Thomas; Lalanne, Valerie; Leblanc, Sophie; Lengger, Christoph; le Marchand, Elise; Ludwig, Tonia; Lux, Aline; McKerlie, Colin; Maier, Holger; Mandel, Jean-Louis; Marschall, Susan; Mark, Manuel; Melvin, David G; Meziane, Hamid; Micklich, Kateryna; Mittelhauser, Christophe; Monassier, Laurent; Moulaert, David; Muller, Stéphanie; Naton, Beatrix; Neff, Frauke; Nolan, Patrick M; Nutter, Lauryl MJ; Ollert, Markus; Pavlovic, Guillaume; Pellegata, Natalia S; Peter, Emilie; Petit-Demoulière, Benoit; Pickard, Amanda; Podrini, Christine; Potter, Paul; Pouilly, Laurent; Puk, Oliver; Richardson, David; Rousseau, Stephane; Quintanilla-Fend, Leticia; Quwailid, Mohamed M; Racz, Ildiko; Rathkolb, Birgit; Riet, Fabrice; Rossant, Janet; Roux, Michel; Rozman, Jan; Ryder, Ed; Salisbury, Jennifer; Santos, Luis; Schäble, Karl-Heinz; Schiller, Evelyn; Schrewe, Anja; Schulz, Holger; Steinkamp, Ralf; Simon, Michelle; Stewart, Michelle; Stöger, Claudia; Stöger, Tobias; Sun, Minxuan; Sunter, David; Teboul, Lydia; Tilly, Isabelle; Tocchini-Valentini, Glauco P; Tost, Monica; Treise, Irina; Vasseur, Laurent; Velot, Emilie; Vogt-Weisenhorn, Daniela; Wagner, Christelle; Walling, Alison; Weber, Bruno; Wendling, Olivia; Westerberg, Henrik; Willershäuser, Monja; Wolf, Eckhard; Wolter, Anne; Wood, Joe; Wurst, Wolfgang; Yildirim, Ali Önder; Zeh, Ramona; Zimmer, Andreas; Zimprich, Annemarie

    2015-01-01

    The function of the majority of genes in the mouse and human genomes remains unknown. The mouse ES cell knockout resource provides a basis for characterisation of relationships between gene and phenotype. The EUMODIC consortium developed and validated robust methodologies for broad-based phenotyping of knockouts through a pipeline comprising 20 disease-orientated platforms. We developed novel statistical methods for pipeline design and data analysis aimed at detecting reproducible phenotypes with high power. We acquired phenotype data from 449 mutant alleles, representing 320 unique genes, of which half had no prior functional annotation. We captured data from over 27,000 mice finding that 83% of the mutant lines are phenodeviant, with 65% demonstrating pleiotropy. Surprisingly, we found significant differences in phenotype annotation according to zygosity. Novel phenotypes were uncovered for many genes with unknown function providing a powerful basis for hypothesis generation and further investigation in diverse systems. PMID:26214591

  10. Discovering gene annotations in biomedical text databases

    Directory of Open Access Journals (Sweden)

    Ozsoyoglu Gultekin

    2008-03-01

    Full Text Available Abstract Background Genes and gene products are frequently annotated with Gene Ontology concepts based on the evidence provided in genomics articles. Manually locating and curating information about a genomic entity from the biomedical literature requires vast amounts of human effort. Hence, there is clearly a need forautomated computational tools to annotate the genes and gene products with Gene Ontology concepts by computationally capturing the related knowledge embedded in textual data. Results In this article, we present an automated genomic entity annotation system, GEANN, which extracts information about the characteristics of genes and gene products in article abstracts from PubMed, and translates the discoveredknowledge into Gene Ontology (GO concepts, a widely-used standardized vocabulary of genomic traits. GEANN utilizes textual "extraction patterns", and a semantic matching framework to locate phrases matching to a pattern and produce Gene Ontology annotations for genes and gene products. In our experiments, GEANN has reached to the precision level of 78% at therecall level of 61%. On a select set of Gene Ontology concepts, GEANN either outperforms or is comparable to two other automated annotation studies. Use of WordNet for semantic pattern matching improves the precision and recall by 24% and 15%, respectively, and the improvement due to semantic pattern matching becomes more apparent as the Gene Ontology terms become more general. Conclusion GEANN is useful for two distinct purposes: (i automating the annotation of genomic entities with Gene Ontology concepts, and (ii providing existing annotations with additional "evidence articles" from the literature. The use of textual extraction patterns that are constructed based on the existing annotations achieve high precision. The semantic pattern matching framework provides a more flexible pattern matching scheme with respect to "exactmatching" with the advantage of locating approximate

  11. GO annotation in InterPro: why stability does not indicate accuracy in a sea of changing annotations.

    Science.gov (United States)

    Sangrador-Vegas, Amaia; Mitchell, Alex L; Chang, Hsin-Yu; Yong, Siew-Yit; Finn, Robert D

    2016-01-01

    The removal of annotation from biological databases is often perceived as an indicator of erroneous annotation. As a corollary, annotation stability is considered to be a measure of reliability. However, diverse data-driven events can affect the stability of annotations in both primary protein sequence databases and the protein family databases that are built upon the sequence databases and used to help annotate them. Here, we describe some of these events and their consequences for the InterPro database, and demonstrate that annotation removal or reassignment is not always linked to incorrect annotation by the curator. Database URL: http://www.ebi.ac.uk/interpro.

  12. SNPPhenA: a corpus for extracting ranked associations of single-nucleotide polymorphisms and phenotypes from literature.

    Science.gov (United States)

    Bokharaeian, Behrouz; Diaz, Alberto; Taghizadeh, Nasrin; Chitsaz, Hamidreza; Chavoshinejad, Ramyar

    2017-04-07

    Single Nucleotide Polymorphisms (SNPs) are among the most important types of genetic variations influencing common diseases and phenotypes. Recently, some corpora and methods have been developed with the purpose of extracting mutations and diseases from texts. However, there is no available corpus, for extracting associations from texts, that is annotated with linguistic-based negation, modality markers, neutral candidates, and confidence level of associations. In this research, different steps were presented so as to produce the SNPPhenA corpus. They include automatic Named Entity Recognition (NER) followed by the manual annotation of SNP and phenotype names, annotation of the SNP-phenotype associations and their level of confidence, as well as modality markers. Moreover, the produced corpus was annotated with negation scopes and cues as well as neutral candidates that play crucial role as far as negation and the modality phenomenon in relation to extraction tasks. The agreement between annotators was measured by Cohen's Kappa coefficient where the resulting scores indicated the reliability of the corpus. The Kappa score was 0.79 for annotating the associations and 0.80 for the confidence degree of associations. Further presented were the basic statistics of the annotated features of the corpus in addition to the results of our first experiments related to the extraction of ranked SNP-Phenotype associations. The prepared guideline documents render the corpus more convenient and facile to use. The corpus, guidelines and inter-annotator agreement analysis are available on the website of the corpus: http://nil.fdi.ucm.es/?q=node/639 . Specifying the confidence degree of SNP-phenotype associations from articles helps identify the strength of associations that could in turn assist genomics scientists in determining phenotypic plasticity and the importance of environmental factors. What is more, our first experiments with the corpus show that linguistic-based confidence

  13. Improving pan-genome annotation using whole genome multiple alignment

    Directory of Open Access Journals (Sweden)

    Salzberg Steven L

    2011-06-01

    Full Text Available Abstract Background Rapid annotation and comparisons of genomes from multiple isolates (pan-genomes is becoming commonplace due to advances in sequencing technology. Genome annotations can contain inconsistencies and errors that hinder comparative analysis even within a single species. Tools are needed to compare and improve annotation quality across sets of closely related genomes. Results We introduce a new tool, Mugsy-Annotator, that identifies orthologs and evaluates annotation quality in prokaryotic genomes using whole genome multiple alignment. Mugsy-Annotator identifies anomalies in annotated gene structures, including inconsistently located translation initiation sites and disrupted genes due to draft genome sequencing or pseudogenes. An evaluation of species pan-genomes using the tool indicates that such anomalies are common, especially at translation initiation sites. Mugsy-Annotator reports alternate annotations that improve consistency and are candidates for further review. Conclusions Whole genome multiple alignment can be used to efficiently identify orthologs and annotation problem areas in a bacterial pan-genome. Comparisons of annotated gene structures within a species may show more variation than is actually present in the genome, indicating errors in genome annotation. Our new tool Mugsy-Annotator assists re-annotation efforts by highlighting edits that improve annotation consistency.

  14. BIOFILTER AS A FUNCTIONAL ANNOTATION PIPELINE FOR COMMON AND RARE COPY NUMBER BURDEN.

    Science.gov (United States)

    Kim, Dokyoon; Lucas, Anastasia; Glessner, Joseph; Verma, Shefali S; Bradford, Yuki; Li, Ruowang; Frase, Alex T; Hakonarson, Hakon; Peissig, Peggy; Brilliant, Murray; Ritchie, Marylyn D

    2016-01-01

    Recent studies on copy number variation (CNV) have suggested that an increasing burden of CNVs is associated with susceptibility or resistance to disease. A large number of genes or genomic loci contribute to complex diseases such as autism. Thus, total genomic copy number burden, as an accumulation of copy number change, is a meaningful measure of genomic instability to identify the association between global genetic effects and phenotypes of interest. However, no systematic annotation pipeline has been developed to interpret biological meaning based on the accumulation of copy number change across the genome associated with a phenotype of interest. In this study, we develop a comprehensive and systematic pipeline for annotating copy number variants into genes/genomic regions and subsequently pathways and other gene groups using Biofilter - a bioinformatics tool that aggregates over a dozen publicly available databases of prior biological knowledge. Next we conduct enrichment tests of biologically defined groupings of CNVs including genes, pathways, Gene Ontology, or protein families. We applied the proposed pipeline to a CNV dataset from the Marshfield Clinic Personalized Medicine Research Project (PMRP) in a quantitative trait phenotype derived from the electronic health record - total cholesterol. We identified several significant pathways such as toll-like receptor signaling pathway and hepatitis C pathway, gene ontologies (GOs) of nucleoside triphosphatase activity (NTPase) and response to virus, and protein families such as cell morphogenesis that are associated with the total cholesterol phenotype based on CNV profiles (permutation p-value Biofilter can be used for CNV data from any genotyping or sequencing platform and to explore CNV enrichment for any traits or phenotypes. Biofilter continues to be a powerful bioinformatics tool for annotating, filtering, and constructing biologically informed models for association analysis - now including copy number

  15. MixtureTree annotator: a program for automatic colorization and visual annotation of MixtureTree.

    Directory of Open Access Journals (Sweden)

    Shu-Chuan Chen

    Full Text Available The MixtureTree Annotator, written in JAVA, allows the user to automatically color any phylogenetic tree in Newick format generated from any phylogeny reconstruction program and output the Nexus file. By providing the ability to automatically color the tree by sequence name, the MixtureTree Annotator provides a unique advantage over any other programs which perform a similar function. In addition, the MixtureTree Annotator is the only package that can efficiently annotate the output produced by MixtureTree with mutation information and coalescent time information. In order to visualize the resulting output file, a modified version of FigTree is used. Certain popular methods, which lack good built-in visualization tools, for example, MEGA, Mesquite, PHY-FI, TreeView, treeGraph and Geneious, may give results with human errors due to either manually adding colors to each node or with other limitations, for example only using color based on a number, such as branch length, or by taxonomy. In addition to allowing the user to automatically color any given Newick tree by sequence name, the MixtureTree Annotator is the only method that allows the user to automatically annotate the resulting tree created by the MixtureTree program. The MixtureTree Annotator is fast and easy-to-use, while still allowing the user full control over the coloring and annotating process.

  16. Semi-Semantic Annotation: A guideline for the URDU.KON-TB treebank POS annotation

    Directory of Open Access Journals (Sweden)

    Qaiser ABBAS

    2016-12-01

    Full Text Available This work elaborates the semi-semantic part of speech annotation guidelines for the URDU.KON-TB treebank: an annotated corpus. A hierarchical annotation scheme was designed to label the part of speech and then applied on the corpus. This raw corpus was collected from the Urdu Wikipedia and the Jang newspaper and then annotated with the proposed semi-semantic part of speech labels. The corpus contains text of local & international news, social stories, sports, culture, finance, religion, traveling, etc. This exercise finally contributed a part of speech annotation to the URDU.KON-TB treebank. Twenty-two main part of speech categories are divided into subcategories, which conclude the morphological, and semantical information encoded in it. This article reports the annotation guidelines in major; however, it also briefs the development of the URDU.KON-TB treebank, which includes the raw corpus collection, designing & employment of annotation scheme and finally, its statistical evaluation and results. The guidelines presented as follows, will be useful for linguistic community to annotate the sentences not only for the national language Urdu but for the other indigenous languages like Punjab, Sindhi, Pashto, etc., as well.

  17. MixtureTree annotator: a program for automatic colorization and visual annotation of MixtureTree.

    Science.gov (United States)

    Chen, Shu-Chuan; Ogata, Aaron

    2015-01-01

    The MixtureTree Annotator, written in JAVA, allows the user to automatically color any phylogenetic tree in Newick format generated from any phylogeny reconstruction program and output the Nexus file. By providing the ability to automatically color the tree by sequence name, the MixtureTree Annotator provides a unique advantage over any other programs which perform a similar function. In addition, the MixtureTree Annotator is the only package that can efficiently annotate the output produced by MixtureTree with mutation information and coalescent time information. In order to visualize the resulting output file, a modified version of FigTree is used. Certain popular methods, which lack good built-in visualization tools, for example, MEGA, Mesquite, PHY-FI, TreeView, treeGraph and Geneious, may give results with human errors due to either manually adding colors to each node or with other limitations, for example only using color based on a number, such as branch length, or by taxonomy. In addition to allowing the user to automatically color any given Newick tree by sequence name, the MixtureTree Annotator is the only method that allows the user to automatically annotate the resulting tree created by the MixtureTree program. The MixtureTree Annotator is fast and easy-to-use, while still allowing the user full control over the coloring and annotating process.

  18. The Ambulant Annotator: Medical Multimedia Annotations on TabletPC’s

    NARCIS (Netherlands)

    D.C.A. Bulterman (Dick)

    2003-01-01

    htmlabstractA new generation of tablet computers has stimulated end-user interest on annotating documents by making pen-based commentary and spoken audio labels to otherwise static documents. The typical application scenario for most annotation systems is to convert existing content to a (virtual)

  19. Genome Annotation Transfer Utility (GATU: rapid annotation of viral genomes using a closely related reference genome

    Directory of Open Access Journals (Sweden)

    Upton Chris

    2006-06-01

    Full Text Available Abstract Background Since DNA sequencing has become easier and cheaper, an increasing number of closely related viral genomes have been sequenced. However, many of these have been deposited in GenBank without annotations, severely limiting their value to researchers. While maintaining comprehensive genomic databases for a set of virus families at the Viral Bioinformatics Resource Center http://www.biovirus.org and Viral Bioinformatics – Canada http://www.virology.ca, we found that researchers were unnecessarily spending time annotating viral genomes that were close relatives of already annotated viruses. We have therefore designed and implemented a novel tool, Genome Annotation Transfer Utility (GATU, to transfer annotations from a previously annotated reference genome to a new target genome, thereby greatly reducing this laborious task. Results GATU transfers annotations from a reference genome to a closely related target genome, while still giving the user final control over which annotations should be included. GATU also detects open reading frames present in the target but not the reference genome and provides the user with a variety of bioinformatics tools to quickly determine if these ORFs should also be included in the annotation. After this process is complete, GATU saves the newly annotated genome as a GenBank, EMBL or XML-format file. The software is coded in Java and runs on a variety of computer platforms. Its user-friendly Graphical User Interface is specifically designed for users trained in the biological sciences. Conclusion GATU greatly simplifies the initial stages of genome annotation by using a closely related genome as a reference. It is not intended to be a gene prediction tool or a "complete" annotation system, but we have found that it significantly reduces the time required for annotation of genes and mature peptides as well as helping to standardize gene names between related organisms by transferring reference genome

  20. Genome Annotation Transfer Utility (GATU): rapid annotation of viral genomes using a closely related reference genome.

    Science.gov (United States)

    Tcherepanov, Vasily; Ehlers, Angelika; Upton, Chris

    2006-06-13

    Since DNA sequencing has become easier and cheaper, an increasing number of closely related viral genomes have been sequenced. However, many of these have been deposited in GenBank without annotations, severely limiting their value to researchers. While maintaining comprehensive genomic databases for a set of virus families at the Viral Bioinformatics Resource Center http://www.biovirus.org and Viral Bioinformatics - Canada http://www.virology.ca, we found that researchers were unnecessarily spending time annotating viral genomes that were close relatives of already annotated viruses. We have therefore designed and implemented a novel tool, Genome Annotation Transfer Utility (GATU), to transfer annotations from a previously annotated reference genome to a new target genome, thereby greatly reducing this laborious task. GATU transfers annotations from a reference genome to a closely related target genome, while still giving the user final control over which annotations should be included. GATU also detects open reading frames present in the target but not the reference genome and provides the user with a variety of bioinformatics tools to quickly determine if these ORFs should also be included in the annotation. After this process is complete, GATU saves the newly annotated genome as a GenBank, EMBL or XML-format file. The software is coded in Java and runs on a variety of computer platforms. Its user-friendly Graphical User Interface is specifically designed for users trained in the biological sciences. GATU greatly simplifies the initial stages of genome annotation by using a closely related genome as a reference. It is not intended to be a gene prediction tool or a "complete" annotation system, but we have found that it significantly reduces the time required for annotation of genes and mature peptides as well as helping to standardize gene names between related organisms by transferring reference genome annotations to the target genome. The program is freely

  1. 基于本体的电力知识跨媒体资源管理%Ontology Based Electricity Knowledge Management for Cross-Media Resources

    Institute of Scientific and Technical Information of China (English)

    金晶

    2015-01-01

    随着电力企业海量数字资源的日益增长,如何对多种媒体格式的数字资源进行统一管理并进行快速、准确的搜索是电力企业知识管理中面临的难点。传统的搜索方法主要是基于关键字匹配来查找并返回大量的信息,没有考虑语义信息和用户的个性化特征,因此无法为用户提供准确、个性化的学习资源,造成了学习资源和人力的严重浪费。为解决此问题,本文基于语义技术,用机器可处理的语义元数据描述各种异构资源,并提出了基于本体的电力知识跨媒体资源标注方法,可以有效的解决用户对多知识点联动检索的问题,实现了电力企业知识内部知识的转化和传递,最终实现知识的共享和重用。%With the rapid growing of massive digital resource, how to manage different formats of digital resources efficiently and search the needed recourses accurately is very difficult for knowledge management in electric power enterprises. Most of the traditional search methods are based on keyword matching and fail to consider the semantic information, so these methods cannot provide accurate, personalized learning resources for users and lead to a serious waste of resources and manpower. In order to solve this problem, this paper uses semantic metadata to descript various heterogeneous resources, and develop an power knowledge ontology annotation method. With the proposed method, the different formats of digital resources of the electric power company can be retrieved effectively and the knowledge sharing and reuse can be achieved.

  2. Automated analysis and annotation of basketball video

    Science.gov (United States)

    Saur, Drew D.; Tan, Yap-Peng; Kulkarni, Sanjeev R.; Ramadge, Peter J.

    1997-01-01

    Automated analysis and annotation of video sequences are important for digital video libraries, content-based video browsing and data mining projects. A successful video annotation system should provide users with useful video content summary in a reasonable processing time. Given the wide variety of video genres available today, automatically extracting meaningful video content for annotation still remains hard by using current available techniques. However, a wide range video has inherent structure such that some prior knowledge about the video content can be exploited to improve our understanding of the high-level video semantic content. In this paper, we develop tools and techniques for analyzing structured video by using the low-level information available directly from MPEG compressed video. Being able to work directly in the video compressed domain can greatly reduce the processing time and enhance storage efficiency. As a testbed, we have developed a basketball annotation system which combines the low-level information extracted from MPEG stream with the prior knowledge of basketball video structure to provide high level content analysis, annotation and browsing for events such as wide- angle and close-up views, fast breaks, steals, potential shots, number of possessions and possession times. We expect our approach can also be extended to structured video in other domains.

  3. Functional annotation and identification of candidate disease genes by computational analysis of normal tissue gene expression data.

    Directory of Open Access Journals (Sweden)

    Laura Miozzi

    Full Text Available BACKGROUND: High-throughput gene expression data can predict gene function through the "guilt by association" principle: coexpressed genes are likely to be functionally associated. METHODOLOGY/PRINCIPAL FINDINGS: We analyzed publicly available expression data on normal human tissues. The analysis is based on the integration of data obtained with two experimental platforms (microarrays and SAGE and of various measures of dissimilarity between expression profiles. The building blocks of the procedure are the Ranked Coexpression Groups (RCG, small sets of tightly coexpressed genes which are analyzed in terms of functional annotation. Functionally characterized RCGs are selected by means of the majority rule and used to predict new functional annotations. Functionally characterized RCGs are enriched in groups of genes associated to similar phenotypes. We exploit this fact to find new candidate disease genes for many OMIM phenotypes of unknown molecular origin. CONCLUSIONS/SIGNIFICANCE: We predict new functional annotations for many human genes, showing that the integration of different data sets and coexpression measures significantly improves the scope of the results. Combining gene expression data, functional annotation and known phenotype-gene associations we provide candidate genes for several genetic diseases of unknown molecular basis.

  4. Quantifying Variability of Manual Annotation in Cryo-Electron Tomograms.

    Science.gov (United States)

    Hecksel, Corey W; Darrow, Michele C; Dai, Wei; Galaz-Montoya, Jesús G; Chin, Jessica A; Mitchell, Patrick G; Chen, Shurui; Jakana, Jemba; Schmid, Michael F; Chiu, Wah

    2016-06-01

    Although acknowledged to be variable and subjective, manual annotation of cryo-electron tomography data is commonly used to answer structural questions and to create a "ground truth" for evaluation of automated segmentation algorithms. Validation of such annotation is lacking, but is critical for understanding the reproducibility of manual annotations. Here, we used voxel-based similarity scores for a variety of specimens, ranging in complexity and segmented by several annotators, to quantify the variation among their annotations. In addition, we have identified procedures for merging annotations to reduce variability, thereby increasing the reliability of manual annotation. Based on our analyses, we find that it is necessary to combine multiple manual annotations to increase the confidence level for answering structural questions. We also make recommendations to guide algorithm development for automated annotation of features of interest.

  5. Critical Assessment of Function Annotation Meeting, 2011

    Energy Technology Data Exchange (ETDEWEB)

    Friedberg, Iddo

    2015-01-21

    The Critical Assessment of Function Annotation meeting was held July 14-15, 2011 at the Austria Conference Center in Vienna, Austria. There were 73 registered delegates at the meeting. We thank the DOE for this award. It helped us organize and support a scientific meeting AFP 2011 as a special interest group (SIG) meeting associated with the ISMB 2011 conference. The conference was held in Vienna, Austria, in July 2011. The AFP SIG was held on July 15-16, 2011 (immediately preceding the conference). The meeting consisted of two components, the first being a series of talks (invited and contributed) and discussion sections dedicated to protein function research, with an emphasis on the theory and practice of computational methods utilized in functional annotation. The second component provided a large-scale assessment of computational methods through participation in the Critical Assessment of Functional Annotation (CAFA).

  6. I2Cnet medical image annotation service.

    Science.gov (United States)

    Chronaki, C E; Zabulis, X; Orphanoudakis, S C

    1997-01-01

    I2Cnet (Image Indexing by Content network) aims to provide services related to the content-based management of images in healthcare over the World-Wide Web. Each I2Cnet server maintains an autonomous repository of medical images and related information. The annotation service of I2Cnet allows specialists to interact with the contents of the repository, adding comments or illustrations to medical images of interest. I2Cnet annotations may be communicated to other users via e-mail or posted to I2Cnet for inclusion in its local repositories. This paper discusses the annotation service of I2Cnet and argues that such services pave the way towards the evolution of active digital medical image libraries.

  7. Structural annotation of Mycobacterium tuberculosis proteome.

    Directory of Open Access Journals (Sweden)

    Praveen Anand

    Full Text Available Of the ∼4000 ORFs identified through the genome sequence of Mycobacterium tuberculosis (TB H37Rv, experimentally determined structures are available for 312. Since knowledge of protein structures is essential to obtain a high-resolution understanding of the underlying biology, we seek to obtain a structural annotation for the genome, using computational methods. Structural models were obtained and validated for ∼2877 ORFs, covering ∼70% of the genome. Functional annotation of each protein was based on fold-based functional assignments and a novel binding site based ligand association. New algorithms for binding site detection and genome scale binding site comparison at the structural level, recently reported from the laboratory, were utilized. Besides these, the annotation covers detection of various sequence and sub-structural motifs and quaternary structure predictions based on the corresponding templates. The study provides an opportunity to obtain a global perspective of the fold distribution in the genome. The annotation indicates that cellular metabolism can be achieved with only 219 folds. New insights about the folds that predominate in the genome, as well as the fold-combinations that make up multi-domain proteins are also obtained. 1728 binding pockets have been associated with ligands through binding site identification and sub-structure similarity analyses. The resource (http://proline.physics.iisc.ernet.in/Tbstructuralannotation, being one of the first to be based on structure-derived functional annotations at a genome scale, is expected to be useful for better understanding of TB and for application in drug discovery. The reported annotation pipeline is fairly generic and can be applied to other genomes as well.

  8. Annotating images by mining image search results.

    Science.gov (United States)

    Wang, Xin-Jing; Zhang, Lei; Li, Xirong; Ma, Wei-Ying

    2008-11-01

    Although it has been studied for years by the computer vision and machine learning communities, image annotation is still far from practical. In this paper, we propose a novel attempt at model-free image annotation, which is a data-driven approach that annotates images by mining their search results. Some 2.4 million images with their surrounding text are collected from a few photo forums to support this approach. The entire process is formulated in a divide-and-conquer framework where a query keyword is provided along with the uncaptioned image to improve both the effectiveness and efficiency. This is helpful when the collected data set is not dense everywhere. In this sense, our approach contains three steps: 1) the search process to discover visually and semantically similar search results, 2) the mining process to identify salient terms from textual descriptions of the search results, and 3) the annotation rejection process to filter out noisy terms yielded by Step 2. To ensure real-time annotation, two key techniques are leveraged-one is to map the high-dimensional image visual features into hash codes, the other is to implement it as a distributed system, of which the search and mining processes are provided as Web services. As a typical result, the entire process finishes in less than 1 second. Since no training data set is required, our approach enables annotating with unlimited vocabulary and is highly scalable and robust to outliers. Experimental results on both real Web images and a benchmark image data set show the effectiveness and efficiency of the proposed algorithm. It is also worth noting that, although the entire approach is illustrated within the divide-and conquer framework, a query keyword is not crucial to our current implementation. We provide experimental results to prove this.

  9. Computable visually observed phenotype ontological framework for plants

    Directory of Open Access Journals (Sweden)

    Schaeffer Mary

    2011-06-01

    Full Text Available Abstract Background The ability to search for and precisely compare similar phenotypic appearances within and across species has vast potential in plant science and genetic research. The difficulty in doing so lies in the fact that many visual phenotypic data, especially visually observed phenotypes that often times cannot be directly measured quantitatively, are in the form of text annotations, and these descriptions are plagued by semantic ambiguity, heterogeneity, and low granularity. Though several bio-ontologies have been developed to standardize phenotypic (and genotypic information and permit comparisons across species, these semantic issues persist and prevent precise analysis and retrieval of information. A framework suitable for the modeling and analysis of precise computable representations of such phenotypic appearances is needed. Results We have developed a new framework called the Computable Visually Observed Phenotype Ontological Framework for plants. This work provides a novel quantitative view of descriptions of plant phenotypes that leverages existing bio-ontologies and utilizes a computational approach to capture and represent domain knowledge in a machine-interpretable form. This is accomplished by means of a robust and accurate semantic mapping module that automatically maps high-level semantics to low-level measurements computed from phenotype imagery. The framework was applied to two different plant species with semantic rules mined and an ontology constructed. Rule quality was evaluated and showed high quality rules for most semantics. This framework also facilitates automatic annotation of phenotype images and can be adopted by different plant communities to aid in their research. Conclusions The Computable Visually Observed Phenotype Ontological Framework for plants has been developed for more efficient and accurate management of visually observed phenotypes, which play a significant role in plant genomics research. The

  10. Software for computing and annotating genomic ranges.

    Directory of Open Access Journals (Sweden)

    Michael Lawrence

    Full Text Available We describe Bioconductor infrastructure for representing and computing on annotated genomic ranges and integrating genomic data with the statistical computing features of R and its extensions. At the core of the infrastructure are three packages: IRanges, GenomicRanges, and GenomicFeatures. These packages provide scalable data structures for representing annotated ranges on the genome, with special support for transcript structures, read alignments and coverage vectors. Computational facilities include efficient algorithms for overlap and nearest neighbor detection, coverage calculation and other range operations. This infrastructure directly supports more than 80 other Bioconductor packages, including those for sequence analysis, differential expression analysis and visualization.

  11. EHR-based phenotyping: Bulk learning and evaluation.

    Science.gov (United States)

    Chiu, Po-Hsiang; Hripcsak, George

    2017-06-01

    In data-driven phenotyping, a core computational task is to identify medical concepts and their variations from sources of electronic health records (EHR) to stratify phenotypic cohorts. A conventional analytic framework for phenotyping largely uses a manual knowledge engineering approach or a supervised learning approach where clinical cases are represented by variables encompassing diagnoses, medicinal treatments and laboratory tests, among others. In such a framework, tasks associated with feature engineering and data annotation remain a tedious and expensive exercise, resulting in poor scalability. In addition, certain clinical conditions, such as those that are rare and acute in nature, may never accumulate sufficient data over time, which poses a challenge to establishing accurate and informative statistical models. In this paper, we use infectious diseases as the domain of study to demonstrate a hierarchical learning method based on ensemble learning that attempts to address these issues through feature abstraction. We use a sparse annotation set to train and evaluate many phenotypes at once, which we call bulk learning. In this batch-phenotyping framework, disease cohort definitions can be learned from within the abstract feature space established by using multiple diseases as a substrate and diagnostic codes as surrogates. In particular, using surrogate labels for model training renders possible its subsequent evaluation using only a sparse annotated sample. Moreover, statistical models can be trained and evaluated, using the same sparse annotation, from within the abstract feature space of low dimensionality that encapsulates the shared clinical traits of these target diseases, collectively referred to as the bulk learning set. Copyright © 2017 Elsevier Inc. All rights reserved.

  12. Solar Tutorial and Annotation Resource (STAR)

    Science.gov (United States)

    Showalter, C.; Rex, R.; Hurlburt, N. E.; Zita, E. J.

    2009-12-01

    We have written a software suite designed to facilitate solar data analysis by scientists, students, and the public, anticipating enormous datasets from future instruments. Our “STAR" suite includes an interactive learning section explaining 15 classes of solar events. Users learn software tools that exploit humans’ superior ability (over computers) to identify many events. Annotation tools include time slice generation to quantify loop oscillations, the interpolation of event shapes using natural cubic splines (for loops, sigmoids, and filaments) and closed cubic splines (for coronal holes). Learning these tools in an environment where examples are provided prepares new users to comfortably utilize annotation software with new data. Upon completion of our tutorial, users are presented with media of various solar events and asked to identify and annotate the images, to test their mastery of the system. Goals of the project include public input into the data analysis of very large datasets from future solar satellites, and increased public interest and knowledge about the Sun. In 2010, the Solar Dynamics Observatory (SDO) will be launched into orbit. SDO’s advancements in solar telescope technology will generate a terabyte per day of high-quality data, requiring innovation in data management. While major projects develop automated feature recognition software, so that computers can complete much of the initial event tagging and analysis, still, that software cannot annotate features such as sigmoids, coronal magnetic loops, coronal dimming, etc., due to large amounts of data concentrated in relatively small areas. Previously, solar physicists manually annotated these features, but with the imminent influx of data it is unrealistic to expect specialized researchers to examine every image that computers cannot fully process. A new approach is needed to efficiently process these data. Providing analysis tools and data access to students and the public have proven

  13. Ranking Biomedical Annotations with Annotator’s Semantic Relevancy

    Directory of Open Access Journals (Sweden)

    Aihua Wu

    2014-01-01

    Full Text Available Biomedical annotation is a common and affective artifact for researchers to discuss, show opinion, and share discoveries. It becomes increasing popular in many online research communities, and implies much useful information. Ranking biomedical annotations is a critical problem for data user to efficiently get information. As the annotator’s knowledge about the annotated entity normally determines quality of the annotations, we evaluate the knowledge, that is, semantic relationship between them, in two ways. The first is extracting relational information from credible websites by mining association rules between an annotator and a biomedical entity. The second way is frequent pattern mining from historical annotations, which reveals common features of biomedical entities that an annotator can annotate with high quality. We propose a weighted and concept-extended RDF model to represent an annotator, a biomedical entity, and their background attributes and merge information from the two ways as the context of an annotator. Based on that, we present a method to rank the annotations by evaluating their correctness according to user’s vote and the semantic relevancy between the annotator and the annotated entity. The experimental results show that the approach is applicable and efficient even when data set is large.

  14. SNAD: sequence name annotation-based designer

    Directory of Open Access Journals (Sweden)

    Gorbalenya Alexander E

    2009-08-01

    Full Text Available Abstract Background A growing diversity of biological data is tagged with unique identifiers (UIDs associated with polynucleotides and proteins to ensure efficient computer-mediated data storage, maintenance, and processing. These identifiers, which are not informative for most people, are often substituted by biologically meaningful names in various presentations to facilitate utilization and dissemination of sequence-based knowledge. This substitution is commonly done manually that may be a tedious exercise prone to mistakes and omissions. Results Here we introduce SNAD (Sequence Name Annotation-based Designer that mediates automatic conversion of sequence UIDs (associated with multiple alignment or phylogenetic tree, or supplied as plain text list into biologically meaningful names and acronyms. This conversion is directed by precompiled or user-defined templates that exploit wealth of annotation available in cognate entries of external databases. Using examples, we demonstrate how this tool can be used to generate names for practical purposes, particularly in virology. Conclusion A tool for controllable annotation-based conversion of sequence UIDs into biologically meaningful names and acronyms has been developed and placed into service, fostering links between quality of sequence annotation, and efficiency of communication and knowledge dissemination among researchers.

  15. Mulligan Concept manual therapy: standardizing annotation.

    Science.gov (United States)

    McDowell, Jillian Marie; Johnson, Gillian Margaret; Hetherington, Barbara Helen

    2014-10-01

    Quality technique documentation is integral to the practice of manual therapy, ensuring uniform application and reproducibility of treatment. Manual therapy techniques are described by annotations utilizing a range of acronyms, abbreviations and universal terminology based on biomechanical and anatomical concepts. The various combinations of therapist and patient generated forces utilized in a variety of weight-bearing positions, which are synonymous with Mulligan Concept, challenge practitioners existing annotational skills. An annotation framework with recording rules adapted to the Mulligan Concept is proposed in which the abbreviations incorporate established manual therapy tenets and are detailed in the following sequence of; starting position, side, joint/s, method of application, glide/s, Mulligan technique, movement (or function), whether an assistant is used, overpressure (and by whom) and numbers of repetitions or time and sets. Therapist or patient application of overpressure and utilization of treatment belts or manual techniques must be recorded to capture the complete description. The adoption of the Mulligan Concept annotation framework in this way for documentation purposes will provide uniformity and clarity of information transfer for the future purposes of teaching, clinical practice and audit for its practitioners.

  16. Nutrition & Adolescent Pregnancy: A Selected Annotated Bibliography.

    Science.gov (United States)

    National Agricultural Library (USDA), Washington, DC.

    This annotated bibliography on nutrition and adolescent pregnancy is intended to be a source of technical assistance for nurses, nutritionists, physicians, educators, social workers, and other personnel concerned with improving the health of teenage mothers and their babies. It is divided into two major sections. The first section lists selected…

  17. Suggested Books for Children: An Annotated Bibliography

    Science.gov (United States)

    NHSA Dialog, 2008

    2008-01-01

    This article provides an annotated bibliography of various children's books. It includes listings of books that illustrate the dynamic relationships within the natural environment, economic context, racial and cultural identities, cross-group similarities and differences, gender, different abilities and stories of injustice and resistance.

  18. Genotyping and annotation of Affymetrix SNP arrays

    DEFF Research Database (Denmark)

    Lamy, Philippe; Andersen, Claus Lindbjerg; Wikman, Friedrik;

    2006-01-01

    allows us to annotate SNPs that have poor performance, either because of poor experimental conditions or because for one of the alleles the probes do not behave in a dose-response manner. Generally, our method agrees well with a method developed by Affymetrix. When both methods make a call they agree...

  19. Annotated Bibliography of EDGE2D Use

    Energy Technology Data Exchange (ETDEWEB)

    J.D. Strachan and G. Corrigan

    2005-06-24

    This annotated bibliography is intended to help EDGE2D users, and particularly new users, find existing published literature that has used EDGE2D. Our idea is that a person can find existing studies which may relate to his intended use, as well as gain ideas about other possible applications by scanning the attached tables.

  20. Studies of Scientific Disciplines. An Annotated Bibliography.

    Science.gov (United States)

    Weisz, Diane; Kruytbosch, Carlos

    Provided in this bibliography are annotated lists of social studies of science literature, arranged alphabetically by author in 13 disciplinary areas. These areas include astronomy; general biology; biochemistry and molecular biology; biomedicine; chemistry; earth and space sciences; economics; engineering; mathematics; physics; political science;…

  1. Communication and Sexuality: An Annotated Bibliography.

    Science.gov (United States)

    Buley, Jerry, Comp.; And Others

    The entries in this annotated bibliography represent books, educational journals, dissertations, popular magazines, and research studies that deal with the topic of communication and sexuality. Arranged alphabetically by author and also indexed according to subject matter, the titles span a variety of topics, including the following: sex and…

  2. La Mujer Chicana: An Annotated Bibliography, 1976.

    Science.gov (United States)

    Chapa, Evey, Ed.; And Others

    Intended to provide interested persons, researchers, and educators with information about "la mujer Chicana", this annotated bibliography cites 320 materials published between 1916 and 1975, with the majority being between 1960 and 1975. The 12 sections cover the following subject areas: Chicana publications; Chicana feminism and…

  3. College Students in Transition: An Annotated Bibliography

    Science.gov (United States)

    Foote, Stephanie M., Ed.; Hinkle, Sara M., Ed.; Kranzow, Jeannine, Ed.; Pistilli, Matthew D., Ed.; Miles, LaTonya Rease, Ed.; Simmons, Jannell G., Ed.

    2013-01-01

    The transition from high school to college is an important milestone, but it is only one of many steps in the journey through higher education. This volume is an annotated bibliography of the emerging literature examining the many other transitions students make beyond the first year, including the sophomore year, the transfer experience, and the…

  4. Adolescent Reproductive Behaviour: An Annotated Bibliography.

    Science.gov (United States)

    United Nations, New York, NY. Population Div.

    A general overview of the literature on adolescent fertility and closely related issues is provided in this annotated bibliography. Material on the following topics is included: (1) programs related to adolescent pregnancy, contraception, abortion, and births; (2) studies relating socioeconomic characteristics of pregnant adolescents to their…

  5. Reflective Annotations: On Becoming a Scholar

    Science.gov (United States)

    Alexander, Mark; Taylor, Caroline; Greenberger, Scott; Watts, Margie; Balch, Riann

    2012-01-01

    This article presents the authors' reflective annotations on becoming a scholar. This paper begins with a discussion on socialization for teaching, followed by a discussion on socialization for service and sense of belonging. Then, it describes how the doctoral process evolves. Finally, it talks about adult learners who pursue doctoral education.

  6. Communication and Politics: A Selected, Annotated Bibliography.

    Science.gov (United States)

    Kaid, Lynda Lee; And Others

    Noting that the study of communication in political settings is an increasingly popular and important area of teaching and research in many disciplines, this 51-item annotated bibliography reflects the interdisciplinary nature of the field and is designed to incorporate varying approaches to the subject matter. With few exceptions, the books and…

  7. The Basic Course: A Selected, Annotated Bibliography.

    Science.gov (United States)

    Demo, Penny

    Defining basic speech communication courses as those public speaking, interpersonal, or communication courses that treat fundamental communication concepts, this annotated bibliography reflects the current thought of speech educators on the basic course. The bibliography consists of 27 citations, all of which are drawn from the ERIC database. (SKC)

  8. Greeks in Canada (an Annotated Bibliography).

    Science.gov (United States)

    Bombas, Leonidas C.

    This bibliography on Greeks in Canada includes annotated references to both published and (mostly) unpublished works. Among the 70 entries (arranged in alphabetical order by author) are articles, reports, papers, and theses that deal either exclusively with or include a separate section on Greeks in the various Canadian provinces. (GC)

  9. Annotated Bibliography of English for Special Purposes.

    Science.gov (United States)

    Allix, Beverley, Comp.

    This annotated bibliography covers the following types of materials of use to teachers of English for Special Purposes: (1) books, monographs, reports, and conference papers; (2) periodical articles and essays in collections; (3) theses and dissertations; (4) bibliographies; (5) dictionaries; and (6) textbooks in series by publisher. Section (1)…

  10. La Mujer Chicana: An Annotated Bibliography, 1976.

    Science.gov (United States)

    Chapa, Evey, Ed.; And Others

    Intended to provide interested persons, researchers, and educators with information about "la mujer Chicana", this annotated bibliography cites 320 materials published between 1916 and 1975, with the majority being between 1960 and 1975. The 12 sections cover the following subject areas: Chicana publications; Chicana feminism and "el movimiento";…

  11. Learning to search for images without annotations

    NARCIS (Netherlands)

    Kordumova, S.

    2016-01-01

    Humans are adjusted to the environment and can easily recognize what they see around them or in images. Machines, however, cannot recognize images unless trained to do so. The usual approach is to annotate images with what they capture and train a machine learning algorithm. This thesis focuses on a

  12. Political Campaign Debating: A Selected, Annotated Bibliography.

    Science.gov (United States)

    Ritter, Kurt; Hellweg, Susan A.

    Noting that television debates have become a regular feature of the media politics by which candidates seek office, this annotated bibliography is particularly intended to assist teachers and researchers of debate, argumentation, and political communication. The 40 citations are limited to the television era of American politics and categorized as…

  13. A Partially Annotated Political Communication Bibliography.

    Science.gov (United States)

    Thornton, Barbara C.

    This 63-page annotated bibliography contains available materials in the area of political communication, a relatively new field of political science. Political communication includes facets of the election process and interaction between political parties and the voter. A variety of materials dating from 1960 to 1972 include books, pamphlets,…

  14. Bibliografia de Aztlan: An Annotated Chicano Bibliography.

    Science.gov (United States)

    Barrios, Ernie, Ed.

    More than 300 books and articles published from 1920 to 1971 are reviewed in this annotated bibliography of literature on the Chicano. The citations and reviews are categorized by subject area and deal with contemporary Chicano history, education, health, history of Mexico, literature, native Americans, philosophy, political science, pre-Columbian…

  15. Automating Ontological Annotation with WordNet

    Energy Technology Data Exchange (ETDEWEB)

    Sanfilippo, Antonio P.; Tratz, Stephen C.; Gregory, Michelle L.; Chappell, Alan R.; Whitney, Paul D.; Posse, Christian; Paulson, Patrick R.; Baddeley, Bob L.; Hohimer, Ryan E.; White, Amanda M.

    2006-01-22

    Semantic Web applications require robust and accurate annotation tools that are capable of automating the assignment of ontological classes to words in naturally occurring text (ontological annotation). Most current ontologies do not include rich lexical databases and are therefore not easily integrated with word sense disambiguation algorithms that are needed to automate ontological annotation. WordNet provides a potentially ideal solution to this problem as it offers a highly structured lexical conceptual representation that has been extensively used to develop word sense disambiguation algorithms. However, WordNet has not been designed as an ontology, and while it can be easily turned into one, the result of doing this would present users with serious practical limitations due to the great number of concepts (synonym sets) it contains. Moreover, mapping WordNet to an existing ontology may be difficult and requires substantial labor. We propose to overcome these limitations by developing an analytical platform that (1) provides a WordNet-based ontology offering a manageable and yet comprehensive set of concept classes, (2) leverages the lexical richness of WordNet to give an extensive characterization of concept class in terms of lexical instances, and (3) integrates a class recognition algorithm that automates the assignment of concept classes to words in naturally occurring text. The ensuing framework makes available an ontological annotation platform that can be effectively integrated with intelligence analysis systems to facilitate evidence marshaling and sustain the creation and validation of inference models.

  16. Ontological Annotation with WordNet

    Energy Technology Data Exchange (ETDEWEB)

    Sanfilippo, Antonio P.; Tratz, Stephen C.; Gregory, Michelle L.; Chappell, Alan R.; Whitney, Paul D.; Posse, Christian; Paulson, Patrick R.; Baddeley, Bob; Hohimer, Ryan E.; White, Amanda M.

    2006-06-06

    Semantic Web applications require robust and accurate annotation tools that are capable of automating the assignment of ontological classes to words in naturally occurring text (ontological annotation). Most current ontologies do not include rich lexical databases and are therefore not easily integrated with word sense disambiguation algorithms that are needed to automate ontological annotation. WordNet provides a potentially ideal solution to this problem as it offers a highly structured lexical conceptual representation that has been extensively used to develop word sense disambiguation algorithms. However, WordNet has not been designed as an ontology, and while it can be easily turned into one, the result of doing this would present users with serious practical limitations due to the great number of concepts (synonym sets) it contains. Moreover, mapping WordNet to an existing ontology may be difficult and requires substantial labor. We propose to overcome these limitations by developing an analytical platform that (1) provides a WordNet-based ontology offering a manageable and yet comprehensive set of concept classes, (2) leverages the lexical richness of WordNet to give an extensive characterization of concept class in terms of lexical instances, and (3) integrates a class recognition algorithm that automates the assignment of concept classes to words in naturally occurring text. The ensuing framework makes available an ontological annotation platform that can be effectively integrated with intelligence analysis systems to facilitate evidence marshaling and sustain the creation and validation of inference models.

  17. Ludwig von Mises: An Annotated Bibliography.

    Science.gov (United States)

    Gordon, David

    A 117-item annotated bibliography of books, articles, essays, lectures, and reviews by economist Ludwig von Mises is presented. The bibliography is arranged chronologicaly, and is followed by an alphabetical listing of the citations, excluding books. An index and information on the Ludwig von Mises Institute at Auburn University (Alabama) are…

  18. Learning to search for images without annotations

    NARCIS (Netherlands)

    Kordumova, S.

    2016-01-01

    Humans are adjusted to the environment and can easily recognize what they see around them or in images. Machines, however, cannot recognize images unless trained to do so. The usual approach is to annotate images with what they capture and train a machine learning algorithm. This thesis focuses on a

  19. Small Group Communication: An Annotated Bibliography.

    Science.gov (United States)

    Gouran, Dennis S.; Guadagnino, Christopher S.

    This annotated bibliography includes sources of information that are primarily concerned with problem solving, decision making, and processes of social influence in small groups, and secondarily deal with other aspects of communication and interaction in groups, such as conflict management and negotiation. The 57 entries, all dating from 1980…

  20. Structuring and presenting annotated media repositories

    NARCIS (Netherlands)

    Rutledge, L.; Ossenbruggen, J.R. van; Hardman, L.

    2004-01-01

    The Semantic Web envisions a Web that is both human readable and machine processible. In practice, however, there is still a large conceptual gap between annotated content repositories on the one hand, and coherent, human readable Web pages on the other. To bridge this conceptual gap, one needs to s

  1. Skin Cancer Education Materials: Selected Annotations.

    Science.gov (United States)

    National Cancer Inst. (NIH), Bethesda, MD.

    This annotated bibliography presents 85 entries on a variety of approaches to cancer education. The entries are grouped under three broad headings, two of which contain smaller sub-divisions. The first heading, Public Education, contains prevention and general information, and non-print materials. The second heading, Professional Education,…

  2. Book Reviews, Annotation, and Web Technology.

    Science.gov (United States)

    Schulze, Patricia

    From reading texts to annotating web pages, grade 6-8 students rely on group cooperation and individual reading and writing skills in this research project that spans six 50-minute lessons. Student objectives for this project are that they will: read, discuss, and keep a journal on a book in literature circles; understand the elements of and…

  3. Greeks in Canada (an Annotated Bibliography).

    Science.gov (United States)

    Bombas, Leonidas C.

    This bibliography on Greeks in Canada includes annotated references to both published and (mostly) unpublished works. Among the 70 entries (arranged in alphabetical order by author) are articles, reports, papers, and theses that deal either exclusively with or include a separate section on Greeks in the various Canadian provinces. (GC)

  4. Annotated Bibliography of Literature on Narcotic Addiction.

    Science.gov (United States)

    Bowden, R. Renee

    Nearly 150 abstracts have been included in this annotated bibliography; its purpose has been to scan the voluminous number of documents on the problem of drug addiction in order to summarize the present state of knowledge on narcotic addiction and on methods for its treatment and control. The literature reviewed has been divided into the following…

  5. DNAVis: interactive visualization of comparative genome annotations

    NARCIS (Netherlands)

    Fiers, M.W.E.J.; Wetering, van de H.; Peeters, T.H.J.M.; Wijk, van J.J.; Nap, J.P.H.

    2006-01-01

    The software package DNAVis offers a fast, interactive and real-time visualization of DNA sequences and their comparative genome annotations. DNAVis implements advanced methods of information visualization such as linked views, perspective walls and semantic zooming, in addition to the display of he

  6. An Annotated Bibliography in Financial Therapy

    Directory of Open Access Journals (Sweden)

    Dorothy B. Durband

    2010-10-01

    Full Text Available The following annotated bibliography contains a summary of articles and websites, as well as a list of books related to financial therapy. The resources were compiled through e-mail solicitation from members of the Financial Therapy Forum in November 2008. Members of the forum are marked with an asterisk.

  7. An Annotated Bibliography of Nonsexist Resources.

    Science.gov (United States)

    Miles Coll., Eutaw, AL. West Alabama Curriculum and Materials Resource Center.

    The result of a thorough search, review, and compilation of resources on women's equity, the annotated bibliography represents a sample of print materials, games and kits, photos and posters, and audiovisual aids now available on sexism that should prove useful to counselors, instructors, school administrators, parents, and elementary and…

  8. Skin Cancer Education Materials: Selected Annotations.

    Science.gov (United States)

    National Cancer Inst. (NIH), Bethesda, MD.

    This annotated bibliography presents 85 entries on a variety of approaches to cancer education. The entries are grouped under three broad headings, two of which contain smaller sub-divisions. The first heading, Public Education, contains prevention and general information, and non-print materials. The second heading, Professional Education,…

  9. MEETING: Chlamydomonas Annotation Jamboree - October 2003

    Energy Technology Data Exchange (ETDEWEB)

    Grossman, Arthur R

    2007-04-13

    Shotgun sequencing of the nuclear genome of Chlamydomonas reinhardtii (Chlamydomonas throughout) was performed at an approximate 10X coverage by JGI. Roughly half of the genome is now contained on 26 scaffolds, all of which are at least 1.6 Mb, and the coverage of the genome is ~95%. There are now over 200,000 cDNA sequence reads that we have generated as part of the Chlamydomonas genome project (Grossman, 2003; Shrager et al., 2003; Grossman et al. 2007; Merchant et al., 2007); other sequences have also been generated by the Kasuza sequence group (Asamizu et al., 1999; Asamizu et al., 2000) or individual laboratories that have focused on specific genes. Shrager et al. (2003) placed the reads into distinct contigs (an assemblage of reads with overlapping nucleotide sequences), and contigs that group together as part of the same genes have been designated ACEs (assembly of contigs generated from EST information). All of the reads have also been mapped to the Chlamydomonas nuclear genome and the cDNAs and their corresponding genomic sequences have been reassembled, and the resulting assemblage is called an ACEG (an Assembly of contiguous EST sequences supported by genomic sequence) (Jain et al., 2007). Most of the unique genes or ACEGs are also represented by gene models that have been generated by the Joint Genome Institute (JGI, Walnut Creek, CA). These gene models have been placed onto the DNA scaffolds and are presented as a track on the Chlamydomonas genome browser associated with the genome portal (http://genome.jgi-psf.org/Chlre3/Chlre3.home.html). Ultimately, the meeting grant awarded by DOE has helped enormously in the development of an annotation pipeline (a set of guidelines used in the annotation of genes) and resulted in high quality annotation of over 4,000 genes; the annotators were from both Europe and the USA. Some of the people who led the annotation initiative were Arthur Grossman, Olivier Vallon, and Sabeeha Merchant (with many individual

  10. Annotation Method (AM): SE22_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available ether with predicted molecular formulae and putative structures, were provided as metabolite annotations. Comparison with public data...bases was performed. A grading system was introduced to describe the evidence supporting the annotations. ...

  11. Computer systems for annotation of single molecule fragments

    Science.gov (United States)

    Schwartz, David Charles; Severin, Jessica

    2016-07-19

    There are provided computer systems for visualizing and annotating single molecule images. Annotation systems in accordance with this disclosure allow a user to mark and annotate single molecules of interest and their restriction enzyme cut sites thereby determining the restriction fragments of single nucleic acid molecules. The markings and annotations may be automatically generated by the system in certain embodiments and they may be overlaid translucently onto the single molecule images. An image caching system may be implemented in the computer annotation systems to reduce image processing time. The annotation systems include one or more connectors connecting to one or more databases capable of storing single molecule data as well as other biomedical data. Such diverse array of data can be retrieved and used to validate the markings and annotations. The annotation systems may be implemented and deployed over a computer network. They may be ergonomically optimized to facilitate user interactions.

  12. Evaluation of probabilistic and logical inference for a SNP annotation system.

    Science.gov (United States)

    Shen, Terry H; Tarczy-Hornoch, Peter; Detwiler, Landon T; Cadag, Eithon; Carlson, Christopher S

    2010-06-01

    Genome wide association studies (GWAS) are an important approach to understanding the genetic mechanisms behind human diseases. Single nucleotide polymorphisms (SNPs) are the predominant markers used in genome wide association studies, and the ability to predict which SNPs are likely to be functional is important for both a priori and a posteriori analyses of GWA studies. This article describes the design, implementation and evaluation of a family of systems for the purpose of identifying SNPs that may cause a change in phenotypic outcomes. The methods described in this article characterize the feasibility of combinations of logical and probabilistic inference with federated data integration for both point and regional SNP annotation and analysis. Evaluations of the methods demonstrate the overall strong predictive value of logical, and logical with probabilistic, inference applied to the domain of SNP annotation.

  13. On Semantic Annotation in Clarin-PL Parallel Corpora

    OpenAIRE

    Violetta Koseska-Toszewa; Roman Roszko

    2015-01-01

    On Semantic Annotation in Clarin-PL Parallel Corpora In the article, the authors present a proposal for semantic annotation in Clarin-PL parallel corpora: Polish-Bulgarian-Russian and Polish-Lithuanian ones. Semantic annotation of quantification is a novum in developing sentence level semantics in multilingual parallel corpora. This is why our semantic annotation is manual. The authors hope it will be interesting to IT specialists working on automatic processing of the given natural langu...

  14. BEACON: automated tool for Bacterial GEnome Annotation ComparisON

    KAUST Repository

    Kalkatawi, Manal Matoq Saeed

    2015-08-18

    Background Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). Results The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON’s utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27 %, while the number of genes without any function assignment is reduced. Conclusions We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/

  15. BEACON: automated tool for Bacterial GEnome Annotation ComparisON.

    Science.gov (United States)

    Kalkatawi, Manal; Alam, Intikhab; Bajic, Vladimir B

    2015-08-18

    Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON's utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27%, while the number of genes without any function assignment is reduced. We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/ .

  16. AnnaBot: A Static Verifier for Java Annotation Usage

    OpenAIRE

    Ian Darwin

    2010-01-01

    This paper describes AnnaBot, one of the first tools to verify correct use of Annotation-based metadata in the Java programming language. These Annotations are a standard Java 5 mechanism used to attach metadata to types, methods, or fields without using an external configuration file. A binary representation of the Annotation becomes part of the compiled “.class” file, for inspection by another component or library at runtime. Java Annotations were introduced into the Java language in ...

  17. A SANE approach to annotation in the digital edition

    NARCIS (Netherlands)

    Boot, P.; Braungart, Georg; Jannidis, Fotis; Gendolla, Peter

    2007-01-01

    Robinson and others have recently called for dynamic and collaborative digital scholarly editions. Annotation is a key component for editions that are not merely passive, read-only repositories of knowledge. Annotation facilities (both annotation creation and display), however, require complex

  18. Annotation of the protein coding regions of the equine genome

    DEFF Research Database (Denmark)

    Hestand, Matthew S.; Kalbfleisch, Theodore S.; Coleman, Stephen J.;

    2015-01-01

    Current gene annotation of the horse genome is largely derived from in silico predictions and cross-species alignments. Only a small number of genes are annotated based on equine EST and mRNA sequences. To expand the number of equine genes annotated from equine experimental evidence, we sequenced...

  19. Automatic annotation of head velocity and acceleration in Anvil

    DEFF Research Database (Denmark)

    Jongejan, Bart

    2012-01-01

    We describe an automatic face tracker plugin for the ANVIL annotation tool. The face tracker produces data for velocity and for acceleration in two dimensions. We compare the annotations generated by the face tracking algorithm with independently made manual annotations for head movements...

  20. Interoperable Multimedia Annotation and Retrieval for the Tourism Sector

    NARCIS (Netherlands)

    Chatzitoulousis, Antonios; Efraimidis, Pavlos S.; Athanasiadis, I.N.

    2015-01-01

    The Atlas Metadata System (AMS) employs semantic web annotation techniques in order to create an interoperable information annotation and retrieval platform for the tourism sector. AMS adopts state-of-the-art metadata vocabularies, annotation techniques and semantic web technologies. Interoperabilit

  1. Model and Interoperability using Meta Data Annotations

    Science.gov (United States)

    David, O.

    2011-12-01

    Software frameworks and architectures are in need for meta data to efficiently support model integration. Modelers have to know the context of a model, often stepping into modeling semantics and auxiliary information usually not provided in a concise structure and universal format, consumable by a range of (modeling) tools. XML often seems the obvious solution for capturing meta data, but its wide adoption to facilitate model interoperability is limited by XML schema fragmentation, complexity, and verbosity outside of a data-automation process. Ontologies seem to overcome those shortcomings, however the practical significance of their use remains to be demonstrated. OMS version 3 took a different approach for meta data representation. The fundamental building block of a modular model in OMS is a software component representing a single physical process, calibration method, or data access approach. Here, programing language features known as Annotations or Attributes were adopted. Within other (non-modeling) frameworks it has been observed that annotations lead to cleaner and leaner application code. Framework-supported model integration, traditionally accomplished using Application Programming Interfaces (API) calls is now achieved using descriptive code annotations. Fully annotated components for various hydrological and Ag-system models now provide information directly for (i) model assembly and building, (ii) data flow analysis for implicit multi-threading or visualization, (iii) automated and comprehensive model documentation of component dependencies, physical data properties, (iv) automated model and component testing, calibration, and optimization, and (v) automated audit-traceability to account for all model resources leading to a particular simulation result. Such a non-invasive methodology leads to models and modeling components with only minimal dependencies on the modeling framework but a strong reference to its originating code. Since models and

  2. Evolutionary interrogation of human biology in well-annotated genomic framework of rhesus macaque.

    Science.gov (United States)

    Zhang, Shi-Jian; Liu, Chu-Jun; Yu, Peng; Zhong, Xiaoming; Chen, Jia-Yu; Yang, Xinzhuang; Peng, Jiguang; Yan, Shouyu; Wang, Chenqu; Zhu, Xiaotong; Xiong, Jingwei; Zhang, Yong E; Tan, Bertrand Chin-Ming; Li, Chuan-Yun

    2014-05-01

    With genome sequence and composition highly analogous to human, rhesus macaque represents a unique reference for evolutionary studies of human biology. Here, we developed a comprehensive genomic framework of rhesus macaque, the RhesusBase2, for evolutionary interrogation of human genes and the associated regulations. A total of 1,667 next-generation sequencing (NGS) data sets were processed, integrated, and evaluated, generating 51.2 million new functional annotation records. With extensive NGS annotations, RhesusBase2 refined the fine-scale structures in 30% of the macaque Ensembl transcripts, reporting an accurate, up-to-date set of macaque gene models. On the basis of these annotations and accurate macaque gene models, we further developed an NGS-oriented Molecular Evolution Gateway to access and visualize macaque annotations in reference to human orthologous genes and associated regulations (www.rhesusbase.org/molEvo). We highlighted the application of this well-annotated genomic framework in generating hypothetical link of human-biased regulations to human-specific traits, by using mechanistic characterization of the DIEXF gene as an example that provides novel clues to the understanding of digestive system reduction in human evolution. On a global scale, we also identified a catalog of 9,295 human-biased regulatory events, which may represent novel elements that have a substantial impact on shaping human transcriptome and possibly underpin recent human phenotypic evolution. Taken together, we provide an NGS data-driven, information-rich framework that will broadly benefit genomics research in general and serves as an important resource for in-depth evolutionary studies of human biology.

  3. dcGOR: an R package for analysing ontologies and protein domain annotations.

    Directory of Open Access Journals (Sweden)

    Hai Fang

    2014-10-01

    Full Text Available I introduce an open-source R package 'dcGOR' to provide the bioinformatics community with the ease to analyse ontologies and protein domain annotations, particularly those in the dcGO database. The dcGO is a comprehensive resource for protein domain annotations using a panel of ontologies including Gene Ontology. Although increasing in popularity, this database needs statistical and graphical support to meet its full potential. Moreover, there are no bioinformatics tools specifically designed for domain ontology analysis. As an add-on package built in the R software environment, dcGOR offers a basic infrastructure with great flexibility and functionality. It implements new data structure to represent domains, ontologies, annotations, and all analytical outputs as well. For each ontology, it provides various mining facilities, including: (i domain-based enrichment analysis and visualisation; (ii construction of a domain (semantic similarity network according to ontology annotations; and (iii significance analysis for estimating a contact (statistical significance network. To reduce runtime, most analyses support high-performance parallel computing. Taking as inputs a list of protein domains of interest, the package is able to easily carry out in-depth analyses in terms of functional, phenotypic and diseased relevance, and network-level understanding. More importantly, dcGOR is designed to allow users to import and analyse their own ontologies and annotations on domains (taken from SCOP, Pfam and InterPro and RNAs (from Rfam as well. The package is freely available at CRAN for easy installation, and also at GitHub for version control. The dedicated website with reproducible demos can be found at http://supfam.org/dcGOR.

  4. dcGOR: an R package for analysing ontologies and protein domain annotations.

    Science.gov (United States)

    Fang, Hai

    2014-10-01

    I introduce an open-source R package 'dcGOR' to provide the bioinformatics community with the ease to analyse ontologies and protein domain annotations, particularly those in the dcGO database. The dcGO is a comprehensive resource for protein domain annotations using a panel of ontologies including Gene Ontology. Although increasing in popularity, this database needs statistical and graphical support to meet its full potential. Moreover, there are no bioinformatics tools specifically designed for domain ontology analysis. As an add-on package built in the R software environment, dcGOR offers a basic infrastructure with great flexibility and functionality. It implements new data structure to represent domains, ontologies, annotations, and all analytical outputs as well. For each ontology, it provides various mining facilities, including: (i) domain-based enrichment analysis and visualisation; (ii) construction of a domain (semantic similarity) network according to ontology annotations; and (iii) significance analysis for estimating a contact (statistical significance) network. To reduce runtime, most analyses support high-performance parallel computing. Taking as inputs a list of protein domains of interest, the package is able to easily carry out in-depth analyses in terms of functional, phenotypic and diseased relevance, and network-level understanding. More importantly, dcGOR is designed to allow users to import and analyse their own ontologies and annotations on domains (taken from SCOP, Pfam and InterPro) and RNAs (from Rfam) as well. The package is freely available at CRAN for easy installation, and also at GitHub for version control. The dedicated website with reproducible demos can be found at http://supfam.org/dcGOR.

  5. MalaCards: an amalgamated human disease compendium with diverse clinical and genetic annotation and structured search

    Science.gov (United States)

    Rappaport, Noa; Twik, Michal; Plaschkes, Inbar; Nudel, Ron; Iny Stein, Tsippi; Levitt, Jacob; Gershoni, Moran; Morrey, C. Paul; Safran, Marilyn; Lancet, Doron

    2017-01-01

    The MalaCards human disease database (http://www.malacards.org/) is an integrated compendium of annotated diseases mined from 68 data sources. MalaCards has a web card for each of ∼20 000 disease entries, in six global categories. It portrays a broad array of annotation topics in 15 sections, including Summaries, Symptoms, Anatomical Context, Drugs, Genetic Tests, Variations and Publications. The Aliases and Classifications section reflects an algorithm for disease name integration across often-conflicting sources, providing effective annotation consolidation. A central feature is a balanced Genes section, with scores reflecting the strength of disease-gene associations. This is accompanied by other gene-related disease information such as pathways, mouse phenotypes and GO-terms, stemming from MalaCards’ affiliation with the GeneCards Suite of databases. MalaCards’ capacity to inter-link information from complementary sources, along with its elaborate search function, relational database infrastructure and convenient data dumps, allows it to tackle its rich disease annotation landscape, and facilitates systems analyses and genome sequence interpretation. MalaCards adopts a ‘flat’ disease-card approach, but each card is mapped to popular hierarchical ontologies (e.g. International Classification of Diseases, Human Phenotype Ontology and Unified Medical Language System) and also contains information about multi-level relations among diseases, thereby providing an optimal tool for disease representation and scrutiny. PMID:27899610

  6. MalaCards: an amalgamated human disease compendium with diverse clinical and genetic annotation and structured search.

    Science.gov (United States)

    Rappaport, Noa; Twik, Michal; Plaschkes, Inbar; Nudel, Ron; Iny Stein, Tsippi; Levitt, Jacob; Gershoni, Moran; Morrey, C Paul; Safran, Marilyn; Lancet, Doron

    2017-01-04

    The MalaCards human disease database (http://www.malacards.org/) is an integrated compendium of annotated diseases mined from 68 data sources. MalaCards has a web card for each of ∼20 000 disease entries, in six global categories. It portrays a broad array of annotation topics in 15 sections, including Summaries, Symptoms, Anatomical Context, Drugs, Genetic Tests, Variations and Publications. The Aliases and Classifications section reflects an algorithm for disease name integration across often-conflicting sources, providing effective annotation consolidation. A central feature is a balanced Genes section, with scores reflecting the strength of disease-gene associations. This is accompanied by other gene-related disease information such as pathways, mouse phenotypes and GO-terms, stemming from MalaCards' affiliation with the GeneCards Suite of databases. MalaCards' capacity to inter-link information from complementary sources, along with its elaborate search function, relational database infrastructure and convenient data dumps, allows it to tackle its rich disease annotation landscape, and facilitates systems analyses and genome sequence interpretation. MalaCards adopts a 'flat' disease-card approach, but each card is mapped to popular hierarchical ontologies (e.g. International Classification of Diseases, Human Phenotype Ontology and Unified Medical Language System) and also contains information about multi-level relations among diseases, thereby providing an optimal tool for disease representation and scrutiny. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  7. Analysis of mammalian gene function through broad-based phenotypic screens across a consortium of mouse clinics.

    Science.gov (United States)

    Hrabě de Angelis, Martin; Nicholson, George; Selloum, Mohammed; White, Jacqueline K; Morgan, Hugh; Ramirez-Solis, Ramiro; Sorg, Tania; Wells, Sara; Fuchs, Helmut; Fray, Martin; Adams, David J; Adams, Niels C; Adler, Thure; Aguilar-Pimentel, Antonio; Ali-Hadji, Dalila; Amann, Gregory; André, Philippe; Atkins, Sarah; Auburtin, Aurelie; Ayadi, Abdel; Becker, Julien; Becker, Lore; Bedu, Elodie; Bekeredjian, Raffi; Birling, Marie-Christine; Blake, Andrew; Bottomley, Joanna; Bowl, Michael R; Brault, Véronique; Busch, Dirk H; Bussell, James N; Calzada-Wack, Julia; Cater, Heather; Champy, Marie-France; Charles, Philippe; Chevalier, Claire; Chiani, Francesco; Codner, Gemma F; Combe, Roy; Cox, Roger; Dalloneau, Emilie; Dierich, André; Di Fenza, Armida; Doe, Brendan; Duchon, Arnaud; Eickelberg, Oliver; Esapa, Chris T; Fertak, Lahcen El; Feigel, Tanja; Emelyanova, Irina; Estabel, Jeanne; Favor, Jack; Flenniken, Ann; Gambadoro, Alessia; Garrett, Lilian; Gates, Hilary; Gerdin, Anna-Karin; Gkoutos, George; Greenaway, Simon; Glasl, Lisa; Goetz, Patrice; Da Cruz, Isabelle Goncalves; Götz, Alexander; Graw, Jochen; Guimond, Alain; Hans, Wolfgang; Hicks, Geoff; Hölter, Sabine M; Höfler, Heinz; Hancock, John M; Hoehndorf, Robert; Hough, Tertius; Houghton, Richard; Hurt, Anja; Ivandic, Boris; Jacobs, Hughes; Jacquot, Sylvie; Jones, Nora; Karp, Natasha A; Katus, Hugo A; Kitchen, Sharon; Klein-Rodewald, Tanja; Klingenspor, Martin; Klopstock, Thomas; Lalanne, Valerie; Leblanc, Sophie; Lengger, Christoph; le Marchand, Elise; Ludwig, Tonia; Lux, Aline; McKerlie, Colin; Maier, Holger; Mandel, Jean-Louis; Marschall, Susan; Mark, Manuel; Melvin, David G; Meziane, Hamid; Micklich, Kateryna; Mittelhauser, Christophe; Monassier, Laurent; Moulaert, David; Muller, Stéphanie; Naton, Beatrix; Neff, Frauke; Nolan, Patrick M; Nutter, Lauryl M J; Ollert, Markus; Pavlovic, Guillaume; Pellegata, Natalia S; Peter, Emilie; Petit-Demoulière, Benoit; Pickard, Amanda; Podrini, Christine; Potter, Paul; Pouilly, Laurent; Puk, Oliver; Richardson, David; Rousseau, Stephane; Quintanilla-Fend, Leticia; Quwailid, Mohamed M; Racz, Ildiko; Rathkolb, Birgit; Riet, Fabrice; Rossant, Janet; Roux, Michel; Rozman, Jan; Ryder, Edward; Salisbury, Jennifer; Santos, Luis; Schäble, Karl-Heinz; Schiller, Evelyn; Schrewe, Anja; Schulz, Holger; Steinkamp, Ralf; Simon, Michelle; Stewart, Michelle; Stöger, Claudia; Stöger, Tobias; Sun, Minxuan; Sunter, David; Teboul, Lydia; Tilly, Isabelle; Tocchini-Valentini, Glauco P; Tost, Monica; Treise, Irina; Vasseur, Laurent; Velot, Emilie; Vogt-Weisenhorn, Daniela; Wagner, Christelle; Walling, Alison; Wattenhofer-Donze, Marie; Weber, Bruno; Wendling, Olivia; Westerberg, Henrik; Willershäuser, Monja; Wolf, Eckhard; Wolter, Anne; Wood, Joe; Wurst, Wolfgang; Yildirim, Ali Önder; Zeh, Ramona; Zimmer, Andreas; Zimprich, Annemarie; Holmes, Chris; Steel, Karen P; Herault, Yann; Gailus-Durner, Valérie; Mallon, Ann-Marie; Brown, Steve D M

    2015-09-01

    The function of the majority of genes in the mouse and human genomes remains unknown. The mouse embryonic stem cell knockout resource provides a basis for the characterization of relationships between genes and phenotypes. The EUMODIC consortium developed and validated robust methodologies for the broad-based phenotyping of knockouts through a pipeline comprising 20 disease-oriented platforms. We developed new statistical methods for pipeline design and data analysis aimed at detecting reproducible phenotypes with high power. We acquired phenotype data from 449 mutant alleles, representing 320 unique genes, of which half had no previous functional annotation. We captured data from over 27,000 mice, finding that 83% of the mutant lines are phenodeviant, with 65% demonstrating pleiotropy. Surprisingly, we found significant differences in phenotype annotation according to zygosity. New phenotypes were uncovered for many genes with previously unknown function, providing a powerful basis for hypothesis generation and further investigation in diverse systems.

  8. Automatic Extraction of Tagset Mappings from Parallel-Annotated Corpora

    CERN Document Server

    Hughes, J; Atwell, E; Hughes, John; Souter, Clive; Atwell, Eric

    1995-01-01

    This paper describes some of the recent work of project AMALGAM (automatic mapping among lexico-grammatical annotation models). We are investigating ways to map between the leading corpus annotation schemes in order to improve their resuability. Collation of all the included corpora into a single large annotated corpus will provide a more detailed language model to be developed for tasks such as speech and handwriting recognition. In particular, we focus here on a method of extracting mappings from corpora that have been annotated according to more than one annotation scheme.

  9. How well are protein structures annotated in secondary databases?

    Science.gov (United States)

    Rother, Kristian; Michalsky, Elke; Leser, Ulf

    2005-09-01

    We investigated to what extent Protein Data Bank (PDB) entries are annotated with second-party information based on existing cross-references between PDB and 15 other databases. We report 2 interesting findings. First, there is a clear "annotation gap" for structures less than 7 years old for secondary databases that are manually curated. Second, the examined databases overlap with each other quite well, dividing the PDB into 2 well-annotated thirds and one poorly annotated third. Both observations should be taken into account in any study depending on the selection of protein structures by their annotation.

  10. All SNPs are not created equal: genome-wide association studies reveal a consistent pattern of enrichment among functionally annotated SNPs

    DEFF Research Database (Denmark)

    Schork, Andrew J; Thompson, Wesley K; Pham, Phillip;

    2013-01-01

    (TDR = 1-FDR) for strata determined by different genic categories. We show a consistent pattern of enrichment of polygenic effects in specific annotation categories across diverse phenotypes, with the greatest enrichment for SNPs tagging regulatory and coding genic elements, little enrichment...

  11. Image Semantic Automatic Annotation by Relevance Feedback

    Institute of Scientific and Technical Information of China (English)

    ZHANG Tong-zhen; SHEN Rui-min

    2007-01-01

    A large semantic gap exists between content based index retrieval (CBIR) and high-level semantic, additional semantic information should be attached to the images, it refers in three respects including semantic representation model, semantic information building and semantic retrieval techniques. In this paper, we introduce an associated semantic network and an automatic semantic annotation system. In the system, a semantic network model is employed as the semantic representation model, it uses semantic keywords, linguistic ontology and low-level features in semantic similarity calculating. Through several times of users' relevance feedback, semantic network is enriched automatically. To speed up the growth of semantic network and get a balance annotation, semantic seeds and semantic loners are employed especially.

  12. Annotation of selection strengths in viral genomes

    DEFF Research Database (Denmark)

    McCauley, Stephen; de Groot, Saskia; Mailund, Thomas

    2007-01-01

    Motivation: Viral genomes tend to code in overlapping reading frames to maximize information content. This may result in atypical codon bias and particular evolutionary constraints. Due to the fast mutation rate of viruses, there is additional strong evidence for varying selection between intra......- and intergenomic regions. The presence of multiple coding regions complicates the concept of Ka/Ks ratio, and thus begs for an alternative approach when investigating selection strengths. Building on the paper by McCauley & Hein (2006), we develop a method for annotating a viral genome coding in overlapping...... may thus achieve an annotation both of coding regions as well as selection strengths, allowing us to investigate different selection patterns and hypotheses. Results: We illustrate our method by applying it to a multiple alignment of four HIV2 sequences, as well as four Hepatitis B sequences. We...

  13. Applied bioinformatics: Genome annotation and transcriptome analysis

    DEFF Research Database (Denmark)

    Gupta, Vikas

    and dhurrin, which have not previously been characterized in blueberries. There are more than 44,500 spider species with distinct habitats and unique characteristics. Spiders are masters of producing silk webs to catch prey and using venom to neutralize. The exploration of the genetics behind these properties...... japonicus (Lotus), Vaccinium corymbosum (blueberry), Stegodyphus mimosarum (spider) and Trifolium occidentale (clover). From a bioinformatics data analysis perspective, my work can be divided into three parts; genome annotation, small RNA, and gene expression analysis. Lotus is a legume of significant...... has just started. We have assembled and annotated the first two spider genomes to facilitate our understanding of spiders at the molecular level. The need for analyzing the large and increasing amount of sequencing data has increased the demand for efficient, user friendly, and broadly applicable...

  14. Exploiting Social Annotation for Automatic Resource Discovery

    CERN Document Server

    Plangprasopchok, Anon

    2007-01-01

    Information integration applications, such as mediators or mashups, that require access to information resources currently rely on users manually discovering and integrating them in the application. Manual resource discovery is a slow process, requiring the user to sift through results obtained via keyword-based search. Although search methods have advanced to include evidence from document contents, its metadata and the contents and link structure of the referring pages, they still do not adequately cover information sources -- often called ``the hidden Web''-- that dynamically generate documents in response to a query. The recently popular social bookmarking sites, which allow users to annotate and share metadata about various information sources, provide rich evidence for resource discovery. In this paper, we describe a probabilistic model of the user annotation process in a social bookmarking system del.icio.us. We then use the model to automatically find resources relevant to a particular information dom...

  15. Cognition inspired framework for indoor scene annotation

    Science.gov (United States)

    Ye, Zhipeng; Liu, Peng; Zhao, Wei; Tang, Xianglong

    2015-09-01

    We present a simple yet effective scene annotation framework based on a combination of bag-of-visual words (BoVW), three-dimensional scene structure estimation, scene context, and cognitive theory. From a macroperspective, the proposed cognition-based hybrid motivation framework divides the annotation problem into empirical inference and real-time classification. Inspired by the inference ability of human beings, common objects of indoor scenes are defined for experience-based inference, while in the real-time classification stage, an improved BoVW-based multilayer abstract semantics labeling method is proposed by introducing abstract semantic hierarchies to narrow the semantic gap and improve the performance of object categorization. The proposed framework was evaluated on a variety of common data sets and experimental results proved its effectiveness.

  16. A Novel Technique to Image Annotation using Neural Network

    Directory of Open Access Journals (Sweden)

    Pankaj Savita

    2013-03-01

    Full Text Available : Automatic annotation of digital pictures is a key technology for managing and retrieving images from large image collection. Traditional image semantics extraction and representation schemes were commonly divided into two categories, namely visual features and text annotations. However, visual feature scheme are difficult to extract and are often semantically inconsistent. On the other hand, the image semantics can be well represented by text annotations. It is also easier to retrieve images according to their annotations. Traditional image annotation techniques are time-consuming and requiring lots of human effort. In this paper we propose Neural Network based a novel approach to the problem of image annotation. These approaches are applied to the Image data set. Our main work is focused on the image annotation by using multilayer perceptron, which exhibits a clear-cut idea on application of multilayer perceptron with special features. MLP Algorithm helps us to discover the concealed relations between image data and annotation data, and annotate image according to such relations. By using this algorithm we can save more memory space, and in case of web applications, transferring of images and download should be fast. This paper reviews 50 image annotation systems using supervised machine learning Techniques to annotate images for image retrieval. Results obtained show that the multi layer perceptron Neural Network classifier outperforms conventional DST Technique.

  17. Web-based Video Annotation and its Applications

    Science.gov (United States)

    Yamamoto, Daisuke; Nagao, Katashi

    In this paper, we developed a Web-based video annotation system, named iVAS (intelligent Video Annotation Server). Audiences can associate any video content on the Internet with annotations. The system analyzes video content in order to acquire cut/shot information and color histograms. And it also automatically generates a Web page for editing annotations. Then, audiences can create annotation data by two methods. The first one helps the users to create text data such as person/object names, scene descriptions, and comments interactively. The second method facilitates the users associating any video fragments with their subjective impression by just clicking a mouse button. The generated annotation data are accumulated and managed by an XML database connected with iVAS. We also developed some application systems based on annotations such as video retrieval, video simplification, and video-content-based community support. One of the major advantages of our approach is easy integration of hand-coded and automatically-generated (such as color histograms and cut/shot information) annotations. Additionally, since our annotation system is open for public, we must consider some reliability or correctness of annotation data. We also developed an automatic evaluation method of annotation reliability using the users' feedback. In the future, these fundamental technologies will contribute to the formation of new communities centered around video content.

  18. Building a semantically annotated corpus of clinical texts.

    Science.gov (United States)

    Roberts, Angus; Gaizauskas, Robert; Hepple, Mark; Demetriou, George; Guo, Yikun; Roberts, Ian; Setzer, Andrea

    2009-10-01

    In this paper, we describe the construction of a semantically annotated corpus of clinical texts for use in the development and evaluation of systems for automatically extracting clinically significant information from the textual component of patient records. The paper details the sampling of textual material from a collection of 20,000 cancer patient records, the development of a semantic annotation scheme, the annotation methodology, the distribution of annotations in the final corpus, and the use of the corpus for development of an adaptive information extraction system. The resulting corpus is the most richly semantically annotated resource for clinical text processing built to date, whose value has been demonstrated through its use in developing an effective information extraction system. The detailed presentation of our corpus construction and annotation methodology will be of value to others seeking to build high-quality semantically annotated corpora in biomedical domains.

  19. Cadec: A corpus of adverse drug event annotations.

    Science.gov (United States)

    Karimi, Sarvnaz; Metke-Jimenez, Alejandro; Kemp, Madonna; Wang, Chen

    2015-06-01

    CSIRO Adverse Drug Event Corpus (Cadec) is a new rich annotated corpus of medical forum posts on patient-reported Adverse Drug Events (ADEs). The corpus is sourced from posts on social media, and contains text that is largely written in colloquial language and often deviates from formal English grammar and punctuation rules. Annotations contain mentions of concepts such as drugs, adverse effects, symptoms, and diseases linked to their corresponding concepts in controlled vocabularies, i.e., SNOMED Clinical Terms and MedDRA. The quality of the annotations is ensured by annotation guidelines, multi-stage annotations, measuring inter-annotator agreement, and final review of the annotations by a clinical terminologist. This corpus is useful for studies in the area of information extraction, or more generally text mining, from social media to detect possible adverse drug reactions from direct patient reports. The corpus is publicly available at https://data.csiro.au.(1).

  20. Deburring: an annotated bibliography. Volume V

    Energy Technology Data Exchange (ETDEWEB)

    Gillespie, L.K.

    1978-01-01

    An annotated summary of 204 articles and publications on burrs, burr prevention and deburring is presented. Thirty-seven deburring processes are listed. Entries cited include English, Russian, French, Japanese and German language articles. Entries are indexed by deburring processes, author, and language. Indexes also indicate which references discuss equipment and tooling, how to use a process, economics, burr properties, and how to design to minimize burr problems. Research studies are identified as are the materials deburred.

  1. Deburring: an annotated bibliography. Volume VI

    Energy Technology Data Exchange (ETDEWEB)

    Gillespie, L.K.

    1980-07-01

    An annotated summary of 138 articles and publications on burrs, burr prevention and deburring is presented. Thirty-seven deburring processes are listed. Entries cited include English, Russian, French, Japanese, and German language articles. Entries are indexed by deburring processes, author, and language. Indexes also indicate which references discuss equipment and tooling, how to use a proces economics, burr properties, and how to design to minimize burr problems. Research studies are identified as are the materials deburred.

  2. Cultural nationalism: a review and annotated bibliography

    OpenAIRE

    Woods, Eric Taylor

    2014-01-01

    This review and annotated bibliography is part of The State of Nationalism (SoN), a comprehensive guide to the study of nationalism. Topic of this first contribution is cultural nationalism. This concept generally refers to ideas and practices that relate to the intended revival of a purported national community’s culture. If political nationalism is focused on the achievement of political autonomy, cultural nationalism is focused on the cultivation of a nation.

  3. Automatic Function Annotations for Hoare Logic

    Directory of Open Access Journals (Sweden)

    Daniel Matichuk

    2012-11-01

    Full Text Available In systems verification we are often concerned with multiple, inter-dependent properties that a program must satisfy. To prove that a program satisfies a given property, the correctness of intermediate states of the program must be characterized. However, this intermediate reasoning is not always phrased such that it can be easily re-used in the proofs of subsequent properties. We introduce a function annotation logic that extends Hoare logic in two important ways: (1 when proving that a function satisfies a Hoare triple, intermediate reasoning is automatically stored as function annotations, and (2 these function annotations can be exploited in future Hoare logic proofs. This reduces duplication of reasoning between the proofs of different properties, whilst serving as a drop-in replacement for traditional Hoare logic to avoid the costly process of proof refactoring. We explain how this was implemented in Isabelle/HOL and applied to an experimental branch of the seL4 microkernel to significantly reduce the size and complexity of existing proofs.

  4. Nonlinear Deep Kernel Learning for Image Annotation.

    Science.gov (United States)

    Jiu, Mingyuan; Sahbi, Hichem

    2017-02-08

    Multiple kernel learning (MKL) is a widely used technique for kernel design. Its principle consists in learning, for a given support vector classifier, the most suitable convex (or sparse) linear combination of standard elementary kernels. However, these combinations are shallow and often powerless to capture the actual similarity between highly semantic data, especially for challenging classification tasks such as image annotation. In this paper, we redefine multiple kernels using deep multi-layer networks. In this new contribution, a deep multiple kernel is recursively defined as a multi-layered combination of nonlinear activation functions, each one involves a combination of several elementary or intermediate kernels, and results into a positive semi-definite deep kernel. We propose four different frameworks in order to learn the weights of these networks: supervised, unsupervised, kernel-based semisupervised and Laplacian-based semi-supervised. When plugged into support vector machines (SVMs), the resulting deep kernel networks show clear gain, compared to several shallow kernels for the task of image annotation. Extensive experiments and analysis on the challenging ImageCLEF photo annotation benchmark, the COREL5k database and the Banana dataset validate the effectiveness of the proposed method.

  5. GLANET: genomic loci annotation and enrichment tool.

    Science.gov (United States)

    Otlu, Burçak; Firtina, Can; Keles, Sündüz; Tastan, Oznur

    2017-09-15

    Genomic studies identify genomic loci representing genetic variations, transcription factor (TF) occupancy, or histone modification through next generation sequencing (NGS) technologies. Interpreting these loci requires evaluating them with known genomic and epigenomic annotations. We present GLANET as a comprehensive annotation and enrichment analysis tool which implements a sampling-based enrichment test that accounts for GC content and/or mappability biases, jointly or separately. GLANET annotates and performs enrichment analysis on these loci with a rich library. We introduce and perform novel data-driven computational experiments for assessing the power and Type-I error of its enrichment procedure which show that GLANET has attained high statistical power and well-controlled Type-I error rate. As a key feature, users can easily extend its library with new gene sets and genomic intervals. Other key features include assessment of impact of single nucleotide variants (SNPs) on TF binding sites and regulation based pathway enrichment analysis. GLANET can be run using its GUI or on command line. GLANET's source code is available at https://github.com/burcakotlu/GLANET . Tutorials are provided at https://glanet.readthedocs.org . burcak@ceng.metu.edu.tr or oznur.tastan@cs.bilkent.edu.tr. Supplementary data are available at Bioinformatics online.

  6. Jannovar: a java library for exome annotation.

    Science.gov (United States)

    Jäger, Marten; Wang, Kai; Bauer, Sebastian; Smedley, Damian; Krawitz, Peter; Robinson, Peter N

    2014-05-01

    Transcript-based annotation and pedigree analysis are two basic steps in the computational analysis of whole-exome sequencing experiments in genetic diagnostics and disease-gene discovery projects. Here, we present Jannovar, a stand-alone Java application as well as a Java library designed to be used in larger software frameworks for exome and genome analysis. Jannovar uses an interval tree to identify all transcripts affected by a given variant, and provides Human Genome Variation Society-compliant annotations both for variants affecting coding sequences and splice junctions as well as untranslated regions and noncoding RNA transcripts. Jannovar can also perform family-based pedigree analysis with Variant Call Format (VCF) files with data from members of a family segregating a Mendelian disorder. Using a desktop computer, Jannovar requires a few seconds to annotate a typical VCF file with exome data. Jannovar is freely available under the BSD2 license. Source code as well as the Java application and library file can be downloaded from http://compbio.charite.de (with tutorial) and https://github.com/charite/jannovar. © 2014 WILEY PERIODICALS, INC.

  7. Management Tool for Semantic Annotations in WSDL

    Science.gov (United States)

    Boissel-Dallier, Nicolas; Lorré, Jean-Pierre; Benaben, Frédérick

    Semantic Web Services add features to automate web services discovery and composition. A new standard called SAWSDL emerged recently as a W3C recommendation to add semantic annotations within web service descriptions (WSDL). In order to manipulate such information in Java program we need an XML parser. Two open-source libraries already exist (SAWSDL4J and Woden4SAWSDL) but they don't meet all our specific needs such as support for WSDL 1.1 and 2.0. This paper presents a new tool, called EasyWSDL, which is able to handle semantic annotations as well as to manage the full WSDL description thanks to a plug-in mechanism. This tool allows us to read/edit/create a WSDL description and related annotations thanks to a uniform API, in both 1.1 and 2.0 versions. This document compares these three libraries and presents its integration into Dragon the OW2 open-source SOA governance tool.

  8. MUC16/CA125 in the Context of Modular Proteins with an Annotated Role in Adhesion-Related Processes: In Silico Analysis

    Directory of Open Access Journals (Sweden)

    Ninoslav Mitic

    2012-08-01

    Full Text Available Mucin 16 (MUC16 is a type I transmembrane protein, the extracellular portion of which is shed after proteolytic degradation and is denoted as CA125 antigen, a well known tumor marker for ovarian cancer. Regarding its polypeptide and glycan structures, as yet there is no detailed insight into their heterogeneity and ligand properties, which may greatly influence its function and biomarker potential. This study was aimed at obtaining further insight into the biological capacity of MUC16/CA125, using in silico analysis of corresponding mucin sequences, including similarity searches as well as GO (gene ontology-based function prediction. The results obtained pointed to the similarities within extracellular serine/threonine rich regions of MUC16 to sequences of proteins expressed in evolutionary distant taxa, all having in common an annotated role in adhesion-related processes. Specifically, a homology to conserved domains from the family of herpesvirus major outer envelope protein (BLLF1 was found. In addition, the possible involvement of MUC16/CA125 in carbohydrate-binding interactions or cellular transport of protein/ion was suggested.

  9. Microglia phenotype diversity

    NARCIS (Netherlands)

    Olah, M.; Biber, K.; Vinet, J.; Boddeke, H. W. G. M.

    2011-01-01

    Microglia, the tissue macrophages of the brain, have under healthy conditions a resting phenotype that is characterized by a ramified morphology. With their fine processes microglia are continuously scanning their environment. Upon any homeostatic disturbance microglia rapidly change their phenotype

  10. Microglia phenotype diversity

    NARCIS (Netherlands)

    Olah, M.; Biber, K.; Vinet, J.; Boddeke, H. W. G. M.

    2011-01-01

    Microglia, the tissue macrophages of the brain, have under healthy conditions a resting phenotype that is characterized by a ramified morphology. With their fine processes microglia are continuously scanning their environment. Upon any homeostatic disturbance microglia rapidly change their phenotype

  11. Microglia phenotype diversity

    NARCIS (Netherlands)

    Olah, M.; Biber, K.; Vinet, J.; Boddeke, H. W. G. M.

    Microglia, the tissue macrophages of the brain, have under healthy conditions a resting phenotype that is characterized by a ramified morphology. With their fine processes microglia are continuously scanning their environment. Upon any homeostatic disturbance microglia rapidly change their phenotype

  12. Prokaryotic Contig Annotation Pipeline Server: Web Application for a Prokaryotic Genome Annotation Pipeline Based on the Shiny App Package.

    Science.gov (United States)

    Park, Byeonghyeok; Baek, Min-Jeong; Min, Byoungnam; Choi, In-Geol

    2017-09-01

    Genome annotation is a primary step in genomic research. To establish a light and portable prokaryotic genome annotation pipeline for use in individual laboratories, we developed a Shiny app package designated as "P-CAPS" (Prokaryotic Contig Annotation Pipeline Server). The package is composed of R and Python scripts that integrate publicly available annotation programs into a server application. P-CAPS is not only a browser-based interactive application but also a distributable Shiny app package that can be installed on any personal computer. The final annotation is provided in various standard formats and is summarized in an R markdown document. Annotation can be visualized and examined with a public genome browser. A benchmark test showed that the annotation quality and completeness of P-CAPS were reliable and compatible with those of currently available public pipelines.

  13. Leveraging Comparative Genomics to Identify and Functionally Characterize Genes Associated with Sperm Phenotypes in Python bivittatus (Burmese Python)

    OpenAIRE

    Kristopher J. L. Irizarry; Josep Rutllant

    2016-01-01

    Comparative genomics approaches provide a means of leveraging functional genomics information from a highly annotated model organism’s genome (such as the mouse genome) in order to make physiological inferences about the role of genes and proteins in a less characterized organism’s genome (such as the Burmese python). We employed a comparative genomics approach to produce the functional annotation of Python bivittatus genes encoding proteins associated with sperm phenotypes. We identify 129 g...

  14. Ontology-based framework for personalized recommendation in digital libraries%数字图书馆中基于本体的个性化推荐框架

    Institute of Scientific and Technical Information of China (English)

    颜端武; 岑咏华; 张炜; 毛平

    2006-01-01

    为了提高数字图书馆信息服务的能力,描述了一个基于本体的用户浏览和搜索个性化推荐系统框架.该框架将本体的优点应用于检索周期中,包括提问相关测度、语义化的用户兴趣表达和自动更新、以及个性化的检索结果排序等.在用户访问数字图书馆的交互过程中,可通过本体来构造用户提问和文档内容的匹配机制以实现语义化的内容检索,并可进一步使用本体来构造用户兴趣偏好的概念向量以实现面向用户的个性化推荐反馈.%To promote information service ability of digital libraries,a browsing and searching personalized recommendation framework based on the use of ontology is described, where the advantages of ontology are exploited in different parts of the retrieval cycle including query-based relevance measures,semantic user preference representation and automatic update,and personalized result ranking.Both the usage and information resources can be exploited to extract useful knowledge from the way users interact with a digital library.Through combination and mapping between the extracted knowledge and domain ontology,semantic content retrieval between queries and documents can be utilized.Furthermore,ontology-based conceptual vector of user preference can be applied in personalized recommendation feedback.

  15. 多级环境下基于角色和本体的访问控制方法%Method of role and ontology based access control in multi-level environment

    Institute of Scientific and Technical Information of China (English)

    王智辉; 艾中良; 王祥根; 唐稳

    2013-01-01

    为了解决多级环境中敏感资源的安全性问题,提出了多级环境下基于角色和本体的访问控制方法.通过使用本体等语义技术,构造了一个结合基于角色的访问控制和Bell-LaPadula (BLP)模型的访问控制模型.根据角色的继承关系推导用户及资源之间安全级别的高低关系,在为角色指派权限时按照BLP模型进行授权的限制.实验表明,该方法在保证多级安全特性的前提下实现了用户权限的自动分配,消除了角色权限继承带来的安全隐患,为多级环境下的访问控制提供了一种新的思路.%To address the problem of protecting key resources in multi-level environment,a role and ontology based access control method is proposed.A model which combines the Role Based Access Control (RBAC) model and the Bell-LaPadula (BLP) model is constructed by exploiting semantic technology.Based on the role hierarchy,relationship between users and resources can be inferred.When a privilege is granted to some role,security properties of the BLP model should be met.The method is proved to be effective to automatically assign privilege to different roles and guarantee multi-level security at the same time,security risks caused by privilege inheritance is diminished.The study offered a new method to solve access control problems in multilevel environment.

  16. Worm Phenotype Ontology: Integrating phenotype data within and beyond the C. elegans community

    Directory of Open Access Journals (Sweden)

    Yook Karen

    2011-01-01

    Full Text Available Abstract Background Caenorhabditis elegans gene-based phenotype information dates back to the 1970's, beginning with Sydney Brenner and the characterization of behavioral and morphological mutant alleles via classical genetics in order to understand nervous system function. Since then C. elegans has become an important genetic model system for the study of basic biological and biomedical principles, largely through the use of phenotype analysis. Because of the growth of C. elegans as a genetically tractable model organism and the development of large-scale analyses, there has been a significant increase of phenotype data that needs to be managed and made accessible to the research community. To do so, a standardized vocabulary is necessary to integrate phenotype data from diverse sources, permit integration with other data types and render the data in a computable form. Results We describe a hierarchically structured, controlled vocabulary of terms that can be used to standardize phenotype descriptions in C. elegans, namely the Worm Phenotype Ontology (WPO. The WPO is currently comprised of 1,880 phenotype terms, 74% of which have been used in the annotation of phenotypes associated with greater than 18,000 C. elegans genes. The scope of the WPO is not exclusively limited to C. elegans biology, rather it is devised to also incorporate phenotypes observed in related nematode species. We have enriched the value of the WPO by integrating it with other ontologies, thereby increasing the accessibility of worm phenotypes to non-nematode biologists. We are actively developing the WPO to continue to fulfill the evolving needs of the scientific community and hope to engage researchers in this crucial endeavor. Conclusions We provide a phenotype ontology (WPO that will help to facilitate data retrieval, and cross-species comparisons within the nematode community. In the larger scientific community, the WPO will permit data integration, and

  17. The Human Phenotype Ontology: Semantic Unification of Common and Rare Disease

    Science.gov (United States)

    Groza, Tudor; Köhler, Sebastian; Moldenhauer, Dawid; Vasilevsky, Nicole; Baynam, Gareth; Zemojtel, Tomasz; Schriml, Lynn Marie; Kibbe, Warren Alden; Schofield, Paul N.; Beck, Tim; Vasant, Drashtti; Brookes, Anthony J.; Zankl, Andreas; Washington, Nicole L.; Mungall, Christopher J.; Lewis, Suzanna E.; Haendel, Melissa A.; Parkinson, Helen; Robinson, Peter N.

    2015-01-01

    The Human Phenotype Ontology (HPO) is widely used in the rare disease community for differential diagnostics, phenotype-driven analysis of next-generation sequence-variation data, and translational research, but a comparable resource has not been available for common disease. Here, we have developed a concept-recognition procedure that analyzes the frequencies of HPO disease annotations as identified in over five million PubMed abstracts by employing an iterative procedure to optimize precision and recall of the identified terms. We derived disease models for 3,145 common human diseases comprising a total of 132,006 HPO annotations. The HPO now comprises over 250,000 phenotypic annotations for over 10,000 rare and common diseases and can be used for examining the phenotypic overlap among common diseases that share risk alleles, as well as between Mendelian diseases and common diseases linked by genomic location. The annotations, as well as the HPO itself, are freely available. PMID:26119816

  18. Cyclebase 3.0: a multi-organism database on cell-cycle regulation and phenotypes

    DEFF Research Database (Denmark)

    Santos Delgado, Alberto; Wernersson, Rasmus; Jensen, Lars Juhl

    2015-01-01

    3.0, we have updated the content of the database to reflect changes to genome annotation, added new mRNAand protein expression data, and integrated cell-cycle phenotype information from high-content screens and model-organism databases. The new version of Cyclebase also features a new web interface...

  19. Ontology-Based Peer Exchange Network (OPEN)

    Science.gov (United States)

    Dong, Hui

    2010-01-01

    In current Peer-to-Peer networks, distributed and semantic free indexing is widely used by systems adopting "Distributed Hash Table" ("DHT") mechanisms. Although such systems typically solve a. user query rather fast in a deterministic way, they only support a very narrow search scheme, namely the exact hash key match. Furthermore, DHT systems put…

  20. Ontology Based Vocabulary Matching for Oceanographic Instruments

    Science.gov (United States)

    Chen, Yu; Shepherd, Adam; Chandler, Cyndy; Arko, Robert; Leadbetter, Adam

    2014-05-01

    Data integration act as the preliminary entry point as we enter the era of big data in many scientific domains. However the reusefulness of various dataset has met the hurdle due to different initial of interests of different parties, therefore different vocabularies in describing similar or semantically related concepts. In this scenario it is vital to devise an automatic or semi-supervised algorithm to facilitate the convergence of different vocabularies. The Ocean Data Interoperability Platform (ODIP) seeks to increase data sharing across scientific domains and international boundaries by providing a forum to harmonize diverse regional data systems. ODIP participants from the US include the Rolling Deck to Repository (R2R) program, whose mission is to capture, catalog, and describe the underway/environmental sensor data from US oceanographic research vessels and submit the data to public long-term archives. In an attempt to harmonize these regional data systems, especially vocabularies, R2R recognizes the value of the SeaDataNet vocabularies served by the NERC Vocabulary Server (NVS) hosted at the British Oceanographic Data Centre as a trusted, authoritative source for describing many oceanographic research concepts such as instrumentation. In this work, we make use of the semantic relations in the vocabularies served by NVS to build a Bayesian network and take advantage of the idea of entropy in evaluating the correlation between different concepts and keywords. The performance of the model is evaluated against matching instruments from R2R against the SeaDataNet instrument vocabularies based on calculated confidence scores in the instrument pairings. These pairings with their scores can then be analyzed for assertion growing the interoperability of the R2R vocabulary through its links to the SeaDataNet entities.

  1. Ontology-Based Analysis of Microarray Data.

    Science.gov (United States)

    Giuseppe, Agapito; Milano, Marianna

    2016-01-01

    The importance of semantic-based methods and algorithms for the analysis and management of biological data is growing for two main reasons. From a biological side, knowledge contained in ontologies is more and more accurate and complete, from a computational side, recent algorithms are using in a valuable way such knowledge. Here we focus on semantic-based management and analysis of protein interaction networks referring to all the approaches of analysis of protein-protein interaction data that uses knowledge encoded into biological ontologies. Semantic approaches for studying high-throughput data have been largely used in the past to mine genomic and expression data. Recently, the emergence of network approaches for investigating molecular machineries has stimulated in a parallel way the introduction of semantic-based techniques for analysis and management of network data. The application of these computational approaches to the study of microarray data can broad the application scenario of them and simultaneously can help the understanding of disease development and progress.

  2. Model Validation in Ontology Based Transformations

    Directory of Open Access Journals (Sweden)

    Jesús M. Almendros-Jiménez

    2012-10-01

    Full Text Available Model Driven Engineering (MDE is an emerging approach of software engineering. MDE emphasizes the construction of models from which the implementation should be derived by applying model transformations. The Ontology Definition Meta-model (ODM has been proposed as a profile for UML models of the Web Ontology Language (OWL. In this context, transformations of UML models can be mapped into ODM/OWL transformations. On the other hand, model validation is a crucial task in model transformation. Meta-modeling permits to give a syntactic structure to source and target models. However, semantic requirements have to be imposed on source and target models. A given transformation will be sound when source and target models fulfill the syntactic and semantic requirements. In this paper, we present an approach for model validation in ODM based transformations. Adopting a logic programming based transformational approach we will show how it is possible to transform and validate models. Properties to be validated range from structural and semantic requirements of models (pre and post conditions to properties of the transformation (invariants. The approach has been applied to a well-known example of model transformation: the Entity-Relationship (ER to Relational Model (RM transformation.

  3. Ontology-based Software Repository System

    Science.gov (United States)

    2010-04-30

    and domains. Current State of the Art Improvements to the current state of the art for software reuse repositories are required ( Shiva & Shala, 2007...model of architecture. IEEE Software, 12(6), 42-50. Object Management Group. (2005). Reusable asset specification (Vers. 2.2). Shiva , S., & Shala

  4. Ontology-Based Peer Exchange Network (OPEN)

    Science.gov (United States)

    Dong, Hui

    2010-01-01

    In current Peer-to-Peer networks, distributed and semantic free indexing is widely used by systems adopting "Distributed Hash Table" ("DHT") mechanisms. Although such systems typically solve a. user query rather fast in a deterministic way, they only support a very narrow search scheme, namely the exact hash key match. Furthermore, DHT systems put…

  5. Ontology-Based Search of Genomic Metadata.

    Science.gov (United States)

    Fernandez, Javier D; Lenzerini, Maurizio; Masseroli, Marco; Venco, Francesco; Ceri, Stefano

    2016-01-01

    The Encyclopedia of DNA Elements (ENCODE) is a huge and still expanding public repository of more than 4,000 experiments and 25,000 data files, assembled by a large international consortium since 2007; unknown biological knowledge can be extracted from these huge and largely unexplored data, leading to data-driven genomic, transcriptomic, and epigenomic discoveries. Yet, search of relevant datasets for knowledge discovery is limitedly supported: metadata describing ENCODE datasets are quite simple and incomplete, and not described by a coherent underlying ontology. Here, we show how to overcome this limitation, by adopting an ENCODE metadata searching approach which uses high-quality ontological knowledge and state-of-the-art indexing technologies. Specifically, we developed S.O.S. GeM (http://www.bioinformatics.deib.polimi.it/SOSGeM/), a system supporting effective semantic search and retrieval of ENCODE datasets. First, we constructed a Semantic Knowledge Base by starting with concepts extracted from ENCODE metadata, matched to and expanded on biomedical ontologies integrated in the well-established Unified Medical Language System. We prove that this inference method is sound and complete. Then, we leveraged the Semantic Knowledge Base to semantically search ENCODE data from arbitrary biologists' queries. This allows correctly finding more datasets than those extracted by a purely syntactic search, as supported by the other available systems. We empirically show the relevance of found datasets to the biologists' queries.

  6. ONTOLOGY BASED QUALITY EVALUATION FOR SPATIAL DATA

    Directory of Open Access Journals (Sweden)

    C. Yılmaz

    2015-08-01

    Full Text Available Many institutions will be providing data to the National Spatial Data Infrastructure (NSDI. Current technical background of the NSDI is based on syntactic web services. It is expected that this will be replaced by semantic web services. The quality of the data provided is important in terms of the decision-making process and the accuracy of transactions. Therefore, the data quality needs to be tested. This topic has been neglected in Turkey. Data quality control for NSDI may be done by private or public “data accreditation” institutions. A methodology is required for data quality evaluation. There are studies for data quality including ISO standards, academic studies and software to evaluate spatial data quality. ISO 19157 standard defines the data quality elements. Proprietary software such as, 1Spatial’s 1Validate and ESRI’s Data Reviewer offers quality evaluation based on their own classification of rules. Commonly, rule based approaches are used for geospatial data quality check. In this study, we look for the technical components to devise and implement a rule based approach with ontologies using free and open source software in semantic web context. Semantic web uses ontologies to deliver well-defined web resources and make them accessible to end-users and processes. We have created an ontology conforming to the geospatial data and defined some sample rules to show how to test data with respect to data quality elements including; attribute, topo-semantic and geometrical consistency using free and open source software. To test data against rules, sample GeoSPARQL queries are created, associated with specifications.

  7. Ontology-based Cloud Services Representation

    Directory of Open Access Journals (Sweden)

    Abdullah Ali

    2014-07-01

    Full Text Available The advancement of cloud computing has enabled service providers to provide diversity of cloud services to users with different attributes at a range of costs. Finding the suitable service from the increasing numbers of cloud services that satisfy the user requirements such as performance, cost and security has become a big challenge. The variety on services description none uniformed naming conventions and the heterogeneous types and features of cloud services led to make the cloud service discovery a hard problem. Therefore, an intelligent service discovery system is necessary for searching and retrieving appropriate services accurately and quickly. Many studies have been conducted to discover the cloud services using different techniques, such as ontology model and agents technology. The existing ontology for cloud services does not cover the cloud concepts and it is intended to be used for specific tasks only. This study represents the cloud concepts in a comprehensive way that can be used for cloud services discovery or cloud computing management.

  8. Ontology-based multi-agent systems

    Energy Technology Data Exchange (ETDEWEB)

    Hadzic, Maja; Wongthongtham, Pornpit; Dillon, Tharam; Chang, Elizabeth [Digital Ecosystems and Business Intelligence Institute, Perth, WA (Australia)

    2009-07-01

    The Semantic web has given a great deal of impetus to the development of ontologies and multi-agent systems. Several books have appeared which discuss the development of ontologies or of multi-agent systems separately on their own. The growing interaction between agents and ontologies has highlighted the need for integrated development of these. This book is unique in being the first to provide an integrated treatment of the modeling, design and implementation of such combined ontology/multi-agent systems. It provides clear exposition of this integrated modeling and design methodology. It further illustrates this with two detailed case studies in (a) the biomedical area and (b) the software engineering area. The book is, therefore, of interest to researchers, graduate students and practitioners in the semantic web and web science area. (orig.)

  9. Ontology-Based Geographic Data Set Integration

    NARCIS (Netherlands)

    Uitermark, Harry T.; Oosterom, Peter J.M.; Mars, Nicolaas J.I.; Molenaar, Martien

    1999-01-01

    In order to develop a system to propagate updates we investigate the semantic and spatial relationships between independently produced geographic data sets of the same region (data set integration). The goal of this system is to reduce operator intervention in update operations between corresponding

  10. Ontology-Based Geographic Data Set Integration

    NARCIS (Netherlands)

    Uitermark, Henricus Theodorus Johannes Antonius

    2001-01-01

    Geographic data set integration is particularly important for update propagation, i.e. the reuse of updates from one data set in another data set. In this thesis geographic data set integration (also known as map integration) between two topographic data sets, GBKN and TOP10vector, is described. GBK

  11. Mapping gene associations in human mitochondria using clinical disease phenotypes.

    Directory of Open Access Journals (Sweden)

    Curt Scharfe

    2009-04-01

    Full Text Available Nuclear genes encode most mitochondrial proteins, and their mutations cause diverse and debilitating clinical disorders. To date, 1,200 of these mitochondrial genes have been recorded, while no standardized catalog exists of the associated clinical phenotypes. Such a catalog would be useful to develop methods to analyze human phenotypic data, to determine genotype-phenotype relations among many genes and diseases, and to support the clinical diagnosis of mitochondrial disorders. Here we establish a clinical phenotype catalog of 174 mitochondrial disease genes and study associations of diseases and genes. Phenotypic features such as clinical signs and symptoms were manually annotated from full-text medical articles and classified based on the hierarchical MeSH ontology. This classification of phenotypic features of each gene allowed for the comparison of diseases between different genes. In turn, we were then able to measure the phenotypic associations of disease genes for which we calculated a quantitative value that is based on their shared phenotypic features. The results showed that genes sharing more similar phenotypes have a stronger tendency for functional interactions, proving the usefulness of phenotype similarity values in disease gene network analysis. We then constructed a functional network of mitochondrial genes and discovered a higher connectivity for non-disease than for disease genes, and a tendency of disease genes to interact with each other. Utilizing these differences, we propose 168 candidate genes that resemble the characteristic interaction patterns of mitochondrial disease genes. Through their network associations, the candidates are further prioritized for the study of specific disorders such as optic neuropathies and Parkinson disease. Most mitochondrial disease phenotypes involve several clinical categories including neurologic, metabolic, and gastrointestinal disorders, which might indicate the effects of gene defects

  12. Reporting phenotypes in mouse models when considering body size as a potential confounder.

    Science.gov (United States)

    Oellrich, Anika; Meehan, Terrence F; Parkinson, Helen; Sarntivijai, Sirarat; White, Jacqueline K; Karp, Natasha A

    2016-01-01

    Genotype-phenotype studies aim to identify causative relationships between genes and phenotypes. The International Mouse Phenotyping Consortium is a high throughput phenotyping program whose goal is to collect phenotype data for a knockout mouse strain of every protein coding gene. The scale of the project requires an automatic analysis pipeline to detect abnormal phenotypes, and disseminate the resulting gene-phenotype annotation data into public resources. A body weight phenotype is a common result of knockout studies. As body weight correlates with many other biological traits, this challenges the interpretation of related gene-phenotype associations. Co-correlation can lead to gene-phenotype associations that are potentially misleading. Here we use statistical modelling to account for body weight as a potential confounder to assess the impact. We find that there is a considerable impact on previously established gene-phenotype associations due to an increase in sensitivity as well as the confounding effect. We investigated the existing ontologies to represent this phenotypic information and we explored ways to ontologically represent the results of the influence of confounders on gene-phenotype associations. With the scale of data being disseminated within the high throughput programs and the range of downstream studies that utilise these data, it is critical to consider how we improve the quality of the disseminated data and provide a robust ontological representation.

  13. Applying Reliability Metrics to Co-Reference Annotation

    CERN Document Server

    Passonneau, R J

    1997-01-01

    Studies of the contextual and linguistic factors that constrain discourse phenomena such as reference are coming to depend increasingly on annotated language corpora. In preparing the corpora, it is important to evaluate the reliability of the annotation, but methods for doing so have not been readily available. In this report, I present a method for computing reliability of coreference annotation. First I review a method for applying the information retrieval metrics of recall and precision to coreference annotation proposed by Marc Vilain and his collaborators. I show how this method makes it possible to construct contingency tables for computing Cohen's Kappa, a familiar reliability metric. By comparing recall and precision to reliability on the same data sets, I also show that recall and precision can be misleadingly high. Because Kappa factors out chance agreement among coders, it is a preferable measure for developing annotated corpora where no pre-existing target annotation exists.

  14. An Unsupervised Model for Exploring Hierarchical Semantics from Social Annotations

    Science.gov (United States)

    Zhou, Mianwei; Bao, Shenghua; Wu, Xian; Yu, Yong

    This paper deals with the problem of exploring hierarchical semantics from social annotations. Recently, social annotation services have become more and more popular in Semantic Web. It allows users to arbitrarily annotate web resources, thus, largely lowers the barrier to cooperation. Furthermore, through providing abundant meta-data resources, social annotation might become a key to the development of Semantic Web. However, on the other hand, social annotation has its own apparent limitations, for instance, 1) ambiguity and synonym phenomena and 2) lack of hierarchical information. In this paper, we propose an unsupervised model to automatically derive hierarchical semantics from social annotations. Using a social bookmark service Del.icio.us as example, we demonstrate that the derived hierarchical semantics has the ability to compensate those shortcomings. We further apply our model on another data set from Flickr to testify our model's applicability on different environments. The experimental results demonstrate our model's efficiency.

  15. Annotation of mammalian primary microRNAs

    Directory of Open Access Journals (Sweden)

    Enright Anton J

    2008-11-01

    Full Text Available Abstract Background MicroRNAs (miRNAs are important regulators of gene expression and have been implicated in development, differentiation and pathogenesis. Hundreds of miRNAs have been discovered in mammalian genomes. Approximately 50% of mammalian miRNAs are expressed from introns of protein-coding genes; the primary transcript (pri-miRNA is therefore assumed to be the host transcript. However, very little is known about the structure of pri-miRNAs expressed from intergenic regions. Here we annotate transcript boundaries of miRNAs in human, mouse and rat genomes using various transcription features. The 5' end of the pri-miRNA is predicted from transcription start sites, CpG islands and 5' CAGE tags mapped in the upstream flanking region surrounding the precursor miRNA (pre-miRNA. The 3' end of the pri-miRNA is predicted based on the mapping of polyA signals, and supported by cDNA/EST and ditags data. The predicted pri-miRNAs are also analyzed for promoter and insulator-associated regulatory regions. Results We define sets of conserved and non-conserved human, mouse and rat pre-miRNAs using bidirectional BLAST and synteny analysis. Transcription features in their flanking regions are used to demarcate the 5' and 3' boundaries of the pri-miRNAs. The lengths and boundaries of primary transcripts are highly conserved between orthologous miRNAs. A significant fraction of pri-miRNAs have lengths between 1 and 10 kb, with very few introns. We annotate a total of 59 pri-miRNA structures, which include 82 pre-miRNAs. 36 pri-miRNAs are conserved in all 3 species. In total, 18 of the confidently annotated transcripts express more than one pre-miRNA. The upstream regions of 54% of the predicted pri-miRNAs are found to be associated with promoter and insulator regulatory sequences. Conclusion Little is known about the primary transcripts of intergenic miRNAs. Using comparative data, we are able to identify the boundaries of a significant proportion of

  16. Annotation Bibliography for Geographical Science Field

    Directory of Open Access Journals (Sweden)

    Sukendra Martha

    2014-12-01

    Full Text Available This annotated bibliography is gathered specially for the field of geography obtained from various scientific articles (basic concept in geography of different geographical journals. This article aims to present information particulary for geographers who will undertake researches, and indeed need the geographical References with all spatial concepts. Other reason defeated by the rapid development of the branch of technical geography such as geographical information systems (GIS and remote sensing. It hopes that this bibliography can contribute of remotivating geographers to learn and review their original geographical thought.

  17. Updating RNA-Seq analyses after re-annotation

    OpenAIRE

    Roberts, Adam; Schaeffer, Lorian; Pachter, Lior

    2013-01-01

    The estimation of isoform abundances from RNA-Seq data requires a time-intensive step of mapping reads to either an assembled or previously annotated transcriptome, followed by an optimization procedure for deconvolution of multi-mapping reads. These procedures are essential for downstream analysis such as differential expression. In cases where it is desirable to adjust the underlying annotation, for example, on the discovery of novel isoforms or errors in existing annotations, current pipel...

  18. A robust data-driven approach for gene ontology annotation

    OpenAIRE

    2014-01-01

    Gene ontology (GO) and GO annotation are important resources for biological information management and knowledge discovery, but the speed of manual annotation became a major bottleneck of database curation. BioCreative IV GO annotation task aims to evaluate the performance of system that automatically assigns GO terms to genes based on the narrative sentences in biomedical literature. This article presents our work in this task as well as the experimental results after the competition. For th...

  19. TTC’15 Live Contest Case Study: Transformation of Java Annotations

    OpenAIRE

    Křikava, Filip; Monperrus, Martin

    2015-01-01

    International audience; Java 5 introduced annotations as a systematic mean to attach syntactic meta-data to various elements of Java source code. Since then, annotations have been extensively used by a number of libraries, frameworks and tools to conveniently extend behaviour of Java programs that would otherwise have to be done manually or synthesised from external resources. The annotations are usually processed through reflection and the extended behaviour is injected into Java classes usi...

  20. Statistical analysis of genomic protein family and domain controlled annotations for functional investigation of classified gene lists

    Directory of Open Access Journals (Sweden)

    Masseroli Marco

    2007-03-01

    Full Text Available Abstract Background The increasing protein family and domain based annotations constitute important information to understand protein functions and gain insight into relations among their codifying genes. To allow analyzing of gene proteomic annotations, we implemented novel modules within GFINDer, a Web system we previously developed that dynamically aggregates functional and phenotypic annotations of user-uploaded gene lists and allows performing their statistical analysis and mining. Results Exploiting protein information in Pfam and InterPro databanks, we developed and added in GFINDer original modules specifically devoted to the exploration and analysis of functional signatures of gene protein products. They allow annotating numerous user-classified nucleotide sequence identifiers with controlled information on related protein families, domains and functional sites, classifying them according to such protein annotation categories, and statistically analyzing the obtained classifications. In particular, when uploaded nucleotide sequence identifiers are subdivided in classes, the Statistics Protein Families&Domains module allows estimating relevance of Pfam or InterPro controlled annotations for the uploaded genes by highlighting protein signatures significantly more represented within user-defined classes of genes. In addition, the Logistic Regression module allows identifying protein functional signatures that better explain the considered gene classification. Conclusion Novel GFINDer modules provide genomic protein family and domain analyses supporting better functional interpretation of gene classes, for instance defined through statistical and clustering analyses of gene expression results from microarray experiments. They can hence help understanding fundamental biological processes and complex cellular mechanisms influenced by protein domain composition, and contribute to unveil new biomedical knowledge about the codifying genes.

  1. Literacy and Basic Education: A Selected, Annotated Bibliography. Annotated Bibliography #3.

    Science.gov (United States)

    Michigan State Univ., East Lansing. Non-Formal Education Information Center.

    A selected annotated bibliography on literacy and basic education, including contributions from practitioners in the worldwide non-formal education network and compiled for them, has three interrelated themes: integration of literacy programs with broader development efforts; the learner-centered or "psycho-social" approach to literacy,…

  2. MalaCards: an integrated compendium for diseases and their annotation

    Science.gov (United States)

    Rappaport, Noa; Nativ, Noam; Stelzer, Gil; Twik, Michal; Guan-Golan, Yaron; Iny Stein, Tsippi; Bahir, Iris; Belinky, Frida; Morrey, C. Paul; Safran, Marilyn; Lancet, Doron

    2013-01-01

    Comprehensive disease classification, integration and annotation are crucial for biomedical discovery. At present, disease compilation is incomplete, heterogeneous and often lacking systematic inquiry mechanisms. We introduce MalaCards, an integrated database of human maladies and their annotations, modeled on the architecture and strategy of the GeneCards database of human genes. MalaCards mines and merges 44 data sources to generate a computerized card for each of 16 919 human diseases. Each MalaCard contains disease-specific prioritized annotations, as well as inter-disease connections, empowered by the GeneCards relational database, its searches and GeneDecks set analyses. First, we generate a disease list from 15 ranked sources, using disease-name unification heuristics. Next, we use four schemes to populate MalaCards sections: (i) directly interrogating disease resources, to establish integrated disease names, synonyms, summaries, drugs/therapeutics, clinical features, genetic tests and anatomical context; (ii) searching GeneCards for related publications, and for associated genes with corresponding relevance scores; (iii) analyzing disease-associated gene sets in GeneDecks to yield affiliated pathways, phenotypes, compounds and GO terms, sorted by a composite relevance score and presented with GeneCards links; and (iv) searching within MalaCards itself, e.g. for additional related diseases and anatomical context. The latter forms the basis for the construction of a disease network, based on shared MalaCards annotations, embodying associations based on etiology, clinical features and clinical conditions. This broadly disposed network has a power-law degree distribution, suggesting that this might be an inherent property of such networks. Work in progress includes hierarchical malady classification, ontological mapping and disease set analyses, striving to make MalaCards an even more effective tool for biomedical research. Database URL: http

  3. Augmented annotation and orthologue analysis for Oryctolagus cuniculus: Better Bunny

    National Research Council Canada - National Science Library

    Craig, Douglas B; Kannan, Sujatha; Dombkowski, Alan A

    2012-01-01

    .... Using data extracted from several public bioinformatics repositories we created Better Bunny, a database and query tool that extensively augments the available functional annotation for rabbit genes...

  4. A Novel Approach to Semantic and Coreference Annotation at LLNL

    Energy Technology Data Exchange (ETDEWEB)

    Firpo, M

    2005-02-04

    A case is made for the importance of high quality semantic and coreference annotation. The challenges of providing such annotation are described. Asperger's Syndrome is introduced, and the connections are drawn between the needs of text annotation and the abilities of persons with Asperger's Syndrome to meet those needs. Finally, a pilot program is recommended wherein semantic annotation is performed by people with Asperger's Syndrome. The primary points embodied in this paper are as follows: (1) Document annotation is essential to the Natural Language Processing (NLP) projects at Lawrence Livermore National Laboratory (LLNL); (2) LLNL does not currently have a system in place to meet its need for text annotation; (3) Text annotation is challenging for a variety of reasons, many related to its very rote nature; (4) Persons with Asperger's Syndrome are particularly skilled at rote verbal tasks, and behavioral experts agree that they would excel at text annotation; and (6) A pilot study is recommend in which two to three people with Asperger's Syndrome annotate documents and then the quality and throughput of their work is evaluated relative to that of their neuro-typical peers.

  5. Correction of the Caulobacter crescentus NA1000 genome annotation.

    Directory of Open Access Journals (Sweden)

    Bert Ely

    Full Text Available Bacterial genome annotations are accumulating rapidly in the GenBank database and the use of automated annotation technologies to create these annotations has become the norm. However, these automated methods commonly result in a small, but significant percentage of genome annotation errors. To improve accuracy and reliability, we analyzed the Caulobacter crescentus NA1000 genome utilizing computer programs Artemis and MICheck to manually examine the third codon position GC content, alignment to a third codon position GC frame plot peak, and matches in the GenBank database. We identified 11 new genes, modified the start site of 113 genes, and changed the reading frame of 38 genes that had been incorrectly annotated. Furthermore, our manual method of identifying protein-coding genes allowed us to remove 112 non-coding regions that had been designated as coding regions. The improved NA1000 genome annotation resulted in a reduction in the use of rare codons since noncoding regions with atypical codon usage were removed from the annotation and 49 new coding regions were added to the annotation. Thus, a more accurate codon usage table was generated as well. These results demonstrate that a comparison of the location of peaks third codon position GC content to the location of protein coding regions could be used to verify the annotation of any genome that has a GC content that is greater than 60%.

  6. Supporting One-Time Point Annotations for Gesture Recognition.

    Science.gov (United States)

    Nguyen-Dinh, Long-Van; Calatroni, Alberto; Troester, Gerhard

    2016-12-08

    This paper investigates a new annotation technique that reduces significantly the amount of time to annotate training data for gesture recognition. Conventionally, the annotations comprise the start and end times, and the corresponding labels of gestures in sensor recordings. In this work, we propose a one-time point annotation in which labelers do not have to select the start and end time carefully, but just mark a one-time point within the time a gesture is happening. The technique gives more freedom and reduces significantly the burden for labelers. To make the one-time point annotations applicable, we propose a novel BoundarySearch algorithm to find automatically the correct temporal boundaries of gestures by discovering data patterns around their given one-time point annotations. The corrected annotations are then used to train gesture models. We evaluate the method on three applications from wearable gesture recognition with various gesture classes (10-17 classes) recorded with different sensor modalities. The results show that training on the corrected annotations can achieve performances close to a fully supervised training on clean annotations (lower by just up to 5% F1-score on average). Furthermore, the BoundarySearch algorithm is also evaluated on the ChaLearn 2014 multi-modal gesture recognition challenge recorded with Kinect sensors from computer vision and achieves similar results.

  7. Correction of the Caulobacter crescentus NA1000 genome annotation.

    Science.gov (United States)

    Ely, Bert; Scott, LaTia Etheredge

    2014-01-01

    Bacterial genome annotations are accumulating rapidly in the GenBank database and the use of automated annotation technologies to create these annotations has become the norm. However, these automated methods commonly result in a small, but significant percentage of genome annotation errors. To improve accuracy and reliability, we analyzed the Caulobacter crescentus NA1000 genome utilizing computer programs Artemis and MICheck to manually examine the third codon position GC content, alignment to a third codon position GC frame plot peak, and matches in the GenBank database. We identified 11 new genes, modified the start site of 113 genes, and changed the reading frame of 38 genes that had been incorrectly annotated. Furthermore, our manual method of identifying protein-coding genes allowed us to remove 112 non-coding regions that had been designated as coding regions. The improved NA1000 genome annotation resulted in a reduction in the use of rare codons since noncoding regions with atypical codon usage were removed from the annotation and 49 new coding regions were added to the annotation. Thus, a more accurate codon usage table was generated as well. These results demonstrate that a comparison of the location of peaks third codon position GC content to the location of protein coding regions could be used to verify the annotation of any genome that has a GC content that is greater than 60%.

  8. Review of actinide-sediment reactions with an annotated bibliography

    Energy Technology Data Exchange (ETDEWEB)

    Ames, L.L.; Rai, D.; Serne, R.J.

    1976-02-10

    The annotated bibliography is divided into sections on chemistry and geochemistry, migration and accumulation, cultural distributions, natural distributions, and bibliographies and annual reviews. (LK)

  9. Semantator: semantic annotator for converting biomedical text to linked data.

    Science.gov (United States)

    Tao, Cui; Song, Dezhao; Sharma, Deepak; Chute, Christopher G

    2013-10-01

    More than 80% of biomedical data is embedded in plain text. The unstructured nature of these text-based documents makes it challenging to easily browse and query the data of interest in them. One approach to facilitate browsing and querying biomedical text is to convert the plain text to a linked web of data, i.e., converting data originally in free text to structured formats with defined meta-level semantics. In this paper, we introduce Semantator (Semantic Annotator), a semantic-web-based environment for annotating data of interest in biomedical documents, browsing and querying the annotated data, and interactively refining annotation results if needed. Through Semantator, information of interest can be either annotated manually or semi-automatically using plug-in information extraction tools. The annotated results will be stored in RDF and can be queried using the SPARQL query language. In addition, semantic reasoners can be directly applied to the annotated data for consistency checking and knowledge inference. Semantator has been released online and was used by the biomedical ontology community who provided positive feedbacks. Our evaluation results indicated that (1) Semantator can perform the annotation functionalities as designed; (2) Semantator can be adopted in real applications in clinical and transactional research; and (3) the annotated results using Semantator can be easily used in Semantic-web-based reasoning tools for further inference.

  10. Automatic medical X-ray image classification using annotation.

    Science.gov (United States)

    Zare, Mohammad Reza; Mueen, Ahmed; Seng, Woo Chaw

    2014-02-01

    The demand for automatically classification of medical X-ray images is rising faster than ever. In this paper, an approach is presented to gain high accuracy rate for those classes of medical database with high ratio of intraclass variability and interclass similarities. The classification framework was constructed via annotation using the following three techniques: annotation by binary classification, annotation by probabilistic latent semantic analysis, and annotation using top similar images. Next, final annotation was constructed by applying ranking similarity on annotated keywords made by each technique. The final annotation keywords were then divided into three levels according to the body region, specific bone structure in body region as well as imaging direction. Different weights were given to each level of the keywords; they are then used to calculate the weightage for each category of medical images based on their ground truth annotation. The weightage computed from the generated annotation of query image was compared with the weightage of each category of medical images, and then the query image would be assigned to the category with closest weightage to the query image. The average accuracy rate reported is 87.5 %.

  11. Introduction to annotated logics foundations for paracomplete and paraconsistent reasoning

    CERN Document Server

    Abe, Jair Minoro; Nakamatsu, Kazumi

    2015-01-01

    This book is written as an introduction to annotated logics. It provides logical foundations for annotated logics, discusses some interesting applications of these logics and also includes the authors' contributions to annotated logics. The central idea of the book is to show how annotated logic can be applied as a tool to solve problems of technology and of applied science. The book will be of interest to pure and applied logicians, philosophers, and computer scientists as a monograph on a kind of paraconsistent logic. But, the layman will also take profit from its reading.

  12. Applying active learning to high-throughput phenotyping algorithms for electronic health records data.

    Science.gov (United States)

    Chen, Yukun; Carroll, Robert J; Hinz, Eugenia R McPeek; Shah, Anushi; Eyler, Anne E; Denny, Joshua C; Xu, Hua

    2013-12-01

    Generalizable, high-throughput phenotyping methods based on supervised machine learning (ML) algorithms could significantly accelerate the use of electronic health records data for clinical and translational research. However, they often require large numbers of annotated samples, which are costly and time-consuming to review. We investigated the use of active learning (AL) in ML-based phenotyping algorithms. We integrated an uncertainty sampling AL approach with support vector machines-based phenotyping algorithms and evaluated its performance using three annotated disease cohorts including rheumatoid arthritis (RA), colorectal cancer (CRC), and venous thromboembolism (VTE). We investigated performance using two types of feature sets: unrefined features, which contained at least all clinical concepts extracted from notes and billing codes; and a smaller set of refined features selected by domain experts. The performance of the AL was compared with a passive learning (PL) approach based on random sampling. Our evaluation showed that AL outperformed PL on three phenotyping tasks. When unrefined features were used in the RA and CRC tasks, AL reduced the number of annotated samples required to achieve an area under the curve (AUC) score of 0.95 by 68% and 23%, respectively. AL also achieved a reduction of 68% for VTE with an optimal AUC of 0.70 using refined features. As expected, refined features improved the performance of phenotyping classifiers and required fewer annotated samples. This study demonstrated that AL can be useful in ML-based phenotyping methods. Moreover, AL and feature engineering based on domain knowledge could be combined to develop efficient and generalizable phenotyping methods.

  13. Single cell dynamic phenotyping

    OpenAIRE

    Katherin Patsch; Chi-Li Chiu; Mark Engeln; Agus, David B.; Parag Mallick; Shannon M. Mumenthaler; Daniel Ruderman

    2016-01-01

    Live cell imaging has improved our ability to measure phenotypic heterogeneity. However, bottlenecks in imaging and image processing often make it difficult to differentiate interesting biological behavior from technical artifact. Thus there is a need for new methods that improve data quality without sacrificing throughput. Here we present a 3-step workflow to improve dynamic phenotype measurements of heterogeneous cell populations. We provide guidelines for image acquisition, phenotype track...

  14. EFFICIENT VIDEO ANNOTATIONS BY AN IMAGE GROUPS

    Directory of Open Access Journals (Sweden)

    K . Mahi balan

    2015-10-01

    Full Text Available Searching desirable events in uncontrolled videos is a challenging task. So, researches mainly focus on obtaining concepts from numerous labelled videos. But it is time consuming and labour expensive to collect a large amount of required labelled videos for training event models under various condition. To avoid this problem, we propose to leverage abundant Web images for videos since Web images contain a rich source of information with many events roughly annotated and taken under various conditions. However, information from the Web is difficult .so,brute force knowledge transfer of images may hurt the video annotation performance. so, we propose a novel Group-based Domain Adaptation learning framework to leverage different groups of knowledge (source target queried from the Web image search engine to consumer videos (domain target. Different from old methods using multiple source domains of images, our method makes the Web images according to their intrinsic semantic relationships instead of source. Specifically, two different types of groups ( event-specific groups and concept-specific groups are exploited to respectively describe the event-level and concept-level semantic meanings of target-domain videos.

  15. Towards a Library of Standard Operating Procedures (SOPs) for (meta)genomic annotation

    Energy Technology Data Exchange (ETDEWEB)

    Kyrpides, Nikos; Angiuoli, Samuel V.; Cochrane, Guy; Field, Dawn; Garrity, George; Gussman, Aaron; Kodira, Chinnappa D.; Klimke, William; Kyrpides, Nikos; Madupu, Ramana; Markowitz, Victor; Tatusova, Tatiana; Thomson, Nick; White, Owen

    2008-04-01

    Genome annotations describe the features of genomes and accompany sequences in genome databases. The methodologies used to generate genome annotation are diverse and typically vary amongst groups. Descriptions of the annotation procedure are helpful in interpreting genome annotation data. Standard Operating Procedures (SOPs) for genome annotation describe the processes that generate genome annotations. Some groups are currently documenting procedures but standards are lacking for structure and content of annotation SOPs. In addition, there is no central repository to store and disseminate procedures and protocols for genome annotation. We highlight the importance of SOPs for genome annotation and endorse a central online repository of SOPs.

  16. Towards a Library of Standard Operating Procedures (SOPs) for (meta)genomic annotation

    Energy Technology Data Exchange (ETDEWEB)

    Kyrpides, Nikos; Angiuoli, Samuel V.; Cochrane, Guy; Field, Dawn; Garrity, George; Gussman, Aaron; Kodira, Chinnappa D.; Klimke, William; Kyrpides, Nikos; Madupu, Ramana; Markowitz, Victor; Tatusova, Tatiana; Thomson, Nick; White, Owen

    2008-04-01

    Genome annotations describe the features of genomes and accompany sequences in genome databases. The methodologies used to generate genome annotation are diverse and typically vary amongst groups. Descriptions of the annotation procedure are helpful in interpreting genome annotation data. Standard Operating Procedures (SOPs) for genome annotation describe the processes that generate genome annotations. Some groups are currently documenting procedures but standards are lacking for structure and content of annotation SOPs. In addition, there is no central repository to store and disseminate procedures and protocols for genome annotation. We highlight the importance of SOPs for genome annotation and endorse a central online repository of SOPs.

  17. Ontology-based Knowledge Modeling of Collaborative Product design for Multi-design Teams%基于本体的多设计团队协同产品设计知识建模

    Institute of Scientific and Technical Information of China (English)

    王有远; 王发麟; 乐承毅; 张丹平; 宗琪

    2012-01-01

    为解决多设计团队协同产品设计过程中设计知识共享和重用困难等问题,提出了一种基于本体的多设计团队协同产品设计知识建模方法.分析了多设计团队协同产品设计的特点,建立了多设计团队协同产品设计工作模型,提出了多设计团队协同产品设计知识本体构建框架.以本体建模理论为依据,以机械产品协同设计为研究背景,利用分类和描述的思想方法对基于本体的多设计团队协同产品设计知识建模过程进行了研究,包括设计知识的分类与概念抽取、设计知识的OWL描述以及属性定义.在此基础上,构建了机械产品协同设计知识本体模型.最后通过应用实例验证了该方法的可行性和有效性.%To solve the problems of the design knowledge sharing and reuse difficulty in the process of collaborative product design, an ontology-based knowledge modeling method of collaborative product design for multi - design teams was proposed. A collaborative product design work model for multi-design teams was established by analyzing the characteristics of collaborative product design firstly, and an ontology construction framework for collaborative product design was also put forward. At the research background of mechanical product collaborative design, the knowledge modeling process of collaborative product design for multi -design teams based on ontology modeling theory was studied by utilizing the classification and description method of thinking, including the design knowledge classification and concepts extraction, the OWL description and the property definition. On this basis, a knowledge ontology model for mechanical product collaborative design was constructed. Finally, an application case was presented to illustrate the feasibility and validity of the knowledge modeling method.

  18. iBeetle-Base: a database for RNAi phenotypes in the red flour beetle Tribolium castaneum.

    Science.gov (United States)

    Dönitz, Jürgen; Schmitt-Engel, Christian; Grossmann, Daniela; Gerischer, Lizzy; Tech, Maike; Schoppmeier, Michael; Klingler, Martin; Bucher, Gregor

    2015-01-01

    The iBeetle-Base (http://ibeetle-base.uni-goettingen.de) makes available annotations of RNAi phenotypes, which were gathered in a large scale RNAi screen in the red flour beetle Tribolium castaneum (iBeetle screen). In addition, it provides access to sequence information and links for all Tribolium castaneum genes. The iBeetle-Base contains the annotations of phenotypes of several thousands of genes knocked down during embryonic and metamorphic epidermis and muscle development in addition to phenotypes linked to oogenesis and stink gland biology. The phenotypes are described according to the EQM (entity, quality, modifier) system using controlled vocabularies and the Tribolium morphological ontology (TrOn). Furthermore, images linked to the respective annotations are provided. The data are searchable either for specific phenotypes using a complex 'search for morphological defects' or a 'quick search' for gene names and IDs. The red flour beetle Tribolium castaneum has become an important model system for insect functional genetics and is a representative of the most species rich taxon, the Coleoptera, which comprise several devastating pests. It is used for studying insect typical development, the evolution of development and for research on metabolism and pest control. Besides Drosophila, Tribolium is the first insect model organism where large scale unbiased screens have been performed.

  19. Phenotype definition in epilepsy.

    Science.gov (United States)

    Winawer, Melodie R

    2006-05-01

    Phenotype definition consists of the use of epidemiologic, biological, molecular, or computational methods to systematically select features of a disorder that might result from distinct genetic influences. By carefully defining the target phenotype, or dividing the sample by phenotypic characteristics, we can hope to narrow the range of genes that influence risk for the trait in the study population, thereby increasing the likelihood of finding them. In this article, fundamental issues that arise in phenotyping in epilepsy and other disorders are reviewed, and factors complicating genotype-phenotype correlation are discussed. Methods of data collection, analysis, and interpretation are addressed, focusing on epidemiologic studies. With this foundation in place, the epilepsy subtypes and clinical features that appear to have a genetic basis are described, and the epidemiologic studies that have provided evidence for the heritability of these phenotypic characteristics, supporting their use in future genetic investigations, are reviewed. Finally, several molecular approaches to phenotype definition are discussed, in which the molecular defect, rather than the clinical phenotype, is used as a starting point.

  20. Prediction of gene-phenotype associations in humans, mice, and plants using phenologs.

    Science.gov (United States)

    Woods, John O; Singh-Blom, Ulf Martin; Laurent, Jon M; McGary, Kriston L; Marcotte, Edward M

    2013-06-21

    Phenotypes and diseases may be related to seemingly dissimilar phenotypes in other species by means of the orthology of underlying genes. Such "orthologous phenotypes," or "phenologs," are examples of deep homology, and may be used to predict additional candidate disease genes. In this work, we develop an unsupervised algorithm for ranking phenolog-based candidate disease genes through the integration of predictions from the k nearest neighbor phenologs, comparing classifiers and weighting functions by cross-validation. We also improve upon the original method by extending the theory to paralogous phenotypes. Our algorithm makes use of additional phenotype data--from chicken, zebrafish, and E. coli, as well as new datasets for C. elegans--establishing that several types of annotations may be treated as phenotypes. We demonstrate the use of our algorithm to predict novel candidate genes for human atrial fibrillation (such as HRH2, ATP4A, ATP4B, and HOPX) and epilepsy (e.g., PAX6 and NKX2-1). We suggest gene candidates for pharmacologically-induced seizures in mouse, solely based on orthologous phenotypes from E. coli. We also explore the prediction of plant gene-phenotype associations, as for the Arabidopsis response to vernalization phenotype. We are able to rank gene predictions for a significant portion of the diseases in the Online Mendelian Inheritance in Man database. Additionally, our method suggests candidate genes for mammalian seizures based only on bacterial phenotypes and gene orthology. We demonstrate that phenotype information may come from diverse sources, including drug sensitivities, gene ontology biological processes, and in situ hybridization annotations. Finally, we offer testable candidates for a variety of human diseases, plant traits, and other classes of phenotypes across a wide array of species.

  1. An Annotated Bibliography of Experimental Research concerning Competitive Swimming.

    Science.gov (United States)

    Bachman, John C.

    This annotated bibliography has been compiled as a guide for the researcher of swimming in referring to experimental studies in the physiological, mechanical, psychological, and medical aspects of swimming. The studies have been briefly annotated to enable the reader to quickly determine the salient points the authors made in their studies. The…

  2. Mastery Learning and Mastery Testing: An Annotated ERIC Bibliography.

    Science.gov (United States)

    Wildemuth, Barbara M., Comp.

    This 136-item annotated bibliography on mastery learning and mastery testing is the result of a computer search of the ERIC data base in February 1977. All entries are listed alphabetically by author. An abstract or annotation is provided for each entry. A subject index is included reflecting the major emphasis of each citation. (RC)

  3. From the Margins to the Center: The Future of Annotation.

    Science.gov (United States)

    Wolfe, Joanna L.; Neuwirth, Christine M.

    2001-01-01

    Describes the importance of annotation to reading and writing practices and reviews new technologies that complicate the ways annotation can be used to support and enhance traditional reading, writing, and collaboration processes. Emphasizes issues and methods that will be productive for enhancing theories of workplace and classroom communication…

  4. The GATO gene annotation tool for research laboratories

    Directory of Open Access Journals (Sweden)

    A. Fujita

    2005-11-01

    Full Text Available Large-scale genome projects have generated a rapidly increasing number of DNA sequences. Therefore, development of computational methods to rapidly analyze these sequences is essential for progress in genomic research. Here we present an automatic annotation system for preliminary analysis of DNA sequences. The gene annotation tool (GATO is a Bioinformatics pipeline designed to facilitate routine functional annotation and easy access to annotated genes. It was designed in view of the frequent need of genomic researchers to access data pertaining to a common set of genes. In the GATO system, annotation is generated by querying some of the Web-accessible resources and the information is stored in a local database, which keeps a record of all previous annotation results. GATO may be accessed from everywhere through the internet or may be run locally if a large number of sequences are going to be annotated. It is implemented in PHP and Perl and may be run on any suitable Web server. Usually, installation and application of annotation systems require experience and are time consuming, but GATO is simple and practical, allowing anyone with basic skills in informatics to access it without any special training. GATO can be downloaded at [http://mariwork.iq.usp.br/gato/]. Minimum computer free space required is 2 MB.

  5. Bioinformatics Assisted Gene Discovery and Annotation of Human Genome

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    As the sequencing stage of human genome project is near the end, the work has begun for discovering novel genes from genome sequences and annotating their biological functions. Here are reviewed current major bioinformatics tools and technologies available for large scale gene discovery and annotation from human genome sequences. Some ideas about possible future development are also provided.

  6. Use of Laplacian Projection Technique for Summarizing Likert Scale Annotations

    OpenAIRE

    Tanveer, M. Iftekhar

    2015-01-01

    Summarizing Likert scale ratings from human annotators is an important step for collecting human judgments. In this project we study a novel, graph theoretic method for this purpose. We also analyze a few interesting properties for this approach using real annotation datasets.

  7. K-Nearest Neighbors Relevance Annotation Model for Distance Education

    Science.gov (United States)

    Ke, Xiao; Li, Shaozi; Cao, Donglin

    2011-01-01

    With the rapid development of Internet technologies, distance education has become a popular educational mode. In this paper, the authors propose an online image automatic annotation distance education system, which could effectively help children learn interrelations between image content and corresponding keywords. Image automatic annotation is…

  8. Annotation-Based Whole Genomic Prediction and Selection

    DEFF Research Database (Denmark)

    Kadarmideen, Haja; Do, Duy Ngoc; Janss, Luc;

    in their contribution to estimated genomic variances and in prediction of genomic breeding values by applying SNP annotation approaches to feed efficiency. Ensembl Variant Predictor (EVP) and Pig QTL database were used as the source of genomic annotation for 60K chip. Genomic prediction was performed using the Bayes...... prove useful for less heritable traits such as diseases and fertility...

  9. JAFA: a protein function annotation meta-server

    DEFF Research Database (Denmark)

    Friedberg, Iddo; Harder, Tim; Godzik, Adam

    2006-01-01

    With the high number of sequences and structures streaming in from genomic projects, there is a need for more powerful and sophisticated annotation tools. Most problematic of the annotation efforts is predicting gene and protein function. Over the past few years there has been considerable progre...

  10. Gene calling and bacterial genome annotation with BG7.

    Science.gov (United States)

    Tobes, Raquel; Pareja-Tobes, Pablo; Manrique, Marina; Pareja-Tobes, Eduardo; Kovach, Evdokim; Alekhin, Alexey; Pareja, Eduardo

    2015-01-01

    New massive sequencing technologies are providing many bacterial genome sequences from diverse taxa but a refined annotation of these genomes is crucial for obtaining scientific findings and new knowledge. Thus, bacterial genome annotation has emerged as a key point to investigate in bacteria. Any efficient tool designed specifically to annotate bacterial genomes sequenced with massively parallel technologies has to consider the specific features of bacterial genomes (absence of introns and scarcity of nonprotein-coding sequence) and of next-generation sequencing (NGS) technologies (presence of errors and not perfectly assembled genomes). These features make it convenient to focus on coding regions and, hence, on protein sequences that are the elements directly related with biological functions. In this chapter we describe how to annotate bacterial genomes with BG7, an open-source tool based on a protein-centered gene calling/annotation paradigm. BG7 is specifically designed for the annotation of bacterial genomes sequenced with NGS. This tool is sequence error tolerant maintaining their capabilities for the annotation of highly fragmented genomes or for annotating mixed sequences coming from several genomes (as those obtained through metagenomics samples). BG7 has been designed with scalability as a requirement, with a computing infrastructure completely based on cloud computing (Amazon Web Services).

  11. Orienteering: An Annotated Bibliography = Orientierungslauf: Eine kommentierte Bibliographie.

    Science.gov (United States)

    Seiler, Roland, Ed.; Hartmann, Wolfgang, Ed.

    1994-01-01

    Annotated bibliography of 220 books, monographs, and journal articles on orienteering published 1984-94, from SPOLIT database of the Federal Institute of Sport Science (Cologne, Germany). Annotations in English or German. Ten sections including psychological, physiological, health, sociological, and environmental aspects; training and coaching;…

  12. The RAST Server: Rapid Annotations using Subsystems Technology

    Directory of Open Access Journals (Sweden)

    Overbeek Ross A

    2008-02-01

    Full Text Available Abstract Background The number of prokaryotic genome sequences becoming available is growing steadily and is growing faster than our ability to accurately annotate them. Description We describe a fully automated service for annotating bacterial and archaeal genomes. The service identifies protein-encoding, rRNA and tRNA genes, assigns functions to the genes, predicts which subsystems are represented in the genome, uses this information to reconstruct the metabolic network and makes the output easily downloadable for the user. In addition, the annotated genome can be browsed in an environment that supports comparative analysis with the annotated genomes maintained in the SEED environment. The service normally makes the annotated genome available within 12–24 hours of submission, but ultimately the quality of such a service will be judged in terms of accuracy, consistency, and completeness of the produced annotations. We summarize our attempts to address these issues and discuss plans for incrementally enhancing the service. Conclusion By providing accurate, rapid annotation freely to the community we have created an important community resource. The service has now been utilized by over 120 external users annotating over 350 distinct genomes.

  13. Solving the Problem: Genome Annotation Standards before the Data Deluge

    Science.gov (United States)

    Klimke, William; O'Donovan, Claire; White, Owen; Brister, J. Rodney; Clark, Karen; Fedorov, Boris; Mizrachi, Ilene; Pruitt, Kim D.; Tatusova, Tatiana

    2011-01-01

    The promise of genome sequencing was that the vast undiscovered country would be mapped out by comparison of the multitude of sequences available and would aid researchers in deciphering the role of each gene in every organism. Researchers recognize that there is a need for high quality data. However, different annotation procedures, numerous databases, and a diminishing percentage of experimentally determined gene functions have resulted in a spectrum of annotation quality. NCBI in collaboration with sequencing centers, archival databases, and researchers, has developed the first international annotation standards, a fundamental step in ensuring that high quality complete prokaryotic genomes are available as gold standard references. Highlights include the development of annotation assessment tools, community acceptance of protein naming standards, comparison of annotation resources to provide consistent annotation, and improved tracking of the evidence used to generate a particular annotation. The development of a set of minimal standards, including the requirement for annotated complete prokaryotic genomes to contain a full set of ribosomal RNAs, transfer RNAs, and proteins encoding core conserved functions, is an historic milestone. The use of these standards in existing genomes and future submissions will increase the quality of databases, enabling researchers to make accurate biological discoveries. PMID:22180819

  14. Product annotations - KOME | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available [ Credits ] BLAST Search Image Search Home About Archive Update History Contact us ...ile name: kome_product_annotation.zip File URL: ftp://ftp.biosciencedbc.jp/archiv...ate History of This Database Site Policy | Contact Us Product annotations - KOME | LSDB Archive ...

  15. Behavioral Contributions to "Teaching of Psychology": An Annotated Bibliography

    Science.gov (United States)

    Karsten, A. M.; Carr, J. E.

    2008-01-01

    An annotated bibliography that summarizes behavioral contributions to the journal "Teaching of Psychology" from 1974 to 2006 is provided. A total of 116 articles of potential utility to college-level instructors of behavior analysis and related areas were identified, annotated, and organized into nine categories for ease of accessibility.…

  16. Collaborative Paper-Based Annotation of Lecture Slides

    Science.gov (United States)

    Steimle, Jurgen; Brdiczka, Oliver; Muhlhauser, Max

    2009-01-01

    In a study of notetaking in university courses, we found that the large majority of students prefer paper to computer-based media like Tablet PCs for taking notes and making annotations. Based on this finding, we developed CoScribe, a concept and system which supports students in making collaborative handwritten annotations on printed lecture…

  17. Online Metacognitive Strategies, Hypermedia Annotations, and Motivation on Hypertext Comprehension

    Science.gov (United States)

    Shang, Hui-Fang

    2016-01-01

    This study examined the effect of online metacognitive strategies, hypermedia annotations, and motivation on reading comprehension in a Taiwanese hypertext environment. A path analysis model was proposed based on the assumption that if English as a foreign language learners frequently use online metacognitive strategies and hypermedia annotations,…

  18. Online Metacognitive Strategies, Hypermedia Annotations, and Motivation on Hypertext Comprehension

    Science.gov (United States)

    Shang, Hui-Fang

    2016-01-01

    This study examined the effect of online metacognitive strategies, hypermedia annotations, and motivation on reading comprehension in a Taiwanese hypertext environment. A path analysis model was proposed based on the assumption that if English as a foreign language learners frequently use online metacognitive strategies and hypermedia annotations,…

  19. Semi-automatic semantic annotation of PubMed queries: a study on quality, efficiency, satisfaction.

    Science.gov (United States)

    Névéol, Aurélie; Islamaj Doğan, Rezarta; Lu, Zhiyong

    2011-04-01

    Information processing algorithms require significant amounts of annotated data for training and testing. The availability of such data is often hindered by the complexity and high cost of production. In this paper, we investigate the benefits of a state-of-the-art tool to help with the semantic annotation of a large set of biomedical queries. Seven annotators were recruited to annotate a set of 10,000 PubMed® queries with 16 biomedical and bibliographic categories. About half of the queries were annotated from scratch, while the other half were automatically pre-annotated and manually corrected. The impact of the automatic pre-annotations was assessed on several aspects of the task: time, number of actions, annotator satisfaction, inter-annotator agreement, quality and number of the resulting annotations. The analysis of annotation results showed that the number of required hand annotations is 28.9% less when using pre-annotated results from automatic tools. As a result, the overall annotation time was substantially lower when pre-annotations were used, while inter-annotator agreement was significantly higher. In addition, there was no statistically significant difference in the semantic distribution or number of annotations produced when pre-annotations were used. The annotated query corpus is freely available to the research community. This study shows that automatic pre-annotations are found helpful by most annotators. Our experience suggests using an automatic tool to assist large-scale manual annotation projects. This helps speed-up the annotation time and improve annotation consistency while maintaining high quality of the final annotations.

  20. Annotated bibliography of software engineering laboratory literature

    Science.gov (United States)

    Kistler, David; Bristow, John; Smith, Don

    1994-01-01

    This document is an annotated bibliography of technical papers, documents, and memorandums produced by or related to the Software Engineering Laboratory. Nearly 200 publications are summarized. These publications cover many areas of software engineering and range from research reports to software documentation. This document has been updated and reorganized substantially since the original version (SEL-82-006, November 1982). All materials have been grouped into eight general subject areas for easy reference: (1) The Software Engineering Laboratory; (2) The Software Engineering Laboratory: Software Development Documents; (3) Software Tools; (4) Software Models; (5) Software Measurement; (6) Technology Evaluations; (7) Ada Technology; and (8) Data Collection. This document contains an index of these publications classified by individual author.

  1. On Semantic Annotation in Clarin-PL Parallel Corpora

    Directory of Open Access Journals (Sweden)

    Violetta Koseska-Toszewa

    2015-12-01

    Full Text Available On Semantic Annotation in Clarin-PL Parallel Corpora In the article, the authors present a proposal for semantic annotation in Clarin-PL parallel corpora: Polish-Bulgarian-Russian and Polish-Lithuanian ones. Semantic annotation of quantification is a novum in developing sentence level semantics in multilingual parallel corpora. This is why our semantic annotation is manual. The authors hope it will be interesting to IT specialists working on automatic processing of the given natural languages. Semantic annotation defined the way it is defined here will make contrastive studies of natural languages more efficient, which in turn will help verify the results of those studies, and will certainly improve human and machine translations.

  2. Semantic annotation of clinical events for generating a problem list.

    Science.gov (United States)

    Mowery, Danielle L; Jordan, Pamela; Wiebe, Janyce; Harkema, Henk; Dowling, John; Chapman, Wendy W

    2013-01-01

    We present a pilot study of an annotation schema representing problems and their attributes, along with their relationship to temporal modifiers. We evaluated the ability for humans to annotate clinical reports using the schema and assessed the contribution of semantic annotations in determining the status of a problem mention as active, inactive, proposed, resolved, negated, or other. Our hypothesis is that the schema captures semantic information useful for generating an accurate problem list. Clinical named entities such as reference events, time points, time durations, aspectual phase, ordering words and their relationships including modifications and ordering relations can be annotated by humans with low to moderate recall. Once identified, most attributes can be annotated with low to moderate agreement. Some attributes - Experiencer, Existence, and Certainty - are more informative than other attributes - Intermittency and Generalized/Conditional - for predicting a problem mention's status. Support vector machine outperformed Naïve Bayes and Decision Tree for predicting a problem's status.

  3. About Certain Semantic Annotation in Parallel Corpora

    Directory of Open Access Journals (Sweden)

    Violetta Koseska-Toszewa

    2015-06-01

    Full Text Available About Certain Semantic Annotation in Parallel Corpora The semantic notation analyzed in this works is contained in the second stream of semantic theories presented here – in the direct approach semantics. We used this stream in our work on the Bulgarian-Polish Contrastive Grammar. Our semantic notation distinguishes quantificational meanings of names and predicates, and indicates aspectual and temporal meanings of verbs. It relies on logical scope-based quantification and on the contemporary theory of processes, known as “Petri nets”. Thanks to it, we can distinguish precisely between a language form and its contents, e.g. a perfective verb form has two meanings: an event or a sequence of events and states, finally ended with an event. An imperfective verb form also has two meanings: a state or a sequence of states and events, finally ended with a state. In turn, names are quantified universally or existentially when they are “undefined”, and uniquely (using the iota operator when they are “defined”. A fact worth emphasizing is the possibility of quantifying not only names, but also the predicate, and then quantification concerns time and aspect.  This is a novum in elaborating sentence-level semantics in parallel corpora. For this reason, our semantic notation is manual. We are hoping that it will raise the interest of computer scientists working on automatic methods for processing the given natural languages. Semantic annotation defined like in this work will facilitate contrastive studies of natural languages, and this in turn will verify the results of those studies, and will certainly facilitate human and machine translations.

  4. Using text mining techniques to extract phenotypic information from the PhenoCHF corpus.

    Science.gov (United States)

    Alnazzawi, Noha; Thompson, Paul; Batista-Navarro, Riza; Ananiadou, Sophia

    2015-01-01

    Phenotypic information locked away in unstructured narrative text presents significant barriers to information accessibility, both for clinical practitioners and for computerised applications used for clinical research purposes. Text mining (TM) techniques have previously been applied successfully to extract different types of information from text in the biomedical domain. They have the potential to be extended to allow the extraction of information relating to phenotypes from free text. To stimulate the development of TM systems that are able to extract phenotypic information from text, we have created a new corpus (PhenoCHF) that is annotated by domain experts with several types of phenotypic information relating to congestive heart failure. To ensure that systems developed using the corpus are robust to multiple text types, it integrates text from heterogeneous sources, i.e., electronic health records (EHRs) and scientific articles from the literature. We have developed several different phenotype extraction methods to demonstrate the utility of the corpus, and tested these methods on a further corpus, i.e., ShARe/CLEF 2013. Evaluation of our automated methods showed that PhenoCHF can facilitate the training of reliable phenotype extraction systems, which are robust to variations in text type. These results have been reinforced by evaluating our trained systems on the ShARe/CLEF corpus, which contains clinical records of various types. Like other studies within the biomedical domain, we found that solutions based on conditional random fields produced the best results, when coupled with a rich feature set. PhenoCHF is the first annotated corpus aimed at encoding detailed phenotypic information. The unique heterogeneous composition of the corpus has been shown to be advantageous in the training of systems that can accurately extract phenotypic information from a range of different text types. Although the scope of our annotation is currently limited to a single

  5. Current and future trends in marine image annotation software

    Science.gov (United States)

    Gomes-Pereira, Jose Nuno; Auger, Vincent; Beisiegel, Kolja; Benjamin, Robert; Bergmann, Melanie; Bowden, David; Buhl-Mortensen, Pal; De Leo, Fabio C.; Dionísio, Gisela; Durden, Jennifer M.; Edwards, Luke; Friedman, Ariell; Greinert, Jens; Jacobsen-Stout, Nancy; Lerner, Steve; Leslie, Murray; Nattkemper, Tim W.; Sameoto, Jessica A.; Schoening, Timm; Schouten, Ronald; Seager, James; Singh, Hanumant; Soubigou, Olivier; Tojeira, Inês; van den Beld, Inge; Dias, Frederico; Tempera, Fernando; Santos, Ricardo S.

    2016-12-01

    Given the need to describe, analyze and index large quantities of marine imagery data for exploration and monitoring activities, a range of specialized image annotation tools have been developed worldwide. Image annotation - the process of transposing objects or events represented in a video or still image to the semantic level, may involve human interactions and computer-assisted solutions. Marine image annotation software (MIAS) have enabled over 500 publications to date. We review the functioning, application trends and developments, by comparing general and advanced features of 23 different tools utilized in underwater image analysis. MIAS requiring human input are basically a graphical user interface, with a video player or image browser that recognizes a specific time code or image code, allowing to log events in a time-stamped (and/or geo-referenced) manner. MIAS differ from similar software by the capability of integrating data associated to video collection, the most simple being the position coordinates of the video recording platform. MIAS have three main characteristics: annotating events in real time, posteriorly to annotation and interact with a database. These range from simple annotation interfaces, to full onboard data management systems, with a variety of toolboxes. Advanced packages allow to input and display data from multiple sensors or multiple annotators via intranet or internet. Posterior human-mediated annotation often include tools for data display and image analysis, e.g. length, area, image segmentation, point count; and in a few cases the possibility of browsing and editing previous dive logs or to analyze the annotations. The interaction with a database allows the automatic integration of annotations from different surveys, repeated annotation and collaborative annotation of shared datasets, browsing and querying of data. Progress in the field of automated annotation is mostly in post processing, for stable platforms or still images

  6. Leveraging Comparative Genomics to Identify and Functionally Characterize Genes Associated with Sperm Phenotypes in Python bivittatus (Burmese Python

    Directory of Open Access Journals (Sweden)

    Kristopher J. L. Irizarry

    2016-01-01

    Full Text Available Comparative genomics approaches provide a means of leveraging functional genomics information from a highly annotated model organism’s genome (such as the mouse genome in order to make physiological inferences about the role of genes and proteins in a less characterized organism’s genome (such as the Burmese python. We employed a comparative genomics approach to produce the functional annotation of Python bivittatus genes encoding proteins associated with sperm phenotypes. We identify 129 gene-phenotype relationships in the python which are implicated in 10 specific sperm phenotypes. Results obtained through our systematic analysis identified subsets of python genes exhibiting associations with gene ontology annotation terms. Functional annotation data was represented in a semantic scatter plot. Together, these newly annotated Python bivittatus genome resources provide a high resolution framework from which the biology relating to reptile spermatogenesis, fertility, and reproduction can be further investigated. Applications of our research include (1 production of genetic diagnostics for assessing fertility in domestic and wild reptiles; (2 enhanced assisted reproduction technology for endangered and captive reptiles; and (3 novel molecular targets for biotechnology-based approaches aimed at reducing fertility and reproduction of invasive reptiles. Additional enhancements to reptile genomic resources will further enhance their value.

  7. Leveraging Comparative Genomics to Identify and Functionally Characterize Genes Associated with Sperm Phenotypes in Python bivittatus (Burmese Python).

    Science.gov (United States)

    Irizarry, Kristopher J L; Rutllant, Josep

    2016-01-01

    Comparative genomics approaches provide a means of leveraging functional genomics information from a highly annotated model organism's genome (such as the mouse genome) in order to make physiological inferences about the role of genes and proteins in a less characterized organism's genome (such as the Burmese python). We employed a comparative genomics approach to produce the functional annotation of Python bivittatus genes encoding proteins associated with sperm phenotypes. We identify 129 gene-phenotype relationships in the python which are implicated in 10 specific sperm phenotypes. Results obtained through our systematic analysis identified subsets of python genes exhibiting associations with gene ontology annotation terms. Functional annotation data was represented in a semantic scatter plot. Together, these newly annotated Python bivittatus genome resources provide a high resolution framework from which the biology relating to reptile spermatogenesis, fertility, and reproduction can be further investigated. Applications of our research include (1) production of genetic diagnostics for assessing fertility in domestic and wild reptiles; (2) enhanced assisted reproduction technology for endangered and captive reptiles; and (3) novel molecular targets for biotechnology-based approaches aimed at reducing fertility and reproduction of invasive reptiles. Additional enhancements to reptile genomic resources will further enhance their value.

  8. The caBIG annotation and image Markup project.

    Science.gov (United States)

    Channin, David S; Mongkolwat, Pattanasak; Kleper, Vladimir; Sepukar, Kastubh; Rubin, Daniel L

    2010-04-01

    Image annotation and markup are at the core of medical interpretation in both the clinical and the research setting. Digital medical images are managed with the DICOM standard format. While DICOM contains a large amount of meta-data about whom, where, and how the image was acquired, DICOM says little about the content or meaning of the pixel data. An image annotation is the explanatory or descriptive information about the pixel data of an image that is generated by a human or machine observer. An image markup is the graphical symbols placed over the image to depict an annotation. While DICOM is the standard for medical image acquisition, manipulation, transmission, storage, and display, there are no standards for image annotation and markup. Many systems expect annotation to be reported verbally, while markups are stored in graphical overlays or proprietary formats. This makes it difficult to extract and compute with both of them. The goal of the Annotation and Image Markup (AIM) project is to develop a mechanism, for modeling, capturing, and serializing image annotation and markup data that can be adopted as a standard by the medical imaging community. The AIM project produces both human- and machine-readable artifacts. This paper describes the AIM information model, schemas, software libraries, and tools so as to prepare researchers and developers for their use of AIM.

  9. Open semantic annotation of scientific publications using DOMEO

    Directory of Open Access Journals (Sweden)

    Ciccarese Paolo

    2012-04-01

    Full Text Available Abstract Background Our group has developed a useful shared software framework for performing, versioning, sharing and viewing Web annotations of a number of kinds, using an open representation model. Methods The Domeo Annotation Tool was developed in tandem with this open model, the Annotation Ontology (AO. Development of both the Annotation Framework and the open model was driven by requirements of several different types of alpha users, including bench scientists and biomedical curators from university research labs, online scientific communities, publishing and pharmaceutical companies. Several use cases were incrementally implemented by the toolkit. These use cases in biomedical communications include personal note-taking, group document annotation, semantic tagging, claim-evidence-context extraction, reagent tagging, and curation of textmining results from entity extraction algorithms. Results We report on the Domeo user interface here. Domeo has been deployed in beta release as part of the NIH Neuroscience Information Framework (NIF, http://www.neuinfo.org and is scheduled for production deployment in the NIF’s next full release. Future papers will describe other aspects of this work in detail, including Annotation Framework Services and components for integrating with external textmining services, such as the NCBO Annotator web service, and with other textmining applications using the Apache UIMA framework.

  10. Making adjustments to event annotations for improved biological event extraction.

    Science.gov (United States)

    Baek, Seung-Cheol; Park, Jong C

    2016-09-16

    Current state-of-the-art approaches to biological event extraction train statistical models in a supervised manner on corpora annotated with event triggers and event-argument relations. Inspecting such corpora, we observe that there is ambiguity in the span of event triggers (e.g., "transcriptional activity" vs. 'transcriptional'), leading to inconsistencies across event trigger annotations. Such inconsistencies make it quite likely that similar phrases are annotated with different spans of event triggers, suggesting the possibility that a statistical learning algorithm misses an opportunity for generalizing from such event triggers. We anticipate that adjustments to the span of event triggers to reduce these inconsistencies would meaningfully improve the present performance of event extraction systems. In this study, we look into this possibility with the corpora provided by the 2009 BioNLP shared task as a proof of concept. We propose an Informed Expectation-Maximization (EM) algorithm, which trains models using the EM algorithm with a posterior regularization technique, which consults the gold-standard event trigger annotations in a form of constraints. We further propose four constraints on the possible event trigger annotations to be explored by the EM algorithm. The algorithm is shown to outperform the state-of-the-art algorithm on the development corpus in a statistically significant manner and on the test corpus by a narrow margin. The analysis of the annotations generated by the algorithm shows that there are various types of ambiguity in event annotations, even though they could be small in number.

  11. Fuzzy Emotional Semantic Analysis and Automated Annotation of Scene Images

    Directory of Open Access Journals (Sweden)

    Jianfang Cao

    2015-01-01

    Full Text Available With the advances in electronic and imaging techniques, the production of digital images has rapidly increased, and the extraction and automated annotation of emotional semantics implied by images have become issues that must be urgently addressed. To better simulate human subjectivity and ambiguity for understanding scene images, the current study proposes an emotional semantic annotation method for scene images based on fuzzy set theory. A fuzzy membership degree was calculated to describe the emotional degree of a scene image and was implemented using the Adaboost algorithm and a back-propagation (BP neural network. The automated annotation method was trained and tested using scene images from the SUN Database. The annotation results were then compared with those based on artificial annotation. Our method showed an annotation accuracy rate of 91.2% for basic emotional values and 82.4% after extended emotional values were added, which correspond to increases of 5.5% and 8.9%, respectively, compared with the results from using a single BP neural network algorithm. Furthermore, the retrieval accuracy rate based on our method reached approximately 89%. This study attempts to lay a solid foundation for the automated emotional semantic annotation of more types of images and therefore is of practical significance.

  12. Ontology modularization to improve semantic medical image annotation.

    Science.gov (United States)

    Wennerberg, Pinar; Schulz, Klaus; Buitelaar, Paul

    2011-02-01

    Searching for medical images and patient reports is a significant challenge in a clinical setting. The contents of such documents are often not described in sufficient detail thus making it difficult to utilize the inherent wealth of information contained within them. Semantic image annotation addresses this problem by describing the contents of images and reports using medical ontologies. Medical images and patient reports are then linked to each other through common annotations. Subsequently, search algorithms can more effectively find related sets of documents on the basis of these semantic descriptions. A prerequisite to realizing such a semantic search engine is that the data contained within should have been previously annotated with concepts from medical ontologies. One major challenge in this regard is the size and complexity of medical ontologies as annotation sources. Manual annotation is particularly time consuming labor intensive in a clinical environment. In this article we propose an approach to reducing the size of clinical ontologies for more efficient manual image and text annotation. More precisely, our goal is to identify smaller fragments of a large anatomy ontology that are relevant for annotating medical images from patients suffering from lymphoma. Our work is in the area of ontology modularization, which is a recent and active field of research. We describe our approach, methods and data set in detail and we discuss our results.

  13. Fuzzy emotional semantic analysis and automated annotation of scene images.

    Science.gov (United States)

    Cao, Jianfang; Chen, Lichao

    2015-01-01

    With the advances in electronic and imaging techniques, the production of digital images has rapidly increased, and the extraction and automated annotation of emotional semantics implied by images have become issues that must be urgently addressed. To better simulate human subjectivity and ambiguity for understanding scene images, the current study proposes an emotional semantic annotation method for scene images based on fuzzy set theory. A fuzzy membership degree was calculated to describe the emotional degree of a scene image and was implemented using the Adaboost algorithm and a back-propagation (BP) neural network. The automated annotation method was trained and tested using scene images from the SUN Database. The annotation results were then compared with those based on artificial annotation. Our method showed an annotation accuracy rate of 91.2% for basic emotional values and 82.4% after extended emotional values were added, which correspond to increases of 5.5% and 8.9%, respectively, compared with the results from using a single BP neural network algorithm. Furthermore, the retrieval accuracy rate based on our method reached approximately 89%. This study attempts to lay a solid foundation for the automated emotional semantic annotation of more types of images and therefore is of practical significance.

  14. 基于BIM管网漏水事故处理的本体建模与推理%BIM and ontology-based modelling and inference for leakage of building pipeline

    Institute of Scientific and Technical Information of China (English)

    杜娟

    2015-01-01

    AEC industry has experienced from paper drawings to 2D CAD drawing and component-based BIM. The usage of information technology improves the efficiency and quality of construction projects, and shortens the construction cycle. However, the traditional way of construction information integration and sharing is based on BIM model base and relation-database. It is difficult to realize the flexible and dynamic transformation of user-centric information, and also greatly increases the capacity of information, resulting in the speed of system response slow and inefficient. In this paper, through the analysis to the shift of information integration mechanism in the construction and engineering industry, it pro-poses the ontology-based building information model integration mechanism. On the basis of the construction information classification, the paper puts forward the mapping mechanism between building ontology and building information model. Besides, taking pipeline leakage in the stage of construction maintenance as an example, through the establishment and reasoning of the architecture ontology, the paper uses the CPN Tools and Jena to realize the positioning to the leakage sec-tion, determine the leakage causes and finalize the maintenance solutions.%建筑行业经历了由纸质的图纸到二维的CAD制图,再到以构件为基础的BIM建筑信息模型的变迁,信息化革新提高了建筑项目的效率和质量,并缩短了建筑施工周期。然而,当前建筑信息集成的方式主要基于BIM模型库和关系型数据库,无法实现柔性地基于用户需求的信息组织与传递,极大地增加了信息传递中的承载量,造成系统响应速度慢和效率低下。通过分析建筑工程信息集成与交互的现状,提出基于本体的建筑信息组织与交互模型,并建立建筑信息对本体的映射机制和推理方式。最后以建筑项目运维阶段的管网漏水为实例,分别使用CPN Tools和Jena

  15. MimoSA: a system for minimotif annotation

    Directory of Open Access Journals (Sweden)

    Kundeti Vamsi

    2010-06-01

    Full Text Available Abstract Background Minimotifs are short peptide sequences within one protein, which are recognized by other proteins or molecules. While there are now several minimotif databases, they are incomplete. There are reports of many minimotifs in the primary literature, which have yet to be annotated, while entirely novel minimotifs continue to be published on a weekly basis. Our recently proposed function and sequence syntax for minimotifs enables us to build a general tool that will facilitate structured annotation and management of minimotif data from the biomedical literature. Results We have built the MimoSA application for minimotif annotation. The application supports management of the Minimotif Miner database, literature tracking, and annotation of new minimotifs. MimoSA enables the visualization, organization, selection and editing functions of minimotifs and their attributes in the MnM database. For the literature components, Mimosa provides paper status tracking and scoring of papers for annotation through a freely available machine learning approach, which is based on word correlation. The paper scoring algorithm is also available as a separate program, TextMine. Form-driven annotation of minimotif attributes enables entry of new minimotifs into the MnM database. Several supporting features increase the efficiency of annotation. The layered architecture of MimoSA allows for extensibility by separating the functions of paper scoring, minimotif visualization, and database management. MimoSA is readily adaptable to other annotation efforts that manually curate literature into a MySQL database. Conclusions MimoSA is an extensible application that facilitates minimotif annotation and integrates with the Minimotif Miner database. We have built MimoSA as an application that integrates dynamic abstract scoring with a high performance relational model of minimotif syntax. MimoSA's TextMine, an efficient paper-scoring algorithm, can be used to

  16. An annotation system for 3D fluid flow visualization

    Science.gov (United States)

    Loughlin, Maria M.; Hughes, John F.

    1995-01-01

    Annotation is a key activity of data analysis. However, current systems for data analysis focus almost exclusively on visualization. We propose a system which integrates annotations into a visualization system. Annotations are embedded in 3D data space, using the Post-it metaphor. This embedding allows contextual-based information storage and retrieval, and facilitates information sharing in collaborative environments. We provide a traditional database filter and a Magic Lens filter to create specialized views of the data. The system has been customized for fluid flow applications, with features which allow users to store parameters of visualization tools and sketch 3D volumes.

  17. An analysis on the entity annotations in biological corpora

    OpenAIRE

    Mariana Neves

    2014-01-01

    Collection of documents annotated with semantic entities and relationships are crucial resources to support development and evaluation of text mining solutions for the biomedical domain. Here I present an overview of 36 corpora and show an analysis on the semantic annotations they contain. Annotations for entity types were classified into six semantic groups and an overview on the semantic entities which can be found in each corpus is shown. Results show that while some semantic entities, suc...

  18. Comparative Omics-Driven Genome Annotation Refinement: Application across Yersiniae

    Energy Technology Data Exchange (ETDEWEB)

    Rutledge, Alexandra C.; Jones, Marcus B.; Chauhan, Sadhana; Purvine, Samuel O.; Sanford, James; Monroe, Matthew E.; Brewer, Heather M.; Payne, Samuel H.; Ansong, Charles; Frank, Bryan C.; Smith, Richard D.; Peterson, Scott; Motin, Vladimir L.; Adkins, Joshua N.

    2012-03-27

    Genome sequencing continues to be a rapidly evolving technology, yet most downstream aspects of genome annotation pipelines remain relatively stable or are even being abandoned. To date, the perceived value of manual curation for genome annotations is not offset by the real cost and time associated with the process. In order to balance the large number of sequences generated, the annotation process is now performed almost exclusively in an automated fashion for most genome sequencing projects. One possible way to reduce errors inherent to automated computational annotations is to apply data from 'omics' measurements (i.e. transcriptional and proteomic) to the un-annotated genome with a proteogenomic-based approach. This approach does require additional experimental and bioinformatics methods to include omics technologies; however, the approach is readily automatable and can benefit from rapid developments occurring in those research domains as well. The annotation process can be improved by experimental validation of transcription and translation and aid in the discovery of annotation errors. Here the concept of annotation refinement has been extended to include a comparative assessment of genomes across closely related species, as is becoming common in sequencing efforts. Transcriptomic and proteomic data derived from three highly similar pathogenic Yersiniae (Y. pestis CO92, Y. pestis pestoides F, and Y. pseudotuberculosis PB1/+) was used to demonstrate a comprehensive comparative omic-based annotation methodology. Peptide and oligo measurements experimentally validated the expression of nearly 40% of each strain's predicted proteome and revealed the identification of 28 novel and 68 previously incorrect protein-coding sequences (e.g., observed frameshifts, extended start sites, and translated pseudogenes) within the three current Yersinia genome annotations. Gene loss is presumed to play a major role in Y. pestis acquiring its niche as a virulent

  19. A resource-based Korean morphological annotation system

    CERN Document Server

    Huh, Hyun-Gue

    2007-01-01

    We describe a resource-based method of morphological annotation of written Korean text. Korean is an agglutinative language. The output of our system is a graph of morphemes annotated with accurate linguistic information. The language resources used by the system can be easily updated, which allows us-ers to control the evolution of the per-formances of the system. We show that morphological annotation of Korean text can be performed directly with a lexicon of words and without morpho-logical rules.

  20. Annotation of the protein coding regions of the equine genome

    DEFF Research Database (Denmark)

    Hestand, Matthew S.; Kalbfleisch, Theodore S.; Coleman, Stephen J.

    2015-01-01

    Current gene annotation of the horse genome is largely derived from in silico predictions and cross-species alignments. Only a small number of genes are annotated based on equine EST and mRNA sequences. To expand the number of equine genes annotated from equine experimental evidence, we sequenced m...... and appear to be small errors in the equine reference genome, since they are also identified as homozygous variants by genomic DNA resequencing of the reference horse. Taken together, we provide a resource of equine mRNA structures and protein coding variants that will enhance equine and cross...

  1. Annotating abstract pronominal anaphora in the DAD project

    DEFF Research Database (Denmark)

    Navarretta, Costanza; Olsen, Sussi Anni

    2008-01-01

    n this paper we present an extension of the MATE/GNOME annotation scheme for anaphora (Poesio 2004) which accounts for abstract anaphora in Danish and Italian. By abstract anaphora it is here meant pronouns whose linguistic antecedents are verbal phrases, clauses and discourse segments....... The extended scheme, which we call the DAD annotation scheme, allows to annotate information about abstract anaphora which is important to investigate their use, see Webber (1988), Gundel et al. (2003), Navarretta (2004) and which can influence their automatic treatment. Intercoder agreement scores obtained...

  2. AtPID: a genome-scale resource for genotype–phenotype associations in Arabidopsis

    Science.gov (United States)

    Lv, Qi; Lan, Yiheng; Shi, Yan; Wang, Huan; Pan, Xia; Li, Peng; Shi, Tieliu

    2017-01-01

    AtPID (Arabidopsis thaliana Protein Interactome Database, available at http://www.megabionet.org/atpid) is an integrated database resource for protein interaction network and functional annotation. In the past few years, we collected 5564 mutants with significant morphological alterations and manually curated them to 167 plant ontology (PO) morphology categories. These single/multiple-gene mutants were indexed and linked to 3919 genes. After integrated these genotype–phenotype associations with the comprehensive protein interaction network in AtPID, we developed a Naïve Bayes method and predicted 4457 novel high confidence gene-PO pairs with 1369 genes as the complement. Along with the accumulated novel data for protein interaction and functional annotation, and the updated visualization toolkits, we present a genome-scale resource for genotype–phenotype associations for Arabidopsis in AtPID 5.0. In our updated website, all the new genotype–phenotype associations from mutants, protein network, and the protein annotation information can be vividly displayed in a comprehensive network view, which will greatly enhance plant protein function and genotype–phenotype association studies in a systematical way. PMID:27899679

  3. Tool for rapid annotation of microbial SNPs (TRAMS): a simple program for rapid annotation of genomic variation in prokaryotes.

    Science.gov (United States)

    Reumerman, Richard A; Tucker, Nicholas P; Herron, Paul R; Hoskisson, Paul A; Sangal, Vartul

    2013-09-01

    Next generation sequencing (NGS) has been widely used to study genomic variation in a variety of prokaryotes. Single nucleotide polymorphisms (SNPs) resulting from genomic comparisons need to be annotated for their functional impact on the coding sequences. We have developed a program, TRAMS, for functional annotation of genomic SNPs which is available to download as a single file executable for WINDOWS users with limited computational experience and as a Python script for Mac OS and Linux users. TRAMS needs a tab delimited text file containing SNP locations, reference nucleotide and SNPs in variant strains along with a reference genome sequence in GenBank or EMBL format. SNPs are annotated as synonymous, nonsynonymous or nonsense. Nonsynonymous SNPs in start and stop codons are separated as non-start and non-stop SNPs, respectively. SNPs in multiple overlapping features are annotated separately for each feature and multiple nucleotide polymorphisms within a codon are combined before annotation. We have also developed a workflow for Galaxy, a highly used tool for analysing NGS data, to map short reads to a reference genome and extract and annotate the SNPs. TRAMS is a simple program for rapid and accurate annotation of SNPs that will be very useful for microbiologists in analysing genomic diversity in microbial populations.

  4. Semantic Annotation for Biological Information Retrieval System

    Directory of Open Access Journals (Sweden)

    Mohamed Marouf Z. Oshaiba

    2015-01-01

    Full Text Available Online literatures are increasing in a tremendous rate. Biological domain is one of the fast growing domains. Biological researchers face a problem finding what they are searching for effectively and efficiently. The aim of this research is to find documents that contain any combination of biological process and/or molecular function and/or cellular component. This research proposes a framework that helps researchers to retrieve meaningful documents related to their asserted terms based on gene ontology (GO. The system utilizes GO by semantically decomposing it into three subontologies (cellular component, biological process, and molecular function. Researcher has the flexibility to choose searching terms from any combination of the three subontologies. Document annotation is taking a place in this research to create an index of biological terms in documents to speed the searching process. Query expansion is used to infer semantically related terms to asserted terms. It increases the search meaningful results using the term synonyms and term relationships. The system uses a ranking method to order the retrieved documents based on the ranking weights. The proposed system achieves researchers’ needs to find documents that fit the asserted terms semantically.

  5. Annotation and Curation of Uncharacterized proteins- Challenges

    Directory of Open Access Journals (Sweden)

    Johny eIjaq

    2015-03-01

    Full Text Available Hypothetical Proteins are the proteins that are predicted to be expressed from an open reading frame (ORF, constituting a substantial fraction of proteomes in both prokaryotes and eukaryotes. Genome projects have led to the identification of many therapeutic targets, the putative function of the protein and their interactions. In this review we have enlisted various methods. Annotation linked to structural and functional prediction of hypothetical proteins assist in the discovery of new structures and functions serving as markers and pharmacological targets for drug designing, discovery and screening. Mass spectrometry is an analytical technique for validating protein characterisation. Matrix-assisted laser desorption ionization–mass spectrometry (MALDI-MS is an efficient analytical method. Microarrays and Protein expression profiles help understanding the biological systems through a systems-wide study of proteins and their interactions with other proteins and non-proteinaceous molecules to control complex processes in cells and tissues and even whole organism. Next generation sequencing technology accelerates multiple areas of genomics research.

  6. Filtering "genic" open reading frames from genomic DNA samples for advanced annotation

    Directory of Open Access Journals (Sweden)

    Sblattero Daniele

    2011-06-01

    Full Text Available Abstract Background In order to carry out experimental gene annotation, DNA encoding open reading frames (ORFs derived from real genes (termed "genic" in the correct frame is required. When genes are correctly assigned, isolation of genic DNA for functional annotation can be carried out by PCR. However, not all genes are correctly assigned, and even when correctly assigned, gene products are often incorrectly folded when expressed in heterologous hosts. This is a problem that can sometimes be overcome by the expression of protein fragments encoding domains, rather than full-length proteins. One possible method to isolate DNA encoding such domains would to "filter" complex DNA (cDNA libraries, genomic and metagenomic DNA for gene fragments that confer a selectable phenotype relying on correct folding, with all such domains present in a complex DNA sample, termed the “domainome”. Results In this paper we discuss the preparation of diverse genic ORF libraries from randomly fragmented genomic DNA using ß-lactamase to filter out the open reading frames. By cloning DNA fragments between leader sequences and the mature ß-lactamase gene, colonies can be selected for resistance to ampicillin, conferred by correct folding of the lactamase gene. Our experiments demonstrate that the majority of surviving colonies contain genic open reading frames, suggesting that ß-lactamase is acting as a selectable folding reporter. Furthermore, different leaders (Sec, TAT and SRP, normally translocating different protein classes, filter different genic fragment subsets, indicating that their use increases the fraction of the “domainone” that is accessible. Conclusions The availability of ORF libraries, obtained with the filtering method described here, combined with screening methods such as phage display and protein-protein interaction studies, or with protein structure determination projects, can lead to the identification and structural determination of

  7. Genomic analyses with biofilter 2.0: knowledge driven filtering, annotation, and model development.

    Science.gov (United States)

    Pendergrass, Sarah A; Frase, Alex; Wallace, John; Wolfe, Daniel; Katiyar, Neerja; Moore, Carrie; Ritchie, Marylyn D

    2013-12-30

    tool that provides a flexible way to use the ever-expanding expert biological knowledge that exists to direct filtering, annotation, and complex predictive model development for elucidating the etiology of complex phenotypic outcomes.

  8. DNApod: DNA polymorphism annotation database from next-generation sequence read archives

    Science.gov (United States)

    Mochizuki, Takako; Tanizawa, Yasuhiro; Fujisawa, Takatomo; Ohta, Tazro; Nikoh, Naruo; Shimizu, Tokurou; Toyoda, Atsushi; Fujiyama, Asao; Kurata, Nori; Nagasaki, Hideki; Kaminuma, Eli; Nakamura, Yasukazu

    2017-01-01

    With the rapid advances in next-generation sequencing (NGS), datasets for DNA polymorphisms among various species and strains have been produced, stored, and distributed. However, reliability varies among these datasets because the experimental and analytical conditions used differ among assays. Furthermore, such datasets have been frequently distributed from the websites of individual sequencing projects. It is desirable to integrate DNA polymorphism data into one database featuring uniform quality control that is distributed from a single platform at a single place. DNA polymorphism annotation database (DNApod; http://tga.nig.ac.jp/dnapod/) is an integrated database that stores genome-wide DNA polymorphism datasets acquired under uniform analytical conditions, and this includes uniformity in the quality of the raw data, the reference genome version, and evaluation algorithms. DNApod genotypic data are re-analyzed whole-genome shotgun datasets extracted from sequence read archives, and DNApod distributes genome-wide DNA polymorphism datasets and known-gene annotations for each DNA polymorphism. This new database was developed for storing genome-wide DNA polymorphism datasets of plants, with crops being the first priority. Here, we describe our analyzed data for 679, 404, and 66 strains of rice, maize, and sorghum, respectively. The analytical methods are available as a DNApod workflow in an NGS annotation system of the DNA Data Bank of Japan and a virtual machine image. Furthermore, DNApod provides tables of links of identifiers between DNApod genotypic data and public phenotypic data. To advance the sharing of organism knowledge, DNApod offers basic and ubiquitous functions for multiple alignment and phylogenetic tree construction by using orthologous gene information. PMID:28234924

  9. MitoBamAnnotator: A web-based tool for detecting and annotating heteroplasmy in human mitochondrial DNA sequences.

    Science.gov (United States)

    Zhidkov, Ilia; Nagar, Tal; Mishmar, Dan; Rubin, Eitan

    2011-11-01

    The use of Next-Generation Sequencing of mitochondrial DNA is becoming widespread in biological and clinical research. This, in turn, creates a need for a convenient tool that detects and analyzes heteroplasmy. Here we present MitoBamAnnotator, a user friendly web-based tool that allows maximum flexibility and control in heteroplasmy research. MitoBamAnnotator provides the user with a comprehensively annotated overview of mitochondrial genetic variation, allowing for an in-depth analysis with no prior knowledge in programming.

  10. Annotation et rature Annotation and Deletion: Outline of a Sociology of Forms

    Directory of Open Access Journals (Sweden)

    Axel Pohn-Weidinger

    2012-05-01

    Full Text Available Ce texte interroge les traces graphiques laissées sur un corpus de formulaires de demande de logement social telles qu’annotations, ratures, biffures et commentaires griffonnés. L’étude de ces traces, laissées en marge des catégories de l’imprimé administratif lors du remplissage, montre le recours au droit comme une opération problématique. Pour les administrés, il s’agit de décrire leur situation de vie de sorte à établir l’éligibilité à un droit, mais bien souvent il est impossible de traduire celle-ci dans les catégories préétablies du formulaire. Les annotations et commentaires laissés sur le formulaire tentent alors d’ouvrir la catégorisation juridique des situations à une prise en compte de la singularité des circonstances de vie du demandeur. Elles montrent le recours au droit comme un accomplissement réflexif, un travail à la fois sur sa propre perception de sa situation et sur celle que l’institution offre à travers le formulaire, et dont la négociation et la mise en œuvre sont au cœur de la production du dossier administratif.This text examines the graphical traces left on a collection of social housing application forms: annotations, erasures, crossed-out words and scribbled-out comments. The study of these traces, left in the margins of the categories on printed administrative forms in the process of being completed, shows the exercising of a right as a problematic operation. Citizens making applications must describe their living situation in a way that will establish their eligibility for a right, but quite often it is impossible to convey this through the form’s predetermined categories. The annotations and comments left on the form attempt to open the legal classification of situations to considering the uniqueness of the applicant’s living circumstances. They show the use of a right as an introspective accomplishment, requiring applicants to work both on their own perception of

  11. Managing and Querying Image Annotation and Markup in XML.

    Science.gov (United States)

    Wang, Fusheng; Pan, Tony; Sharma, Ashish; Saltz, Joel

    2010-01-01

    Proprietary approaches for representing annotations and image markup are serious barriers for researchers to share image data and knowledge. The Annotation and Image Markup (AIM) project is developing a standard based information model for image annotation and markup in health care and clinical trial environments. The complex hierarchical structures of AIM data model pose new challenges for managing such data in terms of performance and support of complex queries. In this paper, we present our work on managing AIM data through a native XML approach, and supporting complex image and annotation queries through native extension of XQuery language. Through integration with xService, AIM databases can now be conveniently shared through caGrid.

  12. Geothermal wetlands: an annotated bibliography of pertinent literature

    Energy Technology Data Exchange (ETDEWEB)

    Stanley, N.E.; Thurow, T.L.; Russell, B.F.; Sullivan, J.F.

    1980-05-01

    This annotated bibliography covers the following topics: algae, wetland ecosystems; institutional aspects; macrophytes - general, production rates, and mineral absorption; trace metal absorption; wetland soils; water quality; and other aspects of marsh ecosystems. (MHR)

  13. Analysis of Annotation on Documents for Recycling Information

    Science.gov (United States)

    Nakai, Tomohiro; Kondo, Nobuyuki; Kise, Koichi; Matsumoto, Keinosuke

    In order to make collaborative business activities fruitful, it is essential to know characteristics of organizations and persons in more details and to gather information relevant to the activities. In this paper, we describe a notion of “information recycle" that actualizes these requirements by analyzing documents. The key of recycling information is to utilize annotations on documents as clues for generating users' profiles and for weighting contents in the context of the activities. We also propose a method of extracting annotations on paper documents just by pressing one button with the help of techniques of camera-based document image analysis. Experimental results demonstrate that it is fundamentally capable of acquiring annotations on paper documents on condition that their electronic versions without annotations are available for the processing.

  14. Responsibility in Governmental-Political Communication: A Selected, Annotated Bibliography.

    Science.gov (United States)

    Johannesen, Richard L.

    This annotated bibliography lists 43 books, periodicals, and essays in the area of governmental-political communication. Topics include: social justice, lying, cheating, ethics, public duties, public policy, language, rhetorical strategies, and propaganda. (MS)

  15. An analysis on the entity annotations in biological corpora.

    Science.gov (United States)

    Neves, Mariana

    2014-01-01

    Collection of documents annotated with semantic entities and relationships are crucial resources to support development and evaluation of text mining solutions for the biomedical domain. Here I present an overview of 36 corpora and show an analysis on the semantic annotations they contain. Annotations for entity types were classified into six semantic groups and an overview on the semantic entities which can be found in each corpus is shown. Results show that while some semantic entities, such as genes, proteins and chemicals are consistently annotated in many collections, corpora available for diseases, variations and mutations are still few, in spite of their importance in the biological domain.

  16. An Annotated Checklist of the Fishes of Samoa

    Data.gov (United States)

    US Fish and Wildlife Service, Department of the Interior — All fishes currently known from the Samoan Islands are listed by their scientific and Samoan names. Species entries are annotated to include the initial Samoan...

  17. Freedom of Speech: A Selected, Annotated Basic Bibliography.

    Science.gov (United States)

    Tedford, Thomas L.

    This bibliography lists 36 books related to problems of freedom of speech. General sources (history, analyses, texts, and anthologies) are listed separately from those dealing with censorship of obscenity and pornography. Each entry is briefly annotated. (AA)

  18. Bayesian Framework for Automatic Image Annotation Using Visual Keywords

    Science.gov (United States)

    Agrawal, Rajeev; Wu, Changhua; Grosky, William; Fotouhi, Farshad

    In this paper, we propose a Bayesian probability based framework, which uses visual keywords and already available text keywords to automatically annotate the images. Taking the cue from document classification, an image can be considered as a document and objects present in it as words. Using this concept, we can create visual keywords by dividing an image into tiles based on a certain template size. Visual keywords are simple vector quantization of small-sized image tiles. We estimate the conditional probability of a text keyword in the presence of visual keywords, described by a multivariate Gaussian distribution. We demonstrate the effectiveness of our approach by comparing predicted text annotations with manual annotations and analyze the effect of text annotation length on the performance.

  19. Effects of dehydration on performance in man: Annotated bibliography

    Science.gov (United States)

    Greenleaf, J. E.

    1973-01-01

    A compilation of studies on the effect of dehydration on human performance and related physiological mechanisms. The annotations are listed in alphabetical order by first author and cover material through June 1973.

  20. Using Apollo to browse and edit genome annotations.

    Science.gov (United States)

    Misra, Sima; Harris, Nomi

    2006-01-01

    An annotation is any feature that can be tied to genomic sequence, such as an exon, transcript, promoter, or transposable element. As biological knowledge increases, annotations of different types need to be added and modified, and links to other sources of information need to be incorporated, to allow biologists to easily access all of the available sequence analysis data and design appropriate experiments. The Apollo genome browser and editor offers biologists these capabilities. Apollo can display many different types of computational evidence, such as alignments and similarities based on BLAST searches (UNITS 3.3 & 3.4), and enables biologists to utilize computational evidence to create and edit gene models and other genomic features, e.g., using experimental evidence to refine exon-intron structures predicted by gene prediction algorithms. This protocol describes simple ways to browse genome annotation data, as well as techniques for editing annotations and loading data from different sources.