WorldWideScience

Sample records for public biology databases

  1. DATABASES DEVELOPED IN INDIA FOR BIOLOGICAL SCIENCES

    Directory of Open Access Journals (Sweden)

    Gitanjali Yadav

    2017-09-01

    Full Text Available The complexity of biological systems requires use of a variety of experimental methods with ever increasing sophistication to probe various cellular processes at molecular and atomic resolution. The availability of technologies for determining nucleic acid sequences of genes and atomic resolution structures of biomolecules prompted development of major biological databases like GenBank and PDB almost four decades ago. India was one of the few countries to realize early, the utility of such databases for progress in modern biology/biotechnology. Department of Biotechnology (DBT, India established Biotechnology Information System (BTIS network in late eighties. Starting with the genome sequencing revolution at the turn of the century, application of high-throughput sequencing technologies in biology and medicine for analysis of genomes, transcriptomes, epigenomes and microbiomes have generated massive volumes of sequence data. BTIS network has not only provided state of the art computational infrastructure to research institutes and universities for utilizing various biological databases developed abroad in their research, it has also actively promoted research and development (R&D projects in Bioinformatics to develop a variety of biological databases in diverse areas. It is encouraging to note that, a large number of biological databases or data driven software tools developed in India, have been published in leading peer reviewed international journals like Nucleic Acids Research, Bioinformatics, Database, BMC, PLoS and NPG series publication. Some of these databases are not only unique, they are also highly accessed as reflected in number of citations. Apart from databases developed by individual research groups, BTIS has initiated consortium projects to develop major India centric databases on Mycobacterium tuberculosis, Rice and Mango, which can potentially have practical applications in health and agriculture. Many of these biological

  2. Bibliographical database of radiation biological dosimetry and risk assessment: Part 2

    International Nuclear Information System (INIS)

    Straume, T.; Ricker, Y.; Thut, M.

    1990-09-01

    This is part 11 of a database constructed to support research in radiation biological dosimetry and risk assessment. Relevant publications were identified through detailed searches of national and international electronic databases and through our personal knowledge of the subject. Publications were numbered and key worded, and referenced in an electronic data-retrieval system that permits quick access through computerized searches on authors, key words, title, year, journal name, or publication number. Photocopies of the publications contained in the database are maintained in a file that is numerically arranged by our publication acquisition numbers. This volume contains 1048 additional entries, which are listed in alphabetical order by author. The computer software used for the database is a simple but sophisticated relational database program that permits quick information access, high flexibility, and the creation of customized reports. This program is inexpensive and is commercially available for the Macintosh and the IBM PC. Although the database entries were made using a Macintosh computer, we have the capability to convert the files into the IBM PC version. As of this date, the database cites 2260 publications. Citations in the database are from 200 different scientific journals. There are also references to 80 books and published symposia, and 158 reports. Information relevant to radiation biological dosimetry and risk assessment is widely distributed within the scientific literature, although a few journals clearly predominate. The journals publishing the largest number of relevant papers are Health Physics, with a total of 242 citations in the database, and Mutation Research, with 185 citations. Other journals with over 100 citations in the database, are Radiation Research, with 136, and International Journal of Radiation Biology, with 132

  3. SPARQLGraph: a web-based platform for graphically querying biological Semantic Web databases.

    Science.gov (United States)

    Schweiger, Dominik; Trajanoski, Zlatko; Pabinger, Stephan

    2014-08-15

    Semantic Web has established itself as a framework for using and sharing data across applications and database boundaries. Here, we present a web-based platform for querying biological Semantic Web databases in a graphical way. SPARQLGraph offers an intuitive drag & drop query builder, which converts the visual graph into a query and executes it on a public endpoint. The tool integrates several publicly available Semantic Web databases, including the databases of the just recently released EBI RDF platform. Furthermore, it provides several predefined template queries for answering biological questions. Users can easily create and save new query graphs, which can also be shared with other researchers. This new graphical way of creating queries for biological Semantic Web databases considerably facilitates usability as it removes the requirement of knowing specific query languages and database structures. The system is freely available at http://sparqlgraph.i-med.ac.at.

  4. Bibliographical database of radiation biological dosimetry and risk assessment: Part 1, through June 1988

    Energy Technology Data Exchange (ETDEWEB)

    Straume, T.; Ricker, Y.; Thut, M.

    1988-08-29

    This database was constructed to support research in radiation biological dosimetry and risk assessment. Relevant publications were identified through detailed searches of national and international electronic databases and through our personal knowledge of the subject. Publications were numbered and key worded, and referenced in an electronic data-retrieval system that permits quick access through computerized searches on publication number, authors, key words, title, year, and journal name. Photocopies of all publications contained in the database are maintained in a file that is numerically arranged by citation number. This report of the database is provided as a useful reference and overview. It should be emphasized that the database will grow as new citations are added to it. With that in mind, we arranged this report in order of ascending citation number so that follow-up reports will simply extend this document. The database cite 1212 publications. Publications are from 119 different scientific journals, 27 of these journals are cited at least 5 times. It also contains reference to 42 books and published symposia, and 129 reports. Information relevant to radiation biological dosimetry and risk assessment is widely distributed among the scientific literature, although a few journals clearly dominate. The four journals publishing the largest number of relevant papers are Health Physics, Mutation Research, Radiation Research, and International Journal of Radiation Biology. Publications in Health Physics make up almost 10% of the current database.

  5. Bibliographical database of radiation biological dosimetry and risk assessment: Part 1, through June 1988

    International Nuclear Information System (INIS)

    Straume, T.; Ricker, Y.; Thut, M.

    1988-01-01

    This database was constructed to support research in radiation biological dosimetry and risk assessment. Relevant publications were identified through detailed searches of national and international electronic databases and through our personal knowledge of the subject. Publications were numbered and key worded, and referenced in an electronic data-retrieval system that permits quick access through computerized searches on publication number, authors, key words, title, year, and journal name. Photocopies of all publications contained in the database are maintained in a file that is numerically arranged by citation number. This report of the database is provided as a useful reference and overview. It should be emphasized that the database will grow as new citations are added to it. With that in mind, we arranged this report in order of ascending citation number so that follow-up reports will simply extend this document. The database cite 1212 publications. Publications are from 119 different scientific journals, 27 of these journals are cited at least 5 times. It also contains reference to 42 books and published symposia, and 129 reports. Information relevant to radiation biological dosimetry and risk assessment is widely distributed among the scientific literature, although a few journals clearly dominate. The four journals publishing the largest number of relevant papers are Health Physics, Mutation Research, Radiation Research, and International Journal of Radiation Biology. Publications in Health Physics make up almost 10% of the current database

  6. Database Publication Practices

    DEFF Research Database (Denmark)

    Bernstein, P.A.; DeWitt, D.; Heuer, A.

    2005-01-01

    There has been a growing interest in improving the publication processes for database research papers. This panel reports on recent changes in those processes and presents an initial cut at historical data for the VLDB Journal and ACM Transactions on Database Systems.......There has been a growing interest in improving the publication processes for database research papers. This panel reports on recent changes in those processes and presents an initial cut at historical data for the VLDB Journal and ACM Transactions on Database Systems....

  7. The Importance of Biological Databases in Biological Discovery.

    Science.gov (United States)

    Baxevanis, Andreas D; Bateman, Alex

    2015-06-19

    Biological databases play a central role in bioinformatics. They offer scientists the opportunity to access a wide variety of biologically relevant data, including the genomic sequences of an increasingly broad range of organisms. This unit provides a brief overview of major sequence databases and portals, such as GenBank, the UCSC Genome Browser, and Ensembl. Model organism databases, including WormBase, The Arabidopsis Information Resource (TAIR), and those made available through the Mouse Genome Informatics (MGI) resource, are also covered. Non-sequence-centric databases, such as Online Mendelian Inheritance in Man (OMIM), the Protein Data Bank (PDB), MetaCyc, and the Kyoto Encyclopedia of Genes and Genomes (KEGG), are also discussed. Copyright © 2015 John Wiley & Sons, Inc.

  8. A dedicated database system for handling multi-level data in systems biology.

    Science.gov (United States)

    Pornputtapong, Natapol; Wanichthanarak, Kwanjeera; Nilsson, Avlant; Nookaew, Intawat; Nielsen, Jens

    2014-01-01

    Advances in high-throughput technologies have enabled extensive generation of multi-level omics data. These data are crucial for systems biology research, though they are complex, heterogeneous, highly dynamic, incomplete and distributed among public databases. This leads to difficulties in data accessibility and often results in errors when data are merged and integrated from varied resources. Therefore, integration and management of systems biological data remain very challenging. To overcome this, we designed and developed a dedicated database system that can serve and solve the vital issues in data management and hereby facilitate data integration, modeling and analysis in systems biology within a sole database. In addition, a yeast data repository was implemented as an integrated database environment which is operated by the database system. Two applications were implemented to demonstrate extensibility and utilization of the system. Both illustrate how the user can access the database via the web query function and implemented scripts. These scripts are specific for two sample cases: 1) Detecting the pheromone pathway in protein interaction networks; and 2) Finding metabolic reactions regulated by Snf1 kinase. In this study we present the design of database system which offers an extensible environment to efficiently capture the majority of biological entities and relations encountered in systems biology. Critical functions and control processes were designed and implemented to ensure consistent, efficient, secure and reliable transactions. The two sample cases on the yeast integrated data clearly demonstrate the value of a sole database environment for systems biology research.

  9. BioCarian: search engine for exploratory searches in heterogeneous biological databases.

    Science.gov (United States)

    Zaki, Nazar; Tennakoon, Chandana

    2017-10-02

    There are a large number of biological databases publicly available for scientists in the web. Also, there are many private databases generated in the course of research projects. These databases are in a wide variety of formats. Web standards have evolved in the recent times and semantic web technologies are now available to interconnect diverse and heterogeneous sources of data. Therefore, integration and querying of biological databases can be facilitated by techniques used in semantic web. Heterogeneous databases can be converted into Resource Description Format (RDF) and queried using SPARQL language. Searching for exact queries in these databases is trivial. However, exploratory searches need customized solutions, especially when multiple databases are involved. This process is cumbersome and time consuming for those without a sufficient background in computer science. In this context, a search engine facilitating exploratory searches of databases would be of great help to the scientific community. We present BioCarian, an efficient and user-friendly search engine for performing exploratory searches on biological databases. The search engine is an interface for SPARQL queries over RDF databases. We note that many of the databases can be converted to tabular form. We first convert the tabular databases to RDF. The search engine provides a graphical interface based on facets to explore the converted databases. The facet interface is more advanced than conventional facets. It allows complex queries to be constructed, and have additional features like ranking of facet values based on several criteria, visually indicating the relevance of a facet value and presenting the most important facet values when a large number of choices are available. For the advanced users, SPARQL queries can be run directly on the databases. Using this feature, users will be able to incorporate federated searches of SPARQL endpoints. We used the search engine to do an exploratory search

  10. 24 CFR 81.72 - Public-use database and public information.

    Science.gov (United States)

    2010-04-01

    ... 24 Housing and Urban Development 1 2010-04-01 2010-04-01 false Public-use database and public... Public-use database and public information. (a) General. Except as provided in paragraph (c) of this section, the Secretary shall establish and make available for public use, a public-use database containing...

  11. A dedicated database system for handling multi-level data in systems biology

    OpenAIRE

    Pornputtapong, Natapol; Wanichthanarak, Kwanjeera; Nilsson, Avlant; Nookaew, Intawat; Nielsen, Jens

    2014-01-01

    Background Advances in high-throughput technologies have enabled extensive generation of multi-level omics data. These data are crucial for systems biology research, though they are complex, heterogeneous, highly dynamic, incomplete and distributed among public databases. This leads to difficulties in data accessibility and often results in errors when data are merged and integrated from varied resources. Therefore, integration and management of systems biological data remain very challenging...

  12. ChemProt: a disease chemical biology database

    DEFF Research Database (Denmark)

    Taboureau, Olivier; Nielsen, Sonny Kim; Audouze, Karine Marie Laure

    2011-01-01

    Systems pharmacology is an emergent area that studies drug action across multiple scales of complexity, from molecular and cellular to tissue and organism levels. There is a critical need to develop network-based approaches to integrate the growing body of chemical biology knowledge with network...... biology. Here, we report ChemProt, a disease chemical biology database, which is based on a compilation of multiple chemical-protein annotation resources, as well as disease-associated protein-protein interactions (PPIs). We assembled more than 700 000 unique chemicals with biological annotation for 30...... evaluation of environmental chemicals, natural products and approved drugs, as well as the selection of new compounds based on their activity profile against most known biological targets, including those related to adverse drug events. Results from the disease chemical biology database associate citalopram...

  13. BioModels Database: a repository of mathematical models of biological processes.

    Science.gov (United States)

    Chelliah, Vijayalakshmi; Laibe, Camille; Le Novère, Nicolas

    2013-01-01

    BioModels Database is a public online resource that allows storing and sharing of published, peer-reviewed quantitative, dynamic models of biological processes. The model components and behaviour are thoroughly checked to correspond the original publication and manually curated to ensure reliability. Furthermore, the model elements are annotated with terms from controlled vocabularies as well as linked to relevant external data resources. This greatly helps in model interpretation and reuse. Models are stored in SBML format, accepted in SBML and CellML formats, and are available for download in various other common formats such as BioPAX, Octave, SciLab, VCML, XPP and PDF, in addition to SBML. The reaction network diagram of the models is also available in several formats. BioModels Database features a search engine, which provides simple and more advanced searches. Features such as online simulation and creation of smaller models (submodels) from the selected model elements of a larger one are provided. BioModels Database can be accessed both via a web interface and programmatically via web services. New models are available in BioModels Database at regular releases, about every 4 months.

  14. Biological Sample Monitoring Database (BSMDBS)

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — The Biological Sample Monitoring Database System (BSMDBS) was developed for the Northeast Fisheries Regional Office and Science Center (NER/NEFSC) to record and...

  15. Answering biological questions: Querying a systems biology database for nutrigenomics

    NARCIS (Netherlands)

    Evelo, C.T.; Bochove, K. van; Saito, J.T.

    2011-01-01

    The requirement of systems biology for connecting different levels of biological research leads directly to a need for integrating vast amounts of diverse information in general and of omics data in particular. The nutritional phenotype database addresses this challenge for nutrigenomics. A

  16. Critical evaluation of the JDO API for the persistence and portability requirements of complex biological databases

    Directory of Open Access Journals (Sweden)

    Schwieger Michael

    2005-01-01

    Full Text Available Abstract Background Complex biological database systems have become key computational tools used daily by scientists and researchers. Many of these systems must be capable of executing on multiple different hardware and software configurations and are also often made available to users via the Internet. We have used the Java Data Object (JDO persistence technology to develop the database layer of such a system known as the SigPath information management system. SigPath is an example of a complex biological database that needs to store various types of information connected by many relationships. Results Using this system as an example, we perform a critical evaluation of current JDO technology; discuss the suitability of the JDO standard to achieve portability, scalability and performance. We show that JDO supports portability of the SigPath system from a relational database backend to an object database backend and achieves acceptable scalability. To answer the performance question, we have created the SigPath JDO application benchmark that we distribute under the Gnu General Public License. This benchmark can be used as an example of using JDO technology to create a complex biological database and makes it possible for vendors and users of the technology to evaluate the performance of other JDO implementations for similar applications. Conclusions The SigPath JDO benchmark and our discussion of JDO technology in the context of biological databases will be useful to bioinformaticians who design new complex biological databases and aim to create systems that can be ported easily to a variety of database backends.

  17. Database Support for Research in Public Administration

    Science.gov (United States)

    Tucker, James Cory

    2005-01-01

    This study examines the extent to which databases support student and faculty research in the area of public administration. A list of journals in public administration, public policy, political science, public budgeting and finance, and other related areas was compared to the journal content list of six business databases. These databases…

  18. Causal biological network database: a comprehensive platform of causal biological network models focused on the pulmonary and vascular systems.

    Science.gov (United States)

    Boué, Stéphanie; Talikka, Marja; Westra, Jurjen Willem; Hayes, William; Di Fabio, Anselmo; Park, Jennifer; Schlage, Walter K; Sewer, Alain; Fields, Brett; Ansari, Sam; Martin, Florian; Veljkovic, Emilija; Kenney, Renee; Peitsch, Manuel C; Hoeng, Julia

    2015-01-01

    With the wealth of publications and data available, powerful and transparent computational approaches are required to represent measured data and scientific knowledge in a computable and searchable format. We developed a set of biological network models, scripted in the Biological Expression Language, that reflect causal signaling pathways across a wide range of biological processes, including cell fate, cell stress, cell proliferation, inflammation, tissue repair and angiogenesis in the pulmonary and cardiovascular context. This comprehensive collection of networks is now freely available to the scientific community in a centralized web-based repository, the Causal Biological Network database, which is composed of over 120 manually curated and well annotated biological network models and can be accessed at http://causalbionet.com. The website accesses a MongoDB, which stores all versions of the networks as JSON objects and allows users to search for genes, proteins, biological processes, small molecules and keywords in the network descriptions to retrieve biological networks of interest. The content of the networks can be visualized and browsed. Nodes and edges can be filtered and all supporting evidence for the edges can be browsed and is linked to the original articles in PubMed. Moreover, networks may be downloaded for further visualization and evaluation. Database URL: http://causalbionet.com © The Author(s) 2015. Published by Oxford University Press.

  19. Use of Graph Database for the Integration of Heterogeneous Biological Data.

    Science.gov (United States)

    Yoon, Byoung-Ha; Kim, Seon-Kyu; Kim, Seon-Young

    2017-03-01

    Understanding complex relationships among heterogeneous biological data is one of the fundamental goals in biology. In most cases, diverse biological data are stored in relational databases, such as MySQL and Oracle, which store data in multiple tables and then infer relationships by multiple-join statements. Recently, a new type of database, called the graph-based database, was developed to natively represent various kinds of complex relationships, and it is widely used among computer science communities and IT industries. Here, we demonstrate the feasibility of using a graph-based database for complex biological relationships by comparing the performance between MySQL and Neo4j, one of the most widely used graph databases. We collected various biological data (protein-protein interaction, drug-target, gene-disease, etc.) from several existing sources, removed duplicate and redundant data, and finally constructed a graph database containing 114,550 nodes and 82,674,321 relationships. When we tested the query execution performance of MySQL versus Neo4j, we found that Neo4j outperformed MySQL in all cases. While Neo4j exhibited a very fast response for various queries, MySQL exhibited latent or unfinished responses for complex queries with multiple-join statements. These results show that using graph-based databases, such as Neo4j, is an efficient way to store complex biological relationships. Moreover, querying a graph database in diverse ways has the potential to reveal novel relationships among heterogeneous biological data.

  20. BioQ: tracing experimental origins in public genomic databases using a novel data provenance model.

    Science.gov (United States)

    Saccone, Scott F; Quan, Jiaxi; Jones, Peter L

    2012-04-15

    Public genomic databases, which are often used to guide genetic studies of human disease, are now being applied to genomic medicine through in silico integrative genomics. These databases, however, often lack tools for systematically determining the experimental origins of the data. We introduce a new data provenance model that we have implemented in a public web application, BioQ, for assessing the reliability of the data by systematically tracing its experimental origins to the original subjects and biologics. BioQ allows investigators to both visualize data provenance as well as explore individual elements of experimental process flow using precise tools for detailed data exploration and documentation. It includes a number of human genetic variation databases such as the HapMap and 1000 Genomes projects. BioQ is freely available to the public at http://bioq.saclab.net.

  1. Saccharomyces genome database informs human biology

    OpenAIRE

    Skrzypek, Marek S; Nash, Robert S; Wong, Edith D; MacPherson, Kevin A; Hellerstedt, Sage T; Engel, Stacia R; Karra, Kalpana; Weng, Shuai; Sheppard, Travis K; Binkley, Gail; Simison, Matt; Miyasato, Stuart R; Cherry, J Michael

    2017-01-01

    Abstract The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org) is an expertly curated database of literature-derived functional information for the model organism budding yeast, Saccharomyces cerevisiae. SGD constantly strives to synergize new types of experimental data and bioinformatics predictions with existing data, and to organize them into a comprehensive and up-to-date information resource. The primary mission of SGD is to facilitate research into the biology of yeast and...

  2. Publications of Australian LIS Academics in Databases

    Science.gov (United States)

    Wilson, Concepcion S.; Boell, Sebastian K.; Kennan, Mary Anne; Willard, Patricia

    2011-01-01

    This paper examines aspects of journal articles published from 1967 to 2008, located in eight databases, and authored or co-authored by academics serving for at least two years in Australian LIS programs from 1959 to 2008. These aspects are: inclusion of publications in databases, publications in journals, authorship characteristics of…

  3. A public database of macromolecular diffraction experiments.

    Science.gov (United States)

    Grabowski, Marek; Langner, Karol M; Cymborowski, Marcin; Porebski, Przemyslaw J; Sroka, Piotr; Zheng, Heping; Cooper, David R; Zimmerman, Matthew D; Elsliger, Marc André; Burley, Stephen K; Minor, Wladek

    2016-11-01

    The low reproducibility of published experimental results in many scientific disciplines has recently garnered negative attention in scientific journals and the general media. Public transparency, including the availability of `raw' experimental data, will help to address growing concerns regarding scientific integrity. Macromolecular X-ray crystallography has led the way in requiring the public dissemination of atomic coordinates and a wealth of experimental data, making the field one of the most reproducible in the biological sciences. However, there remains no mandate for public disclosure of the original diffraction data. The Integrated Resource for Reproducibility in Macromolecular Crystallography (IRRMC) has been developed to archive raw data from diffraction experiments and, equally importantly, to provide related metadata. Currently, the database of our resource contains data from 2920 macromolecular diffraction experiments (5767 data sets), accounting for around 3% of all depositions in the Protein Data Bank (PDB), with their corresponding partially curated metadata. IRRMC utilizes distributed storage implemented using a federated architecture of many independent storage servers, which provides both scalability and sustainability. The resource, which is accessible via the web portal at http://www.proteindiffraction.org, can be searched using various criteria. All data are available for unrestricted access and download. The resource serves as a proof of concept and demonstrates the feasibility of archiving raw diffraction data and associated metadata from X-ray crystallographic studies of biological macromolecules. The goal is to expand this resource and include data sets that failed to yield X-ray structures in order to facilitate collaborative efforts that will improve protein structure-determination methods and to ensure the availability of `orphan' data left behind for various reasons by individual investigators and/or extinct structural genomics

  4. Integr8: enhanced inter-operability of European molecular biology databases.

    Science.gov (United States)

    Kersey, P J; Morris, L; Hermjakob, H; Apweiler, R

    2003-01-01

    The increasing production of molecular biology data in the post-genomic era, and the proliferation of databases that store it, require the development of an integrative layer in database services to facilitate the synthesis of related information. The solution of this problem is made more difficult by the absence of universal identifiers for biological entities, and the breadth and variety of available data. Integr8 was modelled using UML (Universal Modelling Language). Integr8 is being implemented as an n-tier system using a modern object-oriented programming language (Java). An object-relational mapping tool, OJB, is being used to specify the interface between the upper layers and an underlying relational database. The European Bioinformatics Institute is launching the Integr8 project. Integr8 will be an automatically populated database in which we will maintain stable identifiers for biological entities, describe their relationships with each other (in accordance with the central dogma of biology), and store equivalences between identified entities in the source databases. Only core data will be stored in Integr8, with web links to the source databases providing further information. Integr8 will provide the integrative layer of the next generation of bioinformatics services from the EBI. Web-based interfaces will be developed to offer gene-centric views of the integrated data, presenting (where known) the links between genome, proteome and phenotype.

  5. Influencing Database Use in Public Libraries.

    Science.gov (United States)

    Tenopir, Carol

    1999-01-01

    Discusses results of a survey of factors influencing database use in public libraries. Highlights the importance of content; ease of use; and importance of instruction. Tabulates importance indications for number and location of workstations, library hours, availability of remote login, usefulness and quality of content, lack of other databases,…

  6. Mining biological databases for candidate disease genes

    Science.gov (United States)

    Braun, Terry A.; Scheetz, Todd; Webster, Gregg L.; Casavant, Thomas L.

    2001-07-01

    The publicly-funded effort to sequence the complete nucleotide sequence of the human genome, the Human Genome Project (HGP), has currently produced more than 93% of the 3 billion nucleotides of the human genome into a preliminary `draft' format. In addition, several valuable sources of information have been developed as direct and indirect results of the HGP. These include the sequencing of model organisms (rat, mouse, fly, and others), gene discovery projects (ESTs and full-length), and new technologies such as expression analysis and resources (micro-arrays or gene chips). These resources are invaluable for the researchers identifying the functional genes of the genome that transcribe and translate into the transcriptome and proteome, both of which potentially contain orders of magnitude more complexity than the genome itself. Preliminary analyses of this data identified approximately 30,000 - 40,000 human `genes.' However, the bulk of the effort still remains -- to identify the functional and structural elements contained within the transcriptome and proteome, and to associate function in the transcriptome and proteome to genes. A fortuitous consequence of the HGP is the existence of hundreds of databases containing biological information that may contain relevant data pertaining to the identification of disease-causing genes. The task of mining these databases for information on candidate genes is a commercial application of enormous potential. We are developing a system to acquire and mine data from specific databases to aid our efforts to identify disease genes. A high speed cluster of Linux of workstations is used to analyze sequence and perform distributed sequence alignments as part of our data mining and processing. This system has been used to mine GeneMap99 sequences within specific genomic intervals to identify potential candidate disease genes associated with Bardet-Biedle Syndrome (BBS).

  7. Awareness and use of electronic databases by public library users ...

    African Journals Online (AJOL)

    The study investigated awareness, access and use of electronic database by public library users in Ibadan Oyo State in Nigeria. The purpose of this study was to determine awareness of public library users' electronic databases, find out what these users used electronic databases to do and to identify problems associated ...

  8. Analysis of commercial and public bioactivity databases.

    Science.gov (United States)

    Tiikkainen, Pekka; Franke, Lutz

    2012-02-27

    Activity data for small molecules are invaluable in chemoinformatics. Various bioactivity databases exist containing detailed information of target proteins and quantitative binding data for small molecules extracted from journals and patents. In the current work, we have merged several public and commercial bioactivity databases into one bioactivity metabase. The molecular presentation, target information, and activity data of the vendor databases were standardized. The main motivation of the work was to create a single relational database which allows fast and simple data retrieval by in-house scientists. Second, we wanted to know the amount of overlap between databases by commercial and public vendors to see whether the former contain data complementing the latter. Third, we quantified the degree of inconsistency between data sources by comparing data points derived from the same scientific article cited by more than one vendor. We found that each data source contains unique data which is due to different scientific articles cited by the vendors. When comparing data derived from the same article we found that inconsistencies between the vendors are common. In conclusion, using databases of different vendors is still useful since the data overlap is not complete. It should be noted that this can be partially explained by the inconsistencies and errors in the source data.

  9. Towards BioDBcore: a community-defined information specification for biological databases

    Science.gov (United States)

    Gaudet, Pascale; Bairoch, Amos; Field, Dawn; Sansone, Susanna-Assunta; Taylor, Chris; Attwood, Teresa K.; Bateman, Alex; Blake, Judith A.; Bult, Carol J.; Cherry, J. Michael; Chisholm, Rex L.; Cochrane, Guy; Cook, Charles E.; Eppig, Janan T.; Galperin, Michael Y.; Gentleman, Robert; Goble, Carole A.; Gojobori, Takashi; Hancock, John M.; Howe, Douglas G.; Imanishi, Tadashi; Kelso, Janet; Landsman, David; Lewis, Suzanna E.; Mizrachi, Ilene Karsch; Orchard, Sandra; Ouellette, B. F. Francis; Ranganathan, Shoba; Richardson, Lorna; Rocca-Serra, Philippe; Schofield, Paul N.; Smedley, Damian; Southan, Christopher; Tan, Tin Wee; Tatusova, Tatiana; Whetzel, Patricia L.; White, Owen; Yamasaki, Chisato

    2011-01-01

    The present article proposes the adoption of a community-defined, uniform, generic description of the core attributes of biological databases, BioDBCore. The goals of these attributes are to provide a general overview of the database landscape, to encourage consistency and interoperability between resources and to promote the use of semantic and syntactic standards. BioDBCore will make it easier for users to evaluate the scope and relevance of available resources. This new resource will increase the collective impact of the information present in biological databases. PMID:21097465

  10. Biomine: predicting links between biological entities using network models of heterogeneous databases

    Directory of Open Access Journals (Sweden)

    Eronen Lauri

    2012-06-01

    Full Text Available Abstract Background Biological databases contain large amounts of data concerning the functions and associations of genes and proteins. Integration of data from several such databases into a single repository can aid the discovery of previously unknown connections spanning multiple types of relationships and databases. Results Biomine is a system that integrates cross-references from several biological databases into a graph model with multiple types of edges, such as protein interactions, gene-disease associations and gene ontology annotations. Edges are weighted based on their type, reliability, and informativeness. We present Biomine and evaluate its performance in link prediction, where the goal is to predict pairs of nodes that will be connected in the future, based on current data. In particular, we formulate protein interaction prediction and disease gene prioritization tasks as instances of link prediction. The predictions are based on a proximity measure computed on the integrated graph. We consider and experiment with several such measures, and perform a parameter optimization procedure where different edge types are weighted to optimize link prediction accuracy. We also propose a novel method for disease-gene prioritization, defined as finding a subset of candidate genes that cluster together in the graph. We experimentally evaluate Biomine by predicting future annotations in the source databases and prioritizing lists of putative disease genes. Conclusions The experimental results show that Biomine has strong potential for predicting links when a set of selected candidate links is available. The predictions obtained using the entire Biomine dataset are shown to clearly outperform ones obtained using any single source of data alone, when different types of links are suitably weighted. In the gene prioritization task, an established reference set of disease-associated genes is useful, but the results show that under favorable

  11. CeDAMar global database of abyssal biological sampling

    OpenAIRE

    Stuart, Carol T.; Arbizu, Pedro Martinez; Smith, Craig R.; Molodtsova, Tina; Brandt, Angelika; Etter, Ron J.; Escobar-briones, Elva; Fabri, Marie-claire; Rex, Michael A.

    2008-01-01

    The Census of the Diversity of Abyssal Marine Life (CeDAMar), a division of the Census of Marine Life, has compiled the first comprehensive global database of biological samples taken in the abyssal plains of the world ocean. It is an essential resource for planning future exploration of the abyss, for synthesizing patterns of biogeography and biodiversity, and for environmentally safe exploitation of natural resources. The database is described in this article, and made available to investig...

  12. Data Cleaning and Semantic Improvement in Biological Databases

    Directory of Open Access Journals (Sweden)

    Apiletti Daniele

    2006-12-01

    Full Text Available Public genomic and proteomic databases can be affected by a variety of errors. These errors may involve either the description or the meaning of data (namely, syntactic or semantic errors. We focus our analysis on the detection of semantic errors, in order to verify the accuracy of the stored information. In particular, we address the issue of data constraints and functional dependencies among attributes in a given relational database. Constraints and dependencies show semantics among attributes in a database schema and their knowledge may be exploited to improve data quality and integration in database design, and to perform query optimization and dimensional reduction.

  13. Documentation for the U.S. Geological Survey Public-Supply Database (PSDB): A database of permitted public-supply wells, surface-water intakes, and systems in the United States

    Science.gov (United States)

    Price, Curtis V.; Maupin, Molly A.

    2014-01-01

    The U.S. Geological Survey (USGS) has developed a database containing information about wells, surface-water intakes, and distribution systems that are part of public water systems across the United States, its territories, and possessions. Programs of the USGS such as the National Water Census, the National Water Use Information Program, and the National Water-Quality Assessment Program all require a complete and current inventory of public water systems, the sources of water used by those systems, and the size of populations served by the systems across the Nation. Although the U.S. Environmental Protection Agency’s Safe Drinking Water Information System (SDWIS) database already exists as the primary national Federal database for information on public water systems, the Public-Supply Database (PSDB) was developed to add value to SDWIS data with enhanced location and ancillary information, and to provide links to other databases, including the USGS’s National Water Information System (NWIS) database.

  14. A comparative cellular and molecular biology of longevity database.

    Science.gov (United States)

    Stuart, Jeffrey A; Liang, Ping; Luo, Xuemei; Page, Melissa M; Gallagher, Emily J; Christoff, Casey A; Robb, Ellen L

    2013-10-01

    Discovering key cellular and molecular traits that promote longevity is a major goal of aging and longevity research. One experimental strategy is to determine which traits have been selected during the evolution of longevity in naturally long-lived animal species. This comparative approach has been applied to lifespan research for nearly four decades, yielding hundreds of datasets describing aspects of cell and molecular biology hypothesized to relate to animal longevity. Here, we introduce a Comparative Cellular and Molecular Biology of Longevity Database, available at ( http://genomics.brocku.ca/ccmbl/ ), as a compendium of comparative cell and molecular data presented in the context of longevity. This open access database will facilitate the meta-analysis of amalgamated datasets using standardized maximum lifespan (MLSP) data (from AnAge). The first edition contains over 800 data records describing experimental measurements of cellular stress resistance, reactive oxygen species metabolism, membrane composition, protein homeostasis, and genome homeostasis as they relate to vertebrate species MLSP. The purpose of this review is to introduce the database and briefly demonstrate its use in the meta-analysis of combined datasets.

  15. Ultra-Structure database design methodology for managing systems biology data and analyses

    Directory of Open Access Journals (Sweden)

    Hemminger Bradley M

    2009-08-01

    Full Text Available Abstract Background Modern, high-throughput biological experiments generate copious, heterogeneous, interconnected data sets. Research is dynamic, with frequently changing protocols, techniques, instruments, and file formats. Because of these factors, systems designed to manage and integrate modern biological data sets often end up as large, unwieldy databases that become difficult to maintain or evolve. The novel rule-based approach of the Ultra-Structure design methodology presents a potential solution to this problem. By representing both data and processes as formal rules within a database, an Ultra-Structure system constitutes a flexible framework that enables users to explicitly store domain knowledge in both a machine- and human-readable form. End users themselves can change the system's capabilities without programmer intervention, simply by altering database contents; no computer code or schemas need be modified. This provides flexibility in adapting to change, and allows integration of disparate, heterogenous data sets within a small core set of database tables, facilitating joint analysis and visualization without becoming unwieldy. Here, we examine the application of Ultra-Structure to our ongoing research program for the integration of large proteomic and genomic data sets (proteogenomic mapping. Results We transitioned our proteogenomic mapping information system from a traditional entity-relationship design to one based on Ultra-Structure. Our system integrates tandem mass spectrum data, genomic annotation sets, and spectrum/peptide mappings, all within a small, general framework implemented within a standard relational database system. General software procedures driven by user-modifiable rules can perform tasks such as logical deduction and location-based computations. The system is not tied specifically to proteogenomic research, but is rather designed to accommodate virtually any kind of biological research. Conclusion We find

  16. Relational Databases: A Transparent Framework for Encouraging Biology Students to Think Informatically

    Science.gov (United States)

    Rice, Michael; Gladstone, William; Weir, Michael

    2004-01-01

    We discuss how relational databases constitute an ideal framework for representing and analyzing large-scale genomic data sets in biology. As a case study, we describe a Drosophila splice-site database that we recently developed at Wesleyan University for use in research and teaching. The database stores data about splice sites computed by a…

  17. Prototype Food and Nutrient Database for Dietary Studies: Branded Food Products Database for Public Health Proof of Concept

    Science.gov (United States)

    The Prototype Food and Nutrient Database for Dietary Studies (Prototype FNDDS) Branded Food Products Database for Public Health is a proof of concept database. The database contains a small selection of food products which is being used to exhibit the approach for incorporation of the Branded Food ...

  18. A novel approach: chemical relational databases, and the role of the ISSCAN database on assessing chemical carcinogenicity.

    Science.gov (United States)

    Benigni, Romualdo; Bossa, Cecilia; Richard, Ann M; Yang, Chihae

    2008-01-01

    Mutagenicity and carcinogenicity databases are crucial resources for toxicologists and regulators involved in chemicals risk assessment. Until recently, existing public toxicity databases have been constructed primarily as "look-up-tables" of existing data, and most often did not contain chemical structures. Concepts and technologies originated from the structure-activity relationships science have provided powerful tools to create new types of databases, where the effective linkage of chemical toxicity with chemical structure can facilitate and greatly enhance data gathering and hypothesis generation, by permitting: a) exploration across both chemical and biological domains; and b) structure-searchability through the data. This paper reviews the main public databases, together with the progress in the field of chemical relational databases, and presents the ISSCAN database on experimental chemical carcinogens.

  19. Academic impact of a public electronic health database: bibliometric analysis of studies using the general practice research database.

    Directory of Open Access Journals (Sweden)

    Yu-Chun Chen

    Full Text Available BACKGROUND: Studies that use electronic health databases as research material are getting popular but the influence of a single electronic health database had not been well investigated yet. The United Kingdom's General Practice Research Database (GPRD is one of the few electronic health databases publicly available to academic researchers. This study analyzed studies that used GPRD to demonstrate the scientific production and academic impact by a single public health database. METHODOLOGY AND FINDINGS: A total of 749 studies published between 1995 and 2009 with 'General Practice Research Database' as their topics, defined as GPRD studies, were extracted from Web of Science. By the end of 2009, the GPRD had attracted 1251 authors from 22 countries and been used extensively in 749 studies published in 193 journals across 58 study fields. Each GPRD study was cited 2.7 times by successive studies. Moreover, the total number of GPRD studies increased rapidly, and it is expected to reach 1500 by 2015, twice the number accumulated till the end of 2009. Since 17 of the most prolific authors (1.4% of all authors contributed nearly half (47.9% of GPRD studies, success in conducting GPRD studies may accumulate. The GPRD was used mainly in, but not limited to, the three study fields of "Pharmacology and Pharmacy", "General and Internal Medicine", and "Public, Environmental and Occupational Health". The UK and United States were the two most active regions of GPRD studies. One-third of GRPD studies were internationally co-authored. CONCLUSIONS: A public electronic health database such as the GPRD will promote scientific production in many ways. Data owners of electronic health databases at a national level should consider how to reduce access barriers and to make data more available for research.

  20. “NaKnowBase”: A Nanomaterials Relational Database

    Science.gov (United States)

    NaKnowBase is an internal relational database populated with data from peer-reviewed ORD nanomaterials research publications. The database focuses on papers describing the actions of nanomaterials in environmental or biological media including their interactions, transformations...

  1. “NaKnowBase”: A Nanomaterials Relational Database

    Science.gov (United States)

    NaKnowBase is a relational database populated with data from peer-reviewed ORD nanomaterials research publications. The database focuses on papers describing the actions of nanomaterials in environmental or biological media including their interactions, transformations and poten...

  2. Academic Impact of a Public Electronic Health Database: Bibliometric Analysis of Studies Using the General Practice Research Database

    Science.gov (United States)

    Chen, Yu-Chun; Wu, Jau-Ching; Haschler, Ingo; Majeed, Azeem; Chen, Tzeng-Ji; Wetter, Thomas

    2011-01-01

    Background Studies that use electronic health databases as research material are getting popular but the influence of a single electronic health database had not been well investigated yet. The United Kingdom's General Practice Research Database (GPRD) is one of the few electronic health databases publicly available to academic researchers. This study analyzed studies that used GPRD to demonstrate the scientific production and academic impact by a single public health database. Methodology and Findings A total of 749 studies published between 1995 and 2009 with ‘General Practice Research Database’ as their topics, defined as GPRD studies, were extracted from Web of Science. By the end of 2009, the GPRD had attracted 1251 authors from 22 countries and been used extensively in 749 studies published in 193 journals across 58 study fields. Each GPRD study was cited 2.7 times by successive studies. Moreover, the total number of GPRD studies increased rapidly, and it is expected to reach 1500 by 2015, twice the number accumulated till the end of 2009. Since 17 of the most prolific authors (1.4% of all authors) contributed nearly half (47.9%) of GPRD studies, success in conducting GPRD studies may accumulate. The GPRD was used mainly in, but not limited to, the three study fields of “Pharmacology and Pharmacy”, “General and Internal Medicine”, and “Public, Environmental and Occupational Health”. The UK and United States were the two most active regions of GPRD studies. One-third of GRPD studies were internationally co-authored. Conclusions A public electronic health database such as the GPRD will promote scientific production in many ways. Data owners of electronic health databases at a national level should consider how to reduce access barriers and to make data more available for research. PMID:21731733

  3. Human Ageing Genomic Resources: Integrated databases and tools for the biology and genetics of ageing

    Science.gov (United States)

    Tacutu, Robi; Craig, Thomas; Budovsky, Arie; Wuttke, Daniel; Lehmann, Gilad; Taranukha, Dmitri; Costa, Joana; Fraifeld, Vadim E.; de Magalhães, João Pedro

    2013-01-01

    The Human Ageing Genomic Resources (HAGR, http://genomics.senescence.info) is a freely available online collection of research databases and tools for the biology and genetics of ageing. HAGR features now several databases with high-quality manually curated data: (i) GenAge, a database of genes associated with ageing in humans and model organisms; (ii) AnAge, an extensive collection of longevity records and complementary traits for >4000 vertebrate species; and (iii) GenDR, a newly incorporated database, containing both gene mutations that interfere with dietary restriction-mediated lifespan extension and consistent gene expression changes induced by dietary restriction. Since its creation about 10 years ago, major efforts have been undertaken to maintain the quality of data in HAGR, while further continuing to develop, improve and extend it. This article briefly describes the content of HAGR and details the major updates since its previous publications, in terms of both structure and content. The completely redesigned interface, more intuitive and more integrative of HAGR resources, is also presented. Altogether, we hope that through its improvements, the current version of HAGR will continue to provide users with the most comprehensive and accessible resources available today in the field of biogerontology. PMID:23193293

  4. 75 FR 41180 - Notice of Order: Revisions to Enterprise Public Use Database

    Science.gov (United States)

    2010-07-15

    .... This responsibility to maintain a public use database (PUDB) for such mortgage data was transferred to... FEDERAL HOUSING FINANCE AGENCY [No. 2010-N-10] Notice of Order: Revisions to Enterprise Public Use Database AGENCY: Federal Housing Finance Agency. ACTION: Notice of order. SUMMARY: Section 1323(a)(1) of...

  5. A natural language interface plug-in for cooperative query answering in biological databases.

    Science.gov (United States)

    Jamil, Hasan M

    2012-06-11

    One of the many unique features of biological databases is that the mere existence of a ground data item is not always a precondition for a query response. It may be argued that from a biologist's standpoint, queries are not always best posed using a structured language. By this we mean that approximate and flexible responses to natural language like queries are well suited for this domain. This is partly due to biologists' tendency to seek simpler interfaces and partly due to the fact that questions in biology involve high level concepts that are open to interpretations computed using sophisticated tools. In such highly interpretive environments, rigidly structured databases do not always perform well. In this paper, our goal is to propose a semantic correspondence plug-in to aid natural language query processing over arbitrary biological database schema with an aim to providing cooperative responses to queries tailored to users' interpretations. Natural language interfaces for databases are generally effective when they are tuned to the underlying database schema and its semantics. Therefore, changes in database schema become impossible to support, or a substantial reorganization cost must be absorbed to reflect any change. We leverage developments in natural language parsing, rule languages and ontologies, and data integration technologies to assemble a prototype query processor that is able to transform a natural language query into a semantically equivalent structured query over the database. We allow knowledge rules and their frequent modifications as part of the underlying database schema. The approach we adopt in our plug-in overcomes some of the serious limitations of many contemporary natural language interfaces, including support for schema modifications and independence from underlying database schema. The plug-in introduced in this paper is generic and facilitates connecting user selected natural language interfaces to arbitrary databases using a

  6. BioMart Central Portal: an open database network for the biological community

    Science.gov (United States)

    Guberman, Jonathan M.; Ai, J.; Arnaiz, O.; Baran, Joachim; Blake, Andrew; Baldock, Richard; Chelala, Claude; Croft, David; Cros, Anthony; Cutts, Rosalind J.; Di Génova, A.; Forbes, Simon; Fujisawa, T.; Gadaleta, E.; Goodstein, D. M.; Gundem, Gunes; Haggarty, Bernard; Haider, Syed; Hall, Matthew; Harris, Todd; Haw, Robin; Hu, S.; Hubbard, Simon; Hsu, Jack; Iyer, Vivek; Jones, Philip; Katayama, Toshiaki; Kinsella, R.; Kong, Lei; Lawson, Daniel; Liang, Yong; Lopez-Bigas, Nuria; Luo, J.; Lush, Michael; Mason, Jeremy; Moreews, Francois; Ndegwa, Nelson; Oakley, Darren; Perez-Llamas, Christian; Primig, Michael; Rivkin, Elena; Rosanoff, S.; Shepherd, Rebecca; Simon, Reinhard; Skarnes, B.; Smedley, Damian; Sperling, Linda; Spooner, William; Stevenson, Peter; Stone, Kevin; Teague, J.; Wang, Jun; Wang, Jianxin; Whitty, Brett; Wong, D. T.; Wong-Erasmus, Marie; Yao, L.; Youens-Clark, Ken; Yung, Christina; Zhang, Junjun; Kasprzyk, Arek

    2011-01-01

    BioMart Central Portal is a first of its kind, community-driven effort to provide unified access to dozens of biological databases spanning genomics, proteomics, model organisms, cancer data, ontology information and more. Anybody can contribute an independently maintained resource to the Central Portal, allowing it to be exposed to and shared with the research community, and linking it with the other resources in the portal. Users can take advantage of the common interface to quickly utilize different sources without learning a new system for each. The system also simplifies cross-database searches that might otherwise require several complicated steps. Several integrated tools streamline common tasks, such as converting between ID formats and retrieving sequences. The combination of a wide variety of databases, an easy-to-use interface, robust programmatic access and the array of tools make Central Portal a one-stop shop for biological data querying. Here, we describe the structure of Central Portal and show example queries to demonstrate its capabilities. Database URL: http://central.biomart.org. PMID:21930507

  7. TabSQL: a MySQL tool to facilitate mapping user data to public databases.

    Science.gov (United States)

    Xia, Xiao-Qin; McClelland, Michael; Wang, Yipeng

    2010-06-23

    With advances in high-throughput genomics and proteomics, it is challenging for biologists to deal with large data files and to map their data to annotations in public databases. We developed TabSQL, a MySQL-based application tool, for viewing, filtering and querying data files with large numbers of rows. TabSQL provides functions for downloading and installing table files from public databases including the Gene Ontology database (GO), the Ensembl databases, and genome databases from the UCSC genome bioinformatics site. Any other database that provides tab-delimited flat files can also be imported. The downloaded gene annotation tables can be queried together with users' data in TabSQL using either a graphic interface or command line. TabSQL allows queries across the user's data and public databases without programming. It is a convenient tool for biologists to annotate and enrich their data.

  8. Implementing database system for LHCb publications page

    CERN Document Server

    Abdullayev, Fakhriddin

    2017-01-01

    The LHCb is one of the main detectors of Large Hadron Collider, where physicists and scientists work together on high precision measurements of matter-antimatter asymmetries and searches for rare and forbidden decays, with the aim of discovering new and unexpected forces. The work does not only consist of analyzing data collected from experiments but also in publishing the results of those analyses. The LHCb publications are gathered on LHCb publications page to maximize their availability to both LHCb members and to the high energy community. In this project a new database system was implemented for LHCb publications page. This will help to improve access to research papers for scientists and better integration with current CERN library website and others.

  9. Databases in Indian biology: The state of the art and prospects

    Digital Repository Service at National Institute of Oceanography (India)

    Chavan, V.S.; Chandramohan, D.

    the Indian biology and biotechnology databses and their relation to international databases on the subject. It highlights their limitations and throws more light on their potential for subject experts and information managers in the country to build...

  10. Knowledge Discovery in Biological Databases for Revealing Candidate Genes Linked to Complex Phenotypes.

    Science.gov (United States)

    Hassani-Pak, Keywan; Rawlings, Christopher

    2017-06-13

    Genetics and "omics" studies designed to uncover genotype to phenotype relationships often identify large numbers of potential candidate genes, among which the causal genes are hidden. Scientists generally lack the time and technical expertise to review all relevant information available from the literature, from key model species and from a potentially wide range of related biological databases in a variety of data formats with variable quality and coverage. Computational tools are needed for the integration and evaluation of heterogeneous information in order to prioritise candidate genes and components of interaction networks that, if perturbed through potential interventions, have a positive impact on the biological outcome in the whole organism without producing negative side effects. Here we review several bioinformatics tools and databases that play an important role in biological knowledge discovery and candidate gene prioritization. We conclude with several key challenges that need to be addressed in order to facilitate biological knowledge discovery in the future.

  11. A publication database for optical long baseline interferometry

    Science.gov (United States)

    Malbet, Fabien; Mella, Guillaume; Lawson, Peter; Taillifet, Esther; Lafrasse, Sylvain

    2010-07-01

    Optical long baseline interferometry is a technique that has generated almost 850 refereed papers to date. The targets span a large variety of objects from planetary systems to extragalactic studies and all branches of stellar physics. We have created a database hosted by the JMMC and connected to the Optical Long Baseline Interferometry Newsletter (OLBIN) web site using MySQL and a collection of XML or PHP scripts in order to store and classify these publications. Each entry is defined by its ADS bibcode, includes basic ADS informations and metadata. The metadata are specified by tags sorted in categories: interferometric facilities, instrumentation, wavelength of operation, spectral resolution, type of measurement, target type, and paper category, for example. The whole OLBIN publication list has been processed and we present how the database is organized and can be accessed. We use this tool to generate statistical plots of interest for the community in optical long baseline interferometry.

  12. Annotation error in public databases: misannotation of molecular function in enzyme superfamilies.

    Directory of Open Access Journals (Sweden)

    Alexandra M Schnoes

    2009-12-01

    Full Text Available Due to the rapid release of new data from genome sequencing projects, the majority of protein sequences in public databases have not been experimentally characterized; rather, sequences are annotated using computational analysis. The level of misannotation and the types of misannotation in large public databases are currently unknown and have not been analyzed in depth. We have investigated the misannotation levels for molecular function in four public protein sequence databases (UniProtKB/Swiss-Prot, GenBank NR, UniProtKB/TrEMBL, and KEGG for a model set of 37 enzyme families for which extensive experimental information is available. The manually curated database Swiss-Prot shows the lowest annotation error levels (close to 0% for most families; the two other protein sequence databases (GenBank NR and TrEMBL and the protein sequences in the KEGG pathways database exhibit similar and surprisingly high levels of misannotation that average 5%-63% across the six superfamilies studied. For 10 of the 37 families examined, the level of misannotation in one or more of these databases is >80%. Examination of the NR database over time shows that misannotation has increased from 1993 to 2005. The types of misannotation that were found fall into several categories, most associated with "overprediction" of molecular function. These results suggest that misannotation in enzyme superfamilies containing multiple families that catalyze different reactions is a larger problem than has been recognized. Strategies are suggested for addressing some of the systematic problems contributing to these high levels of misannotation.

  13. Annotation error in public databases: misannotation of molecular function in enzyme superfamilies.

    Science.gov (United States)

    Schnoes, Alexandra M; Brown, Shoshana D; Dodevski, Igor; Babbitt, Patricia C

    2009-12-01

    Due to the rapid release of new data from genome sequencing projects, the majority of protein sequences in public databases have not been experimentally characterized; rather, sequences are annotated using computational analysis. The level of misannotation and the types of misannotation in large public databases are currently unknown and have not been analyzed in depth. We have investigated the misannotation levels for molecular function in four public protein sequence databases (UniProtKB/Swiss-Prot, GenBank NR, UniProtKB/TrEMBL, and KEGG) for a model set of 37 enzyme families for which extensive experimental information is available. The manually curated database Swiss-Prot shows the lowest annotation error levels (close to 0% for most families); the two other protein sequence databases (GenBank NR and TrEMBL) and the protein sequences in the KEGG pathways database exhibit similar and surprisingly high levels of misannotation that average 5%-63% across the six superfamilies studied. For 10 of the 37 families examined, the level of misannotation in one or more of these databases is >80%. Examination of the NR database over time shows that misannotation has increased from 1993 to 2005. The types of misannotation that were found fall into several categories, most associated with "overprediction" of molecular function. These results suggest that misannotation in enzyme superfamilies containing multiple families that catalyze different reactions is a larger problem than has been recognized. Strategies are suggested for addressing some of the systematic problems contributing to these high levels of misannotation.

  14. Toward an interactive article: integrating journals and biological databases

    Directory of Open Access Journals (Sweden)

    Marygold Steven J

    2011-05-01

    Full Text Available Abstract Background Journal articles and databases are two major modes of communication in the biological sciences, and thus integrating these critical resources is of urgent importance to increase the pace of discovery. Projects focused on bridging the gap between journals and databases have been on the rise over the last five years and have resulted in the development of automated tools that can recognize entities within a document and link those entities to a relevant database. Unfortunately, automated tools cannot resolve ambiguities that arise from one term being used to signify entities that are quite distinct from one another. Instead, resolving these ambiguities requires some manual oversight. Finding the right balance between the speed and portability of automation and the accuracy and flexibility of manual effort is a crucial goal to making text markup a successful venture. Results We have established a journal article mark-up pipeline that links GENETICS journal articles and the model organism database (MOD WormBase. This pipeline uses a lexicon built with entities from the database as a first step. The entity markup pipeline results in links from over nine classes of objects including genes, proteins, alleles, phenotypes and anatomical terms. New entities and ambiguities are discovered and resolved by a database curator through a manual quality control (QC step, along with help from authors via a web form that is provided to them by the journal. New entities discovered through this pipeline are immediately sent to an appropriate curator at the database. Ambiguous entities that do not automatically resolve to one link are resolved by hand ensuring an accurate link. This pipeline has been extended to other databases, namely Saccharomyces Genome Database (SGD and FlyBase, and has been implemented in marking up a paper with links to multiple databases. Conclusions Our semi-automated pipeline hyperlinks articles published in GENETICS to

  15. Interleukins and their signaling pathways in the Reactome biological pathway database.

    Science.gov (United States)

    Jupe, Steve; Ray, Keith; Roca, Corina Duenas; Varusai, Thawfeek; Shamovsky, Veronica; Stein, Lincoln; D'Eustachio, Peter; Hermjakob, Henning

    2018-04-01

    There is a wealth of biological pathway information available in the scientific literature, but it is spread across many thousands of publications. Alongside publications that contain definitive experimental discoveries are many others that have been dismissed as spurious, found to be irreproducible, or are contradicted by later results and consequently now considered controversial. Many descriptions and images of pathways are incomplete stylized representations that assume the reader is an expert and familiar with the established details of the process, which are consequently not fully explained. Pathway representations in publications frequently do not represent a complete, detailed, and unambiguous description of the molecules involved; their precise posttranslational state; or a full account of the molecular events they undergo while participating in a process. Although this might be sufficient to be interpreted by an expert reader, the lack of detail makes such pathways less useful and difficult to understand for anyone unfamiliar with the area and of limited use as the basis for computational models. Reactome was established as a freely accessible knowledge base of human biological pathways. It is manually populated with interconnected molecular events that fully detail the molecular participants linked to published experimental data and background material by using a formal and open data structure that facilitates computational reuse. These data are accessible on a Web site in the form of pathway diagrams that have descriptive summaries and annotations and as downloadable data sets in several formats that can be reused with other computational tools. The entire database and all supporting software can be downloaded and reused under a Creative Commons license. Pathways are authored by expert biologists who work with Reactome curators and editorial staff to represent the consensus in the field. Pathways are represented as interactive diagrams that include as

  16. The public's belief about biology.

    Science.gov (United States)

    Wolpert, L

    2007-02-01

    This short review is concerned with a topic that has been neglected and is still very poorly understood: what the general public think and believe about biology (including health and medicine, and bioethics), and, in particular, about biotechnology.

  17. Assessment of Residential History Generation Using a Public-Record Database

    Directory of Open Access Journals (Sweden)

    David C. Wheeler

    2015-09-01

    Full Text Available In studies of disease with potential environmental risk factors, residential location is often used as a surrogate for unknown environmental exposures or as a basis for assigning environmental exposures. These studies most typically use the residential location at the time of diagnosis due to ease of collection. However, previous residential locations may be more useful for risk analysis because of population mobility and disease latency. When residential histories have not been collected in a study, it may be possible to generate them through public-record databases. In this study, we evaluated the ability of a public-records database from LexisNexis to provide residential histories for subjects in a geographically diverse cohort study. We calculated 11 performance metrics comparing study-collected addresses and two address retrieval services from LexisNexis. We found 77% and 90% match rates for city and state and 72% and 87% detailed address match rates with the basic and enhanced services, respectively. The enhanced LexisNexis service covered 86% of the time at residential addresses recorded in the study. The mean match rate for detailed address matches varied spatially over states. The results suggest that public record databases can be useful for reconstructing residential histories for subjects in epidemiologic studies.

  18. ChlamyCyc: an integrative systems biology database and web-portal for Chlamydomonas reinhardtii.

    Science.gov (United States)

    May, Patrick; Christian, Jan-Ole; Kempa, Stefan; Walther, Dirk

    2009-05-04

    The unicellular green alga Chlamydomonas reinhardtii is an important eukaryotic model organism for the study of photosynthesis and plant growth. In the era of modern high-throughput technologies there is an imperative need to integrate large-scale data sets from high-throughput experimental techniques using computational methods and database resources to provide comprehensive information about the molecular and cellular organization of a single organism. In the framework of the German Systems Biology initiative GoFORSYS, a pathway database and web-portal for Chlamydomonas (ChlamyCyc) was established, which currently features about 250 metabolic pathways with associated genes, enzymes, and compound information. ChlamyCyc was assembled using an integrative approach combining the recently published genome sequence, bioinformatics methods, and experimental data from metabolomics and proteomics experiments. We analyzed and integrated a combination of primary and secondary database resources, such as existing genome annotations from JGI, EST collections, orthology information, and MapMan classification. ChlamyCyc provides a curated and integrated systems biology repository that will enable and assist in systematic studies of fundamental cellular processes in Chlamydomonas. The ChlamyCyc database and web-portal is freely available under http://chlamycyc.mpimp-golm.mpg.de.

  19. Public attitudes toward legally coerced biological treatments of criminals.

    Science.gov (United States)

    Berryessa, Colleen M; Chandler, Jennifer A; Reiner, Peter

    2016-12-01

    How does the public view the offer of a biological treatment in lieu of prison for criminal offenders? Using the contrastive vignette technique, we explored this issue, using mixed-methods analysis to measure concerns regarding changing the criminal's personality, the coercive nature of the offer, and the safety of the proposed treatment. Overall, we found that of the three variables, the safety of the pill had the strongest effect on public acceptance of a biological intervention. Indeed, it was notable that the public was relatively sanguine about coercive offers of biological agents, as well as changing the personality of criminals. While respondents did not fully endorse such coercive offers, neither were they outraged by the use of biological treatments of criminals in lieu of incarceration. These results are discussed in the context of the retributive and rehabilitative sentiments of the public, and legal jurisprudence in the arena of human rights law.

  20. Public participation in genetic databases: crossing the boundaries between biobanks and forensic DNA databases through the principle of solidarity.

    Science.gov (United States)

    Machado, Helena; Silva, Susana

    2015-10-01

    The ethical aspects of biobanks and forensic DNA databases are often treated as separate issues. As a reflection of this, public participation, or the involvement of citizens in genetic databases, has been approached differently in the fields of forensics and medicine. This paper aims to cross the boundaries between medicine and forensics by exploring the flows between the ethical issues presented in the two domains and the subsequent conceptualisation of public trust and legitimisation. We propose to introduce the concept of 'solidarity', traditionally applied only to medical and research biobanks, into a consideration of public engagement in medicine and forensics. Inclusion of a solidarity-based framework, in both medical biobanks and forensic DNA databases, raises new questions that should be included in the ethical debate, in relation to both health services/medical research and activities associated with the criminal justice system. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  1. Data publication: towards a database of everything

    Directory of Open Access Journals (Sweden)

    Smith Vincent S

    2009-06-01

    Full Text Available Abstract The fabric of science is changing, driven by a revolution in digital technologies that facilitate the acquisition and communication of massive amounts of data. This is changing the nature of collaboration and expanding opportunities to participate in science. If digital technologies are the engine of this revolution, digital data are its fuel. But for many scientific disciplines, this fuel is in short supply. The publication of primary data is not a universal or mandatory part of science, and despite policies and proclamations to the contrary, calls to make data publicly available have largely gone unheeded. In this short essay I consider why, and explore some of the challenges that lie ahead, as we work toward a database of everything.

  2. New tools and methods for direct programmatic access to the dbSNP relational database.

    Science.gov (United States)

    Saccone, Scott F; Quan, Jiaxi; Mehta, Gaurang; Bolze, Raphael; Thomas, Prasanth; Deelman, Ewa; Tischfield, Jay A; Rice, John P

    2011-01-01

    Genome-wide association studies often incorporate information from public biological databases in order to provide a biological reference for interpreting the results. The dbSNP database is an extensive source of information on single nucleotide polymorphisms (SNPs) for many different organisms, including humans. We have developed free software that will download and install a local MySQL implementation of the dbSNP relational database for a specified organism. We have also designed a system for classifying dbSNP tables in terms of common tasks we wish to accomplish using the database. For each task we have designed a small set of custom tables that facilitate task-related queries and provide entity-relationship diagrams for each task composed from the relevant dbSNP tables. In order to expose these concepts and methods to a wider audience we have developed web tools for querying the database and browsing documentation on the tables and columns to clarify the relevant relational structure. All web tools and software are freely available to the public at http://cgsmd.isi.edu/dbsnpq. Resources such as these for programmatically querying biological databases are essential for viably integrating biological information into genetic association experiments on a genome-wide scale.

  3. 76 FR 60031 - Notice of Order: Revisions to Enterprise Public Use Database Incorporating High-Cost Single...

    Science.gov (United States)

    2011-09-28

    ... single-family matrix in FHFA's Public Use Database (PUDB) to include data fields for the high-cost single... Use Database Incorporating High-Cost Single-Family Securitized Loan Data Fields and Technical Data... amended, it is necessary to revise the single-family matrix of FHFA's Public Use Database (PUDB) by adding...

  4. A comparison of private and public secondary school biology ...

    African Journals Online (AJOL)

    This paper compares external motivation and job satisfaction in private and public secondary schools biology teachers in Education District IV of Lagos state. The sample for the study consists of 120 Biology teachers selected from ten private and ten public secondary schools. A 20-items Likert type questionnaire was ...

  5. Databases, Repositories, and Other Data Resources in Structural Biology.

    Science.gov (United States)

    Zheng, Heping; Porebski, Przemyslaw J; Grabowski, Marek; Cooper, David R; Minor, Wladek

    2017-01-01

    Structural biology, like many other areas of modern science, produces an enormous amount of primary, derived, and "meta" data with a high demand on data storage and manipulations. Primary data come from various steps of sample preparation, diffraction experiments, and functional studies. These data are not only used to obtain tangible results, like macromolecular structural models, but also to enrich and guide our analysis and interpretation of various biomedical problems. Herein we define several categories of data resources, (a) Archives, (b) Repositories, (c) Databases, and (d) Advanced Information Systems, that can accommodate primary, derived, or reference data. Data resources may be used either as web portals or internally by structural biology software. To be useful, each resource must be maintained, curated, as well as integrated with other resources. Ideally, the system of interconnected resources should evolve toward comprehensive "hubs", or Advanced Information Systems. Such systems, encompassing the PDB and UniProt, are indispensable not only for structural biology, but for many related fields of science. The categories of data resources described herein are applicable well beyond our usual scientific endeavors.

  6. ChlamyCyc: an integrative systems biology database and web-portal for Chlamydomonas reinhardtii

    Directory of Open Access Journals (Sweden)

    Kempa Stefan

    2009-05-01

    Full Text Available Abstract Background The unicellular green alga Chlamydomonas reinhardtii is an important eukaryotic model organism for the study of photosynthesis and plant growth. In the era of modern high-throughput technologies there is an imperative need to integrate large-scale data sets from high-throughput experimental techniques using computational methods and database resources to provide comprehensive information about the molecular and cellular organization of a single organism. Results In the framework of the German Systems Biology initiative GoFORSYS, a pathway database and web-portal for Chlamydomonas (ChlamyCyc was established, which currently features about 250 metabolic pathways with associated genes, enzymes, and compound information. ChlamyCyc was assembled using an integrative approach combining the recently published genome sequence, bioinformatics methods, and experimental data from metabolomics and proteomics experiments. We analyzed and integrated a combination of primary and secondary database resources, such as existing genome annotations from JGI, EST collections, orthology information, and MapMan classification. Conclusion ChlamyCyc provides a curated and integrated systems biology repository that will enable and assist in systematic studies of fundamental cellular processes in Chlamydomonas. The ChlamyCyc database and web-portal is freely available under http://chlamycyc.mpimp-golm.mpg.de.

  7. Database Description - SSBD | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available base Description General information of database Database name SSBD Alternative nam...ss 2-2-3 Minatojima-minamimachi, Chuo-ku, Kobe 650-0047, Japan, RIKEN Quantitative Biology Center Shuichi Onami E-mail: Database... classification Other Molecular Biology Databases Database classification Dynamic databa...elegans Taxonomy ID: 6239 Taxonomy Name: Escherichia coli Taxonomy ID: 562 Database description Systems Scie...i Onami Journal: Bioinformatics/April, 2015/Volume 31, Issue 7 External Links: Original website information Database

  8. Publication Growth in Biological Sub-Fields: Patterns, Predictability and Sustainability

    Directory of Open Access Journals (Sweden)

    Marco Pautasso

    2012-11-01

    Full Text Available Biologists are producing ever-increasing quantities of papers. The question arises of whether current rates of increase in scientific outputs are sustainable in the long term. I studied this issue using publication data from the Web of Science (1991–2010 for 18 biological sub-fields. In the majority of cases, an exponential regression explains more variation than a linear one in the number of papers published each year as a function of publication year. Exponential growth in publication numbers is clearly not sustainable. About 75% of the variation in publication growth among biological sub-fields over the two studied decades can be predicted by publication data from the first six years. Currently trendy fields such as structural biology, neuroscience and biomaterials cannot be expected to carry on growing at the current pace, because in a few decades they would produce more papers than the whole of biology combined. Synthetic and systems biology are problematic from the point of view of knowledge dissemination, because in these fields more than 80% of existing papers have been published over the last five years. The evidence presented here casts a shadow on how sustainable the recent increase in scientific publications can be in the long term.

  9. mirPub: a database for searching microRNA publications.

    Science.gov (United States)

    Vergoulis, Thanasis; Kanellos, Ilias; Kostoulas, Nikos; Georgakilas, Georgios; Sellis, Timos; Hatzigeorgiou, Artemis; Dalamagas, Theodore

    2015-05-01

    Identifying, amongst millions of publications available in MEDLINE, those that are relevant to specific microRNAs (miRNAs) of interest based on keyword search faces major obstacles. References to miRNA names in the literature often deviate from standard nomenclature for various reasons, since even the official nomenclature evolves. For instance, a single miRNA name may identify two completely different molecules or two different names may refer to the same molecule. mirPub is a database with a powerful and intuitive interface, which facilitates searching for miRNA literature, addressing the aforementioned issues. To provide effective search services, mirPub applies text mining techniques on MEDLINE, integrates data from several curated databases and exploits data from its user community following a crowdsourcing approach. Other key features include an interactive visualization service that illustrates intuitively the evolution of miRNA data, tag clouds summarizing the relevance of publications to particular diseases, cell types or tissues and access to TarBase 6.0 data to oversee genes related to miRNA publications. mirPub is freely available at http://www.microrna.gr/mirpub/. vergoulis@imis.athena-innovation.gr or dalamag@imis.athena-innovation.gr Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.

  10. SeqHound: biological sequence and structure database as a platform for bioinformatics research

    Directory of Open Access Journals (Sweden)

    Dumontier Michel

    2002-10-01

    Full Text Available Abstract Background SeqHound has been developed as an integrated biological sequence, taxonomy, annotation and 3-D structure database system. It provides a high-performance server platform for bioinformatics research in a locally-hosted environment. Results SeqHound is based on the National Center for Biotechnology Information data model and programming tools. It offers daily updated contents of all Entrez sequence databases in addition to 3-D structural data and information about sequence redundancies, sequence neighbours, taxonomy, complete genomes, functional annotation including Gene Ontology terms and literature links to PubMed. SeqHound is accessible via a web server through a Perl, C or C++ remote API or an optimized local API. It provides functionality necessary to retrieve specialized subsets of sequences, structures and structural domains. Sequences may be retrieved in FASTA, GenBank, ASN.1 and XML formats. Structures are available in ASN.1, XML and PDB formats. Emphasis has been placed on complete genomes, taxonomy, domain and functional annotation as well as 3-D structural functionality in the API, while fielded text indexing functionality remains under development. SeqHound also offers a streamlined WWW interface for simple web-user queries. Conclusions The system has proven useful in several published bioinformatics projects such as the BIND database and offers a cost-effective infrastructure for research. SeqHound will continue to develop and be provided as a service of the Blueprint Initiative at the Samuel Lunenfeld Research Institute. The source code and examples are available under the terms of the GNU public license at the Sourceforge site http://sourceforge.net/projects/slritools/ in the SLRI Toolkit.

  11. SeedStor: A Germplasm Information Management System and Public Database.

    Science.gov (United States)

    Horler, R S P; Turner, A S; Fretter, P; Ambrose, M

    2018-01-01

    SeedStor (https://www.seedstor.ac.uk) acts as the publicly available database for the seed collections held by the Germplasm Resources Unit (GRU) based at the John Innes Centre, Norwich, UK. The GRU is a national capability supported by the Biotechnology and Biological Sciences Research Council (BBSRC). The GRU curates germplasm collections of a range of temperate cereal, legume and Brassica crops and their associated wild relatives, as well as precise genetic stocks, near-isogenic lines and mapping populations. With >35,000 accessions, the GRU forms part of the UK's plant conservation contribution to the Multilateral System (MLS) of the International Treaty for Plant Genetic Resources for Food and Agriculture (ITPGRFA) for wheat, barley, oat and pea. SeedStor is a fully searchable system that allows our various collections to be browsed species by species through to complicated multipart phenotype criteria-driven queries. The results from these searches can be downloaded for later analysis or used to order germplasm via our shopping cart. The user community for SeedStor is the plant science research community, plant breeders, specialist growers, hobby farmers and amateur gardeners, and educationalists. Furthermore, SeedStor is much more than a database; it has been developed to act internally as a Germplasm Information Management System that allows team members to track and process germplasm requests, determine regeneration priorities, handle cost recovery and Material Transfer Agreement paperwork, manage the Seed Store holdings and easily report on a wide range of the aforementioned tasks. © The Author(s) 2017. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists.

  12. The Mouse SAGE Site: database of public mouse SAGE libraries

    Czech Academy of Sciences Publication Activity Database

    Divina, Petr; Forejt, Jiří

    2004-01-01

    Roč. 32, - (2004), s. D482-D483 ISSN 0305-1048 R&D Projects: GA MŠk LN00A079; GA ČR GV204/98/K015 Grant - others:HHMI(US) 555000306 Institutional research plan: CEZ:AV0Z5052915 Keywords : mouse SAGE libraries * web -based database Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 7.260, year: 2004

  13. 76 FR 77533 - Notice of Order: Revisions to Enterprise Public Use Database Incorporating High-Cost Single...

    Science.gov (United States)

    2011-12-13

    ..., regarding FHFA's adoption of an Order revising FHFA's Public Use Database matrices to include certain data... FEDERAL HOUSING FINANCE AGENCY [No. 2011-N-13] Notice of Order: Revisions to Enterprise Public Use Database Incorporating High-Cost Single-Family Securitized Loan Data Fields and Technical Data Field...

  14. Interactive bibliographical database on color

    Science.gov (United States)

    Caivano, Jose L.

    2002-06-01

    The paper describes the methodology and results of a project under development, aimed at the elaboration of an interactive bibliographical database on color in all fields of application: philosophy, psychology, semiotics, education, anthropology, physical and natural sciences, biology, medicine, technology, industry, architecture and design, arts, linguistics, geography, history. The project is initially based upon an already developed bibliography, published in different journals, updated in various opportunities, and now available at the Internet, with more than 2,000 entries. The interactive database will amplify that bibliography, incorporating hyperlinks and contents (indexes, abstracts, keywords, introductions, or eventually the complete document), and devising mechanisms for information retrieval. The sources to be included are: books, doctoral dissertations, multimedia publications, reference works. The main arrangement will be chronological, but the design of the database will allow rearrangements or selections by different fields: subject, Decimal Classification System, author, language, country, publisher, etc. A further project is to develop another database, including color-specialized journals or newsletters, and articles on color published in international journals, arranged in this case by journal name and date of publication, but allowing also rearrangements or selections by author, subject and keywords.

  15. Biological knowledge bases using Wikis: combining the flexibility of Wikis with the structure of databases.

    Science.gov (United States)

    Brohée, Sylvain; Barriot, Roland; Moreau, Yves

    2010-09-01

    In recent years, the number of knowledge bases developed using Wiki technology has exploded. Unfortunately, next to their numerous advantages, classical Wikis present a critical limitation: the invaluable knowledge they gather is represented as free text, which hinders their computational exploitation. This is in sharp contrast with the current practice for biological databases where the data is made available in a structured way. Here, we present WikiOpener an extension for the classical MediaWiki engine that augments Wiki pages by allowing on-the-fly querying and formatting resources external to the Wiki. Those resources may provide data extracted from databases or DAS tracks, or even results returned by local or remote bioinformatics analysis tools. This also implies that structured data can be edited via dedicated forms. Hence, this generic resource combines the structure of biological databases with the flexibility of collaborative Wikis. The source code and its documentation are freely available on the MediaWiki website: http://www.mediawiki.org/wiki/Extension:WikiOpener.

  16. Database Constraints Applied to Metabolic Pathway Reconstruction Tools

    Directory of Open Access Journals (Sweden)

    Jordi Vilaplana

    2014-01-01

    Full Text Available Our group developed two biological applications, Biblio-MetReS and Homol-MetReS, accessing the same database of organisms with annotated genes. Biblio-MetReS is a data-mining application that facilitates the reconstruction of molecular networks based on automated text-mining analysis of published scientific literature. Homol-MetReS allows functional (reannotation of proteomes, to properly identify both the individual proteins involved in the process(es of interest and their function. It also enables the sets of proteins involved in the process(es in different organisms to be compared directly. The efficiency of these biological applications is directly related to the design of the shared database. We classified and analyzed the different kinds of access to the database. Based on this study, we tried to adjust and tune the configurable parameters of the database server to reach the best performance of the communication data link to/from the database system. Different database technologies were analyzed. We started the study with a public relational SQL database, MySQL. Then, the same database was implemented by a MapReduce-based database named HBase. The results indicated that the standard configuration of MySQL gives an acceptable performance for low or medium size databases. Nevertheless, tuning database parameters can greatly improve the performance and lead to very competitive runtimes.

  17. Database constraints applied to metabolic pathway reconstruction tools.

    Science.gov (United States)

    Vilaplana, Jordi; Solsona, Francesc; Teixido, Ivan; Usié, Anabel; Karathia, Hiren; Alves, Rui; Mateo, Jordi

    2014-01-01

    Our group developed two biological applications, Biblio-MetReS and Homol-MetReS, accessing the same database of organisms with annotated genes. Biblio-MetReS is a data-mining application that facilitates the reconstruction of molecular networks based on automated text-mining analysis of published scientific literature. Homol-MetReS allows functional (re)annotation of proteomes, to properly identify both the individual proteins involved in the process(es) of interest and their function. It also enables the sets of proteins involved in the process(es) in different organisms to be compared directly. The efficiency of these biological applications is directly related to the design of the shared database. We classified and analyzed the different kinds of access to the database. Based on this study, we tried to adjust and tune the configurable parameters of the database server to reach the best performance of the communication data link to/from the database system. Different database technologies were analyzed. We started the study with a public relational SQL database, MySQL. Then, the same database was implemented by a MapReduce-based database named HBase. The results indicated that the standard configuration of MySQL gives an acceptable performance for low or medium size databases. Nevertheless, tuning database parameters can greatly improve the performance and lead to very competitive runtimes.

  18. Numerical databases in marine biology

    Digital Repository Service at National Institute of Oceanography (India)

    Sarupria, J.S.; Bhargava, R.M.S.

    stream_size 9 stream_content_type text/plain stream_name Natl_Workshop_Database_Networking_Mar_Biol_1991_45.pdf.txt stream_source_info Natl_Workshop_Database_Networking_Mar_Biol_1991_45.pdf.txt Content-Encoding ISO-8859-1 Content-Type... text/plain; charset=ISO-8859-1 ...

  19. Accessing the public MIMIC-II intensive care relational database for clinical research.

    Science.gov (United States)

    Scott, Daniel J; Lee, Joon; Silva, Ikaro; Park, Shinhyuk; Moody, George B; Celi, Leo A; Mark, Roger G

    2013-01-10

    The Multiparameter Intelligent Monitoring in Intensive Care II (MIMIC-II) database is a free, public resource for intensive care research. The database was officially released in 2006, and has attracted a growing number of researchers in academia and industry. We present the two major software tools that facilitate accessing the relational database: the web-based QueryBuilder and a downloadable virtual machine (VM) image. QueryBuilder and the MIMIC-II VM have been developed successfully and are freely available to MIMIC-II users. Simple example SQL queries and the resulting data are presented. Clinical studies pertaining to acute kidney injury and prediction of fluid requirements in the intensive care unit are shown as typical examples of research performed with MIMIC-II. In addition, MIMIC-II has also provided data for annual PhysioNet/Computing in Cardiology Challenges, including the 2012 Challenge "Predicting mortality of ICU Patients". QueryBuilder is a web-based tool that provides easy access to MIMIC-II. For more computationally intensive queries, one can locally install a complete copy of MIMIC-II in a VM. Both publicly available tools provide the MIMIC-II research community with convenient querying interfaces and complement the value of the MIMIC-II relational database.

  20. Databases applicable to quantitative hazard/risk assessment-Towards a predictive systems toxicology

    International Nuclear Information System (INIS)

    Waters, Michael; Jackson, Marcus

    2008-01-01

    The Workshop on The Power of Aggregated Toxicity Data addressed the requirement for distributed databases to support quantitative hazard and risk assessment. The authors have conceived and constructed with federal support several databases that have been used in hazard identification and risk assessment. The first of these databases, the EPA Gene-Tox Database was developed for the EPA Office of Toxic Substances by the Oak Ridge National Laboratory, and is currently hosted by the National Library of Medicine. This public resource is based on the collaborative evaluation, by government, academia, and industry, of short-term tests for the detection of mutagens and presumptive carcinogens. The two-phased evaluation process resulted in more than 50 peer-reviewed publications on test system performance and a qualitative database on thousands of chemicals. Subsequently, the graphic and quantitative EPA/IARC Genetic Activity Profile (GAP) Database was developed in collaboration with the International Agency for Research on Cancer (IARC). A chemical database driven by consideration of the lowest effective dose, GAP has served IARC for many years in support of hazard classification of potential human carcinogens. The Toxicological Activity Profile (TAP) prototype database was patterned after GAP and utilized acute, subchronic, and chronic data from the Office of Air Quality Planning and Standards. TAP demonstrated the flexibility of the GAP format for air toxics, water pollutants and other environmental agents. The GAP format was also applied to developmental toxicants and was modified to represent quantitative results from the rodent carcinogen bioassay. More recently, the authors have constructed: 1) the NIEHS Genetic Alterations in Cancer (GAC) Database which quantifies specific mutations found in cancers induced by environmental agents, and 2) the NIEHS Chemical Effects in Biological Systems (CEBS) Knowledgebase that integrates genomic and other biological data including

  1. A Public Image Database for Benchmark of Plant Seedling Classification Algorithms

    DEFF Research Database (Denmark)

    Giselsson, Thomas Mosgaard; Nyholm Jørgensen, Rasmus; Jensen, Peter Kryger

    A database of images of approximately 960 unique plants belonging to 12 species at several growth stages is made publicly available. It comprises annotated RGB images with a physical resolution of roughly 10 pixels per mm. To standardise the evaluation of classification results obtained...

  2. Sagace: A web-based search engine for biomedical databases in Japan

    Directory of Open Access Journals (Sweden)

    Morita Mizuki

    2012-10-01

    Full Text Available Abstract Background In the big data era, biomedical research continues to generate a large amount of data, and the generated information is often stored in a database and made publicly available. Although combining data from multiple databases should accelerate further studies, the current number of life sciences databases is too large to grasp features and contents of each database. Findings We have developed Sagace, a web-based search engine that enables users to retrieve information from a range of biological databases (such as gene expression profiles and proteomics data and biological resource banks (such as mouse models of disease and cell lines. With Sagace, users can search more than 300 databases in Japan. Sagace offers features tailored to biomedical research, including manually tuned ranking, a faceted navigation to refine search results, and rich snippets constructed with retrieved metadata for each database entry. Conclusions Sagace will be valuable for experts who are involved in biomedical research and drug development in both academia and industry. Sagace is freely available at http://sagace.nibio.go.jp/en/.

  3. New perspectives in toxicological information management, and the role of ISSTOX databases in assessing chemical mutagenicity and carcinogenicity.

    Science.gov (United States)

    Benigni, Romualdo; Battistelli, Chiara Laura; Bossa, Cecilia; Tcheremenskaia, Olga; Crettaz, Pierre

    2013-07-01

    Currently, the public has access to a variety of databases containing mutagenicity and carcinogenicity data. These resources are crucial for the toxicologists and regulators involved in the risk assessment of chemicals, which necessitates access to all the relevant literature, and the capability to search across toxicity databases using both biological and chemical criteria. Towards the larger goal of screening chemicals for a wide range of toxicity end points of potential interest, publicly available resources across a large spectrum of biological and chemical data space must be effectively harnessed with current and evolving information technologies (i.e. systematised, integrated and mined), if long-term screening and prediction objectives are to be achieved. A key to rapid progress in the field of chemical toxicity databases is that of combining information technology with the chemical structure as identifier of the molecules. This permits an enormous range of operations (e.g. retrieving chemicals or chemical classes, describing the content of databases, finding similar chemicals, crossing biological and chemical interrogations, etc.) that other more classical databases cannot allow. This article describes the progress in the technology of toxicity databases, including the concepts of Chemical Relational Database and Toxicological Standardized Controlled Vocabularies (Ontology). Then it describes the ISSTOX cluster of toxicological databases at the Istituto Superiore di Sanitá. It consists of freely available databases characterised by the use of modern information technologies and by curation of the quality of the biological data. Finally, this article provides examples of analyses and results made possible by ISSTOX.

  4. ChemProt-2.0: visual navigation in a disease chemical biology database

    DEFF Research Database (Denmark)

    Kjærulff, Sonny Kim; Wich, Louis; Kringelum, Jens Vindahl

    2013-01-01

    ChemProt-2.0 (http://www.cbs.dtu.dk/services/ChemProt-2.0) is a public available compilation of multiple chemical-protein annotation resources integrated with diseases and clinical outcomes information. The database has been updated to > 1.15 million compounds with 5.32 millions bioactivity measu...

  5. DATABASES AND THE SUI-GENERIS RIGHT – PROTECTION OUTSIDE THE ORIGINALITY. THE DISREGARD OF THE PUBLIC DOMAIN

    Directory of Open Access Journals (Sweden)

    Monica LUPAȘCU

    2018-05-01

    Full Text Available This study focuses on databases as they are regulated by Directive no.96/9/EC regarding the protection of databases. There are also several references to Romanian Law no.8/1996 on copyright and neighbouring rights which implements the mentioned European Directive. The study analyses certain effects that the sui-generis protection has on public domain. The study tries to demonstrate that the reglementation specific to databases neglects the interests correlated with the public domain. The effect of such a regulation is the abusive creation of some databases in which the public domain (meaning information not protected by copyright such as news, ideas, procedures, methods, systems, processes, concepts, principles, discoveries ends up being encapsulated and made available only to some private interests, the access to public domain being regulated indirectly. The study begins by explaining the sui- generis right and its origin. The first mention of databases can be found in “Green Paper on Copyright (1998,” a document that clearly shows, the database protection was thought to cover a sphere of information non-protectable from the scientific and industrial fields. Several arguments are made by the author, most of them based on the report of the Public Consultation sustained in 2014 in regards to the necessity of the sui-generis right. There are some references made to a specific case law, namely British Houseracing Board vs William Hill and Fixture Marketing Ldt. The ECJ’s decision în that case is of great importance for the support of public interest to access information corresponding to some restrictive fields that are derived as a result of the maker’s activities, because in the absence of the sui-generis right, all this information can be freely accessed and used.

  6. Microcomputer Database Management Systems that Interface with Online Public Access Catalogs.

    Science.gov (United States)

    Rice, James

    1988-01-01

    Describes a study that assessed the availability and use of microcomputer database management interfaces to online public access catalogs. The software capabilities needed to effect such an interface are identified, and available software packages are evaluated by these criteria. A directory of software vendors is provided. (4 notes with…

  7. Personal Publications Lists Serve as a Reliable Calibration Parameter to Compare Coverage in Academic Citation Databases with Scientific Social Media

    Directory of Open Access Journals (Sweden)

    Emma Hughes

    2017-03-01

    Full Text Available A Review of: Hilbert, F., Barth, J., Gremm, J., Gros, D., Haiter, J., Henkel, M., Reinhardt, W., & Stock, W.G. (2015. Coverage of academic citation databases compared with coverage of scientific social media: personal publication lists as calibration parameters. Online Information Review 39(2: 255-264. http://dx.doi.org/10.1108/OIR-07-2014-0159 Objective – The purpose of this study was to explore coverage rates of information science publications in academic citation databases and scientific social media using a new method of personal publication lists as a calibration parameter. The research questions were: How many publications are covered in different databases, which has the best coverage, and what institutions are represented and how does the language of the publication play a role? Design – Bibliometric analysis. Setting – Academic citation databases (Web of Science, Scopus, Google Scholar and scientific social media (Mendeley, CiteULike, Bibsonomy. Subjects – 1,017 library and information science publications produced by 76 information scientists at 5 German-speaking universities in Germany and Austria. Methods – Only documents which were published between 1 January 2003 and 31 December 2012 were included. In that time the 76 information scientists had produced 1,017 documents. The information scientists confirmed that their publication lists were complete and these served as the calibration parameter for the study. The citations from the publication lists were searched in three academic databases: Google Scholar, Web of Science (WoS, and Scopus; as well as three social media citation sites: Mendeley, CiteULike, and BibSonomy and the results were compared. The publications were searched for by author name and words from the title. Main results – None of the databases investigated had 100% coverage. In the academic databases, Google Scholar had the highest amount of coverage with an average of 63%, Scopus an average of 31%, and

  8. Genelab: Scientific Partnerships and an Open-Access Database to Maximize Usage of Omics Data from Space Biology Experiments

    Science.gov (United States)

    Reinsch, S. S.; Galazka, J..; Berrios, D. C; Chakravarty, K.; Fogle, H.; Lai, S.; Bokyo, V.; Timucin, L. R.; Tran, P.; Skidmore, M.

    2016-01-01

    NASA's mission includes expanding our understanding of biological systems to improve life on Earth and to enable long-duration human exploration of space. The GeneLab Data System (GLDS) is NASA's premier open-access omics data platform for biological experiments. GLDS houses standards-compliant, high-throughput sequencing and other omics data from spaceflight-relevant experiments. The GeneLab project at NASA-Ames Research Center is developing the database, and also partnering with spaceflight projects through sharing or augmentation of experiment samples to expand omics analyses on precious spaceflight samples. The partnerships ensure that the maximum amount of data is garnered from spaceflight experiments and made publically available as rapidly as possible via the GLDS. GLDS Version 1.0, went online in April 2015. Software updates and new data releases occur at least quarterly. As of October 2016, the GLDS contains 80 datasets and has search and download capabilities. Version 2.0 is slated for release in September of 2017 and will have expanded, integrated search capabilities leveraging other public omics databases (NCBI GEO, PRIDE, MG-RAST). Future versions in this multi-phase project will provide a collaborative platform for omics data analysis. Data from experiments that explore the biological effects of the spaceflight environment on a wide variety of model organisms are housed in the GLDS including data from rodents, invertebrates, plants and microbes. Human datasets are currently limited to those with anonymized data (e.g., from cultured cell lines). GeneLab ensures prompt release and open access to high-throughput genomics, transcriptomics, proteomics, and metabolomics data from spaceflight and ground-based simulations of microgravity, radiation or other space environment factors. The data are meticulously curated to assure that accurate experimental and sample processing metadata are included with each data set. GLDS download volumes indicate strong

  9. Classification of Recombinant Biologics in the EU

    DEFF Research Database (Denmark)

    Klein, Kevin; De Bruin, Marie L; Broekmans, Andre W

    2015-01-01

    BACKGROUND AND OBJECTIVE: Biological medicinal products (biologics) are subject to specific pharmacovigilance requirements to ensure that biologics are identifiable by brand name and batch number in adverse drug reaction (ADR) reports. Since Member States collect ADR data at the national level...... of biologics by national authorities responsible for ADR reporting. METHODS: A sample list of recombinant biologics from the European Medicines Agency database of European Public Assessment Reports was created to analyze five Member States (Belgium, the Netherlands, Spain, Sweden, and the UK) according...

  10. Accelerating Smith-Waterman Algorithm for Biological Database Search on CUDA-Compatible GPUs

    Science.gov (United States)

    Munekawa, Yuma; Ino, Fumihiko; Hagihara, Kenichi

    This paper presents a fast method capable of accelerating the Smith-Waterman algorithm for biological database search on a cluster of graphics processing units (GPUs). Our method is implemented using compute unified device architecture (CUDA), which is available on the nVIDIA GPU. As compared with previous methods, our method has four major contributions. (1) The method efficiently uses on-chip shared memory to reduce the data amount being transferred between off-chip video memory and processing elements in the GPU. (2) It also reduces the number of data fetches by applying a data reuse technique to query and database sequences. (3) A pipelined method is also implemented to overlap GPU execution with database access. (4) Finally, a master/worker paradigm is employed to accelerate hundreds of database searches on a cluster system. In experiments, the peak performance on a GeForce GTX 280 card reaches 8.32 giga cell updates per second (GCUPS). We also find that our method reduces the amount of data fetches to 1/140, achieving approximately three times higher performance than a previous CUDA-based method. Our 32-node cluster version is approximately 28 times faster than a single GPU version. Furthermore, the effective performance reaches 75.6 giga instructions per second (GIPS) using 32 GeForce 8800 GTX cards.

  11. Molecular signatures database (MSigDB) 3.0.

    Science.gov (United States)

    Liberzon, Arthur; Subramanian, Aravind; Pinchback, Reid; Thorvaldsdóttir, Helga; Tamayo, Pablo; Mesirov, Jill P

    2011-06-15

    Well-annotated gene sets representing the universe of the biological processes are critical for meaningful and insightful interpretation of large-scale genomic data. The Molecular Signatures Database (MSigDB) is one of the most widely used repositories of such sets. We report the availability of a new version of the database, MSigDB 3.0, with over 6700 gene sets, a complete revision of the collection of canonical pathways and experimental signatures from publications, enhanced annotations and upgrades to the web site. MSigDB is freely available for non-commercial use at http://www.broadinstitute.org/msigdb.

  12. ProteoLens: a visual analytic tool for multi-scale database-driven biological network data mining.

    Science.gov (United States)

    Huan, Tianxiao; Sivachenko, Andrey Y; Harrison, Scott H; Chen, Jake Y

    2008-08-12

    New systems biology studies require researchers to understand how interplay among myriads of biomolecular entities is orchestrated in order to achieve high-level cellular and physiological functions. Many software tools have been developed in the past decade to help researchers visually navigate large networks of biomolecular interactions with built-in template-based query capabilities. To further advance researchers' ability to interrogate global physiological states of cells through multi-scale visual network explorations, new visualization software tools still need to be developed to empower the analysis. A robust visual data analysis platform driven by database management systems to perform bi-directional data processing-to-visualizations with declarative querying capabilities is needed. We developed ProteoLens as a JAVA-based visual analytic software tool for creating, annotating and exploring multi-scale biological networks. It supports direct database connectivity to either Oracle or PostgreSQL database tables/views, on which SQL statements using both Data Definition Languages (DDL) and Data Manipulation languages (DML) may be specified. The robust query languages embedded directly within the visualization software help users to bring their network data into a visualization context for annotation and exploration. ProteoLens supports graph/network represented data in standard Graph Modeling Language (GML) formats, and this enables interoperation with a wide range of other visual layout tools. The architectural design of ProteoLens enables the de-coupling of complex network data visualization tasks into two distinct phases: 1) creating network data association rules, which are mapping rules between network node IDs or edge IDs and data attributes such as functional annotations, expression levels, scores, synonyms, descriptions etc; 2) applying network data association rules to build the network and perform the visual annotation of graph nodes and edges

  13. BioMart Central Portal: an open database network for the biological community

    OpenAIRE

    Guberman, Jonathan M.; Ai, J.; Arnaiz, O.; Baran, Joachim; Blake, Andrew; Baldock, Richard; Chelala, Claude; Croft, David; Cros, Anthony; Cutts, Rosalind J.; Di Genova, A.; Forbes, Simon; Fujisawa, T.; Gadaleta, E.; Goodstein, D. M.

    2011-01-01

    International audience; BioMart Central Portal is a first of its kind, community-driven effort to provide unified access to dozens of biological databases spanning genomics, proteomics, model organisms, cancer data, ontology information and more. Anybody can contribute an independently maintained resource to the Central Portal, allowing it to be exposed to and shared with the research community, and linking it with the other resources in the portal. Users can take advantage of the common inte...

  14. Federal, provincial and territorial public health response plan for biological events.

    Science.gov (United States)

    McNeill, R; Topping, J

    2018-01-04

    The Federal/Provincial/Territorial (FPT) Public Health Response Plan for Biological Events was developed for the Public Health Network Council (PHNC). This plan outlines how the national response to public health events caused by biological agents will be conducted and coordinated, with a focus on implementation of responses led by senior-level FPT public health decision-makers. The plan was developed by an expert task group and was approved by PHNC in October, 2017. The plan describes roles, responsibilities and authorities of FPT governments for public health and emergency management, a concept of operations outlining four scalable response levels and a governance structure that aims to facilitate an efficient, timely, evidence-informed and consistent approach across jurisdictions. Improving effective engagement amongst public health, health care delivery and health emergency management authorities is a key objective of the plan.

  15. Creating databases for biological information: an introduction.

    Science.gov (United States)

    Stein, Lincoln

    2013-06-01

    The essence of bioinformatics is dealing with large quantities of information. Whether it be sequencing data, microarray data files, mass spectrometric data (e.g., fingerprints), the catalog of strains arising from an insertional mutagenesis project, or even large numbers of PDF files, there inevitably comes a time when the information can simply no longer be managed with files and directories. This is where databases come into play. This unit briefly reviews the characteristics of several database management systems, including flat file, indexed file, relational databases, and NoSQL databases. It compares their strengths and weaknesses and offers some general guidelines for selecting an appropriate database management system. Copyright 2013 by JohnWiley & Sons, Inc.

  16. Coverage and quality: A comparison of Web of Science and Scopus databases for reporting faculty nursing publication metrics.

    Science.gov (United States)

    Powell, Kimberly R; Peterson, Shenita R

    Web of Science and Scopus are the leading databases of scholarly impact. Recent studies outside the field of nursing report differences in journal coverage and quality. A comparative analysis of nursing publications reported impact. Journal coverage by each database for the field of nursing was compared. Additionally, publications by 2014 nursing faculty were collected in both databases and compared for overall coverage and reported quality, as modeled by Scimajo Journal Rank, peer review status, and MEDLINE inclusion. Individual author impact, modeled by the h-index, was calculated by each database for comparison. Scopus offered significantly higher journal coverage. For 2014 faculty publications, 100% of journals were found in Scopus, Web of Science offered 82%. No significant difference was found in the quality of reported journals. Author h-index was found to be higher in Scopus. When reporting faculty publications and scholarly impact, academic nursing programs may be better represented by Scopus, without compromising journal quality. Programs with strong interdisciplinary work should examine all areas of strength to ensure appropriate coverage. Copyright © 2017 Elsevier Inc. All rights reserved.

  17. Development of SRS.php, a Simple Object Access Protocol-based library for data acquisition from integrated biological databases.

    Science.gov (United States)

    Barbosa-Silva, A; Pafilis, E; Ortega, J M; Schneider, R

    2007-12-11

    Data integration has become an important task for biological database providers. The current model for data exchange among different sources simplifies the manner that distinct information is accessed by users. The evolution of data representation from HTML to XML enabled programs, instead of humans, to interact with biological databases. We present here SRS.php, a PHP library that can interact with the data integration Sequence Retrieval System (SRS). The library has been written using SOAP definitions, and permits the programmatic communication through webservices with the SRS. The interactions are possible by invoking the methods described in WSDL by exchanging XML messages. The current functions available in the library have been built to access specific data stored in any of the 90 different databases (such as UNIPROT, KEGG and GO) using the same query syntax format. The inclusion of the described functions in the source of scripts written in PHP enables them as webservice clients to the SRS server. The functions permit one to query the whole content of any SRS database, to list specific records in these databases, to get specific fields from the records, and to link any record among any pair of linked databases. The case study presented exemplifies the library usage to retrieve information regarding registries of a Plant Defense Mechanisms database. The Plant Defense Mechanisms database is currently being developed, and the proposal of SRS.php library usage is to enable the data acquisition for the further warehousing tasks related to its setup and maintenance.

  18. Database Description - Plabrain DB | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available elopmental Biology, Department of Biophysics, Division of Biological Sciences, Gr...: Original website information Database maintenance site Laboratory for Molecular Developmental Biology Department of Biophysics

  19. Ionizing radiation induced biological response and its public health implication

    International Nuclear Information System (INIS)

    Koeteles, Gy.

    1994-01-01

    Several sources of ionizing radiation exist in natural and artificial environment of humanity. An overview of their biological effects and the biological response of man is present. Emphasize is given to the differences caused by high and low doses. The interrelation of radiology, radiation hygiene and public health is pointed out. Especially, the physical and biological effects of radiation on cells and their responses are discussed in more detail. (R.P.)

  20. Women are underrepresented in computational biology: An analysis of the scholarly literature in biology, computer science and computational biology.

    Directory of Open Access Journals (Sweden)

    Kevin S Bonham

    2017-10-01

    Full Text Available While women are generally underrepresented in STEM fields, there are noticeable differences between fields. For instance, the gender ratio in biology is more balanced than in computer science. We were interested in how this difference is reflected in the interdisciplinary field of computational/quantitative biology. To this end, we examined the proportion of female authors in publications from the PubMed and arXiv databases. There are fewer female authors on research papers in computational biology, as compared to biology in general. This is true across authorship position, year, and journal impact factor. A comparison with arXiv shows that quantitative biology papers have a higher ratio of female authors than computer science papers, placing computational biology in between its two parent fields in terms of gender representation. Both in biology and in computational biology, a female last author increases the probability of other authors on the paper being female, pointing to a potential role of female PIs in influencing the gender balance.

  1. Women are underrepresented in computational biology: An analysis of the scholarly literature in biology, computer science and computational biology.

    Science.gov (United States)

    Bonham, Kevin S; Stefan, Melanie I

    2017-10-01

    While women are generally underrepresented in STEM fields, there are noticeable differences between fields. For instance, the gender ratio in biology is more balanced than in computer science. We were interested in how this difference is reflected in the interdisciplinary field of computational/quantitative biology. To this end, we examined the proportion of female authors in publications from the PubMed and arXiv databases. There are fewer female authors on research papers in computational biology, as compared to biology in general. This is true across authorship position, year, and journal impact factor. A comparison with arXiv shows that quantitative biology papers have a higher ratio of female authors than computer science papers, placing computational biology in between its two parent fields in terms of gender representation. Both in biology and in computational biology, a female last author increases the probability of other authors on the paper being female, pointing to a potential role of female PIs in influencing the gender balance.

  2. EuroFIR-BASIS - a combined composition and biological activity database for bioactive compounds in plant-based foods

    DEFF Research Database (Denmark)

    Gry, Jørn; Black, Lucinda; Eriksen, Folmer Damsted

    2007-01-01

    Mounting evidence suggests that certain non-nutrient bioactive compounds promote optimal human health and reduce the risk of chronic disease. An Internet-deployed database, EuroFIR-BASIS, which uniquely combines food composition and biological effects data for plant-based bioactive compounds......, is being developed. The database covers multiple compound classes and 330 major food plants and their edible parts with data sourced from quality-assessed, peer-reviewed literature. The database will be a valuable resource for food regulatory and advisory bodies, risk authorities, epidemiologists...... and researchers interested in diet and health relationships, and product developers within the food industry....

  3. Database Resources of the BIG Data Center in 2018.

    Science.gov (United States)

    2018-01-04

    The BIG Data Center at Beijing Institute of Genomics (BIG) of the Chinese Academy of Sciences provides freely open access to a suite of database resources in support of worldwide research activities in both academia and industry. With the vast amounts of omics data generated at ever-greater scales and rates, the BIG Data Center is continually expanding, updating and enriching its core database resources through big-data integration and value-added curation, including BioCode (a repository archiving bioinformatics tool codes), BioProject (a biological project library), BioSample (a biological sample library), Genome Sequence Archive (GSA, a data repository for archiving raw sequence reads), Genome Warehouse (GWH, a centralized resource housing genome-scale data), Genome Variation Map (GVM, a public repository of genome variations), Gene Expression Nebulas (GEN, a database of gene expression profiles based on RNA-Seq data), Methylation Bank (MethBank, an integrated databank of DNA methylomes), and Science Wikis (a series of biological knowledge wikis for community annotations). In addition, three featured web services are provided, viz., BIG Search (search as a service; a scalable inter-domain text search engine), BIG SSO (single sign-on as a service; a user access control system to gain access to multiple independent systems with a single ID and password) and Gsub (submission as a service; a unified submission service for all relevant resources). All of these resources are publicly accessible through the home page of the BIG Data Center at http://bigd.big.ac.cn. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  4. Biofuel Database

    Science.gov (United States)

    Biofuel Database (Web, free access)   This database brings together structural, biological, and thermodynamic data for enzymes that are either in current use or are being considered for use in the production of biofuels.

  5. IAEA biological reference materials

    International Nuclear Information System (INIS)

    Parr, R.M.; Schelenz, R.; Ballestra, S.

    1988-01-01

    The Analytical Quality Control Services programme of the IAEA encompasses a wide variety of intercomparisons and reference materials. This paper reviews only those aspects of the subject having to do with biological reference materials. The 1988 programme foresees 13 new intercomparison exercises, one for major, minor and trace elements, five for radionuclides, and seven for stable isotopes. Twenty-two natural matrix biological reference materials are available: twelve for major, minor and trace elements, six for radionuclides, and four for chlorinated hydrocarbons. Seven new intercomparisons and reference materials are in preparation or under active consideration. Guidelines on the correct use of reference materials are being prepared for publication in 1989 in consultation with other major international producers and users of biological reference materials. The IAEA database on available reference materials is being updated and expanded in scope, and a new publication is planned for 1989. (orig.)

  6. Data publication with the structural biology data grid supports live analysis

    NARCIS (Netherlands)

    Meyer, Peter A.; Socias, Stephanie; Key, Jason; Ransey, Elizabeth; Tjon, Emily C.; Buschiazzo, Alejandro; Lei, Ming; Botka, Chris; Withrow, James; Neau, David; Rajashankar, Kanagalaghatta; Anderson, Karen S.; Baxter, Richard H.; Blacklow, Stephen C.; Boggon, Titus J.; Bonvin, Alexandre M J J|info:eu-repo/dai/nl/113691238; Borek, Dominika; Brett, Tom J.; Caflisch, Amedeo; Chang, Chung I.; Chazin, Walter J.; Corbett, Kevin D.; Cosgrove, Michael S.; Crosson, Sean; Dhe-Paganon, Sirano; Di Cera, Enrico; Drennan, Catherine L.; Eck, Michael J.; Eichman, Brandt F.; Fan, Qing R.; Ferré-D'Amaré, Adrian R.; Fromme, J. Christopher; Garcia, K. Christopher; Gaudet, Rachelle; Gong, Peng; Harrison, Stephen C.; Heldwein, Ekaterina E.; Jia, Zongchao; Keenan, Robert J.; Kruse, Andrew C.; Kvansakul, Marc; McLellan, Jason S.; Modis, Yorgo; Nam, Yunsun; Otwinowski, Zbyszek; Pai, Emil F.; Pereira, Pedro José Barbosa; Petosa, Carlo; Raman, C. S.; Rapoport, Tom A.; Roll-Mecak, Antonina; Rosen, Michael K.; Rudenko, Gabby; Schlessinger, Joseph; Schwartz, Thomas U.; Shamoo, Yousif; Sondermann, Holger; Tao, Yizhi J.; Tolia, Niraj H.; Tsodikov, Oleg V.; Westover, Kenneth D.; Wu, Hao; Foster, Ian; Fraser, James S.; Maia, Filipe R N C; Gonen, Tamir; Kirchhausen, Tom; Diederichs, Kay; Crosas, Mercé; Sliz, Piotr

    2016-01-01

    Access to experimental X-ray diffraction image data is fundamental for validation and reproduction of macromolecular models and indispensable for development of structural biology processing methods. Here, we established a diffraction data publication and dissemination system, Structural Biology

  7. Exploiting publicly available biological and biochemical information for the discovery of novel short linear motifs.

    KAUST Repository

    Sayadi, Ahmed

    2011-07-20

    The function of proteins is often mediated by short linear segments of their amino acid sequence, called Short Linear Motifs or SLiMs, the identification of which can provide important information about a protein function. However, the short length of the motifs and their variable degree of conservation makes their identification hard since it is difficult to correctly estimate the statistical significance of their occurrence. Consequently, only a small fraction of them have been discovered so far. We describe here an approach for the discovery of SLiMs based on their occurrence in evolutionarily unrelated proteins belonging to the same biological, signalling or metabolic pathway and give specific examples of its effectiveness in both rediscovering known motifs and in discovering novel ones. An automatic implementation of the procedure, available for download, allows significant motifs to be identified, automatically annotated with functional, evolutionary and structural information and organized in a database that can be inspected and queried. An instance of the database populated with pre-computed data on seven organisms is accessible through a publicly available server and we believe it constitutes by itself a useful resource for the life sciences (http://www.biocomputing.it/modipath).

  8. Databases in the fields of toxicology, occupational and environmental health at DIMDI

    International Nuclear Information System (INIS)

    Bystrich, E.

    1993-01-01

    DIMDI, the German Institute for Medical Documentation and Information, is a governmental institute and affiliated to the Federal Ministry for Health. It was founded in 1969 in Cologne. At present DIMDI hosts about seventy international and national bibliographic and factual databases in the field of biosciences, such as medicine, public health, pharmacology, toxicology, occupational and environmental health, nutrition, biology, psychology, sociology, sports, and agricultural sciences. The most important databases with toxicological and ecotoxicological information, which contain data useful for managers of chemical and nucelar power plants are the factual databases HSDB, ECDIN, SIGEDA, RTECS, and CCRIS, and the bibliographic databases TOXALL, ENVIROLINE, SCISEARCH, MEDLINE, EMBASE, and BIOSIS PREVIEWS. (orig.)

  9. Toward computational cumulative biology by combining models of biological datasets.

    Science.gov (United States)

    Faisal, Ali; Peltonen, Jaakko; Georgii, Elisabeth; Rung, Johan; Kaski, Samuel

    2014-01-01

    A main challenge of data-driven sciences is how to make maximal use of the progressively expanding databases of experimental datasets in order to keep research cumulative. We introduce the idea of a modeling-based dataset retrieval engine designed for relating a researcher's experimental dataset to earlier work in the field. The search is (i) data-driven to enable new findings, going beyond the state of the art of keyword searches in annotations, (ii) modeling-driven, to include both biological knowledge and insights learned from data, and (iii) scalable, as it is accomplished without building one unified grand model of all data. Assuming each dataset has been modeled beforehand, by the researchers or automatically by database managers, we apply a rapidly computable and optimizable combination model to decompose a new dataset into contributions from earlier relevant models. By using the data-driven decomposition, we identify a network of interrelated datasets from a large annotated human gene expression atlas. While tissue type and disease were major driving forces for determining relevant datasets, the found relationships were richer, and the model-based search was more accurate than the keyword search; moreover, it recovered biologically meaningful relationships that are not straightforwardly visible from annotations-for instance, between cells in different developmental stages such as thymocytes and T-cells. Data-driven links and citations matched to a large extent; the data-driven links even uncovered corrections to the publication data, as two of the most linked datasets were not highly cited and turned out to have wrong publication entries in the database.

  10. Development of a Publicly Available, Comprehensive Database of Fiber and Health Outcomes: Rationale and Methods.

    Directory of Open Access Journals (Sweden)

    Kara A Livingston

    Full Text Available Dietary fiber is a broad category of compounds historically defined as partially or completely indigestible plant-based carbohydrates and lignin with, more recently, the additional criteria that fibers incorporated into foods as additives should demonstrate functional human health outcomes to receive a fiber classification. Thousands of research studies have been published examining fibers and health outcomes.(1 Develop a database listing studies testing fiber and physiological health outcomes identified by experts at the Ninth Vahouny Conference; (2 Use evidence mapping methodology to summarize this body of literature. This paper summarizes the rationale, methodology, and resulting database. The database will help both scientists and policy-makers to evaluate evidence linking specific fibers with physiological health outcomes, and identify missing information.To build this database, we conducted a systematic literature search for human intervention studies published in English from 1946 to May 2015. Our search strategy included a broad definition of fiber search terms, as well as search terms for nine physiological health outcomes identified at the Ninth Vahouny Fiber Symposium. Abstracts were screened using a priori defined eligibility criteria and a low threshold for inclusion to minimize the likelihood of rejecting articles of interest. Publications then were reviewed in full text, applying additional a priori defined exclusion criteria. The database was built and published on the Systematic Review Data Repository (SRDR™, a web-based, publicly available application.A fiber database was created. This resource will reduce the unnecessary replication of effort in conducting systematic reviews by serving as both a central database archiving PICO (population, intervention, comparator, outcome data on published studies and as a searchable tool through which this data can be extracted and updated.

  11. A Public Database of Memory and Naive B-Cell Receptor Sequences.

    Directory of Open Access Journals (Sweden)

    William S DeWitt

    Full Text Available The vast diversity of B-cell receptors (BCR and secreted antibodies enables the recognition of, and response to, a wide range of epitopes, but this diversity has also limited our understanding of humoral immunity. We present a public database of more than 37 million unique BCR sequences from three healthy adult donors that is many fold deeper than any existing resource, together with a set of online tools designed to facilitate the visualization and analysis of the annotated data. We estimate the clonal diversity of the naive and memory B-cell repertoires of healthy individuals, and provide a set of examples that illustrate the utility of the database, including several views of the basic properties of immunoglobulin heavy chain sequences, such as rearrangement length, subunit usage, and somatic hypermutation positions and dynamics.

  12. A Public Database of Memory and Naive B-Cell Receptor Sequences.

    Science.gov (United States)

    DeWitt, William S; Lindau, Paul; Snyder, Thomas M; Sherwood, Anna M; Vignali, Marissa; Carlson, Christopher S; Greenberg, Philip D; Duerkopp, Natalie; Emerson, Ryan O; Robins, Harlan S

    2016-01-01

    The vast diversity of B-cell receptors (BCR) and secreted antibodies enables the recognition of, and response to, a wide range of epitopes, but this diversity has also limited our understanding of humoral immunity. We present a public database of more than 37 million unique BCR sequences from three healthy adult donors that is many fold deeper than any existing resource, together with a set of online tools designed to facilitate the visualization and analysis of the annotated data. We estimate the clonal diversity of the naive and memory B-cell repertoires of healthy individuals, and provide a set of examples that illustrate the utility of the database, including several views of the basic properties of immunoglobulin heavy chain sequences, such as rearrangement length, subunit usage, and somatic hypermutation positions and dynamics.

  13. Freshwater Biological Traits Database (Traits)

    Science.gov (United States)

    The traits database was compiled for a project on climate change effects on river and stream ecosystems. The traits data, gathered from multiple sources, focused on information published or otherwise well-documented by trustworthy sources.

  14. Current Comparative Table (CCT) automates customized searches of dynamic biological databases.

    Science.gov (United States)

    Landsteiner, Benjamin R; Olson, Michael R; Rutherford, Robert

    2005-07-01

    The Current Comparative Table (CCT) software program enables working biologists to automate customized bioinformatics searches, typically of remote sequence or HMM (hidden Markov model) databases. CCT currently supports BLAST, hmmpfam and other programs useful for gene and ortholog identification. The software is web based, has a BioPerl core and can be used remotely via a browser or locally on Mac OS X or Linux machines. CCT is particularly useful to scientists who study large sets of molecules in today's evolving information landscape because it color-codes all result files by age and highlights even tiny changes in sequence or annotation. By empowering non-bioinformaticians to automate custom searches and examine current results in context at a glance, CCT allows a remote database submission in the evening to influence the next morning's bench experiment. A demonstration of CCT is available at http://orb.public.stolaf.edu/CCTdemo and the open source software is freely available from http://sourceforge.net/projects/orb-cct.

  15. The SBOL Stack: A Platform for Storing, Publishing, and Sharing Synthetic Biology Designs.

    Science.gov (United States)

    Madsen, Curtis; McLaughlin, James Alastair; Mısırlı, Göksel; Pocock, Matthew; Flanagan, Keith; Hallinan, Jennifer; Wipat, Anil

    2016-06-17

    Recently, synthetic biologists have developed the Synthetic Biology Open Language (SBOL), a data exchange standard for descriptions of genetic parts, devices, modules, and systems. The goals of this standard are to allow scientists to exchange designs of biological parts and systems, to facilitate the storage of genetic designs in repositories, and to facilitate the description of genetic designs in publications. In order to achieve these goals, the development of an infrastructure to store, retrieve, and exchange SBOL data is necessary. To address this problem, we have developed the SBOL Stack, a Resource Description Framework (RDF) database specifically designed for the storage, integration, and publication of SBOL data. This database allows users to define a library of synthetic parts and designs as a service, to share SBOL data with collaborators, and to store designs of biological systems locally. The database also allows external data sources to be integrated by mapping them to the SBOL data model. The SBOL Stack includes two Web interfaces: the SBOL Stack API and SynBioHub. While the former is designed for developers, the latter allows users to upload new SBOL biological designs, download SBOL documents, search by keyword, and visualize SBOL data. Since the SBOL Stack is based on semantic Web technology, the inherent distributed querying functionality of RDF databases can be used to allow different SBOL stack databases to be queried simultaneously, and therefore, data can be shared between different institutes, centers, or other users.

  16. Subject and authorship of records related to the Organization for Tropical Studies (OTS) in BINABITROP, a comprehensive database about Costa Rican biology.

    Science.gov (United States)

    Monge-Nájera, Julián; Nielsen-Muñoz, Vanessa; Azofeifa-Mora, Ana Beatriz

    2013-06-01

    BINABITROP is a bibliographical database of more than 38000 records about the ecosystems and organisms of Costa Rica. In contrast with commercial databases, such as Web of Knowledge and Scopus, which exclude most of the scientific journals published in tropical countries, BINABITROP is a comprehensive record of knowledge on the tropical ecosystems and organisms of Costa Rica. We analyzed its contents in three sites (La Selva, Palo Verde and Las Cruces) and recorded scientific field, taxonomic group and authorship. We found that most records dealt with ecology and systematics, and that most authors published only one article in the study period (1963-2011). Most research was published in four journals: Biotropica, Revista de Biología Tropical/ International Journal of Tropical Biology and Conservation, Zootaxa and Brenesia. This may be the first study of a such a comprehensive database for any case of tropical biology literature.

  17. Subject and authorship of records related to the Organization for Tropical Studies (OTS in BINABITROP, a comprehensive database about Costa Rican biology

    Directory of Open Access Journals (Sweden)

    Julián Monge-Nájera

    2013-06-01

    Full Text Available BINABITROP is a bibliographical database of more than 38 000 records about the ecosystems and organisms of Costa Rica. In contrast with commercial databases, such as Web of Knowledge and Scopus, which exclude most of the scientific journals published in tropical countries, BINABITROP is a comprehensive record of knowledge on the tropical ecosystems and organisms of Costa Rica. We analyzed its contents in three sites (La Selva, Palo Verde and Las Cruces and recorded scientific field, taxonomic group and authorship. We found that most records dealt with ecology and systematics, and that most authors published only one article in the study period (1963-2011. Most research was published in four journals: Biotropica, Revista de Biología Tropical/ International Journal of Tropical Biology and Conservation, Zootaxa and Brenesia. This may be the first study of a such a comprehensive database for any case of tropical biology literature.

  18. Database Description - eSOL | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available base Description General information of database Database name eSOL Alternative nam...eator Affiliation: The Research and Development of Biological Databases Project, National Institute of Genet...nology 4259 Nagatsuta-cho, Midori-ku, Yokohama, Kanagawa 226-8501 Japan Email: Tel.: +81-45-924-5785 Database... classification Protein sequence databases - Protein properties Organism Taxonomy Name: Escherichia coli Taxonomy ID: 562 Database...i U S A. 2009 Mar 17;106(11):4201-6. External Links: Original website information Database maintenance site

  19. Clever generation of rich SPARQL queries from annotated relational schema: application to Semantic Web Service creation for biological databases.

    Science.gov (United States)

    Wollbrett, Julien; Larmande, Pierre; de Lamotte, Frédéric; Ruiz, Manuel

    2013-04-15

    In recent years, a large amount of "-omics" data have been produced. However, these data are stored in many different species-specific databases that are managed by different institutes and laboratories. Biologists often need to find and assemble data from disparate sources to perform certain analyses. Searching for these data and assembling them is a time-consuming task. The Semantic Web helps to facilitate interoperability across databases. A common approach involves the development of wrapper systems that map a relational database schema onto existing domain ontologies. However, few attempts have been made to automate the creation of such wrappers. We developed a framework, named BioSemantic, for the creation of Semantic Web Services that are applicable to relational biological databases. This framework makes use of both Semantic Web and Web Services technologies and can be divided into two main parts: (i) the generation and semi-automatic annotation of an RDF view; and (ii) the automatic generation of SPARQL queries and their integration into Semantic Web Services backbones. We have used our framework to integrate genomic data from different plant databases. BioSemantic is a framework that was designed to speed integration of relational databases. We present how it can be used to speed the development of Semantic Web Services for existing relational biological databases. Currently, it creates and annotates RDF views that enable the automatic generation of SPARQL queries. Web Services are also created and deployed automatically, and the semantic annotations of our Web Services are added automatically using SAWSDL attributes. BioSemantic is downloadable at http://southgreen.cirad.fr/?q=content/Biosemantic.

  20. Redundancy control in pathway databases (ReCiPa): an application for improving gene-set enrichment analysis in Omics studies and "Big data" biology.

    Science.gov (United States)

    Vivar, Juan C; Pemu, Priscilla; McPherson, Ruth; Ghosh, Sujoy

    2013-08-01

    Abstract Unparalleled technological advances have fueled an explosive growth in the scope and scale of biological data and have propelled life sciences into the realm of "Big Data" that cannot be managed or analyzed by conventional approaches. Big Data in the life sciences are driven primarily via a diverse collection of 'omics'-based technologies, including genomics, proteomics, metabolomics, transcriptomics, metagenomics, and lipidomics. Gene-set enrichment analysis is a powerful approach for interrogating large 'omics' datasets, leading to the identification of biological mechanisms associated with observed outcomes. While several factors influence the results from such analysis, the impact from the contents of pathway databases is often under-appreciated. Pathway databases often contain variously named pathways that overlap with one another to varying degrees. Ignoring such redundancies during pathway analysis can lead to the designation of several pathways as being significant due to high content-similarity, rather than truly independent biological mechanisms. Statistically, such dependencies also result in correlated p values and overdispersion, leading to biased results. We investigated the level of redundancies in multiple pathway databases and observed large discrepancies in the nature and extent of pathway overlap. This prompted us to develop the application, ReCiPa (Redundancy Control in Pathway Databases), to control redundancies in pathway databases based on user-defined thresholds. Analysis of genomic and genetic datasets, using ReCiPa-generated overlap-controlled versions of KEGG and Reactome pathways, led to a reduction in redundancy among the top-scoring gene-sets and allowed for the inclusion of additional gene-sets representing possibly novel biological mechanisms. Using obesity as an example, bioinformatic analysis further demonstrated that gene-sets identified from overlap-controlled pathway databases show stronger evidence of prior association

  1. Extracting reaction networks from databases-opening Pandora's box.

    Science.gov (United States)

    Fearnley, Liam G; Davis, Melissa J; Ragan, Mark A; Nielsen, Lars K

    2014-11-01

    Large quantities of information describing the mechanisms of biological pathways continue to be collected in publicly available databases. At the same time, experiments have increased in scale, and biologists increasingly use pathways defined in online databases to interpret the results of experiments and generate hypotheses. Emerging computational techniques that exploit the rich biological information captured in reaction systems require formal standardized descriptions of pathways to extract these reaction networks and avoid the alternative: time-consuming and largely manual literature-based network reconstruction. Here, we systematically evaluate the effects of commonly used knowledge representations on the seemingly simple task of extracting a reaction network describing signal transduction from a pathway database. We show that this process is in fact surprisingly difficult, and the pathway representations adopted by various knowledge bases have dramatic consequences for reaction network extraction, connectivity, capture of pathway crosstalk and in the modelling of cell-cell interactions. Researchers constructing computational models built from automatically extracted reaction networks must therefore consider the issues we outline in this review to maximize the value of existing pathway knowledge. © The Author 2013. Published by Oxford University Press.

  2. Fishery Biology Database (AGDBS)

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — Basic biological data are the foundation on which all assessments of fisheries resources are built. These include parameters such as the size and age composition of...

  3. The Mouse Tumor Biology Database: A Comprehensive Resource for Mouse Models of Human Cancer.

    Science.gov (United States)

    Krupke, Debra M; Begley, Dale A; Sundberg, John P; Richardson, Joel E; Neuhauser, Steven B; Bult, Carol J

    2017-11-01

    Research using laboratory mice has led to fundamental insights into the molecular genetic processes that govern cancer initiation, progression, and treatment response. Although thousands of scientific articles have been published about mouse models of human cancer, collating information and data for a specific model is hampered by the fact that many authors do not adhere to existing annotation standards when describing models. The interpretation of experimental results in mouse models can also be confounded when researchers do not factor in the effect of genetic background on tumor biology. The Mouse Tumor Biology (MTB) database is an expertly curated, comprehensive compendium of mouse models of human cancer. Through the enforcement of nomenclature and related annotation standards, MTB supports aggregation of data about a cancer model from diverse sources and assessment of how genetic background of a mouse strain influences the biological properties of a specific tumor type and model utility. Cancer Res; 77(21); e67-70. ©2017 AACR . ©2017 American Association for Cancer Research.

  4. Clever generation of rich SPARQL queries from annotated relational schema: application to Semantic Web Service creation for biological databases

    Science.gov (United States)

    2013-01-01

    Background In recent years, a large amount of “-omics” data have been produced. However, these data are stored in many different species-specific databases that are managed by different institutes and laboratories. Biologists often need to find and assemble data from disparate sources to perform certain analyses. Searching for these data and assembling them is a time-consuming task. The Semantic Web helps to facilitate interoperability across databases. A common approach involves the development of wrapper systems that map a relational database schema onto existing domain ontologies. However, few attempts have been made to automate the creation of such wrappers. Results We developed a framework, named BioSemantic, for the creation of Semantic Web Services that are applicable to relational biological databases. This framework makes use of both Semantic Web and Web Services technologies and can be divided into two main parts: (i) the generation and semi-automatic annotation of an RDF view; and (ii) the automatic generation of SPARQL queries and their integration into Semantic Web Services backbones. We have used our framework to integrate genomic data from different plant databases. Conclusions BioSemantic is a framework that was designed to speed integration of relational databases. We present how it can be used to speed the development of Semantic Web Services for existing relational biological databases. Currently, it creates and annotates RDF views that enable the automatic generation of SPARQL queries. Web Services are also created and deployed automatically, and the semantic annotations of our Web Services are added automatically using SAWSDL attributes. BioSemantic is downloadable at http://southgreen.cirad.fr/?q=content/Biosemantic. PMID:23586394

  5. The catfish genome database cBARBEL: an informatic platform for genome biology of ictalurid catfish.

    Science.gov (United States)

    Lu, Jianguo; Peatman, Eric; Yang, Qing; Wang, Shaolin; Hu, Zhiliang; Reecy, James; Kucuktas, Huseyin; Liu, Zhanjiang

    2011-01-01

    The catfish genome database, cBARBEL (abbreviated from catfish Breeder And Researcher Bioinformatics Entry Location) is an online open-access database for genome biology of ictalurid catfish (Ictalurus spp.). It serves as a comprehensive, integrative platform for all aspects of catfish genetics, genomics and related data resources. cBARBEL provides BLAST-based, fuzzy and specific search functions, visualization of catfish linkage, physical and integrated maps, a catfish EST contig viewer with SNP information overlay, and GBrowse-based organization of catfish genomic data based on sequence similarity with zebrafish chromosomes. Subsections of the database are tightly related, allowing a user with a sequence or search string of interest to navigate seamlessly from one area to another. As catfish genome sequencing proceeds and ongoing quantitative trait loci (QTL) projects bear fruit, cBARBEL will allow rapid data integration and dissemination within the catfish research community and to interested stakeholders. cBARBEL can be accessed at http://catfishgenome.org.

  6. Assessing the quality of life history information in publicly available databases.

    Science.gov (United States)

    Thorson, James T; Cope, Jason M; Patrick, Wesley S

    2014-01-01

    Single-species life history parameters are central to ecological research and management, including the fields of macro-ecology, fisheries science, and ecosystem modeling. However, there has been little independent evaluation of the precision and accuracy of the life history values in global and publicly available databases. We therefore develop a novel method based on a Bayesian errors-in-variables model that compares database entries with estimates from local experts, and we illustrate this process by assessing the accuracy and precision of entries in FishBase, one of the largest and oldest life history databases. This model distinguishes biases among seven life history parameters, two types of information available in FishBase (i.e., published values and those estimated from other parameters), and two taxa (i.e., bony and cartilaginous fishes) relative to values from regional experts in the United States, while accounting for additional variance caused by sex- and region-specific life history traits. For published values in FishBase, the model identifies a small positive bias in natural mortality and negative bias in maximum age, perhaps caused by unacknowledged mortality caused by fishing. For life history values calculated by FishBase, the model identified large and inconsistent biases. The model also demonstrates greatest precision for body size parameters, decreased precision for values derived from geographically distant populations, and greatest between-sex differences in age at maturity. We recommend that our bias and precision estimates be used in future errors-in-variables models as a prior on measurement errors. This approach is broadly applicable to global databases of life history traits and, if used, will encourage further development and improvements in these databases.

  7. SoyDB: a knowledge database of soybean transcription factors

    Directory of Open Access Journals (Sweden)

    Valliyodan Babu

    2010-01-01

    Full Text Available Abstract Background Transcription factors play the crucial rule of regulating gene expression and influence almost all biological processes. Systematically identifying and annotating transcription factors can greatly aid further understanding their functions and mechanisms. In this article, we present SoyDB, a user friendly database containing comprehensive knowledge of soybean transcription factors. Description The soybean genome was recently sequenced by the Department of Energy-Joint Genome Institute (DOE-JGI and is publicly available. Mining of this sequence identified 5,671 soybean genes as putative transcription factors. These genes were comprehensively annotated as an aid to the soybean research community. We developed SoyDB - a knowledge database for all the transcription factors in the soybean genome. The database contains protein sequences, predicted tertiary structures, putative DNA binding sites, domains, homologous templates in the Protein Data Bank (PDB, protein family classifications, multiple sequence alignments, consensus protein sequence motifs, web logo of each family, and web links to the soybean transcription factor database PlantTFDB, known EST sequences, and other general protein databases including Swiss-Prot, Gene Ontology, KEGG, EMBL, TAIR, InterPro, SMART, PROSITE, NCBI, and Pfam. The database can be accessed via an interactive and convenient web server, which supports full-text search, PSI-BLAST sequence search, database browsing by protein family, and automatic classification of a new protein sequence into one of 64 annotated transcription factor families by hidden Markov models. Conclusions A comprehensive soybean transcription factor database was constructed and made publicly accessible at http://casp.rnet.missouri.edu/soydb/.

  8. Network Biology (http://www.iaees.org/publications/journals/nb/online-version.asp

    Directory of Open Access Journals (Sweden)

    networkbiology@iaees.org

    Full Text Available Network Biology ISSN 2220-8879 URL: http://www.iaees.org/publications/journals/nb/online-version.asp RSS: http://www.iaees.org/publications/journals/nb/rss.xml E-mail: networkbiology@iaees.org Editor-in-Chief: WenJun Zhang Aims and Scope NETWORK BIOLOGY (ISSN 2220-8879; CODEN NBEICS is an open access, peer-reviewed international journal that considers scientific articles in all different areas of network biology. It is the transactions of the International Society of Network Biology. It dedicates to the latest advances in network biology. The goal of this journal is to keep a record of the state-of-the-art research and promote the research work in these fast moving areas. The topics to be covered by Network Biology include, but are not limited to: •Theories, algorithms and programs of network analysis •Innovations and applications of biological networks •Ecological networks, food webs and natural equilibrium •Co-evolution, co-extinction, biodiversity conservation •Metabolic networks, protein-protein interaction networks, biochemical reaction networks, gene networks, transcriptional regulatory networks, cell cycle networks, phylogenetic networks, network motifs •Physiological networks •Network regulation of metabolic processes, human diseases and ecological systems •Social networks, epidemiological networks •System complexity, self-organized systems, emergence of biological systems, agent-based modeling, individual-based modeling, neural network modeling, and other network-based modeling, etc. We are also interested in short communications that clearly address a specific issue or completely present a new ecological network, food web, or metabolic or gene network, etc. Authors can submit their works to the email box of this journal, networkbiology@iaees.org. All manuscripts submitted to this journal must be previously unpublished and may not be considered for publication elsewhere at any time during review period of this journal

  9. The NCBI BioSystems database.

    Science.gov (United States)

    Geer, Lewis Y; Marchler-Bauer, Aron; Geer, Renata C; Han, Lianyi; He, Jane; He, Siqian; Liu, Chunlei; Shi, Wenyao; Bryant, Stephen H

    2010-01-01

    The NCBI BioSystems database, found at http://www.ncbi.nlm.nih.gov/biosystems/, centralizes and cross-links existing biological systems databases, increasing their utility and target audience by integrating their pathways and systems into NCBI resources. This integration allows users of NCBI's Entrez databases to quickly categorize proteins, genes and small molecules by metabolic pathway, disease state or other BioSystem type, without requiring time-consuming inference of biological relationships from the literature or multiple experimental datasets.

  10. Analysis of undergraduate cell biology contents in Brazilian public universities.

    Science.gov (United States)

    Mermelstein, Claudia; Costa, Manoel Luis

    2017-04-01

    The enormous amount of information available in cell biology has created a challenge in selecting the core concepts we should be teaching our undergraduates. One way to define a set of essential core ideas in cell biology is to analyze what a specific cell biology community is teaching their students. Our main objective was to analyze the cell biology content currently being taught in Brazilian universities. We collected the syllabi of cell biology courses from public universities in Brazil and analyzed the frequency of cell biology topics in each course. We also compared the Brazilian data with the contents of a major cell biology textbook. Our analysis showed that while some cell biology topics such as plasma membrane and cytoskeleton was present in ∼100% of the Brazilian curricula analyzed others such as cell signaling and cell differentiation were present in only ∼35%. The average cell biology content taught in the Brazilian universities is quite different from what is presented in the textbook. We discuss several possible explanations for these observations. We also suggest a list with essential cell biology topics for any biological or biomedical undergraduate course. The comparative discussion of cell biology topics presented here could be valuable in other educational contexts. © 2017 The Authors. Cell Biology International Published by John Wiley & Sons Ltd on behalf of International Federation of Cell Biology.

  11. Constraints on Biological Mechanism from Disease Comorbidity Using Electronic Medical Records and Database of Genetic Variants.

    Directory of Open Access Journals (Sweden)

    Steven C Bagley

    2016-04-01

    Full Text Available Patterns of disease co-occurrence that deviate from statistical independence may represent important constraints on biological mechanism, which sometimes can be explained by shared genetics. In this work we study the relationship between disease co-occurrence and commonly shared genetic architecture of disease. Records of pairs of diseases were combined from two different electronic medical systems (Columbia, Stanford, and compared to a large database of published disease-associated genetic variants (VARIMED; data on 35 disorders were available across all three sources, which include medical records for over 1.2 million patients and variants from over 17,000 publications. Based on the sources in which they appeared, disease pairs were categorized as having predominant clinical, genetic, or both kinds of manifestations. Confounding effects of age on disease incidence were controlled for by only comparing diseases when they fall in the same cluster of similarly shaped incidence patterns. We find that disease pairs that are overrepresented in both electronic medical record systems and in VARIMED come from two main disease classes, autoimmune and neuropsychiatric. We furthermore identify specific genes that are shared within these disease groups.

  12. A spatial national health facility database for public health sector planning in Kenya in 2008

    Directory of Open Access Journals (Sweden)

    Gething Peter W

    2009-03-01

    Full Text Available Abstract Background Efforts to tackle the enormous burden of ill-health in low-income countries are hampered by weak health information infrastructures that do not support appropriate planning and resource allocation. For health information systems to function well, a reliable inventory of health service providers is critical. The spatial referencing of service providers to allow their representation in a geographic information system is vital if the full planning potential of such data is to be realized. Methods A disparate series of contemporary lists of health service providers were used to update a public health facility database of Kenya last compiled in 2003. These new lists were derived primarily through the national distribution of antimalarial and antiretroviral commodities since 2006. A combination of methods, including global positioning systems, was used to map service providers. These spatially-referenced data were combined with high-resolution population maps to analyze disparity in geographic access to public health care. Findings The updated 2008 database contained 5,334 public health facilities (67% ministry of health; 28% mission and nongovernmental organizations; 2% local authorities; and 3% employers and other ministries. This represented an overall increase of 1,862 facilities compared to 2003. Most of the additional facilities belonged to the ministry of health (79% and the majority were dispensaries (91%. 93% of the health facilities were spatially referenced, 38% using global positioning systems compared to 21% in 2003. 89% of the population was within 5 km Euclidean distance to a public health facility in 2008 compared to 71% in 2003. Over 80% of the population outside 5 km of public health service providers was in the sparsely settled pastoralist areas of the country. Conclusion We have shown that, with concerted effort, a relatively complete inventory of mapped health services is possible with enormous potential for

  13. A spatial national health facility database for public health sector planning in Kenya in 2008.

    Science.gov (United States)

    Noor, Abdisalan M; Alegana, Victor A; Gething, Peter W; Snow, Robert W

    2009-03-06

    Efforts to tackle the enormous burden of ill-health in low-income countries are hampered by weak health information infrastructures that do not support appropriate planning and resource allocation. For health information systems to function well, a reliable inventory of health service providers is critical. The spatial referencing of service providers to allow their representation in a geographic information system is vital if the full planning potential of such data is to be realized. A disparate series of contemporary lists of health service providers were used to update a public health facility database of Kenya last compiled in 2003. These new lists were derived primarily through the national distribution of antimalarial and antiretroviral commodities since 2006. A combination of methods, including global positioning systems, was used to map service providers. These spatially-referenced data were combined with high-resolution population maps to analyze disparity in geographic access to public health care. The updated 2008 database contained 5,334 public health facilities (67% ministry of health; 28% mission and nongovernmental organizations; 2% local authorities; and 3% employers and other ministries). This represented an overall increase of 1,862 facilities compared to 2003. Most of the additional facilities belonged to the ministry of health (79%) and the majority were dispensaries (91%). 93% of the health facilities were spatially referenced, 38% using global positioning systems compared to 21% in 2003. 89% of the population was within 5 km Euclidean distance to a public health facility in 2008 compared to 71% in 2003. Over 80% of the population outside 5 km of public health service providers was in the sparsely settled pastoralist areas of the country. We have shown that, with concerted effort, a relatively complete inventory of mapped health services is possible with enormous potential for improving planning. Expansion in public health care in Kenya has

  14. From 20th century metabolic wall charts to 21st century systems biology: database of mammalian metabolic enzymes.

    Science.gov (United States)

    Corcoran, Callan C; Grady, Cameron R; Pisitkun, Trairak; Parulekar, Jaya; Knepper, Mark A

    2017-03-01

    The organization of the mammalian genome into gene subsets corresponding to specific functional classes has provided key tools for systems biology research. Here, we have created a web-accessible resource called the Mammalian Metabolic Enzyme Database ( https://hpcwebapps.cit.nih.gov/ESBL/Database/MetabolicEnzymes/MetabolicEnzymeDatabase.html) keyed to the biochemical reactions represented on iconic metabolic pathway wall charts created in the previous century. Overall, we have mapped 1,647 genes to these pathways, representing ~7 percent of the protein-coding genome. To illustrate the use of the database, we apply it to the area of kidney physiology. In so doing, we have created an additional database ( Database of Metabolic Enzymes in Kidney Tubule Segments: https://hpcwebapps.cit.nih.gov/ESBL/Database/MetabolicEnzymes/), mapping mRNA abundance measurements (mined from RNA-Seq studies) for all metabolic enzymes to each of 14 renal tubule segments. We carry out bioinformatics analysis of the enzyme expression pattern among renal tubule segments and mine various data sources to identify vasopressin-regulated metabolic enzymes in the renal collecting duct. Copyright © 2017 the American Physiological Society.

  15. BIOSPIDA: A Relational Database Translator for NCBI.

    Science.gov (United States)

    Hagen, Matthew S; Lee, Eva K

    2010-11-13

    As the volume and availability of biological databases continue widespread growth, it has become increasingly difficult for research scientists to identify all relevant information for biological entities of interest. Details of nucleotide sequences, gene expression, molecular interactions, and three-dimensional structures are maintained across many different databases. To retrieve all necessary information requires an integrated system that can query multiple databases with minimized overhead. This paper introduces a universal parser and relational schema translator that can be utilized for all NCBI databases in Abstract Syntax Notation (ASN.1). The data models for OMIM, Entrez-Gene, Pubmed, MMDB and GenBank have been successfully converted into relational databases and all are easily linkable helping to answer complex biological questions. These tools facilitate research scientists to locally integrate databases from NCBI without significant workload or development time.

  16. Toward public volume database management: a case study of NOVA, the National Online Volumetric Archive

    Science.gov (United States)

    Fletcher, Alex; Yoo, Terry S.

    2004-04-01

    Public databases today can be constructed with a wide variety of authoring and management structures. The widespread appeal of Internet search engines suggests that public information be made open and available to common search strategies, making accessible information that would otherwise be hidden by the infrastructure and software interfaces of a traditional database management system. We present the construction and organizational details for managing NOVA, the National Online Volumetric Archive. As an archival effort of the Visible Human Project for supporting medical visualization research, archiving 3D multimodal radiological teaching files, and enhancing medical education with volumetric data, our overall database structure is simplified; archives grow by accruing information, but seldom have to modify, delete, or overwrite stored records. NOVA is being constructed and populated so that it is transparent to the Internet; that is, much of its internal structure is mirrored in HTML allowing internet search engines to investigate, catalog, and link directly to the deep relational structure of the collection index. The key organizational concept for NOVA is the Image Content Group (ICG), an indexing strategy for cataloging incoming data as a set structure rather than by keyword management. These groups are managed through a series of XML files and authoring scripts. We cover the motivation for Image Content Groups, their overall construction, authorship, and management in XML, and the pilot results for creating public data repositories using this strategy.

  17. Databases and their application

    NARCIS (Netherlands)

    Grimm, E.C.; Bradshaw, R.H.W; Brewer, S.; Flantua, S.; Giesecke, T.; Lézine, A.M.; Takahara, H.; Williams, J.W.,Jr; Elias, S.A.; Mock, C.J.

    2013-01-01

    During the past 20 years, several pollen database cooperatives have been established. These databases are now constituent databases of the Neotoma Paleoecology Database, a public domain, multiproxy, relational database designed for Quaternary-Pliocene fossil data and modern surface samples. The

  18. Communicating Synthetic Biology: from the lab via the media to the broader public.

    Science.gov (United States)

    Kronberger, Nicole; Holtz, Peter; Kerbe, Wolfgang; Strasser, Ewald; Wagner, Wolfgang

    2009-12-01

    We present insights from a study on communicating Synthetic Biology conducted in 2008. Scientists were invited to write press releases on their work; the resulting texts were passed on to four journalists from major Austrian newspapers and magazines. The journalists in turn wrote articles that were used as stimulus material for eight group discussions with select members of the Austrian public. The results show that, from the lab via the media to the general public, communication is characterized by two important tendencies: first, communication becomes increasingly focused on concrete applications of Synthetic Biology; and second, biotechnology represents an important benchmark against which Synthetic Biology is being evaluated.

  19. A consensus yeast metabolic network reconstruction obtained from a community approach to systems biology

    DEFF Research Database (Denmark)

    Herrgard, Markus; Swainston, Neil; Dobson, Paul

    2008-01-01

    and in a manner that permits automated reasoning. The reconstruction is readily available via a publicly accessible database and in the Systems Biology Markup Language (http://www.comp-sys-bio.org/yeastnet). It can be maintained as a resource that serves as a common denominator for studying the systems biology...

  20. Occupational accidents involving biological material among public health workers.

    Science.gov (United States)

    Chiodi, Mônica Bonagamba; Marziale, Maria Helena Palucci; Robazzi, Maria Lúcia do Carmo Cruz

    2007-01-01

    This descriptive research aimed to recognize the occurrence of work accidents (WA) involving exposure to biological material among health workers at Public Health Units in Ribeirão Preto-SP, Brazil. A quantitative approach was adopted. In 2004, 155 accidents were notified by means of the Work Accident Communication (WAC). Sixty-two accidents (40%) involved exposure to biological material that could cause infections like Hepatitis and Aids. The highest number of victims (42 accidents) came from the category of nursing aids and technicians. Needles were responsible for 80.6% of accidents and blood was the biological material involved in a majority of occupational exposure cases. This subject needs greater attention, so that prevention measures can be implemented, which consider the peculiarities of the activities carried out by the different professional categories.

  1. BISQUE: locus- and variant-specific conversion of genomic, transcriptomic and proteomic database identifiers.

    Science.gov (United States)

    Meyer, Michael J; Geske, Philip; Yu, Haiyuan

    2016-05-15

    Biological sequence databases are integral to efforts to characterize and understand biological molecules and share biological data. However, when analyzing these data, scientists are often left holding disparate biological currency-molecular identifiers from different databases. For downstream applications that require converting the identifiers themselves, there are many resources available, but analyzing associated loci and variants can be cumbersome if data is not given in a form amenable to particular analyses. Here we present BISQUE, a web server and customizable command-line tool for converting molecular identifiers and their contained loci and variants between different database conventions. BISQUE uses a graph traversal algorithm to generalize the conversion process for residues in the human genome, genes, transcripts and proteins, allowing for conversion across classes of molecules and in all directions through an intuitive web interface and a URL-based web service. BISQUE is freely available via the web using any major web browser (http://bisque.yulab.org/). Source code is available in a public GitHub repository (https://github.com/hyulab/BISQUE). haiyuan.yu@cornell.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  2. Reusable data in public health data-bases-problems encountered in Danish Children's Database.

    Science.gov (United States)

    Høstgaard, Anna Marie; Pape-Haugaard, Louise

    2012-01-01

    Denmark have unique health informatics databases e.g. "The Children's Database", which since 2009 holds data on all Danish children from birth until 17 years of age. In the current set-up a number of potential sources of errors exist - both technical and human-which means that the data is flawed. This gives rise to erroneous statistics and makes the data unsuitable for research purposes. In order to make the data usable, it is necessary to develop new methods for validating the data generation process at the municipal/regional/national level. In the present ongoing research project, two research areas are combined: Public Health Informatics and Computer Science, and both ethnographic as well as system engineering research methods are used. The project is expected to generate new generic methods and knowledge about electronic data collection and transmission in different social contexts and by different social groups and thus to be of international importance, since this is sparsely documented in the Public Health Informatics perspective. This paper presents the preliminary results, which indicate that health information technology used ought to be subject for redesign, where a thorough insight into the work practices should be point of departure.

  3. Xylella fastidiosa comparative genomic database is an information resource to explore the annotation, genomic features, and biology of different strains

    Directory of Open Access Journals (Sweden)

    Alessandro M. Varani

    2012-01-01

    Full Text Available The Xylella fastidiosa comparative genomic database is a scientific resource with the aim to provide a user-friendly interface for accessing high-quality manually curated genomic annotation and comparative sequence analysis, as well as for identifying and mapping prophage-like elements, a marked feature of Xylella genomes. Here we describe a database and tools for exploring the biology of this important plant pathogen. The hallmarks of this database are the high quality genomic annotation, the functional and comparative genomic analysis and the identification and mapping of prophage-like elements. It is available from web site http://www.xylella.lncc.br.

  4. VTCdb: a gene co-expression database for the crop species Vitis vinifera (grapevine).

    Science.gov (United States)

    Wong, Darren C J; Sweetman, Crystal; Drew, Damian P; Ford, Christopher M

    2013-12-16

    Gene expression datasets in model plants such as Arabidopsis have contributed to our understanding of gene function and how a single underlying biological process can be governed by a diverse network of genes. The accumulation of publicly available microarray data encompassing a wide range of biological and environmental conditions has enabled the development of additional capabilities including gene co-expression analysis (GCA). GCA is based on the understanding that genes encoding proteins involved in similar and/or related biological processes may exhibit comparable expression patterns over a range of experimental conditions, developmental stages and tissues. We present an open access database for the investigation of gene co-expression networks within the cultivated grapevine, Vitis vinifera. The new gene co-expression database, VTCdb (http://vtcdb.adelaide.edu.au/Home.aspx), offers an online platform for transcriptional regulatory inference in the cultivated grapevine. Using condition-independent and condition-dependent approaches, grapevine co-expression networks were constructed using the latest publicly available microarray datasets from diverse experimental series, utilising the Affymetrix Vitis vinifera GeneChip (16 K) and the NimbleGen Grape Whole-genome microarray chip (29 K), thus making it possible to profile approximately 29,000 genes (95% of the predicted grapevine transcriptome). Applications available with the online platform include the use of gene names, probesets, modules or biological processes to query the co-expression networks, with the option to choose between Affymetrix or Nimblegen datasets and between multiple co-expression measures. Alternatively, the user can browse existing network modules using interactive network visualisation and analysis via CytoscapeWeb. To demonstrate the utility of the database, we present examples from three fundamental biological processes (berry development, photosynthesis and flavonoid biosynthesis

  5. Challenges with the financial reporting of biological assets by public entities in South Africa

    Directory of Open Access Journals (Sweden)

    Deon Scott

    2016-03-01

    Full Text Available Fair value accounting of biological assets in the public sector was introduced with the adoption of the public-sector-specific accounting standard: Generally Recognised Accounting Practice (GRAP 101. The public sector currently reports on various bases of accounting. Public entities and municipalities report in terms of accrual accounting, and government departments report on the modified cash basis. The lack of a uniform basis of accounting impedes the comparability of financial information. The implementation of GRAP 101 in the public sector is important in facilitating comparability of financial information regarding biological assets. This paper is based on a content analysis of the annual reports of 10 relevant public entities in South Africa and specifically details the challenges that public entities encounter with the application of GRAP 101. These challenges, and how they were addressed by a public entity that adopted and applied GRAP 101, namely the Accelerated and Shared Growth Initiative South Africa – Eastern Cape (AsgiSA-EC, are documented in this research.

  6. NREL: U.S. Life Cycle Inventory Database - About the LCI Database Project

    Science.gov (United States)

    About the LCI Database Project The U.S. Life Cycle Inventory (LCI) Database is a publicly available database that allows users to objectively review and compare analysis results that are based on similar source of critically reviewed LCI data through its LCI Database Project. NREL's High-Performance

  7. PathSys: integrating molecular interaction graphs for systems biology

    Directory of Open Access Journals (Sweden)

    Raval Alpan

    2006-02-01

    Full Text Available Abstract Background The goal of information integration in systems biology is to combine information from a number of databases and data sets, which are obtained from both high and low throughput experiments, under one data management scheme such that the cumulative information provides greater biological insight than is possible with individual information sources considered separately. Results Here we present PathSys, a graph-based system for creating a combined database of networks of interaction for generating integrated view of biological mechanisms. We used PathSys to integrate over 14 curated and publicly contributed data sources for the budding yeast (S. cerevisiae and Gene Ontology. A number of exploratory questions were formulated as a combination of relational and graph-based queries to the integrated database. Thus, PathSys is a general-purpose, scalable, graph-data warehouse of biological information, complete with a graph manipulation and a query language, a storage mechanism and a generic data-importing mechanism through schema-mapping. Conclusion Results from several test studies demonstrate the effectiveness of the approach in retrieving biologically interesting relations between genes and proteins, the networks connecting them, and of the utility of PathSys as a scalable graph-based warehouse for interaction-network integration and a hypothesis generator system. The PathSys's client software, named BiologicalNetworks, developed for navigation and analyses of molecular networks, is available as a Java Web Start application at http://brak.sdsc.edu/pub/BiologicalNetworks.

  8. About DNA databasing and investigative genetic analysis of externally visible characteristics: A public survey.

    Science.gov (United States)

    Zieger, Martin; Utz, Silvia

    2015-07-01

    During the last decade, DNA profiling and the use of DNA databases have become two of the most employed instruments of police investigations. This very rapid establishment of forensic genetics is yet far from being complete. In the last few years novel types of analyses have been presented to describe phenotypically a possible perpetrator. We conducted the present study among German speaking Swiss residents for two main reasons: firstly, we aimed at getting an impression of the public awareness and acceptance of the Swiss DNA database and the perception of a hypothetical DNA database containing all Swiss residents. Secondly, we wanted to get a broader picture of how people that are not working in the field of forensic genetics think about legal permission to establish phenotypic descriptions of alleged criminals by genetic means. Even though a significant number of study participants did not even know about the existence of the Swiss DNA database, its acceptance appears to be very high. Generally our results suggest that the current forensic use of DNA profiling is considered highly trustworthy. However, the acceptance of a hypothetical universal database would be only as low as about 30% among the 284 respondents to our study, mostly because people are concerned about the security of their genetic data, their privacy or a possible risk of abuse of such a database. Concerning the genetic analysis of externally visible characteristics and biogeographical ancestry, we discover a high degree of acceptance. The acceptance decreases slightly when precise characteristics are presented to the participants in detail. About half of the respondents would be in favor of the moderate use of physical traits analyses only for serious crimes threatening life, health or sexual integrity. The possible risk of discrimination and reinforcement of racism, as discussed by scholars from anthropology, bioethics, law, philosophy and sociology, is mentioned less frequently by the study

  9. Computer-aided detection of pulmonary nodules: a comparative study using the public LIDC/IDRI database

    International Nuclear Information System (INIS)

    Jacobs, Colin; Prokop, Mathias; Rikxoort, Eva M. van; Ginneken, Bram van; Murphy, Keelin; Schaefer-Prokop, Cornelia M.

    2016-01-01

    To benchmark the performance of state-of-the-art computer-aided detection (CAD) of pulmonary nodules using the largest publicly available annotated CT database (LIDC/IDRI), and to show that CAD finds lesions not identified by the LIDC's four-fold double reading process. The LIDC/IDRI database contains 888 thoracic CT scans with a section thickness of 2.5 mm or lower. We report performance of two commercial and one academic CAD system. The influence of presence of contrast, section thickness, and reconstruction kernel on CAD performance was assessed. Four radiologists independently analyzed the false positive CAD marks of the best CAD system. The updated commercial CAD system showed the best performance with a sensitivity of 82 % at an average of 3.1 false positive detections per scan. Forty-five false positive CAD marks were scored as nodules by all four radiologists in our study. On the largest publicly available reference database for lung nodule detection in chest CT, the updated commercial CAD system locates the vast majority of pulmonary nodules at a low false positive rate. Potential for CAD is substantiated by the fact that it identifies pulmonary nodules that were not marked during the extensive four-fold LIDC annotation process. (orig.)

  10. The FREGAT biobank: a clinico-biological database dedicated to esophageal and gastric cancers.

    Science.gov (United States)

    Mariette, Christophe; Renaud, Florence; Piessen, Guillaume; Gele, Patrick; Copin, Marie-Christine; Leteurtre, Emmanuelle; Delaeter, Christine; Dib, Malek; Clisant, Stéphanie; Harter, Valentin; Bonnetain, Franck; Duhamel, Alain; Christophe, Véronique; Adenis, Antoine

    2018-02-06

    While the incidence of esophageal and gastric cancers is increasing, the prognosis of these cancers remains bleak. Endoscopy and surgery are the standard treatments for localized tumors, but multimodal treatments, associated chemotherapy, targeted therapies, immunotherapy, radiotherapy, and surgery are needed for the vast majority of patients who present with locally advanced or metastatic disease at diagnosis. Although survival has improved, most patients still present with advanced disease at diagnosis. In addition, most patients exhibit a poor or incomplete response to treatment, experience early recurrence and have an impaired quality of life. Compared with several other cancers, the therapeutic approach is not personalized, and research is much less developed. It is, therefore, urgent to hasten the development of research protocols, and consequently, develop a large, ambitious and innovative tool through which future scientific questions may be answered. This research must be patient-related so that rapid feedback to the bedside is achieved and should aim to identify clinical-, biological- and tumor-related factors that are associated with treatment resistance. Finally, this research should also seek to explain epidemiological and social facets of disease behavior. The prospective FREGAT database, established by the French National Cancer Institute, is focused on adult patients with carcinomas of the esophagus and stomach and on whatever might be the tumor stage or therapeutic strategy. The database includes epidemiological, clinical, and tumor characteristics data as well as follow-up, human and social sciences quality of life data, along with a tumor and serum bank. This innovative method of research will allow for the banking of millions of data for the development of excellent basic, translational and clinical research programs for esophageal and gastric cancer. This will ultimately improve general knowledge of these diseases, therapeutic strategies and

  11. Exploring public databases to characterize urban flood risks in Amsterdam

    Science.gov (United States)

    Gaitan, Santiago; ten Veldhuis, Marie-claire; van de Giesen, Nick

    2015-04-01

    Cities worldwide are challenged by increasing urban flood risks. Precise and realistic measures are required to decide upon investment to reduce their impacts. Obvious flooding factors affecting flood risk include sewer systems performance and urban topography. However, currently implemented sewer and topographic models do not provide realistic predictions of local flooding occurrence during heavy rain events. Assessing other factors such as spatially distributed rainfall and socioeconomic characteristics may help to explain probability and impacts of urban flooding. Several public databases were analyzed: complaints about flooding made by citizens, rainfall depths (15 min and 100 Ha spatio-temporal resolution), grids describing number of inhabitants, income, and housing price (1Ha and 25Ha resolution); and buildings age. Data analysis was done using Python and GIS programming, and included spatial indexing of data, cluster analysis, and multivariate regression on the complaints. Complaints were used as a proxy to characterize flooding impacts. The cluster analysis, run for all the variables except the complaints, grouped part of the grid-cells of central Amsterdam into a highly differentiated group, covering 10% of the analyzed area, and accounting for 25% of registered complaints. The configuration of the analyzed variables in central Amsterdam coincides with a high complaint count. Remaining complaints were evenly dispersed along other groups. An adjusted R2 of 0.38 in the multivariate regression suggests that explaining power can improve if additional variables are considered. While rainfall intensity explained 4% of the incidence of complaints, population density and building age significantly explained around 20% each. Data mining of public databases proved to be a valuable tool to identify factors explaining variability in occurrence of urban pluvial flooding, though additional variables must be considered to fully explain flood risk variability.

  12. UbiProt: a database of ubiquitylated proteins

    Directory of Open Access Journals (Sweden)

    Kondratieva Ekaterina V

    2007-04-01

    Full Text Available Abstract Background Post-translational protein modification with ubiquitin, or ubiquitylation, is one of the hottest topics in a modern biology due to a dramatic impact on diverse metabolic pathways and involvement in pathogenesis of severe human diseases. A great number of eukaryotic proteins was found to be ubiquitylated. However, data about particular ubiquitylated proteins are rather disembodied. Description To fill a general need for collecting and systematizing experimental data concerning ubiquitylation we have developed a new resource, UbiProt Database, a knowledgebase of ubiquitylated proteins. The database contains retrievable information about overall characteristics of a particular protein, ubiquitylation features, related ubiquitylation and de-ubiquitylation machinery and literature references reflecting experimental evidence of ubiquitylation. UbiProt is available at http://ubiprot.org.ru for free. Conclusion UbiProt Database is a public resource offering comprehensive information on ubiquitylated proteins. The resource can serve as a general reference source both for researchers in ubiquitin field and those who deal with particular ubiquitylated proteins which are of their interest. Further development of the UbiProt Database is expected to be of common interest for research groups involved in studies of the ubiquitin system.

  13. Variations in clinicopathologic characteristics of thyroid cancer among racial ethnic groups: analysis of a large public city hospital and the SEER database.

    Science.gov (United States)

    Moo-Young, Tricia A; Panergo, Jessel; Wang, Chih E; Patel, Subhash; Duh, Hong Yan; Winchester, David J; Prinz, Richard A; Fogelfeld, Leon

    2013-11-01

    Clinicopathologic variables influence the treatment and prognosis of patients with thyroid cancer. A retrospective analysis of public hospital thyroid cancer database and the Surveillance, Epidemiology and End Results 17 database was conducted. Demographic, clinical, and pathologic data were compared across ethnic groups. Within the public hospital database, Hispanics versus non-Hispanic whites were younger and had more lymph node involvement (34% vs 17%, P ethnic groups. Similar findings were demonstrated within the Surveillance, Epidemiology and End Results database. African Americans aged ethnic groups. Such disparities persist within an equal-access health care system. These findings suggest that factors beyond socioeconomics may contribute to such differences. Copyright © 2013 Elsevier Inc. All rights reserved.

  14. ChemProt: A disease chemical biology database

    DEFF Research Database (Denmark)

    Taboureau, Olivier; Oprea, Tudor I.

    2013-01-01

    The integration of chemistry, biology, and informatics to study drug actions across multiple biological targets, pathways, and biological systems is an emerging paradigm in drug discovery. Rather than reducing a complex system to simplistic models, fields such as chemogenomics and translational...... informatics are seeking to build a holistic model for a better understanding of the drug pharmacology and clinical effects. Here we will present a webserver called ChemProt that can assist, in silico, the drug actions in the context of cellular and disease networks and contribute in the field of disease...... chemical biology, drug repurposing, and off-target effects prediction....

  15. RaMP: A Comprehensive Relational Database of Metabolomics Pathways for Pathway Enrichment Analysis of Genes and Metabolites.

    Science.gov (United States)

    Zhang, Bofei; Hu, Senyang; Baskin, Elizabeth; Patt, Andrew; Siddiqui, Jalal K; Mathé, Ewy A

    2018-02-22

    The value of metabolomics in translational research is undeniable, and metabolomics data are increasingly generated in large cohorts. The functional interpretation of disease-associated metabolites though is difficult, and the biological mechanisms that underlie cell type or disease-specific metabolomics profiles are oftentimes unknown. To help fully exploit metabolomics data and to aid in its interpretation, analysis of metabolomics data with other complementary omics data, including transcriptomics, is helpful. To facilitate such analyses at a pathway level, we have developed RaMP (Relational database of Metabolomics Pathways), which combines biological pathways from the Kyoto Encyclopedia of Genes and Genomes (KEGG), Reactome, WikiPathways, and the Human Metabolome DataBase (HMDB). To the best of our knowledge, an off-the-shelf, public database that maps genes and metabolites to biochemical/disease pathways and can readily be integrated into other existing software is currently lacking. For consistent and comprehensive analysis, RaMP enables batch and complex queries (e.g., list all metabolites involved in glycolysis and lung cancer), can readily be integrated into pathway analysis tools, and supports pathway overrepresentation analysis given a list of genes and/or metabolites of interest. For usability, we have developed a RaMP R package (https://github.com/Mathelab/RaMP-DB), including a user-friendly RShiny web application, that supports basic simple and batch queries, pathway overrepresentation analysis given a list of genes or metabolites of interest, and network visualization of gene-metabolite relationships. The package also includes the raw database file (mysql dump), thereby providing a stand-alone downloadable framework for public use and integration with other tools. In addition, the Python code needed to recreate the database on another system is also publicly available (https://github.com/Mathelab/RaMP-BackEnd). Updates for databases in RaMP will be

  16. Toward a public analysis database for LHC new physics searches using M ADA NALYSIS 5

    Science.gov (United States)

    Dumont, B.; Fuks, B.; Kraml, S.; Bein, S.; Chalons, G.; Conte, E.; Kulkarni, S.; Sengupta, D.; Wymant, C.

    2015-02-01

    We present the implementation, in the MadAnalysis 5 framework, of several ATLAS and CMS searches for supersymmetry in data recorded during the first run of the LHC. We provide extensive details on the validation of our implementations and propose to create a public analysis database within this framework.

  17. Privacy protection and public goods: building a genetic database for health research in Newfoundland and Labrador.

    Science.gov (United States)

    Kosseim, Patricia; Pullman, Daryl; Perrot-Daley, Astrid; Hodgkinson, Kathy; Street, Catherine; Rahman, Proton

    2013-01-01

    To provide a legal and ethical analysis of some of the implementation challenges faced by the Population Therapeutics Research Group (PTRG) at Memorial University (Canada), in using genealogical information offered by individuals for its genetics research database. This paper describes the unique historical and genetic characteristics of the Newfoundland and Labrador founder population, which gave rise to the opportunity for PTRG to build the Newfoundland Genealogy Database containing digitized records of all pre-confederation (1949) census records of the Newfoundland founder population. In addition to building the database, PTRG has developed the Heritability Analytics Infrastructure, a data management structure that stores genotype, phenotype, and pedigree information in a single database, and custom linkage software (KINNECT) to perform pedigree linkages on the genealogy database. A newly adopted legal regimen in Newfoundland and Labrador is discussed. It incorporates health privacy legislation with a unique research ethics statute governing the composition and activities of research ethics boards and, for the first time in Canada, elevating the status of national research ethics guidelines into law. The discussion looks at this integration of legal and ethical principles which provides a flexible and seamless framework for balancing the privacy rights and welfare interests of individuals, families, and larger societies in the creation and use of research data infrastructures as public goods. The complementary legal and ethical frameworks that now coexist in Newfoundland and Labrador provide the legislative authority, ethical legitimacy, and practical flexibility needed to find a workable balance between privacy interests and public goods. Such an approach may also be instructive for other jurisdictions as they seek to construct and use biobanks and related research platforms for genetic research.

  18. STITCH 2: an interaction network database for small molecules and proteins

    DEFF Research Database (Denmark)

    Kuhn, Michael; Szklarczyk, Damian; Franceschini, Andrea

    2010-01-01

    Over the last years, the publicly available knowledge on interactions between small molecules and proteins has been steadily increasing. To create a network of interactions, STITCH aims to integrate the data dispersed over the literature and various databases of biological pathways, drug......-target relationships and binding affinities. In STITCH 2, the number of relevant interactions is increased by incorporation of BindingDB, PharmGKB and the Comparative Toxicogenomics Database. The resulting network can be explored interactively or used as the basis for large-scale analyses. To facilitate links to other...... chemical databases, we adopt InChIKeys that allow identification of chemicals with a short, checksum-like string. STITCH 2.0 connects proteins from 630 organisms to over 74,000 different chemicals, including 2200 drugs. STITCH can be accessed at http://stitch.embl.de/....

  19. Secure Encapsulation and Publication of Biological Services in the Cloud Computing Environment

    Science.gov (United States)

    Zhang, Weizhe; Wang, Xuehui; Lu, Bo; Kim, Tai-hoon

    2013-01-01

    Secure encapsulation and publication for bioinformatics software products based on web service are presented, and the basic function of biological information is realized in the cloud computing environment. In the encapsulation phase, the workflow and function of bioinformatics software are conducted, the encapsulation interfaces are designed, and the runtime interaction between users and computers is simulated. In the publication phase, the execution and management mechanisms and principles of the GRAM components are analyzed. The functions such as remote user job submission and job status query are implemented by using the GRAM components. The services of bioinformatics software are published to remote users. Finally the basic prototype system of the biological cloud is achieved. PMID:24078906

  20. Secure Encapsulation and Publication of Biological Services in the Cloud Computing Environment

    Directory of Open Access Journals (Sweden)

    Weizhe Zhang

    2013-01-01

    Full Text Available Secure encapsulation and publication for bioinformatics software products based on web service are presented, and the basic function of biological information is realized in the cloud computing environment. In the encapsulation phase, the workflow and function of bioinformatics software are conducted, the encapsulation interfaces are designed, and the runtime interaction between users and computers is simulated. In the publication phase, the execution and management mechanisms and principles of the GRAM components are analyzed. The functions such as remote user job submission and job status query are implemented by using the GRAM components. The services of bioinformatics software are published to remote users. Finally the basic prototype system of the biological cloud is achieved.

  1. Secure encapsulation and publication of biological services in the cloud computing environment.

    Science.gov (United States)

    Zhang, Weizhe; Wang, Xuehui; Lu, Bo; Kim, Tai-hoon

    2013-01-01

    Secure encapsulation and publication for bioinformatics software products based on web service are presented, and the basic function of biological information is realized in the cloud computing environment. In the encapsulation phase, the workflow and function of bioinformatics software are conducted, the encapsulation interfaces are designed, and the runtime interaction between users and computers is simulated. In the publication phase, the execution and management mechanisms and principles of the GRAM components are analyzed. The functions such as remote user job submission and job status query are implemented by using the GRAM components. The services of bioinformatics software are published to remote users. Finally the basic prototype system of the biological cloud is achieved.

  2. Mediating objects: scientific and public functions of models in nineteenth-century biology.

    Science.gov (United States)

    Ludwig, David

    2013-01-01

    The aim of this article is to examine the scientific and public functions of two- and three-dimensional models in the context of three episodes from nineteenth-century biology. I argue that these models incorporate both data and theory by presenting theoretical assumptions in the light of concrete data or organizing data through theoretical assumptions. Despite their diverse roles in scientific practice, they all can be characterized as mediators between data and theory. Furthermore, I argue that these different mediating functions often reflect their different audiences that included specialized scientists, students, and the general public. In this sense, models in nineteenth-century biology can be understood as mediators between theory, data, and their diverse audiences.

  3. Current trends and new challenges of databases and web applications for systems driven biological research

    Directory of Open Access Journals (Sweden)

    Pradeep Kumar eSreenivasaiah

    2010-12-01

    Full Text Available Dynamic and rapidly evolving nature of systems driven research imposes special requirements on the technology, approach, design and architecture of computational infrastructure including database and web application. Several solutions have been proposed to meet the expectations and novel methods have been developed to address the persisting problems of data integration. It is important for researchers to understand different technologies and approaches. Having familiarized with the pros and cons of the existing technologies, researchers can exploit its capabilities to the maximum potential for integrating data. In this review we discuss the architecture, design and key technologies underlying some of the prominent databases (DBs and web applications. We will mention their roles in integration of biological data and investigate some of the emerging design concepts and computational technologies that are likely to have a key role in the future of systems driven biomedical research.

  4. Using XML technology for the ontology-based semantic integration of life science databases.

    Science.gov (United States)

    Philippi, Stephan; Köhler, Jacob

    2004-06-01

    Several hundred internet accessible life science databases with constantly growing contents and varying areas of specialization are publicly available via the internet. Database integration, consequently, is a fundamental prerequisite to be able to answer complex biological questions. Due to the presence of syntactic, schematic, and semantic heterogeneities, large scale database integration at present takes considerable efforts. As there is a growing apprehension of extensible markup language (XML) as a means for data exchange in the life sciences, this article focuses on the impact of XML technology on database integration in this area. In detail, a general architecture for ontology-driven data integration based on XML technology is introduced, which overcomes some of the traditional problems in this area. As a proof of concept, a prototypical implementation of this architecture based on a native XML database and an expert system shell is described for the realization of a real world integration scenario.

  5. On the level of coverage and citation of publications by mechanicians of the national academy of sciences of Ukraine in the Scopus database

    Science.gov (United States)

    Guz, A. N.; Rushchitsky, J. J.

    2009-11-01

    The paper analyzes the level of coverage and citation of publications by mechanicians of the National Academy of Sciences of Ukraine (NASU) in the Scopus database. Two groups of mechanicians are considered. One group includes 66 doctors of sciences of the S. P. Timoshenko Institute of Mechanics as representatives of the oldest institute of the NASU. The other group includes 34 members (academicians and corresponding members) of the Division of Mechanics of the NASU as representatives of the authoritative community of mechanicians in Ukraine. The results are presented for each scientist in the form of two indices—the total number of publications accessible in the database as the level of coverage of the scientist's publications in this database and the h-index as the citation level of these publications. This paper may be considered to continue the papers [6-12] published in Prikladnaya Mekhanika (International Applied Mechanics) in 2005-2009

  6. FISH REPRODUCTION: BIBLIOMETRIC ANALYSIS OF WORLDWIDE AND BRAZILIAN PUBLICATIONS IN SCOPUS DATABASE

    Directory of Open Access Journals (Sweden)

    Marcella Costa RADAEL

    2015-12-01

    Full Text Available Reproduction is a fundamental part of life being and studies related to fish reproduction have been much accessed. The aim of this study was to perform a bibliometric analysis in intend to identify trends in this kind of publication. During June 2013, were performed searches on Scopus Database, using the term “fish reproduction”, being compiled and presented information related to the number of publications per year, number of publications by country, publications by author, by journal, by institution and most used keywords. Based on the study, it was possible to obtain the following results: Brazil occupies a highlight position in number of papers, being that the Brazilian participation compared to worldwide publishing production is having an exponential increase; in Brazil, there is a high concentration of articles when concerning the top 10 authors and institutions. The present study allows verifying that the term “fish reproduction” has been focused by many scientific papers, being that in Brazil there is a special research effort related to this subject, especially in the last few years. The main contribution concerns to the use of bibliometric methods to describe the growth and concentration of researches in the area of fishfarm and reproduction.

  7. Privacy protection and public goods: building a genetic database for health research in Newfoundland and Labrador

    Science.gov (United States)

    Pullman, Daryl; Perrot-Daley, Astrid; Hodgkinson, Kathy; Street, Catherine; Rahman, Proton

    2013-01-01

    Objective To provide a legal and ethical analysis of some of the implementation challenges faced by the Population Therapeutics Research Group (PTRG) at Memorial University (Canada), in using genealogical information offered by individuals for its genetics research database. Materials and methods This paper describes the unique historical and genetic characteristics of the Newfoundland and Labrador founder population, which gave rise to the opportunity for PTRG to build the Newfoundland Genealogy Database containing digitized records of all pre-confederation (1949) census records of the Newfoundland founder population. In addition to building the database, PTRG has developed the Heritability Analytics Infrastructure, a data management structure that stores genotype, phenotype, and pedigree information in a single database, and custom linkage software (KINNECT) to perform pedigree linkages on the genealogy database. Discussion A newly adopted legal regimen in Newfoundland and Labrador is discussed. It incorporates health privacy legislation with a unique research ethics statute governing the composition and activities of research ethics boards and, for the first time in Canada, elevating the status of national research ethics guidelines into law. The discussion looks at this integration of legal and ethical principles which provides a flexible and seamless framework for balancing the privacy rights and welfare interests of individuals, families, and larger societies in the creation and use of research data infrastructures as public goods. Conclusion The complementary legal and ethical frameworks that now coexist in Newfoundland and Labrador provide the legislative authority, ethical legitimacy, and practical flexibility needed to find a workable balance between privacy interests and public goods. Such an approach may also be instructive for other jurisdictions as they seek to construct and use biobanks and related research platforms for genetic research. PMID

  8. TiGER: a database for tissue-specific gene expression and regulation.

    Science.gov (United States)

    Liu, Xiong; Yu, Xueping; Zack, Donald J; Zhu, Heng; Qian, Jiang

    2008-06-09

    Understanding how genes are expressed and regulated in different tissues is a fundamental and challenging question. However, most of currently available biological databases do not focus on tissue-specific gene regulation. The recent development of computational methods for tissue-specific combinational gene regulation, based on transcription factor binding sites, enables us to perform a large-scale analysis of tissue-specific gene regulation in human tissues. The results are stored in a web database called TiGER (Tissue-specific Gene Expression and Regulation). The database contains three types of data including tissue-specific gene expression profiles, combinatorial gene regulations, and cis-regulatory module (CRM) detections. At present the database contains expression profiles for 19,526 UniGene genes, combinatorial regulations for 7,341 transcription factor pairs and 6,232 putative CRMs for 2,130 RefSeq genes. We have developed and made publicly available a database, TiGER, which summarizes and provides large scale data sets for tissue-specific gene expression and regulation in a variety of human tissues. This resource is available at 1.

  9. TiGER: A database for tissue-specific gene expression and regulation

    Directory of Open Access Journals (Sweden)

    Zack Donald J

    2008-06-01

    Full Text Available Abstract Background Understanding how genes are expressed and regulated in different tissues is a fundamental and challenging question. However, most of currently available biological databases do not focus on tissue-specific gene regulation. Results The recent development of computational methods for tissue-specific combinational gene regulation, based on transcription factor binding sites, enables us to perform a large-scale analysis of tissue-specific gene regulation in human tissues. The results are stored in a web database called TiGER (Tissue-specific Gene Expression and Regulation. The database contains three types of data including tissue-specific gene expression profiles, combinatorial gene regulations, and cis-regulatory module (CRM detections. At present the database contains expression profiles for 19,526 UniGene genes, combinatorial regulations for 7,341 transcription factor pairs and 6,232 putative CRMs for 2,130 RefSeq genes. Conclusion We have developed and made publicly available a database, TiGER, which summarizes and provides large scale data sets for tissue-specific gene expression and regulation in a variety of human tissues. This resource is available at 1.

  10. Subject and authorship of records related to the Organization for Tropical Studies (OTS in BINABITROP, a comprehensive database about Costa Rican biology

    Directory of Open Access Journals (Sweden)

    Julián Monge-Nájera

    2013-06-01

    Full Text Available BINABITROP is a bibliographical database of more than 38 000 records about the ecosystems and organisms of Costa Rica. In contrast with commercial databases, such as Web of Knowledge and Scopus, which exclude most of the scientific journals published in tropical countries, BINABITROP is a comprehensive record of knowledge on the tropical ecosystems and organisms of Costa Rica. We analyzed its contents in three sites (La Selva, Palo Verde and Las Cruces and recorded scientific field, taxonomic group and authorship. We found that most records dealt with ecology and systematics, and that most authors published only one article in the study period (1963-2011. Most research was published in four journals: Biotropica, Revista de Biología Tropical/ International Journal of Tropical Biology and Conservation, Zootaxa and Brenesia. This may be the first study of a such a comprehensive database for any case of tropical biology literature.BINABITROP es una base de datos bibliográfica con más de 38 000 registros sobre los ecosistemas y organismos de Costa Rica. En contraste con bases de datos comerciales como Web of Knowledge y Scopus, que excluyen a la mayoría de las revistas científicas publicadas en los países tropicales, BINABITROP registra casi por completo la literatura biológica sobre Costa Rica. Analizamos los registros de La Selva, Palo Verde y Las Cruces. Hallamos que la mayoría de los registros corresponden a estudios sobre ecología y sistemática; que la mayoría de los autores sólo registraron un artículo en el período de estudio (1963-2011 y que la mayoría de la investigación formalmente publicada apareció en cuatro revistas: Biotropica, Revista de Biología Tropical/International Journal of Tropical Biology, Zootaxa y Brenesia. Este parece ser el primer estudio de una base de datos integral sobre literatura de biología tropical.

  11. Mining biological networks from full-text articles.

    Science.gov (United States)

    Czarnecki, Jan; Shepherd, Adrian J

    2014-01-01

    The study of biological networks is playing an increasingly important role in the life sciences. Many different kinds of biological system can be modelled as networks; perhaps the most important examples are protein-protein interaction (PPI) networks, metabolic pathways, gene regulatory networks, and signalling networks. Although much useful information is easily accessible in publicly databases, a lot of extra relevant data lies scattered in numerous published papers. Hence there is a pressing need for automated text-mining methods capable of extracting such information from full-text articles. Here we present practical guidelines for constructing a text-mining pipeline from existing code and software components capable of extracting PPI networks from full-text articles. This approach can be adapted to tackle other types of biological network.

  12. 75 FR 61497 - Approval Pathway for Biosimilar and Interchangeable Biological Products; Public Hearing; Request...

    Science.gov (United States)

    2010-10-05

    ... Price Competition and Innovation Act of 2009 (BPCI Act) that amends the Public Health Service Act (PHS... DEPARTMENT OF HEALTH AND HUMAN SERVICES Food and Drug Administration [Docket No. FDA-2010-N-0477] Approval Pathway for Biosimilar and Interchangeable Biological Products; Public Hearing; Request for...

  13. Modeling biology using relational databases.

    Science.gov (United States)

    Peitzsch, Robert M

    2003-02-01

    There are several different methodologies that can be used for designing a database schema; no one is the best for all occasions. This unit demonstrates two different techniques for designing relational tables and discusses when each should be used. These two techniques presented are (1) traditional Entity-Relationship (E-R) modeling and (2) a hybrid method that combines aspects of data warehousing and E-R modeling. The method of choice depends on (1) how well the information and all its inherent relationships are understood, (2) what types of questions will be asked, (3) how many different types of data will be included, and (4) how much data exists.

  14. Biocuration at the Saccharomyces genome database.

    Science.gov (United States)

    Skrzypek, Marek S; Nash, Robert S

    2015-08-01

    Saccharomyces Genome Database is an online resource dedicated to managing information about the biology and genetics of the model organism, yeast (Saccharomyces cerevisiae). This information is derived primarily from scientific publications through a process of human curation that involves manual extraction of data and their organization into a comprehensive system of knowledge. This system provides a foundation for further analysis of experimental data coming from research on yeast as well as other organisms. In this review we will demonstrate how biocuration and biocurators add a key component, the biological context, to our understanding of how genes, proteins, genomes and cells function and interact. We will explain the role biocurators play in sifting through the wealth of biological data to incorporate and connect key information. We will also discuss the many ways we assist researchers with their various research needs. We hope to convince the reader that manual curation is vital in converting the flood of data into organized and interconnected knowledge, and that biocurators play an essential role in the integration of scientific information into a coherent model of the cell. © 2015 Wiley Periodicals, Inc.

  15. IntegromeDB: an integrated system and biological search engine.

    Science.gov (United States)

    Baitaluk, Michael; Kozhenkov, Sergey; Dubinina, Yulia; Ponomarenko, Julia

    2012-01-19

    With the growth of biological data in volume and heterogeneity, web search engines become key tools for researchers. However, general-purpose search engines are not specialized for the search of biological data. Here, we present an approach at developing a biological web search engine based on the Semantic Web technologies and demonstrate its implementation for retrieving gene- and protein-centered knowledge. The engine is available at http://www.integromedb.org. The IntegromeDB search engine allows scanning data on gene regulation, gene expression, protein-protein interactions, pathways, metagenomics, mutations, diseases, and other gene- and protein-related data that are automatically retrieved from publicly available databases and web pages using biological ontologies. To perfect the resource design and usability, we welcome and encourage community feedback.

  16. Compliance with minimum information guidelines in public metabolomics repositories.

    Science.gov (United States)

    Spicer, Rachel A; Salek, Reza; Steinbeck, Christoph

    2017-09-26

    The Metabolomics Standards Initiative (MSI) guidelines were first published in 2007. These guidelines provided reporting standards for all stages of metabolomics analysis: experimental design, biological context, chemical analysis and data processing. Since 2012, a series of public metabolomics databases and repositories, which accept the deposition of metabolomic datasets, have arisen. In this study, the compliance of 399 public data sets, from four major metabolomics data repositories, to the biological context MSI reporting standards was evaluated. None of the reporting standards were complied with in every publicly available study, although adherence rates varied greatly, from 0 to 97%. The plant minimum reporting standards were the most complied with and the microbial and in vitro were the least. Our results indicate the need for reassessment and revision of the existing MSI reporting standards.

  17. An overview of the challenges in designing, integrating, and delivering BARD: a public chemical biology resource and query portal across multiple organizations, locations, and disciplines

    Science.gov (United States)

    de Souza, Andrea; Bittker, Joshua; Lahr, David; Brudz, Steve; Chatwin, Simon; Oprea, Tudor I.; Waller, Anna; Yang, Jeremy; Southall, Noel; Guha, Rajarshi; Schurer, Stephan; Vempati, Uma; Southern, Mark R.; Dawson, Eric S.; Clemons, Paul A.; Chung, Thomas D.Y.

    2015-01-01

    Recent industry-academic partnerships involve collaboration across disciplines, locations, and organizations using publicly funded “open-access” and proprietary commercial data sources. These require effective integration of chemical and biological information from diverse data sources, presenting key informatics, personnel, and organizational challenges. BARD (BioAssay Research Database) was conceived to address these challenges and to serve as a community-wide resource and intuitive web portal for public-sector chemical biology data. Its initial focus is to enable scientists to more effectively use the NIH Roadmap Molecular Libraries Program (MLP) data generated from 3-year pilot and 6-year production phases of the Molecular Libraries Probe Production Centers Network (MLPCN), currently in its final year. BARD evolves the current data standards through structured assay and result annotations that leverage the BioAssay Ontology (BAO) and other industry-standard ontologies, and a core hierarchy of assay definition terms and data standards defined specifically for small-molecule assay data. We have initially focused on migrating the highest-value MLP data into BARD and bringing it up to this new standard. We review the technical and organizational challenges overcome by the inter-disciplinary BARD team, veterans of public and private sector data-integration projects, collaborating to describe (functional specifications), design (technical specifications), and implement this next-generation software solution. PMID:24441647

  18. An Overview of the Challenges in Designing, Integrating, and Delivering BARD: A Public Chemical-Biology Resource and Query Portal for Multiple Organizations, Locations, and Disciplines.

    Science.gov (United States)

    de Souza, Andrea; Bittker, Joshua A; Lahr, David L; Brudz, Steve; Chatwin, Simon; Oprea, Tudor I; Waller, Anna; Yang, Jeremy J; Southall, Noel; Guha, Rajarshi; Schürer, Stephan C; Vempati, Uma D; Southern, Mark R; Dawson, Eric S; Clemons, Paul A; Chung, Thomas D Y

    2014-06-01

    Recent industry-academic partnerships involve collaboration among disciplines, locations, and organizations using publicly funded "open-access" and proprietary commercial data sources. These require the effective integration of chemical and biological information from diverse data sources, which presents key informatics, personnel, and organizational challenges. The BioAssay Research Database (BARD) was conceived to address these challenges and serve as a community-wide resource and intuitive web portal for public-sector chemical-biology data. Its initial focus is to enable scientists to more effectively use the National Institutes of Health Roadmap Molecular Libraries Program (MLP) data generated from the 3-year pilot and 6-year production phases of the Molecular Libraries Probe Production Centers Network (MLPCN), which is currently in its final year. BARD evolves the current data standards through structured assay and result annotations that leverage BioAssay Ontology and other industry-standard ontologies, and a core hierarchy of assay definition terms and data standards defined specifically for small-molecule assay data. We initially focused on migrating the highest-value MLP data into BARD and bringing it up to this new standard. We review the technical and organizational challenges overcome by the interdisciplinary BARD team, veterans of public- and private-sector data-integration projects, who are collaborating to describe (functional specifications), design (technical specifications), and implement this next-generation software solution. © 2014 Society for Laboratory Automation and Screening.

  19. The MAR databases: development and implementation of databases specific for marine metagenomics.

    Science.gov (United States)

    Klemetsen, Terje; Raknes, Inge A; Fu, Juan; Agafonov, Alexander; Balasundaram, Sudhagar V; Tartari, Giacomo; Robertsen, Espen; Willassen, Nils P

    2018-01-04

    We introduce the marine databases; MarRef, MarDB and MarCat (https://mmp.sfb.uit.no/databases/), which are publicly available resources that promote marine research and innovation. These data resources, which have been implemented in the Marine Metagenomics Portal (MMP) (https://mmp.sfb.uit.no/), are collections of richly annotated and manually curated contextual (metadata) and sequence databases representing three tiers of accuracy. While MarRef is a database for completely sequenced marine prokaryotic genomes, which represent a marine prokaryote reference genome database, MarDB includes all incomplete sequenced prokaryotic genomes regardless level of completeness. The last database, MarCat, represents a gene (protein) catalog of uncultivable (and cultivable) marine genes and proteins derived from marine metagenomics samples. The first versions of MarRef and MarDB contain 612 and 3726 records, respectively. Each record is built up of 106 metadata fields including attributes for sampling, sequencing, assembly and annotation in addition to the organism and taxonomic information. Currently, MarCat contains 1227 records with 55 metadata fields. Ontologies and controlled vocabularies are used in the contextual databases to enhance consistency. The user-friendly web interface lets the visitors browse, filter and search in the contextual databases and perform BLAST searches against the corresponding sequence databases. All contextual and sequence databases are freely accessible and downloadable from https://s1.sfb.uit.no/public/mar/. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  20. 21SSD: a new public 21-cm EoR database

    Science.gov (United States)

    Eames, Evan; Semelin, Benoît

    2018-05-01

    With current efforts inching closer to detecting the 21-cm signal from the Epoch of Reionization (EoR), proper preparation will require publicly available simulated models of the various forms the signal could take. In this work we present a database of such models, available at 21ssd.obspm.fr. The models are created with a fully-coupled radiative hydrodynamic simulation (LICORICE), and are created at high resolution (10243). We also begin to analyse and explore the possible 21-cm EoR signals (with Power Spectra and Pixel Distribution Functions), and study the effects of thermal noise on our ability to recover the signal out to high redshifts. Finally, we begin to explore the concepts of `distance' between different models, which represents a crucial step towards optimising parameter space sampling, training neural networks, and finally extracting parameter values from observations.

  1. Using the TIGR gene index databases for biological discovery.

    Science.gov (United States)

    Lee, Yuandan; Quackenbush, John

    2003-11-01

    The TIGR Gene Index web pages provide access to analyses of ESTs and gene sequences for nearly 60 species, as well as a number of resources derived from these. Each species-specific database is presented using a common format with a homepage. A variety of methods exist that allow users to search each species-specific database. Methods implemented currently include nucleotide or protein sequence queries using WU-BLAST, text-based searches using various sequence identifiers, searches by gene, tissue and library name, and searches using functional classes through Gene Ontology assignments. This protocol provides guidance for using the Gene Index Databases to extract information.

  2. Importance of databases of nucleic acids for bioinformatic analysis focused to genomics

    Science.gov (United States)

    Jimenez-Gutierrez, L. R.; Barrios-Hernández, C. J.; Pedraza-Ferreira, G. R.; Vera-Cala, L.; Martinez-Perez, F.

    2016-08-01

    Recently, bioinformatics has become a new field of science, indispensable in the analysis of millions of nucleic acids sequences, which are currently deposited in international databases (public or private); these databases contain information of genes, RNA, ORF, proteins, intergenic regions, including entire genomes from some species. The analysis of this information requires computer programs; which were renewed in the use of new mathematical methods, and the introduction of the use of artificial intelligence. In addition to the constant creation of supercomputing units trained to withstand the heavy workload of sequence analysis. However, it is still necessary the innovation on platforms that allow genomic analyses, faster and more effectively, with a technological understanding of all biological processes.

  3. Quality controls in integrative approaches to detect errors and inconsistencies in biological databases

    Directory of Open Access Journals (Sweden)

    Ghisalberti Giorgio

    2010-12-01

    different biological databases and integrated in the GFINDer data warehouse. By doing so, we identified in these data a variety of different types of errors and inconsistencies; this enables us to ensure good quality of the data in the GFINDer data warehouse. We reported all identified data errors and inconsistencies to the curators of the original databases from where the data were retrieved, who mainly corrected them in subsequent updating of the original database. This contributed to improve the quality of the data available, in the original databases, to the whole scientific community.

  4. Life sciences: Nuclear medicine, radiation biology, medical physics, 1980-1994. International Atomic Energy Agency Publications

    International Nuclear Information System (INIS)

    1994-11-01

    The catalogue lists all sales publications of the IAEA dealing with Life Sciences issued during the period 1980-1994. The publications are grouped in the following chapters: Nuclear Medicine (including Radiopharmaceuticals), Radiation Biology and Medical Physics (including Dosimetry)

  5. Behavioral Economics and the Public Acceptance of Synthetic Biology.

    Science.gov (United States)

    Oliver, Adam

    2018-01-01

    Different applications of synthetic biology are alike in that their possible negative consequences are highly uncertain, potentially catastrophic, and perhaps irreversible; therefore, they are also alike in that public attitudes about them are fertile ground for behavioral economic phenomena. Findings from behavioral economics suggest that people may not respond to such applications according to the normal rules of economic evaluation, by which the value of an outcome is multiplied by the mathematical probability that the outcome will occur. Possibly, then, synthetic biology applications challenge the normative postulates of the standard approach, too. I want to first consider how some of the phenomena described by behavioral economists-and behavioral scientists more broadly-might affect people's perceptions of the uncertainties associated with synthetic biology. My analysis will be far from complete, however, because behavioral economics is essentially the study of human behavior, and thus its reach is potentially vast and its development longstanding and ongoing. Nonetheless, I hope to give an indicative perspective on how some aspects of behavioral economics might affect the assessment and perceived acceptability of synthetic biology. I will then consider the issue of agency. Should policy-makers respect people's reactions to synthetic biology when those reactions are known to be driven by behavioral economic phenomena rather than following the normative postulates of rational choice theory? Or should policy-makers dismiss these reactions as inherently biased? I will argue that the normative force of these human reactions (probably) depends on phenomenon and context. © 2018 The Hastings Center.

  6. Segmentation of anatomical structures in chest radiographs using supervised methods: a comparative study on a public database

    DEFF Research Database (Denmark)

    van Ginneken, Bram; Stegmann, Mikkel Bille; Loog, Marco

    2006-01-01

    classification method that employs a multi-scale filter bank of Gaussian derivatives and a k-nearest-neighbors classifier. The methods have been tested on a publicly available database of 247 chest radiographs, in which all objects have been manually segmented by two human observers. A parameter optimization...

  7. There Is a Significant Discrepancy Between "Big Data" Database and Original Research Publications on Hip Arthroscopy Outcomes: A Systematic Review.

    Science.gov (United States)

    Sochacki, Kyle R; Jack, Robert A; Safran, Marc R; Nho, Shane J; Harris, Joshua D

    2018-06-01

    The purpose of this study was to compare (1) major complication, (2) revision, and (3) conversion to arthroplasty rates following hip arthroscopy between database studies and original research peer-reviewed publications. A systematic review was performed using PRISMA guidelines. PubMed, SCOPUS, SportDiscus, and Cochrane Central Register of Controlled Trials were searched for studies that investigated major complication (dislocation, femoral neck fracture, avascular necrosis, fluid extravasation, septic arthritis, death), revision, and hip arthroplasty conversion rates following hip arthroscopy. Major complication, revision, and conversion to hip arthroplasty rates were compared between original research (single- or multicenter therapeutic studies) and database (insurance database using ICD-9/10 and/or current procedural terminology coding terminology) publishing studies. Two hundred seven studies (201 original research publications [15,780 subjects; 54% female] and 6 database studies [20,825 subjects; 60% female]) were analyzed (mean age, 38.2 ± 11.6 years old; mean follow-up, 2.7 ± 2.9 years). The database studies had a significantly higher age (40.6 + 2.8 vs 35.4 ± 11.6), body mass index (27.4 ± 5.6 vs 24.9 ± 3.1), percentage of females (60.1% vs 53.8%), and longer follow-up (3.1 ± 1.6 vs 2.7 ± 3.0) compared with original research (P database studies (P = .029; relative risk [RR], 1.3). There was a significantly higher rate of femoral neck fracture (0.24% vs 0.03%; P database studies. Reoperations occurred at a significantly higher rate in the database studies (11.1% vs 7.3%; P database studies (8.0% vs 3.7%; P Database studies report significantly increased major complication, revision, and conversion to hip arthroplasty rates compared with original research investigations of hip arthroscopy outcomes. Level IV, systematic review of Level I-IV studies. Copyright © 2018 Arthroscopy Association of North America. Published by Elsevier Inc. All rights

  8. COMPARISON OF POPULAR BIOINFORMATICS DATABASES

    OpenAIRE

    Abdulganiyu Abdu Yusuf; Zahraddeen Sufyanu; Kabir Yusuf Mamman; Abubakar Umar Suleiman

    2016-01-01

    Bioinformatics is the application of computational tools to capture and interpret biological data. It has wide applications in drug development, crop improvement, agricultural biotechnology and forensic DNA analysis. There are various databases available to researchers in bioinformatics. These databases are customized for a specific need and are ranged in size, scope, and purpose. The main drawbacks of bioinformatics databases include redundant information, constant change, data spread over m...

  9. Hospice palliative care article publications: An analysis of the Web of Science database from 1993 to 2013.

    Science.gov (United States)

    Chang, Hsiao-Ting; Lin, Ming-Hwai; Chen, Chun-Ku; Hwang, Shinn-Jang; Hwang, I-Hsuan; Chen, Yu-Chun

    2016-01-01

    Academic publications are important for developing a medical specialty or discipline and improvements of quality of care. As hospice palliative care medicine is a rapidly growing medical specialty in Taiwan, this study aimed to analyze the hospice palliative care-related publications from 1993 through 2013 both worldwide and in Taiwan, by using the Web of Science database. Academic articles published with topics including "hospice", "palliative care", "end of life care", and "terminal care" were retrieved and analyzed from the Web of Science database, which includes documents published in Science Citation Index-Expanded and Social Science Citation Indexed journals from 1993 to 2013. Compound annual growth rates (CAGRs) were calculated to evaluate the trends of publications. There were a total of 27,788 documents published worldwide during the years 1993 to 2013. The top five most prolific countries/areas with published documents were the United States (11,419 documents, 41.09%), England (3620 documents, 13.03%), Canada (2428 documents, 8.74%), Germany (1598 documents, 5.75%), and Australia (1580 documents, 5.69%). Three hundred and ten documents (1.12%) were published from Taiwan, which ranks second among Asian countries (after Japan, with 594 documents, 2.14%) and 16(th) in the world. During this 21-year period, the number of hospice palliative care-related article publications increased rapidly. The worldwide CAGR for hospice palliative care publications during 1993 through 2013 was 12.9%. As for Taiwan, the CAGR for publications during 1999 through 2013 was 19.4%. The majority of these documents were submitted from universities or hospitals affiliated to universities. The number of hospice palliative care-related publications increased rapidly from 1993 to 2013 in the world and in Taiwan; however, the number of publications from Taiwan is still far below those published in several other countries. Further research is needed to identify and try to reduce the

  10. Defining new criteria for selection of cell-based intestinal models using publicly available databases

    Directory of Open Access Journals (Sweden)

    Christensen Jon

    2012-06-01

    Full Text Available Abstract Background The criteria for choosing relevant cell lines among a vast panel of available intestinal-derived lines exhibiting a wide range of functional properties are still ill-defined. The objective of this study was, therefore, to establish objective criteria for choosing relevant cell lines to assess their appropriateness as tumor models as well as for drug absorption studies. Results We made use of publicly available expression signatures and cell based functional assays to delineate differences between various intestinal colon carcinoma cell lines and normal intestinal epithelium. We have compared a panel of intestinal cell lines with patient-derived normal and tumor epithelium and classified them according to traits relating to oncogenic pathway activity, epithelial-mesenchymal transition (EMT and stemness, migratory properties, proliferative activity, transporter expression profiles and chemosensitivity. For example, SW480 represent an EMT-high, migratory phenotype and scored highest in terms of signatures associated to worse overall survival and higher risk of recurrence based on patient derived databases. On the other hand, differentiated HT29 and T84 cells showed gene expression patterns closest to tumor bulk derived cells. Regarding drug absorption, we confirmed that differentiated Caco-2 cells are the model of choice for active uptake studies in the small intestine. Regarding chemosensitivity we were unable to confirm a recently proposed association of chemo-resistance with EMT traits. However, a novel signature was identified through mining of NCI60 GI50 values that allowed to rank the panel of intestinal cell lines according to their drug responsiveness to commonly used chemotherapeutics. Conclusions This study presents a straightforward strategy to exploit publicly available gene expression data to guide the choice of cell-based models. While this approach does not overcome the major limitations of such models

  11. Development of a biomarkers database for the National Children's Study

    Energy Technology Data Exchange (ETDEWEB)

    Lobdell, Danelle T [US Environmental Protection Agency, Office of Research and Development, National Health and Environmental Effects Research Laboratory, Human Studies Division, Epidemiology and Biomarkers Branch, MD 58A, Research Triangle Park, NC 27711 (United States); Mendola, Pauline [US Environmental Protection Agency, Office of Research and Development, National Health and Environmental Effects Research Laboratory, Human Studies Division, Epidemiology and Biomarkers Branch, MD 58A, Research Triangle Park, NC 27711 (United States)

    2005-08-07

    The National Children's Study (NCS) is a federally-sponsored, longitudinal study of environmental influences on the health and development of children across the United States (www.nationalchildrensstudy.gov). Current plans are to study approximately 100,000 children and their families beginning before birth up to age 21 years. To explore potential biomarkers that could be important measurements in the NCS, we compiled the relevant scientific literature to identify both routine or standardized biological markers as well as new and emerging biological markers. Although the search criteria encouraged examination of factors that influence the breadth of child health and development, attention was primarily focused on exposure, susceptibility, and outcome biomarkers associated with four important child health outcomes: autism and neurobehavioral disorders, injury, cancer, and asthma. The Biomarkers Database was designed to allow users to: (1) search the biomarker records compiled by type of marker (susceptibility, exposure or effect), sampling media (e.g., blood, urine, etc.), and specific marker name; (2) search the citations file; and (3) read the abstract evaluations relative to our search criteria. A searchable, user-friendly database of over 2000 articles was created and is publicly available at: http://cfpub.epa.gov/ncea/cfm/recordisplay.cfm?deid=85844. PubMed was the primary source of references with some additional searches of Toxline, NTIS, and other reference databases. Our initial focus was on review articles, beginning as early as 1996, supplemented with searches of the recent primary research literature from 2001 to 2003. We anticipate this database will have applicability for the NCS as well as other studies of children's environmental health.

  12. Development of a biomarkers database for the National Children's Study

    International Nuclear Information System (INIS)

    Lobdell, Danelle T.; Mendola, Pauline

    2005-01-01

    The National Children's Study (NCS) is a federally-sponsored, longitudinal study of environmental influences on the health and development of children across the United States (www.nationalchildrensstudy.gov). Current plans are to study approximately 100,000 children and their families beginning before birth up to age 21 years. To explore potential biomarkers that could be important measurements in the NCS, we compiled the relevant scientific literature to identify both routine or standardized biological markers as well as new and emerging biological markers. Although the search criteria encouraged examination of factors that influence the breadth of child health and development, attention was primarily focused on exposure, susceptibility, and outcome biomarkers associated with four important child health outcomes: autism and neurobehavioral disorders, injury, cancer, and asthma. The Biomarkers Database was designed to allow users to: (1) search the biomarker records compiled by type of marker (susceptibility, exposure or effect), sampling media (e.g., blood, urine, etc.), and specific marker name; (2) search the citations file; and (3) read the abstract evaluations relative to our search criteria. A searchable, user-friendly database of over 2000 articles was created and is publicly available at: http://cfpub.epa.gov/ncea/cfm/recordisplay.cfm?deid=85844. PubMed was the primary source of references with some additional searches of Toxline, NTIS, and other reference databases. Our initial focus was on review articles, beginning as early as 1996, supplemented with searches of the recent primary research literature from 2001 to 2003. We anticipate this database will have applicability for the NCS as well as other studies of children's environmental health

  13. Directory of IAEA databases. 4. ed.

    International Nuclear Information System (INIS)

    1997-06-01

    This fourth edition of the Directory of IAEA Databases has been prepared within the Division of NESI. ITs main objective is to describe the computerized information sources available to the public. This directory contains all publicly available databases which are produced at the IAEA. This includes databases stored on the mainframe, LAN servers and user PCs. All IAEA Division Directors have been requested to register the existence of their databases with NESI. At the data of printing, some of the information in the directory will be already obsolete. For the most up-to-date information please see the IAEA's World Wide Web site at URL: http:/www.iaea.or.at/databases/dbdir/. Refs, figs, tabs

  14. Databases of Publications and Observations as a Part of the Crimean Astronomical Virtual Observatory

    Directory of Open Access Journals (Sweden)

    Shlyapnikov A.

    2015-12-01

    Full Text Available We describe the main principles of formation of databases (DBs with information about astronomical objects and their physical characteristics derived from observations obtained at the Crimean Astrophysical Observatory (CrAO and published in the “Izvestiya of the CrAO” and elsewhere. Emphasis is placed on the DBs missing from the most complete global library of catalogs and data tables, VizieR (supported by the Center of Astronomical Data, Strasbourg. We specially consider the problem of forming a digital archive of observational data obtained at the CrAO as an interactive DB related to database objects and publications. We present examples of all our DBs as elements integrated into the Crimean Astronomical Virtual Observatory. We illustrate the work with the CrAO DBs using tools of the International Virtual Observatory: Aladin, VOPlot, VOSpec, in conjunction with the VizieR and Simbad DBs.

  15. Aviation Safety Issues Database

    Science.gov (United States)

    Morello, Samuel A.; Ricks, Wendell R.

    2009-01-01

    The aviation safety issues database was instrumental in the refinement and substantiation of the National Aviation Safety Strategic Plan (NASSP). The issues database is a comprehensive set of issues from an extremely broad base of aviation functions, personnel, and vehicle categories, both nationally and internationally. Several aviation safety stakeholders such as the Commercial Aviation Safety Team (CAST) have already used the database. This broader interest was the genesis to making the database publically accessible and writing this report.

  16. RISE OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY IN INDIA: A LOOK THROUGH PUBLICATIONS

    Directory of Open Access Journals (Sweden)

    Anjali Srivastava

    2017-09-01

    Full Text Available Computational biology and bioinformatics have been part and parcel of biomedical research for few decades now. However, the institutionalization of bioinformatics research took place with the establishment of Distributed Information Centres (DISCs in the research institutions of repute in various disciplines by the Department of Biotechnology, Government of India. Though, at initial stages, this endeavor was mainly focused on providing infrastructure for using information technology and internet based communication and tools for carrying out computational biology and in-silico assisted research in varied arena of research starting from disease biology to agricultural crops, spices, veterinary science and many more, the natural outcome of establishment of such facilities resulted into new experiments with bioinformatics tools. Thus, Biotechnology Information Systems (BTIS grew into a solid movement and a large number of publications started coming out of these centres. In the end of last century, bioinformatics started developing like a full-fledged research subject. In the last decade, a need was felt to actually make a factual estimation of the result of this endeavor of DBT which had, by then, established about two hundred centres in almost all disciplines of biomedical research. In a bid to evaluate the efforts and outcome of these centres, BTIS Centre at CSIR-CDRI, Lucknow was entrusted with collecting and collating the publications of these centres. However, when the full data was compiled, the DBT task force felt that the study must include Non-BTIS centres also so as to expand the report to have a glimpse of bioinformatics publications from the country.

  17. Ontological interpretation of biomedical database content.

    Science.gov (United States)

    Santana da Silva, Filipe; Jansen, Ludger; Freitas, Fred; Schulz, Stefan

    2017-06-26

    Biological databases store data about laboratory experiments, together with semantic annotations, in order to support data aggregation and retrieval. The exact meaning of such annotations in the context of a database record is often ambiguous. We address this problem by grounding implicit and explicit database content in a formal-ontological framework. By using a typical extract from the databases UniProt and Ensembl, annotated with content from GO, PR, ChEBI and NCBI Taxonomy, we created four ontological models (in OWL), which generate explicit, distinct interpretations under the BioTopLite2 (BTL2) upper-level ontology. The first three models interpret database entries as individuals (IND), defined classes (SUBC), and classes with dispositions (DISP), respectively; the fourth model (HYBR) is a combination of SUBC and DISP. For the evaluation of these four models, we consider (i) database content retrieval, using ontologies as query vocabulary; (ii) information completeness; and, (iii) DL complexity and decidability. The models were tested under these criteria against four competency questions (CQs). IND does not raise any ontological claim, besides asserting the existence of sample individuals and relations among them. Modelling patterns have to be created for each type of annotation referent. SUBC is interpreted regarding maximally fine-grained defined subclasses under the classes referred to by the data. DISP attempts to extract truly ontological statements from the database records, claiming the existence of dispositions. HYBR is a hybrid of SUBC and DISP and is more parsimonious regarding expressiveness and query answering complexity. For each of the four models, the four CQs were submitted as DL queries. This shows the ability to retrieve individuals with IND, and classes in SUBC and HYBR. DISP does not retrieve anything because the axioms with disposition are embedded in General Class Inclusion (GCI) statements. Ambiguity of biological database content is

  18. Planform: an application and database of graph-encoded planarian regenerative experiments.

    Science.gov (United States)

    Lobo, Daniel; Malone, Taylor J; Levin, Michael

    2013-04-15

    Understanding the mechanisms governing the regeneration capabilities of many organisms is a fundamental interest in biology and medicine. An ever-increasing number of manipulation and molecular experiments are attempting to discover a comprehensive model for regeneration, with the planarian flatworm being one of the most important model species. Despite much effort, no comprehensive, constructive, mechanistic models exist yet, and it is now clear that computational tools are needed to mine this huge dataset. However, until now, there is no database of regenerative experiments, and the current genotype-phenotype ontologies and databases are based on textual descriptions, which are not understandable by computers. To overcome these difficulties, we present here Planform (Planarian formalization), a manually curated database and software tool for planarian regenerative experiments, based on a mathematical graph formalism. The database contains more than a thousand experiments from the main publications in the planarian literature. The software tool provides the user with a graphical interface to easily interact with and mine the database. The presented system is a valuable resource for the regeneration community and, more importantly, will pave the way for the application of novel artificial intelligence tools to extract knowledge from this dataset. The database and software tool are freely available at http://planform.daniel-lobo.com.

  19. Data warehousing in molecular biology.

    Science.gov (United States)

    Schönbach, C; Kowalski-Saunders, P; Brusic, V

    2000-05-01

    In the business and healthcare sectors data warehousing has provided effective solutions for information usage and knowledge discovery from databases. However, data warehousing applications in the biological research and development (R&D) sector are lagging far behind. The fuzziness and complexity of biological data represent a major challenge in data warehousing for molecular biology. By combining experiences in other domains with our findings from building a model database, we have defined the requirements for data warehousing in molecular biology.

  20. The Danish fetal medicine database

    DEFF Research Database (Denmark)

    Ekelund, Charlotte Kvist; Kopp, Tine Iskov; Tabor, Ann

    2016-01-01

    trimester ultrasound scan performed at all public hospitals in Denmark are registered in the database. Main variables/descriptive data: Data on maternal characteristics, ultrasonic, and biochemical variables are continuously sent from the fetal medicine units’Astraia databases to the central database via...... analyses are sent to the database. Conclusion: It has been possible to establish a fetal medicine database, which monitors first-trimester screening for chromosomal abnormalities and second-trimester screening for major fetal malformations with the input from already collected data. The database...

  1. Synthetic Biology between Self-Regulation and Public Discourse: Ethical Issues and the Many Roles of the Ethicist.

    Science.gov (United States)

    Arnason, Gardar

    2017-04-01

    This article discusses the roles of ethicists in the governance of synthetic biology. I am particularly concerned with the idea of self-regulation of bioscience and its relationship to public discourse about ethical issues in bioscience. I will look at the role of philosophical ethicists at different levels and loci, from the "embedded ethicist" in the laboratory or research project, to ethicists' impact on policy and public discourse. In a democratic society, the development of governance frameworks for emerging technologies, such as synthetic biology, needs to be guided by a well-informed public discourse. In the case of synthetic biology, the public discourse has to go further than merely considering technical issues of biosafety and biosecurity, or risk management, to consider more philosophical issues concerning the meaning and value of "life" between the natural and the synthetic. I argue that ethicists have moral expertise to bring to the public arena, which consists not only in guiding the debate but also in evaluating arguments and moral positions and making normative judgments. When ethicists make normative claims or moral judgments, they must be transparent about their theoretical positions and basic moral standpoints.

  2. A Partnership for Public Health: USDA Branded Food Products Database

    Science.gov (United States)

    The importance of comprehensive food composition databases is more critical than ever in helping to address global food security. The USDA National Nutrient Database for Standard Reference is the “gold standard” for food composition databases. The presentation will include new developments in stren...

  3. Fly-DPI: database of protein interactomes for D. melanogaster in the approach of systems biology

    Directory of Open Access Journals (Sweden)

    Lin Chieh-Hua

    2006-12-01

    Full Text Available Abstract Background Proteins control and mediate many biological activities of cells by interacting with other protein partners. This work presents a statistical model to predict protein interaction networks of Drosophila melanogaster based on insight into domain interactions. Results Three high-throughput yeast two-hybrid experiments and the collection in FlyBase were used as our starting datasets. The co-occurrences of domains in these interactive events are converted into a probability score of domain-domain interaction. These scores are used to infer putative interaction among all available open reading frames (ORFs of fruit fly. Additionally, the likelihood function is used to estimate all potential protein-protein interactions. All parameters are successfully iterated and MLE is obtained for each pair of domains. Additionally, the maximized likelihood reaches its converged criteria and maintains the probability stable. The hybrid model achieves a high specificity with a loss of sensitivity, suggesting that the model may possess major features of protein-protein interactions. Several putative interactions predicted by the proposed hybrid model are supported by literatures, while experimental data with a low probability score indicate an uncertain reliability and require further proof of interaction. Fly-DPI is the online database used to present this work. It is an integrated proteomics tool with comprehensive protein annotation information from major databases as well as an effective means of predicting protein-protein interactions. As a novel search strategy, the ping-pong search is a naïve path map between two chosen proteins based on pre-computed shortest paths. Adopting effective filtering strategies will facilitate researchers in depicting the bird's eye view of the network of interest. Fly-DPI can be accessed at http://flydpi.nhri.org.tw. Conclusion This work provides two reference systems, statistical and biological, to evaluate

  4. 78 FR 53138 - South Carolina Public Service Authority; Notice of Meeting to Discuss Santee-Cooper Biological...

    Science.gov (United States)

    2013-08-28

    ... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Project No. 199-205] South Carolina Public Service Authority; Notice of Meeting to Discuss Santee-Cooper Biological Opinion On July 15, 2010, the National Marine Fisheries Service (NMFS) filed its Biological Opinion (BO) on the relicensing of...

  5. Managing Multiuser Database Buffers Using Data Mining Techniques

    NARCIS (Netherlands)

    Feng, L.; Lu, H.J.

    2004-01-01

    In this paper, we propose a data-mining-based approach to public buffer management for a multiuser database system, where database buffers are organized into two areas – public and private. While the private buffer areas contain pages to be updated by particular users, the public

  6. Computational biology for ageing

    Science.gov (United States)

    Wieser, Daniela; Papatheodorou, Irene; Ziehm, Matthias; Thornton, Janet M.

    2011-01-01

    High-throughput genomic and proteomic technologies have generated a wealth of publicly available data on ageing. Easy access to these data, and their computational analysis, is of great importance in order to pinpoint the causes and effects of ageing. Here, we provide a description of the existing databases and computational tools on ageing that are available for researchers. We also describe the computational approaches to data interpretation in the field of ageing including gene expression, comparative and pathway analyses, and highlight the challenges for future developments. We review recent biological insights gained from applying bioinformatics methods to analyse and interpret ageing data in different organisms, tissues and conditions. PMID:21115530

  7. Biological drugs for the treatment of psoriasis in a public health system

    Directory of Open Access Journals (Sweden)

    Luciane Cruz Lopes

    2014-08-01

    Full Text Available OBJECTIVE To analyze the access and utilization profile of biological medications for psoriasis provided by the judicial system in Brazil. METHODS This is a cross-sectional study. We interviewed a total of 203 patients with psoriasis who were on biological medications obtained by the judicial system of the State of Sao Paulo, from 2004 to 2010. Sociodemographics, medical, and political-administrative characteristics were complemented with data obtained from dispensation orders that included biological medications to treat psoriasis and the legal actions involved. The data was analyzed using an electronic data base and shown as simple variable frequencies. The prescriptions contained in the lawsuits were analyzed according to legal provisions. RESULTS A total of 190 lawsuits requesting several biological drugs (adalimumab, efalizumab, etanercept, and infliximab were analyzed. Patients obtained these medications as a result of injunctions (59.5% or without having ever demanded biological medication from any health institution (86.2%, i.e., public or private health services. They used the prerogative of free legal aid (72.6%, even though they were represented by private lawyers (91.1% and treated in private facilities (69.5%. Most of the patients used a biological medication for more than 13 months (66.0%, and some patients were undergoing treatment with this medication when interviewed (44.9%. Approximately one third of the patients discontinued treatment due to worsening of their illness (26.6%, adverse drug reactions (20.5%, lack of efficacy, or because the doctor discontinued this medication (13.8%. None of the analyzed medical prescriptions matched the legal prescribing requirements. Clinical monitoring results showed that 70.3% of the patients had not undergone laboratory examinations (blood work, liver and kidney function tests for treatment control purposes. CONCLUSIONS The plaintiffs resorted to legal action to get access to biological

  8. Biological drugs for the treatment of psoriasis in a public health system

    Science.gov (United States)

    Lopes, Luciane Cruz; Silveira, Miriam Sanches do Nascimento; de Camargo, Iara Alves; Barberato, Silvio; Del Fiol, Fernando de Sá; Osorio-de-Castro, Claudia Garcia Serpa

    2014-01-01

    OBJECTIVE To analyze the access and utilization profile of biological medications for psoriasis provided by the judicial system in Brazil. METHODS This is a cross-sectional study. We interviewed a total of 203 patients with psoriasis who were on biological medications obtained by the judicial system of the State of Sao Paulo, from 2004 to 2010. Sociodemographics, medical, and political-administrative characteristics were complemented with data obtained from dispensation orders that included biological medications to treat psoriasis and the legal actions involved. The data was analyzed using an electronic data base and shown as simple variable frequencies. The prescriptions contained in the lawsuits were analyzed according to legal provisions. RESULTS A total of 190 lawsuits requesting several biological drugs (adalimumab, efalizumab, etanercept, and infliximab) were analyzed. Patients obtained these medications as a result of injunctions (59.5%) or without having ever demanded biological medication from any health institution (86.2%), i.e., public or private health services. They used the prerogative of free legal aid (72.6%), even though they were represented by private lawyers (91.1%) and treated in private facilities (69.5%). Most of the patients used a biological medication for more than 13 months (66.0%), and some patients were undergoing treatment with this medication when interviewed (44.9%). Approximately one third of the patients discontinued treatment due to worsening of their illness (26.6%), adverse drug reactions (20.5%), lack of efficacy, or because the doctor discontinued this medication (13.8%). None of the analyzed medical prescriptions matched the legal prescribing requirements. Clinical monitoring results showed that 70.3% of the patients had not undergone laboratory examinations (blood work, liver and kidney function tests) for treatment control purposes. CONCLUSIONS The plaintiffs resorted to legal action to get access to biological medications

  9. Database citation in full text biomedical articles.

    Science.gov (United States)

    Kafkas, Şenay; Kim, Jee-Hyub; McEntyre, Johanna R

    2013-01-01

    Molecular biology and literature databases represent essential infrastructure for life science research. Effective integration of these data resources requires that there are structured cross-references at the level of individual articles and biological records. Here, we describe the current patterns of how database entries are cited in research articles, based on analysis of the full text Open Access articles available from Europe PMC. Focusing on citation of entries in the European Nucleotide Archive (ENA), UniProt and Protein Data Bank, Europe (PDBe), we demonstrate that text mining doubles the number of structured annotations of database record citations supplied in journal articles by publishers. Many thousands of new literature-database relationships are found by text mining, since these relationships are also not present in the set of articles cited by database records. We recommend that structured annotation of database records in articles is extended to other databases, such as ArrayExpress and Pfam, entries from which are also cited widely in the literature. The very high precision and high-throughput of this text-mining pipeline makes this activity possible both accurately and at low cost, which will allow the development of new integrated data services.

  10. The Government Finance Database: A Common Resource for Quantitative Research in Public Financial Analysis.

    Science.gov (United States)

    Pierson, Kawika; Hand, Michael L; Thompson, Fred

    2015-01-01

    Quantitative public financial management research focused on local governments is limited by the absence of a common database for empirical analysis. While the U.S. Census Bureau distributes government finance data that some scholars have utilized, the arduous process of collecting, interpreting, and organizing the data has led its adoption to be prohibitive and inconsistent. In this article we offer a single, coherent resource that contains all of the government financial data from 1967-2012, uses easy to understand natural-language variable names, and will be extended when new data is available.

  11. Communicating Synthetic Biology: from the lab via the media to the broader public

    OpenAIRE

    Kronberger, Nicole; Holtz, Peter; Kerbe, Wolfgang; Strasser, Ewald; Wagner, Wolfgang

    2009-01-01

    We present insights from a study on communicating Synthetic Biology conducted in 2008. Scientists were invited to write press releases on their work; the resulting texts were passed on to four journalists from major Austrian newspapers and magazines. The journalists in turn wrote articles that were used as stimulus material for eight group discussions with select members of the Austrian public. The results show that, from the lab via the media to the general public, communication is character...

  12. Mycobacteriophage genome database.

    Science.gov (United States)

    Joseph, Jerrine; Rajendran, Vasanthi; Hassan, Sameer; Kumar, Vanaja

    2011-01-01

    Mycobacteriophage genome database (MGDB) is an exclusive repository of the 64 completely sequenced mycobacteriophages with annotated information. It is a comprehensive compilation of the various gene parameters captured from several databases pooled together to empower mycobacteriophage researchers. The MGDB (Version No.1.0) comprises of 6086 genes from 64 mycobacteriophages classified into 72 families based on ACLAME database. Manual curation was aided by information available from public databases which was enriched further by analysis. Its web interface allows browsing as well as querying the classification. The main objective is to collect and organize the complexity inherent to mycobacteriophage protein classification in a rational way. The other objective is to browse the existing and new genomes and describe their functional annotation. The database is available for free at http://mpgdb.ibioinformatics.org/mpgdb.php.

  13. Sports medicine clinical trial research publications in academic medical journals between 1996 and 2005: an audit of the PubMed MEDLINE database.

    Science.gov (United States)

    Nichols, A W

    2008-11-01

    To identify sports medicine-related clinical trial research articles in the PubMed MEDLINE database published between 1996 and 2005 and conduct a review and analysis of topics of research, experimental designs, journals of publication and the internationality of authorships. Sports medicine research is international in scope with improving study methodology and an evolution of topics. Structured review of articles identified in a search of a large electronic medical database. PubMed MEDLINE database. Sports medicine-related clinical research trials published between 1996 and 2005. Review and analysis of articles that meet inclusion criteria. Articles were examined for study topics, research methods, experimental subject characteristics, journal of publication, lead authors and journal countries of origin and language of publication. The search retrieved 414 articles, of which 379 (345 English language and 34 non-English language) met the inclusion criteria. The number of publications increased steadily during the study period. Randomised clinical trials were the most common study type and the "diagnosis, management and treatment of sports-related injuries and conditions" was the most popular study topic. The knee, ankle/foot and shoulder were the most frequent anatomical sites of study. Soccer players and runners were the favourite study subjects. The American Journal of Sports Medicine had the highest number of publications and shared the greatest international diversity of authorships with the British Journal of Sports Medicine. The USA, Australia, Germany and the UK produced a good number of the lead authorships. In all, 91% of articles and 88% of journals were published in English. Sports medicine-related research is internationally diverse, clinical trial publications are increasing and the sophistication of research design may be improving.

  14. Targeted Therapy Database (TTD: a model to match patient's molecular profile with current knowledge on cancer biology.

    Directory of Open Access Journals (Sweden)

    Simone Mocellin

    Full Text Available BACKGROUND: The efficacy of current anticancer treatments is far from satisfactory and many patients still die of their disease. A general agreement exists on the urgency of developing molecularly targeted therapies, although their implementation in the clinical setting is in its infancy. In fact, despite the wealth of preclinical studies addressing these issues, the difficulty of testing each targeted therapy hypothesis in the clinical arena represents an intrinsic obstacle. As a consequence, we are witnessing a paradoxical situation where most hypotheses about the molecular and cellular biology of cancer remain clinically untested and therefore do not translate into a therapeutic benefit for patients. OBJECTIVE: To present a computational method aimed to comprehensively exploit the scientific knowledge in order to foster the development of personalized cancer treatment by matching the patient's molecular profile with the available evidence on targeted therapy. METHODS: To this aim we focused on melanoma, an increasingly diagnosed malignancy for which the need for novel therapeutic approaches is paradigmatic since no effective treatment is available in the advanced setting. Relevant data were manually extracted from peer-reviewed full-text original articles describing any type of anti-melanoma targeted therapy tested in any type of experimental or clinical model. To this purpose, Medline, Embase, Cancerlit and the Cochrane databases were searched. RESULTS AND CONCLUSIONS: We created a manually annotated database (Targeted Therapy Database, TTD where the relevant data are gathered in a formal representation that can be computationally analyzed. Dedicated algorithms were set up for the identification of the prevalent therapeutic hypotheses based on the available evidence and for ranking treatments based on the molecular profile of individual patients. In this essay we describe the principles and computational algorithms of an original method

  15. Targeted Therapy Database (TTD): a model to match patient's molecular profile with current knowledge on cancer biology.

    Science.gov (United States)

    Mocellin, Simone; Shrager, Jeff; Scolyer, Richard; Pasquali, Sandro; Verdi, Daunia; Marincola, Francesco M; Briarava, Marta; Gobbel, Randy; Rossi, Carlo; Nitti, Donato

    2010-08-10

    The efficacy of current anticancer treatments is far from satisfactory and many patients still die of their disease. A general agreement exists on the urgency of developing molecularly targeted therapies, although their implementation in the clinical setting is in its infancy. In fact, despite the wealth of preclinical studies addressing these issues, the difficulty of testing each targeted therapy hypothesis in the clinical arena represents an intrinsic obstacle. As a consequence, we are witnessing a paradoxical situation where most hypotheses about the molecular and cellular biology of cancer remain clinically untested and therefore do not translate into a therapeutic benefit for patients. To present a computational method aimed to comprehensively exploit the scientific knowledge in order to foster the development of personalized cancer treatment by matching the patient's molecular profile with the available evidence on targeted therapy. To this aim we focused on melanoma, an increasingly diagnosed malignancy for which the need for novel therapeutic approaches is paradigmatic since no effective treatment is available in the advanced setting. Relevant data were manually extracted from peer-reviewed full-text original articles describing any type of anti-melanoma targeted therapy tested in any type of experimental or clinical model. To this purpose, Medline, Embase, Cancerlit and the Cochrane databases were searched. We created a manually annotated database (Targeted Therapy Database, TTD) where the relevant data are gathered in a formal representation that can be computationally analyzed. Dedicated algorithms were set up for the identification of the prevalent therapeutic hypotheses based on the available evidence and for ranking treatments based on the molecular profile of individual patients. In this essay we describe the principles and computational algorithms of an original method developed to fully exploit the available knowledge on cancer biology with the

  16. Trends in global acupuncture publications: An analysis of the Web of Science database from 1988 to 2015.

    Science.gov (United States)

    Kung, Yen-Ying; Hwang, Shinn-Jang; Li, Tsai-Feng; Ko, Seong-Gyu; Huang, Ching-Wen; Chen, Fang-Pey

    2017-08-01

    Acupuncture is a rapidly growing medical specialty worldwide. This study aimed to analyze the acupuncture publications from 1988 to 2015 by using the Web of Science (WoS) database. Familiarity with the trend of acupuncture publications will facilitate a better understanding of existing academic research in acupuncture and its applications. Academic articles published focusing on acupuncture were retrieved and analyzed from the WoS database which included articles published in Science Citation Index-Expanded and Social Science Citation Indexed journals from 1988 to 2015. A total of 7450 articles were published in the field of acupuncture during the period of 1988-2015. Annual article publications increased from 109 in 1988 to 670 in 2015. The People's Republic of China (published 2076 articles, 27.9%), USA (published 1638 articles, 22.0%) and South Korea (published 707 articles, 9.5%) were the most abundantly prolific countries. According to the WoS subject categories, 2591 articles (34.8%) were published in the category of Integrative and Complementary Medicine, followed by Neurosciences (1147 articles, 15.4%), and General Internal Medicine (918 articles, 12.3%). Kyung Hee University (South Korea) is the most prolific organization that is the source of acupuncture publications (365 articles, 4.9%). Fields within acupuncture with the most cited articles included mechanism, clinical trials, epidemiology, and a new research method of acupuncture. Publications associated with acupuncture increased rapidly from 1988 to 2015. The different applications of acupuncture were extensive in multiple fields of medicine. It is important to maintain and even nourish a certain quantity and quality of published acupuncture papers, which can play an important role in developing a medical discipline for acupuncture. Copyright © 2017. Published by Elsevier Taiwan LLC.

  17. Deep Time Data Infrastructure: Integrating Our Current Geologic and Biologic Databases

    Science.gov (United States)

    Kolankowski, S. M.; Fox, P. A.; Ma, X.; Prabhu, A.

    2016-12-01

    As our knowledge of Earth's geologic and mineralogical history grows, we require more efficient methods of sharing immense amounts of data. Databases across numerous disciplines have been utilized to offer extensive information on very specific Epochs of Earth's history up to its current state, i.e. Fossil record, rock composition, proteins, etc. These databases could be a powerful force in identifying previously unseen correlations such as relationships between minerals and proteins. Creating a unifying site that provides a portal to these databases will aid in our ability as a collaborative scientific community to utilize our findings more effectively. The Deep-Time Data Infrastructure (DTDI) is currently being defined as part of a larger effort to accomplish this goal. DTDI will not be a new database, but an integration of existing resources. Current geologic and related databases were identified, documentation of their schema was established and will be presented as a stage by stage progression. Through conceptual modeling focused around variables from their combined records, we will determine the best way to integrate these databases using common factors. The Deep-Time Data Infrastructure will allow geoscientists to bridge gaps in data and further our understanding of our Earth's history.

  18. Case Study III: The Construction of a Nanotoxicity Database - The MOD-ENP-TOX Experience.

    Science.gov (United States)

    Vriens, Hanne; Mertens, Dominik; Regret, Renaud; Lin, Pinpin; Locquet, Jean-Pierre; Hoet, Peter

    2017-01-01

    The amount of experimental studies on the toxicity of nanomaterials is growing fast. Interpretation and comparison of these studies is a complex issue due to the high amount of variables possibly determining the toxicity of nanomaterials.Qualitative databases providing a structured combination, integration and quality evaluation of the existing data could reveal insights that cannot be seen from different studies alone. A few database initiatives are under development but in practice very little data is publicly available and collaboration between physicists, toxicologists, computer scientists and modellers is needed to further develop databases, standards and analysis tools.In this case study the process of building a database on the in vitro toxicity of amorphous silica nanoparticles (NPs) is described in detail. Experimental data were systematically collected from peer reviewed papers, manually curated and stored in a standardised format. The result is a database in ISA-Tab-Nano including 68 peer reviewed papers on the toxicity of 148 amorphous silica NPs. Both the physicochemical characterization of the particles and their biological effect (described in 230 in vitro assays) were stored in the database. A scoring system was elaborated in order to evaluate the reliability of the stored data.

  19. Using an international p53 mutation database as a foundation for an online laboratory in an upper level undergraduate biology class.

    Science.gov (United States)

    Melloy, Patricia G

    2015-01-01

    A two-part laboratory exercise was developed to enhance classroom instruction on the significance of p53 mutations in cancer development. Students were asked to mine key information from an international database of p53 genetic changes related to cancer, the IARC TP53 database. Using this database, students designed several data mining activities to look at the changes in the p53 gene from a number of perspectives, including potential cancer-causing agents leading to particular changes and the prevalence of certain p53 variations in certain cancers. In addition, students gained a global perspective on cancer prevalence in different parts of the world. Students learned how to use the database in the first part of the exercise, and then used that knowledge to search particular cancers and cancer-causing agents of their choosing in the second part of the exercise. Students also connected the information gathered from the p53 exercise to a previous laboratory exercise looking at risk factors for cancer development. The goal of the experience was to increase student knowledge of the link between p53 genetic variation and cancer. Students also were able to walk a similar path through the website as a cancer researcher using the database to enhance bench work-based experiments with complementary large-scale database p53 variation information. © 2014 The International Union of Biochemistry and Molecular Biology.

  20. HCSD: the human cancer secretome database

    DEFF Research Database (Denmark)

    Feizi, Amir; Banaei-Esfahani, Amir; Nielsen, Jens

    2015-01-01

    The human cancer secretome database (HCSD) is a comprehensive database for human cancer secretome data. The cancer secretome describes proteins secreted by cancer cells and structuring information about the cancer secretome will enable further analysis of how this is related with tumor biology...... database is limiting the ability to query the increasing community knowledge. We therefore developed the Human Cancer Secretome Database (HCSD) to fulfil this gap. HCSD contains >80 000 measurements for about 7000 nonredundant human proteins collected from up to 35 high-throughput studies on 17 cancer...

  1. Evolution and applications of plant pathway resources and databases

    DEFF Research Database (Denmark)

    Sucaet, Yves; Deva, Taru

    2011-01-01

    Plants are important sources of food and plant products are essential for modern human life. Plants are increasingly gaining importance as drug and fuel resources, bioremediation tools and as tools for recombinant technology. Considering these applications, database infrastructure for plant model...... systems deserves much more attention. Study of plant biological pathways, the interconnection between these pathways and plant systems biology on the whole has in general lagged behind human systems biology. In this article we review plant pathway databases and the resources that are currently available...

  2. Potential translational targets revealed by linking mouse grooming behavioral phenotypes to gene expression using public databases.

    Science.gov (United States)

    Roth, Andrew; Kyzar, Evan J; Cachat, Jonathan; Stewart, Adam Michael; Green, Jeremy; Gaikwad, Siddharth; O'Leary, Timothy P; Tabakoff, Boris; Brown, Richard E; Kalueff, Allan V

    2013-01-10

    Rodent self-grooming is an important, evolutionarily conserved behavior, highly sensitive to pharmacological and genetic manipulations. Mice with aberrant grooming phenotypes are currently used to model various human disorders. Therefore, it is critical to understand the biology of grooming behavior, and to assess its translational validity to humans. The present in-silico study used publicly available gene expression and behavioral data obtained from several inbred mouse strains in the open-field, light-dark box, elevated plus- and elevated zero-maze tests. As grooming duration differed between strains, our analysis revealed several candidate genes with significant correlations between gene expression in the brain and grooming duration. The Allen Brain Atlas, STRING, GoMiner and Mouse Genome Informatics databases were used to functionally map and analyze these candidate mouse genes against their human orthologs, assessing the strain ranking of their expression and the regional distribution of expression in the mouse brain. This allowed us to identify an interconnected network of candidate genes (which have expression levels that correlate with grooming behavior), display altered patterns of expression in key brain areas related to grooming, and underlie important functions in the brain. Collectively, our results demonstrate the utility of large-scale, high-throughput data-mining and in-silico modeling for linking genomic and behavioral data, as well as their potential to identify novel neural targets for complex neurobehavioral phenotypes, including grooming. Copyright © 2012 Elsevier Inc. All rights reserved.

  3. Canis mtDNA HV1 database: a web-based tool for collecting and surveying Canis mtDNA HV1 haplotype in public database.

    Science.gov (United States)

    Thai, Quan Ke; Chung, Dung Anh; Tran, Hoang-Dung

    2017-06-26

    Canine and wolf mitochondrial DNA haplotypes, which can be used for forensic or phylogenetic analyses, have been defined in various schemes depending on the region analyzed. In recent studies, the 582 bp fragment of the HV1 region is most commonly used. 317 different canine HV1 haplotypes have been reported in the rapidly growing public database GenBank. These reported haplotypes contain several inconsistencies in their haplotype information. To overcome this issue, we have developed a Canis mtDNA HV1 database. This database collects data on the HV1 582 bp region in dog mitochondrial DNA from the GenBank to screen and correct the inconsistencies. It also supports users in detection of new novel mutation profiles and assignment of new haplotypes. The Canis mtDNA HV1 database (CHD) contains 5567 nucleotide entries originating from 15 subspecies in the species Canis lupus. Of these entries, 3646 were haplotypes and grouped into 804 distinct sequences. 319 sequences were recognized as previously assigned haplotypes, while the remaining 485 sequences had new mutation profiles and were marked as new haplotype candidates awaiting further analysis for haplotype assignment. Of the 3646 nucleotide entries, only 414 were annotated with correct haplotype information, while 3232 had insufficient or lacked haplotype information and were corrected or modified before storing in the CHD. The CHD can be accessed at http://chd.vnbiology.com . It provides sequences, haplotype information, and a web-based tool for mtDNA HV1 haplotyping. The CHD is updated monthly and supplies all data for download. The Canis mtDNA HV1 database contains information about canine mitochondrial DNA HV1 sequences with reconciled annotation. It serves as a tool for detection of inconsistencies in GenBank and helps identifying new HV1 haplotypes. Thus, it supports the scientific community in naming new HV1 haplotypes and to reconcile existing annotation of HV1 582 bp sequences.

  4. Identification and correction of abnormal, incomplete and mispredicted proteins in public databases

    Directory of Open Access Journals (Sweden)

    Bányai László

    2008-08-01

    Full Text Available Abstract Background Despite significant improvements in computational annotation of genomes, sequences of abnormal, incomplete or incorrectly predicted genes and proteins remain abundant in public databases. Since the majority of incomplete, abnormal or mispredicted entries are not annotated as such, these errors seriously affect the reliability of these databases. Here we describe the MisPred approach that may provide an efficient means for the quality control of databases. The current version of the MisPred approach uses five distinct routines for identifying abnormal, incomplete or mispredicted entries based on the principle that a sequence is likely to be incorrect if some of its features conflict with our current knowledge about protein-coding genes and proteins: (i conflict between the predicted subcellular localization of proteins and the absence of the corresponding sequence signals; (ii presence of extracellular and cytoplasmic domains and the absence of transmembrane segments; (iii co-occurrence of extracellular and nuclear domains; (iv violation of domain integrity; (v chimeras encoded by two or more genes located on different chromosomes. Results Analyses of predicted EnsEMBL protein sequences of nine deuterostome (Homo sapiens, Mus musculus, Rattus norvegicus, Monodelphis domestica, Gallus gallus, Xenopus tropicalis, Fugu rubripes, Danio rerio and Ciona intestinalis and two protostome species (Caenorhabditis elegans and Drosophila melanogaster have revealed that the absence of expected signal peptides and violation of domain integrity account for the majority of mispredictions. Analyses of sequences predicted by NCBI's GNOMON annotation pipeline show that the rates of mispredictions are comparable to those of EnsEMBL. Interestingly, even the manually curated UniProtKB/Swiss-Prot dataset is contaminated with mispredicted or abnormal proteins, although to a much lesser extent than UniProtKB/TrEMBL or the EnsEMBL or GNOMON

  5. Tools and Databases of the KOMICS Web Portal for Preprocessing, Mining, and Dissemination of Metabolomics Data

    Directory of Open Access Journals (Sweden)

    Nozomu Sakurai

    2014-01-01

    Full Text Available A metabolome—the collection of comprehensive quantitative data on metabolites in an organism—has been increasingly utilized for applications such as data-intensive systems biology, disease diagnostics, biomarker discovery, and assessment of food quality. A considerable number of tools and databases have been developed to date for the analysis of data generated by various combinations of chromatography and mass spectrometry. We report here a web portal named KOMICS (The Kazusa Metabolomics Portal, where the tools and databases that we developed are available for free to academic users. KOMICS includes the tools and databases for preprocessing, mining, visualization, and publication of metabolomics data. Improvements in the annotation of unknown metabolites and dissemination of comprehensive metabolomic data are the primary aims behind the development of this portal. For this purpose, PowerGet and FragmentAlign include a manual curation function for the results of metabolite feature alignments. A metadata-specific wiki-based database, Metabolonote, functions as a hub of web resources related to the submitters' work. This feature is expected to increase citation of the submitters' work, thereby promoting data publication. As an example of the practical use of KOMICS, a workflow for a study on Jatropha curcas is presented. The tools and databases available at KOMICS should contribute to enhanced production, interpretation, and utilization of metabolomic Big Data.

  6. Tools and databases of the KOMICS web portal for preprocessing, mining, and dissemination of metabolomics data.

    Science.gov (United States)

    Sakurai, Nozomu; Ara, Takeshi; Enomoto, Mitsuo; Motegi, Takeshi; Morishita, Yoshihiko; Kurabayashi, Atsushi; Iijima, Yoko; Ogata, Yoshiyuki; Nakajima, Daisuke; Suzuki, Hideyuki; Shibata, Daisuke

    2014-01-01

    A metabolome--the collection of comprehensive quantitative data on metabolites in an organism--has been increasingly utilized for applications such as data-intensive systems biology, disease diagnostics, biomarker discovery, and assessment of food quality. A considerable number of tools and databases have been developed to date for the analysis of data generated by various combinations of chromatography and mass spectrometry. We report here a web portal named KOMICS (The Kazusa Metabolomics Portal), where the tools and databases that we developed are available for free to academic users. KOMICS includes the tools and databases for preprocessing, mining, visualization, and publication of metabolomics data. Improvements in the annotation of unknown metabolites and dissemination of comprehensive metabolomic data are the primary aims behind the development of this portal. For this purpose, PowerGet and FragmentAlign include a manual curation function for the results of metabolite feature alignments. A metadata-specific wiki-based database, Metabolonote, functions as a hub of web resources related to the submitters' work. This feature is expected to increase citation of the submitters' work, thereby promoting data publication. As an example of the practical use of KOMICS, a workflow for a study on Jatropha curcas is presented. The tools and databases available at KOMICS should contribute to enhanced production, interpretation, and utilization of metabolomic Big Data.

  7. Database Changes (Post-Publication). ERIC Processing Manual, Section X.

    Science.gov (United States)

    Brandhorst, Ted, Ed.

    The purpose of this section is to specify the procedure for making changes to the ERIC database after the data involved have been announced in the abstract journals RIE or CIJE. As a matter of general ERIC policy, a document or journal article is not re-announced or re-entered into the database as a new accession for the purpose of accomplishing a…

  8. The Ensembl genome database project.

    Science.gov (United States)

    Hubbard, T; Barker, D; Birney, E; Cameron, G; Chen, Y; Clark, L; Cox, T; Cuff, J; Curwen, V; Down, T; Durbin, R; Eyras, E; Gilbert, J; Hammond, M; Huminiecki, L; Kasprzyk, A; Lehvaslaiho, H; Lijnzaad, P; Melsopp, C; Mongin, E; Pettett, R; Pocock, M; Potter, S; Rust, A; Schmidt, E; Searle, S; Slater, G; Smith, J; Spooner, W; Stabenau, A; Stalker, J; Stupka, E; Ureta-Vidal, A; Vastrik, I; Clamp, M

    2002-01-01

    The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organise biology around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of the human genome sequence, with confirmed gene predictions that have been integrated with external data sources, and is available as either an interactive web site or as flat files. It is also an open source software engineering project to develop a portable system able to handle very large genomes and associated requirements from sequence analysis to data storage and visualisation. The Ensembl site is one of the leading sources of human genome sequence annotation and provided much of the analysis for publication by the international human genome project of the draft genome. The Ensembl system is being installed around the world in both companies and academic sites on machines ranging from supercomputers to laptops.

  9. ProBiS tools (algorithm, database, and web servers) for predicting and modeling of biologically interesting proteins.

    Science.gov (United States)

    Konc, Janez; Janežič, Dušanka

    2017-09-01

    ProBiS (Protein Binding Sites) Tools consist of algorithm, database, and web servers for prediction of binding sites and protein ligands based on the detection of structurally similar binding sites in the Protein Data Bank. In this article, we review the operations that ProBiS Tools perform, provide comments on the evolution of the tools, and give some implementation details. We review some of its applications to biologically interesting proteins. ProBiS Tools are freely available at http://probis.cmm.ki.si and http://probis.nih.gov. Copyright © 2017 Elsevier Ltd. All rights reserved.

  10. Search extension transforms Wiki into a relational system: a case for flavonoid metabolite database.

    Science.gov (United States)

    Arita, Masanori; Suwa, Kazuhiro

    2008-09-17

    In computer science, database systems are based on the relational model founded by Edgar Codd in 1970. On the other hand, in the area of biology the word 'database' often refers to loosely formatted, very large text files. Although such bio-databases may describe conflicts or ambiguities (e.g. a protein pair do and do not interact, or unknown parameters) in a positive sense, the flexibility of the data format sacrifices a systematic query mechanism equivalent to the widely used SQL. To overcome this disadvantage, we propose embeddable string-search commands on a Wiki-based system and designed a half-formatted database. As proof of principle, a database of flavonoid with 6902 molecular structures from over 1687 plant species was implemented on MediaWiki, the background system of Wikipedia. Registered users can describe any information in an arbitrary format. Structured part is subject to text-string searches to realize relational operations. The system was written in PHP language as the extension of MediaWiki. All modifications are open-source and publicly available. This scheme benefits from both the free-formatted Wiki style and the concise and structured relational-database style. MediaWiki supports multi-user environments for document management, and the cost for database maintenance is alleviated.

  11. KID - an algorithm for fast and efficient text mining used to automatically generate a database containing kinetic information of enzymes

    Directory of Open Access Journals (Sweden)

    Schomburg Dietmar

    2010-07-01

    Full Text Available Abstract Background The amount of available biological information is rapidly increasing and the focus of biological research has moved from single components to networks and even larger projects aiming at the analysis, modelling and simulation of biological networks as well as large scale comparison of cellular properties. It is therefore essential that biological knowledge is easily accessible. However, most information is contained in the written literature in an unstructured way, so that methods for the systematic extraction of knowledge directly from the primary literature have to be deployed. Description Here we present a text mining algorithm for the extraction of kinetic information such as KM, Ki, kcat etc. as well as associated information such as enzyme names, EC numbers, ligands, organisms, localisations, pH and temperatures. Using this rule- and dictionary-based approach, it was possible to extract 514,394 kinetic parameters of 13 categories (KM, Ki, kcat, kcat/KM, Vmax, IC50, S0.5, Kd, Ka, t1/2, pI, nH, specific activity, Vmax/KM from about 17 million PubMed abstracts and combine them with other data in the abstract. A manual verification of approx. 1,000 randomly chosen results yielded a recall between 51% and 84% and a precision ranging from 55% to 96%, depending of the category searched. The results were stored in a database and are available as "KID the KInetic Database" via the internet. Conclusions The presented algorithm delivers a considerable amount of information and therefore may aid to accelerate the research and the automated analysis required for today's systems biology approaches. The database obtained by analysing PubMed abstracts may be a valuable help in the field of chemical and biological kinetics. It is completely based upon text mining and therefore complements manually curated databases. The database is available at http://kid.tu-bs.de. The source code of the algorithm is provided under the GNU General Public

  12. PathwayAccess: CellDesigner plugins for pathway databases.

    Science.gov (United States)

    Van Hemert, John L; Dickerson, Julie A

    2010-09-15

    CellDesigner provides a user-friendly interface for graphical biochemical pathway description. Many pathway databases are not directly exportable to CellDesigner models. PathwayAccess is an extensible suite of CellDesigner plugins, which connect CellDesigner directly to pathway databases using respective Java application programming interfaces. The process is streamlined for creating new PathwayAccess plugins for specific pathway databases. Three PathwayAccess plugins, MetNetAccess, BioCycAccess and ReactomeAccess, directly connect CellDesigner to the pathway databases MetNetDB, BioCyc and Reactome. PathwayAccess plugins enable CellDesigner users to expose pathway data to analytical CellDesigner functions, curate their pathway databases and visually integrate pathway data from different databases using standard Systems Biology Markup Language and Systems Biology Graphical Notation. Implemented in Java, PathwayAccess plugins run with CellDesigner version 4.0.1 and were tested on Ubuntu Linux, Windows XP and 7, and MacOSX. Source code, binaries, documentation and video walkthroughs are freely available at http://vrac.iastate.edu/~jlv.

  13. CUDASW++: optimizing Smith-Waterman sequence database searches for CUDA-enabled graphics processing units

    Directory of Open Access Journals (Sweden)

    Maskell Douglas L

    2009-05-01

    Full Text Available Abstract Background The Smith-Waterman algorithm is one of the most widely used tools for searching biological sequence databases due to its high sensitivity. Unfortunately, the Smith-Waterman algorithm is computationally demanding, which is further compounded by the exponential growth of sequence databases. The recent emergence of many-core architectures, and their associated programming interfaces, provides an opportunity to accelerate sequence database searches using commonly available and inexpensive hardware. Findings Our CUDASW++ implementation (benchmarked on a single-GPU NVIDIA GeForce GTX 280 graphics card and a dual-GPU GeForce GTX 295 graphics card provides a significant performance improvement compared to other publicly available implementations, such as SWPS3, CBESW, SW-CUDA, and NCBI-BLAST. CUDASW++ supports query sequences of length up to 59K and for query sequences ranging in length from 144 to 5,478 in Swiss-Prot release 56.6, the single-GPU version achieves an average performance of 9.509 GCUPS with a lowest performance of 9.039 GCUPS and a highest performance of 9.660 GCUPS, and the dual-GPU version achieves an average performance of 14.484 GCUPS with a lowest performance of 10.660 GCUPS and a highest performance of 16.087 GCUPS. Conclusion CUDASW++ is publicly available open-source software. It provides a significant performance improvement for Smith-Waterman-based protein sequence database searches by fully exploiting the compute capability of commonly used CUDA-enabled low-cost GPUs.

  14. USAID Public-Private Partnerships Database

    Data.gov (United States)

    US Agency for International Development — This dataset brings together information collected since 2001 on PPPs that have been supported by USAID. For the purposes of this dataset a Public-Private...

  15. ODG: Omics database generator - a tool for generating, querying, and analyzing multi-omics comparative databases to facilitate biological understanding.

    Science.gov (United States)

    Guhlin, Joseph; Silverstein, Kevin A T; Zhou, Peng; Tiffin, Peter; Young, Nevin D

    2017-08-10

    Rapid generation of omics data in recent years have resulted in vast amounts of disconnected datasets without systemic integration and knowledge building, while individual groups have made customized, annotated datasets available on the web with few ways to link them to in-lab datasets. With so many research groups generating their own data, the ability to relate it to the larger genomic and comparative genomic context is becoming increasingly crucial to make full use of the data. The Omics Database Generator (ODG) allows users to create customized databases that utilize published genomics data integrated with experimental data which can be queried using a flexible graph database. When provided with omics and experimental data, ODG will create a comparative, multi-dimensional graph database. ODG can import definitions and annotations from other sources such as InterProScan, the Gene Ontology, ENZYME, UniPathway, and others. This annotation data can be especially useful for studying new or understudied species for which transcripts have only been predicted, and rapidly give additional layers of annotation to predicted genes. In better studied species, ODG can perform syntenic annotation translations or rapidly identify characteristics of a set of genes or nucleotide locations, such as hits from an association study. ODG provides a web-based user-interface for configuring the data import and for querying the database. Queries can also be run from the command-line and the database can be queried directly through programming language hooks available for most languages. ODG supports most common genomic formats as well as generic, easy to use tab-separated value format for user-provided annotations. ODG is a user-friendly database generation and query tool that adapts to the supplied data to produce a comparative genomic database or multi-layered annotation database. ODG provides rapid comparative genomic annotation and is therefore particularly useful for non-model or

  16. Aerospace Medicine and Biology: A Continuing Bibliography with Indexes. Supplement 488

    Science.gov (United States)

    1999-01-01

    This report lists reports, articles and other documents recently announced in the NASA STI Database. In its subject coverage, Aerospace Medicine and Biology concentrates on the biological, physiological, psychological, and environmental effects to which humans are subjected during and following simulated or actual flight in the Earth's atmosphere or in interplanetary space. References describing similar effects on biological organisms of lower order are also included. Such related topics as sanitary problems, pharmacology, toxicology, safety and survival, life support systems, exobiology, and personnel factors receive appropriate attention. Applied research receives the most emphasis, but references to fundamental studies and theoretical principles related to experimental development also qualify for inclusion. Each entry in the publication consists of a standard bibliographic citation accompanied, in most cases, by an abstract.

  17. Consumer Product Category Database

    Science.gov (United States)

    The Chemical and Product Categories database (CPCat) catalogs the use of over 40,000 chemicals and their presence in different consumer products. The chemical use information is compiled from multiple sources while product information is gathered from publicly available Material Safety Data Sheets (MSDS). EPA researchers are evaluating the possibility of expanding the database with additional product and use information.

  18. Comprehensive Thematic T-matrix Reference Database: a 2013-2014 Update

    Science.gov (United States)

    Mishchenko, Michael I.; Zakharova, Nadezhda T.; Khlebtsov, Nikolai G.; Wriedt, Thomas; Videen, Gorden

    2014-01-01

    This paper is the sixth update to the comprehensive thematic database of peer-reviewedT-matrix publications initiated by us in 2004 and includes relevant publications that have appeared since 2013. It also lists several earlier publications not incorporated in the original database and previous updates.

  19. Aerospace Medicine and Biology: A Continuing Bibliography with Indexes. Supplement 489

    Science.gov (United States)

    1999-01-01

    This supplemental issue of Aerospace Medicine and Biology, A Continuing Bibliography with Indexes (NASA/SP-1999-7011) lists reports, articles, and other documents recently announced in the NASA STI Database. In its subject coverage, Aerospace Medicine and Biology concentrates on the biological, physiological, psychological, and environmental effects to which humans are subjected during and following simulated or actual flight in the Earth's atmosphere or in interplanetary space. References describing similar effects on biological organisms of lower order are also included. Such related topics as sanitary problems, pharmacology, toxicology, safety and survival, life support systems, exobiology, and personnel factors receive appropriate attention. Applied research receives the most emphasis, but references to fundamental studies and theoretical principles related to experimental development also qualify for inclusion. Each entry in the publication consists of a standard bibliographic citation accompanied, in most cases, by an abstract.

  20. Aerospace Medicine and Biology: A Continuing Bibliography with Indexes. Supplement 498

    Science.gov (United States)

    2000-01-01

    This supplemental issue of Aerospace Medicine and Biology, A Continuing Bibliography with Indexes (NASA/SP-1999-7011) lists reports, articles, and other documents recently announced in the NASA STI Database. In its subject coverage, Aerospace Medicine and Biology concentrates on the biological, physiological, psychological, and environmental effects to which humans are subjected during and following simulated or actual flight in the Earth's atmosphere or in interplanetary space. References describing similar effects on biological organisms of lower order are also included. Such related topics as sanitary problems, pharmacology, toxicology, safety and survival, life support systems, exobiology, and personnel factors receive appropriate attention. Applied research receives the most emphasis, but references to fundamental studies and theoretical principles related to experimental development also qualify for inclusion. Each entry in the publication consists of a standard bibliographic citation accompanied, in most cases, by an abstract.

  1. Benchmarking of the 2010 BioCreative Challenge III text-mining competition by the BioGRID and MINT interaction databases

    Directory of Open Access Journals (Sweden)

    Cesareni Gianni

    2011-10-01

    Full Text Available Abstract Background The vast amount of data published in the primary biomedical literature represents a challenge for the automated extraction and codification of individual data elements. Biological databases that rely solely on manual extraction by expert curators are unable to comprehensively annotate the information dispersed across the entire biomedical literature. The development of efficient tools based on natural language processing (NLP systems is essential for the selection of relevant publications, identification of data attributes and partially automated annotation. One of the tasks of the Biocreative 2010 Challenge III was devoted to the evaluation of NLP systems developed to identify articles for curation and extraction of protein-protein interaction (PPI data. Results The Biocreative 2010 competition addressed three tasks: gene normalization, article classification and interaction method identification. The BioGRID and MINT protein interaction databases both participated in the generation of the test publication set for gene normalization, annotated the development and test sets for article classification, and curated the test set for interaction method classification. These test datasets served as a gold standard for the evaluation of data extraction algorithms. Conclusion The development of efficient tools for extraction of PPI data is a necessary step to achieve full curation of the biomedical literature. NLP systems can in the first instance facilitate expert curation by refining the list of candidate publications that contain PPI data; more ambitiously, NLP approaches may be able to directly extract relevant information from full-text articles for rapid inspection by expert curators. Close collaboration between biological databases and NLP systems developers will continue to facilitate the long-term objectives of both disciplines.

  2. A Novel Approach: Chemical Relational Databases, and the Role of the ISSCAN Database on Assessing Chemical Carcinogenity

    Science.gov (United States)

    Mutagenicity and carcinogenicity databases are crucial resources for toxicologists and regulators involved in chemicals risk assessment. Until recently, existing public toxicity databases have been constructed primarily as "look-up-tables" of existing data, and most often did no...

  3. Dealing with the Data Deluge: Handling the Multitude Of Chemical Biology Data Sources.

    Science.gov (United States)

    Guha, Rajarshi; Nguyen, Dac-Trung; Southall, Noel; Jadhav, Ajit

    2012-09-01

    Over the last 20 years, there has been an explosion in the amount and type of biological and chemical data that has been made publicly available in a variety of online databases. While this means that vast amounts of information can be found online, there is no guarantee that it can be found easily (or at all). A scientist searching for a specific piece of information is faced with a daunting task - many databases have overlapping content, use their own identifiers and, in some cases, have arcane and unintuitive user interfaces. In this overview, a variety of well known data sources for chemical and biological information are highlighted, focusing on those most useful for chemical biology research. The issue of using multiple data sources together and the associated problems such as identifier disambiguation are highlighted. A brief discussion is then provided on Tripod, a recently developed platform that supports the integration of arbitrary data sources, providing users a simple interface to search across a federated collection of resources.

  4. Complementary Value of Databases for Discovery of Scholarly Literature: A User Survey of Online Searching for Publications in Art History

    Science.gov (United States)

    Nemeth, Erik

    2010-01-01

    Discovery of academic literature through Web search engines challenges the traditional role of specialized research databases. Creation of literature outside academic presses and peer-reviewed publications expands the content for scholarly research within a particular field. The resulting body of literature raises the question of whether scholars…

  5. CardioTF, a database of deconstructing transcriptional circuits in the heart system.

    Science.gov (United States)

    Zhen, Yisong

    2016-01-01

    Information on cardiovascular gene transcription is fragmented and far behind the present requirements of the systems biology field. To create a comprehensive source of data for cardiovascular gene regulation and to facilitate a deeper understanding of genomic data, the CardioTF database was constructed. The purpose of this database is to collate information on cardiovascular transcription factors (TFs), position weight matrices (PWMs), and enhancer sequences discovered using the ChIP-seq method. The Naïve-Bayes algorithm was used to classify literature and identify all PubMed abstracts on cardiovascular development. The natural language learning tool GNAT was then used to identify corresponding gene names embedded within these abstracts. Local Perl scripts were used to integrate and dump data from public databases into the MariaDB management system (MySQL). In-house R scripts were written to analyze and visualize the results. Known cardiovascular TFs from humans and human homologs from fly, Ciona, zebrafish, frog, chicken, and mouse were identified and deposited in the database. PWMs from Jaspar, hPDI, and UniPROBE databases were deposited in the database and can be retrieved using their corresponding TF names. Gene enhancer regions from various sources of ChIP-seq data were deposited into the database and were able to be visualized by graphical output. Besides biocuration, mouse homologs of the 81 core cardiac TFs were selected using a Naïve-Bayes approach and then by intersecting four independent data sources: RNA profiling, expert annotation, PubMed abstracts and phenotype. The CardioTF database can be used as a portal to construct transcriptional network of cardiac development. Database URL: http://www.cardiosignal.org/database/cardiotf.html.

  6. E-SovTox: An online database of the main publicly-available sources of toxicity data concerning REACH-relevant chemicals published in the Russian language.

    Science.gov (United States)

    Sihtmäe, Mariliis; Blinova, Irina; Aruoja, Villem; Dubourguier, Henri-Charles; Legrand, Nicolas; Kahru, Anne

    2010-08-01

    A new open-access online database, E-SovTox, is presented. E-SovTox provides toxicological data for substances relevant to the EU Registration, Evaluation, Authorisation and Restriction of Chemicals (REACH) system, from publicly-available Russian language data sources. The database contains information selected mainly from scientific journals published during the Soviet Union era. The main information source for this database - the journal, Gigiena Truda i Professional'nye Zabolevania [Industrial Hygiene and Occupational Diseases], published between 1957 and 1992 - features acute, but also chronic, toxicity data for numerous industrial chemicals, e.g. for rats, mice, guinea-pigs and rabbits. The main goal of the abovementioned toxicity studies was to derive the maximum allowable concentration limits for industrial chemicals in the occupational health settings of the former Soviet Union. Thus, articles featured in the database include mostly data on LD50 values, skin and eye irritation, skin sensitisation and cumulative properties. Currently, the E-SovTox database contains toxicity data selected from more than 500 papers covering more than 600 chemicals. The user is provided with the main toxicity information, as well as abstracts of these papers in Russian and in English (given as provided in the original publication). The search engine allows cross-searching of the database by the name or CAS number of the compound, and the author of the paper. The E-SovTox database can be used as a decision-support tool by researchers and regulators for the hazard assessment of chemical substances. 2010 FRAME.

  7. Synthetic biology in the view of European public funding organisations

    Science.gov (United States)

    Pei, Lei; Gaisser, Sibylle; Schmidt, Markus

    2012-01-01

    We analysed the decisions of major European public funding organisations to fund or not to fund synthetic biology (SB) and related ethical, legal and social implication (ELSI) studies. We investigated the reaction of public organisations in six countries (Austria, France, Germany, the Netherlands, Switzerland and the UK) towards SB that may influence SB’s further development in Europe. We examined R&D and ELSI communities and their particular funding situation. Our results show that the funding situation for SB varies considerably among the analysed countries, with the UK as the only country with an established funding scheme for R&D and ELSI that successfully integrates these research communities. Elsewhere, we determined a general lack of funding (France), difficulties in funding ELSI work (Switzerland), lack of an R&D community (Austria), too small ELSI communities (France, Switzerland, Netherlands), or difficulties in linking existing communities with available funding sources (Germany), partly due to an unclear SB definition. PMID:22586841

  8. SPring-8 Structural Biology Beamlines / Current Status of Public Beamlines for Protein Crystallography at SPring-8

    International Nuclear Information System (INIS)

    Kawamoto, Masahide; Hasegawa, Kazuya; Shimizu, Nobutaka; Sakai, Hisanobu; Shimizu, Tetsuya; Nisawa, Atsushi; Yamamoto, Masaki

    2007-01-01

    SPring-8 has 2 protein crystallography beamlines for public use, BL38B1 (Structural Biology III) and BL41XU (Structural Biology I). The BL38B1 is a bending magnet beamline for routine data collection, and the BL41XU is an undulator beamline specially customized for micro beam and ultra-high resolutional experiment. The designs and the performances of each beamline are presented

  9. Metabolonote: A wiki-based database for managing hierarchical metadata of metabolome analyses

    Directory of Open Access Journals (Sweden)

    Takeshi eAra

    2015-04-01

    Full Text Available Metabolomics—technology for comprehensive detection of small molecules in an organism—lags behind the other omics in terms of publication and dissemination of experimental data. Among the reasons for this are difficulty precisely recording information about complicated analytical experiments (metadata, existence of various databases with their own metadata descriptions, and low reusability of the published data, resulting in submitters (the researchers who generate the data being insufficiently motivated. To tackle these issues, we developed Metabolonote, a Semantic MediaWiki-based database designed specifically for managing metabolomic metadata. We also defined a metadata and data description format, called TogoMD, with an ID system that is required for unique access to each level of the tree-structured metadata such as study purpose, sample, analytical method, and data analysis. Separation of the management of metadata from that of data and permission to attach related information to the metadata provide advantages for submitters, readers, and database developers. The metadata are enriched with information such as links to comparable data, thereby functioning as a hub of related data resources. They also enhance not only readers' understanding and use of data, but also submitters' motivation to publish the data. The metadata are computationally shared among other systems via APIs, which facilitates the construction of novel databases by database developers. A permission system that allows publication of immature metadata and feedback from readers also helps submitters to improve their metadata. Hence, this aspect of Metabolonote, as a metadata preparation tool, is complementary to high-quality and persistent data repositories such as MetaboLights. A total of 808 metadata for analyzed data obtained from 35 biological species are published currently. Metabolonote and related tools are available free of cost at http://metabolonote.kazusa.or.jp/.

  10. A Brief Introduction to Chinese Biological Biological

    Institute of Scientific and Technical Information of China (English)

    2005-01-01

    Chinese Biological Abstracts sponsored by the Library, the Shanghai Institutes for Biological Sciences, the Biological Documentation and Information Network, all of the Chinese Academy of Sciences, commenced publication in 1987 and was initiated to provide access to the Chinese information in the field of biology.

  11. 40 CFR 1400.13 - Read-only database.

    Science.gov (United States)

    2010-07-01

    ... 40 Protection of Environment 32 2010-07-01 2010-07-01 false Read-only database. 1400.13 Section... INFORMATION Other Provisions § 1400.13 Read-only database. The Administrator is authorized to establish... public off-site consequence analysis information by means of a central database under the control of the...

  12. MIRNA-DISTILLER: a stand-alone application to compile microRNA data from databases

    Directory of Open Access Journals (Sweden)

    Jessica K. Rieger

    2011-07-01

    Full Text Available MicroRNAs (miRNA are small non-coding RNA molecules of ~22 nucleotides which regulate large numbers of genes by binding to seed sequences at the 3’-UTR of target gene transcripts. The target mRNA is then usually degraded or translation is inhibited, although thus resulting in posttranscriptional down regulation of gene expression at the mRNA and/or protein level. Due to the bioinformatic difficulties in predicting functional miRNA binding sites, several publically available databases have been developed that predict miRNA binding sites based on different algorithms. The parallel use of different databases is currently indispensable, but highly uncomfortable and time consuming, especially when working with numerous genes of interest. We have therefore developed a new stand-alone program, termed MIRNA-DISTILLER, which allows to compile miRNA data for given target genes from public databases. Currently implemented are TargetScan, microCosm, and miRDB, which may be queried independently, pairwise, or together to calculate the respective intersections. Data are stored locally for application of further analysis tools including freely definable biological parameter filters, customized output-lists for both miRNAs and target genes, and various graphical facilities. The software, a data example file and a tutorial are freely available at http://www.ikp-stuttgart.de/content/language1/html/10415.asp

  13. MIRNA-DISTILLER: A Stand-Alone Application to Compile microRNA Data from Databases.

    Science.gov (United States)

    Rieger, Jessica K; Bodan, Denis A; Zanger, Ulrich M

    2011-01-01

    MicroRNAs (miRNA) are small non-coding RNA molecules of ∼22 nucleotides which regulate large numbers of genes by binding to seed sequences at the 3'-untranslated region of target gene transcripts. The target mRNA is then usually degraded or translation is inhibited, although thus resulting in posttranscriptional down regulation of gene expression at the mRNA and/or protein level. Due to the bioinformatic difficulties in predicting functional miRNA binding sites, several publically available databases have been developed that predict miRNA binding sites based on different algorithms. The parallel use of different databases is currently indispensable, but highly uncomfortable and time consuming, especially when working with numerous genes of interest. We have therefore developed a new stand-alone program, termed MIRNA-DISTILLER, which allows to compile miRNA data for given target genes from public databases. Currently implemented are TargetScan, microCosm, and miRDB, which may be queried independently, pairwise, or together to calculate the respective intersections. Data are stored locally for application of further analysis tools including freely definable biological parameter filters, customized output-lists for both miRNAs and target genes, and various graphical facilities. The software, a data example file and a tutorial are freely available at http://www.ikp-stuttgart.de/content/language1/html/10415.asp.

  14. The World Bacterial Biogeography and Biodiversity through Databases: A Case Study of NCBI Nucleotide Database and GBIF Database

    Directory of Open Access Journals (Sweden)

    Okba Selama

    2013-01-01

    Full Text Available Databases are an essential tool and resource within the field of bioinformatics. The primary aim of this study was to generate an overview of global bacterial biodiversity and biogeography using available data from the two largest public online databases, NCBI Nucleotide and GBIF. The secondary aim was to highlight the contribution each geographic area has to each database. The basis for data analysis of this study was the metadata provided by both databases, mainly, the taxonomy and the geographical area origin of isolation of the microorganism (record. These were directly obtained from GBIF through the online interface, while E-utilities and Python were used in combination with a programmatic web service access to obtain data from the NCBI Nucleotide Database. Results indicate that the American continent, and more specifically the USA, is the top contributor, while Africa and Antarctica are less well represented. This highlights the imbalance of exploration within these areas rather than any reduction in biodiversity. This study describes a novel approach to generating global scale patterns of bacterial biodiversity and biogeography and indicates that the Proteobacteria are the most abundant and widely distributed phylum within both databases.

  15. CBD: a biomarker database for colorectal cancer.

    Science.gov (United States)

    Zhang, Xueli; Sun, Xiao-Feng; Cao, Yang; Ye, Benchen; Peng, Qiliang; Liu, Xingyun; Shen, Bairong; Zhang, Hong

    2018-01-01

    Colorectal cancer (CRC) biomarker database (CBD) was established based on 870 identified CRC biomarkers and their relevant information from 1115 original articles in PubMed published from 1986 to 2017. In this version of the CBD, CRC biomarker data were collected, sorted, displayed and analysed. The CBD with the credible contents as a powerful and time-saving tool provide more comprehensive and accurate information for further CRC biomarker research. The CBD was constructed under MySQL server. HTML, PHP and JavaScript languages have been used to implement the web interface. The Apache was selected as HTTP server. All of these web operations were implemented under the Windows system. The CBD could provide to users the multiple individual biomarker information and categorized into the biological category, source and application of biomarkers; the experiment methods, results, authors and publication resources; the research region, the average age of cohort, gender, race, the number of tumours, tumour location and stage. We only collect data from the articles with clear and credible results to prove the biomarkers are useful in the diagnosis, treatment or prognosis of CRC. The CBD can also provide a professional platform to researchers who are interested in CRC research to communicate, exchange their research ideas and further design high-quality research in CRC. They can submit their new findings to our database via the submission page and communicate with us in the CBD.Database URL: http://sysbio.suda.edu.cn/CBD/.

  16. Respiratory cancer database: An open access database of respiratory cancer gene and miRNA.

    Science.gov (United States)

    Choubey, Jyotsna; Choudhari, Jyoti Kant; Patel, Ashish; Verma, Mukesh Kumar

    2017-01-01

    Respiratory cancer database (RespCanDB) is a genomic and proteomic database of cancer of respiratory organ. It also includes the information of medicinal plants used for the treatment of various respiratory cancers with structure of its active constituents as well as pharmacological and chemical information of drug associated with various respiratory cancers. Data in RespCanDB has been manually collected from published research article and from other databases. Data has been integrated using MySQL an object-relational database management system. MySQL manages all data in the back-end and provides commands to retrieve and store the data into the database. The web interface of database has been built in ASP. RespCanDB is expected to contribute to the understanding of scientific community regarding respiratory cancer biology as well as developments of new way of diagnosing and treating respiratory cancer. Currently, the database consist the oncogenomic information of lung cancer, laryngeal cancer, and nasopharyngeal cancer. Data for other cancers, such as oral and tracheal cancers, will be added in the near future. The URL of RespCanDB is http://ridb.subdic-bioinformatics-nitrr.in/.

  17. From biological anthropology to applied public health: epidemiological approaches to the study of infectious disease.

    Science.gov (United States)

    Albalak, Rachel

    2009-01-01

    This article describes two large, multisite infectious disease programs: the Tuberculosis Epidemiologic Studies Consortium (TBESC) and the Emerging Infections Programs (EIPs). The links between biological anthropology and applied public health are highlighted using these programs as examples. Funded by the Centers for Disease Control and Prevention (CDC), the TBESC and EIPs conduct applied public health research to strengthen infectious disease prevention and control efforts in the United States. They involve collaborations among CDC, public health departments, and academic and clinical institutions. Their unique role in national infectious disease work, including their links to anthropology, shared elements, key differences, strengths and challenges, is discussed.

  18. A computational platform to maintain and migrate manual functional annotations for BioCyc databases.

    Science.gov (United States)

    Walsh, Jesse R; Sen, Taner Z; Dickerson, Julie A

    2014-10-12

    BioCyc databases are an important resource for information on biological pathways and genomic data. Such databases represent the accumulation of biological data, some of which has been manually curated from literature. An essential feature of these databases is the continuing data integration as new knowledge is discovered. As functional annotations are improved, scalable methods are needed for curators to manage annotations without detailed knowledge of the specific design of the BioCyc database. We have developed CycTools, a software tool which allows curators to maintain functional annotations in a model organism database. This tool builds on existing software to improve and simplify annotation data imports of user provided data into BioCyc databases. Additionally, CycTools automatically resolves synonyms and alternate identifiers contained within the database into the appropriate internal identifiers. Automating steps in the manual data entry process can improve curation efforts for major biological databases. The functionality of CycTools is demonstrated by transferring GO term annotations from MaizeCyc to matching proteins in CornCyc, both maize metabolic pathway databases available at MaizeGDB, and by creating strain specific databases for metabolic engineering.

  19. Visualizing information across multidimensional post-genomic structured and textual databases.

    Science.gov (United States)

    Tao, Ying; Friedman, Carol; Lussier, Yves A

    2005-04-15

    Visualizing relationships among biological information to facilitate understanding is crucial to biological research during the post-genomic era. Although different systems have been developed to view gene-phenotype relationships for specific databases, very few have been designed specifically as a general flexible tool for visualizing multidimensional genotypic and phenotypic information together. Our goal is to develop a method for visualizing multidimensional genotypic and phenotypic information and a model that unifies different biological databases in order to present the integrated knowledge using a uniform interface. We developed a novel, flexible and generalizable visualization tool, called PhenoGenesviewer (PGviewer), which in this paper was used to display gene-phenotype relationships from a human-curated database (OMIM) and from an automatic method using a Natural Language Processing tool called BioMedLEE. Data obtained from multiple databases were first integrated into a uniform structure and then organized by PGviewer. PGviewer provides a flexible query interface that allows dynamic selection and ordering of any desired dimension in the databases. Based on users' queries, results can be visualized using hierarchical expandable trees that present views specified by users according to their research interests. We believe that this method, which allows users to dynamically organize and visualize multiple dimensions, is a potentially powerful and promising tool that should substantially facilitate biological research. PhenogenesViewer as well as its support and tutorial are available at http://www.dbmi.columbia.edu/pgviewer/ Lussier@dbmi.columbia.edu.

  20. Tibetan Magmatism Database

    Science.gov (United States)

    Chapman, James B.; Kapp, Paul

    2017-11-01

    A database containing previously published geochronologic, geochemical, and isotopic data on Mesozoic to Quaternary igneous rocks in the Himalayan-Tibetan orogenic system are presented. The database is intended to serve as a repository for new and existing igneous rock data and is publicly accessible through a web-based platform that includes an interactive map and data table interface with search, filtering, and download options. To illustrate the utility of the database, the age, location, and ɛHft composition of magmatism from the central Gangdese batholith in the southern Lhasa terrane are compared. The data identify three high-flux events, which peak at 93, 50, and 15 Ma. They are characterized by inboard arc migration and a temporal and spatial shift to more evolved isotopic compositions.

  1. dbPAF: an integrative database of protein phosphorylation in animals and fungi.

    Science.gov (United States)

    Ullah, Shahid; Lin, Shaofeng; Xu, Yang; Deng, Wankun; Ma, Lili; Zhang, Ying; Liu, Zexian; Xue, Yu

    2016-03-24

    Protein phosphorylation is one of the most important post-translational modifications (PTMs) and regulates a broad spectrum of biological processes. Recent progresses in phosphoproteomic identifications have generated a flood of phosphorylation sites, while the integration of these sites is an urgent need. In this work, we developed a curated database of dbPAF, containing known phosphorylation sites in H. sapiens, M. musculus, R. norvegicus, D. melanogaster, C. elegans, S. pombe and S. cerevisiae. From the scientific literature and public databases, we totally collected and integrated 54,148 phosphoproteins with 483,001 phosphorylation sites. Multiple options were provided for accessing the data, while original references and other annotations were also present for each phosphoprotein. Based on the new data set, we computationally detected significantly over-represented sequence motifs around phosphorylation sites, predicted potential kinases that are responsible for the modification of collected phospho-sites, and evolutionarily analyzed phosphorylation conservation states across different species. Besides to be largely consistent with previous reports, our results also proposed new features of phospho-regulation. Taken together, our database can be useful for further analyses of protein phosphorylation in human and other model organisms. The dbPAF database was implemented in PHP + MySQL and freely available at http://dbpaf.biocuckoo.org.

  2. Growth Analysis of Cancer Biology Research, 2000-2011

    Directory of Open Access Journals (Sweden)

    Keshava,

    2015-09-01

    Full Text Available Methods and Material: The PubMed database was used for retrieving data on 'cancer biology.' Articles were downloaded from the years 2000 to 2011. The articles were classified chronologically and transferred to a spreadsheet application for analysis of the data as per the objectives of the study. Statistical Method: To investigate the nature of growth of articles via exponential, linear, and logistics tests. Result: The year wise analysis of the growth of articles output shows that for the years 2000 to 2005 and later there is a sudden increase in output, during the years 2006 to 2007 and 2008 to 2011. The high productivity of articles during these years may be due to their significance in cancer biology literature, having received prominence in research. Conclusion: There is an obvious need for better compilations of statistics on numbers of publications in the years from 2000 to 2011 on various disciplines on a worldwide scale, for informed critical assessments of the amount of new knowledge contributed by these publications, and for enhancements and refinements of present Scientometric techniques (citation and publication counts, so that valid measures of knowledge growth may be obtained. Only then will Scientometrics be able to provide accurate, useful descriptions and predictions of knowledge growth.

  3. National Database for Autism Research (NDAR)

    Data.gov (United States)

    U.S. Department of Health & Human Services — National Database for Autism Research (NDAR) is an extensible, scalable informatics platform for austism spectrum disorder-relevant data at all levels of biological...

  4. Reactome graph database: Efficient access to complex pathway data

    Science.gov (United States)

    Korninger, Florian; Viteri, Guilherme; Marin-Garcia, Pablo; Ping, Peipei; Wu, Guanming; Stein, Lincoln; D’Eustachio, Peter

    2018-01-01

    Reactome is a free, open-source, open-data, curated and peer-reviewed knowledgebase of biomolecular pathways. One of its main priorities is to provide easy and efficient access to its high quality curated data. At present, biological pathway databases typically store their contents in relational databases. This limits access efficiency because there are performance issues associated with queries traversing highly interconnected data. The same data in a graph database can be queried more efficiently. Here we present the rationale behind the adoption of a graph database (Neo4j) as well as the new ContentService (REST API) that provides access to these data. The Neo4j graph database and its query language, Cypher, provide efficient access to the complex Reactome data model, facilitating easy traversal and knowledge discovery. The adoption of this technology greatly improved query efficiency, reducing the average query time by 93%. The web service built on top of the graph database provides programmatic access to Reactome data by object oriented queries, but also supports more complex queries that take advantage of the new underlying graph-based data storage. By adopting graph database technology we are providing a high performance pathway data resource to the community. The Reactome graph database use case shows the power of NoSQL database engines for complex biological data types. PMID:29377902

  5. Reactome graph database: Efficient access to complex pathway data.

    Directory of Open Access Journals (Sweden)

    Antonio Fabregat

    2018-01-01

    Full Text Available Reactome is a free, open-source, open-data, curated and peer-reviewed knowledgebase of biomolecular pathways. One of its main priorities is to provide easy and efficient access to its high quality curated data. At present, biological pathway databases typically store their contents in relational databases. This limits access efficiency because there are performance issues associated with queries traversing highly interconnected data. The same data in a graph database can be queried more efficiently. Here we present the rationale behind the adoption of a graph database (Neo4j as well as the new ContentService (REST API that provides access to these data. The Neo4j graph database and its query language, Cypher, provide efficient access to the complex Reactome data model, facilitating easy traversal and knowledge discovery. The adoption of this technology greatly improved query efficiency, reducing the average query time by 93%. The web service built on top of the graph database provides programmatic access to Reactome data by object oriented queries, but also supports more complex queries that take advantage of the new underlying graph-based data storage. By adopting graph database technology we are providing a high performance pathway data resource to the community. The Reactome graph database use case shows the power of NoSQL database engines for complex biological data types.

  6. Reactome graph database: Efficient access to complex pathway data.

    Science.gov (United States)

    Fabregat, Antonio; Korninger, Florian; Viteri, Guilherme; Sidiropoulos, Konstantinos; Marin-Garcia, Pablo; Ping, Peipei; Wu, Guanming; Stein, Lincoln; D'Eustachio, Peter; Hermjakob, Henning

    2018-01-01

    Reactome is a free, open-source, open-data, curated and peer-reviewed knowledgebase of biomolecular pathways. One of its main priorities is to provide easy and efficient access to its high quality curated data. At present, biological pathway databases typically store their contents in relational databases. This limits access efficiency because there are performance issues associated with queries traversing highly interconnected data. The same data in a graph database can be queried more efficiently. Here we present the rationale behind the adoption of a graph database (Neo4j) as well as the new ContentService (REST API) that provides access to these data. The Neo4j graph database and its query language, Cypher, provide efficient access to the complex Reactome data model, facilitating easy traversal and knowledge discovery. The adoption of this technology greatly improved query efficiency, reducing the average query time by 93%. The web service built on top of the graph database provides programmatic access to Reactome data by object oriented queries, but also supports more complex queries that take advantage of the new underlying graph-based data storage. By adopting graph database technology we are providing a high performance pathway data resource to the community. The Reactome graph database use case shows the power of NoSQL database engines for complex biological data types.

  7. A Novel Approach: Chemical Relational Databases, and the ...

    Science.gov (United States)

    Mutagenicity and carcinogenicity databases are crucial resources for toxicologists and regulators involved in chemicals risk assessment. Until recently, existing public toxicity databases have been constructed primarily as

  8. Follicle Online: an integrated database of follicle assembly, development and ovulation.

    Science.gov (United States)

    Hua, Juan; Xu, Bo; Yang, Yifan; Ban, Rongjun; Iqbal, Furhan; Cooke, Howard J; Zhang, Yuanwei; Shi, Qinghua

    2015-01-01

    Folliculogenesis is an important part of ovarian function as it provides the oocytes for female reproductive life. Characterizing genes/proteins involved in folliculogenesis is fundamental for understanding the mechanisms associated with this biological function and to cure the diseases associated with folliculogenesis. A large number of genes/proteins associated with folliculogenesis have been identified from different species. However, no dedicated public resource is currently available for folliculogenesis-related genes/proteins that are validated by experiments. Here, we are reporting a database 'Follicle Online' that provides the experimentally validated gene/protein map of the folliculogenesis in a number of species. Follicle Online is a web-based database system for storing and retrieving folliculogenesis-related experimental data. It provides detailed information for 580 genes/proteins (from 23 model organisms, including Homo sapiens, Mus musculus, Rattus norvegicus, Mesocricetus auratus, Bos Taurus, Drosophila and Xenopus laevis) that have been reported to be involved in folliculogenesis, POF (premature ovarian failure) and PCOS (polycystic ovary syndrome). The literature was manually curated from more than 43,000 published articles (till 1 March 2014). The Follicle Online database is implemented in PHP + MySQL + JavaScript and this user-friendly web application provides access to the stored data. In summary, we have developed a centralized database that provides users with comprehensive information about genes/proteins involved in folliculogenesis. This database can be accessed freely and all the stored data can be viewed without any registration. Database URL: http://mcg.ustc.edu.cn/sdap1/follicle/index.php © The Author(s) 2015. Published by Oxford University Press.

  9. UnoViS: the MedIT public unobtrusive vital signs database.

    Science.gov (United States)

    Wartzek, Tobias; Czaplik, Michael; Antink, Christoph Hoog; Eilebrecht, Benjamin; Walocha, Rafael; Leonhardt, Steffen

    2015-01-01

    While PhysioNet is a large database for standard clinical vital signs measurements, such a database does not exist for unobtrusively measured signals. This inhibits progress in the vital area of signal processing for unobtrusive medical monitoring as not everybody owns the specific measurement systems to acquire signals. Furthermore, if no common database exists, a comparison between different signal processing approaches is not possible. This gap will be closed by our UnoViS database. It contains different recordings in various scenarios ranging from a clinical study to measurements obtained while driving a car. Currently, 145 records with a total of 16.2 h of measurement data is available, which are provided as MATLAB files or in the PhysioNet WFDB file format. In its initial state, only (multichannel) capacitive ECG and unobtrusive PPG signals are, together with a reference ECG, included. All ECG signals contain annotations by a peak detector and by a medical expert. A dataset from a clinical study contains further clinical annotations. Additionally, supplementary functions are provided, which simplify the usage of the database and thus the development and evaluation of new algorithms. The development of urgently needed methods for very robust parameter extraction or robust signal fusion in view of frequent severe motion artifacts in unobtrusive monitoring is now possible with the database.

  10. Clearance of psoriasis: the impact of private versus public insurance.

    Science.gov (United States)

    Buzney, Catherine D; Peterman, Caitlin; Saraiya, Ami; Au, Shiu-chung; Dumont, Nicole; Mansfield, Ryan; Gottlieb, Alice B

    2015-02-01

    Psoriasis treatments and therapeutic response as they relate to private versus public patient insurance in the United States have not yet been reviewed. Improved understanding could clarify factors challenging optimal psoriasis management and offer insight for dermatologists treating psoriasis within our healthcare system. 258 subjects were included from a database of psoriasis patients seen at Tufts Medical Center (Boston, MA) during 2008-2014. Insurance was classified as primarily private or public (Medicare or MassHealth/Medicaid). Patients required a minimum of two consecutive visits per treatment and at least 8 weeks within one of four treatment categories: biologics, oral systemics/ phototherapy, combined biologics and oral systemics/phototherapy, or topicals only. Primary endpoint was the Simple-Measure for Assessing Psoriasis Activity (S-MAPA) calculated by multiplying Physician Global Assessment by Body Surface Area. S-MAPAMAPA improvement from baseline, and total drugs used per treatment course (“drug-switching”). 80.2% (n=207) and 19.8% (n=51) had primarily private and public insurance, respectively. 69.6% with private insurance were prescribed biologics versus 66.7% (public insurance) (P=0.689). 54% (private) versus 49% (public) achieved clearance (P=0.514). However, S-MAPA decreased 78.35% from baseline in those with private insurance compared to 61.48% (public) (P=0.036). On average, privately insured patients used at least twice as many same-category treatments, most commonly biologics, than publicly insured individuals (P=0.003). Drug-switching was significantly associated with clearance (P=0.024). Multivariate analysis demonstrated no significant differences in prescribed treatment categories, drug efficacy, clearance, S-MAPA, or drugswitching with respect to patient age. Treatment categories were comparably prescribed between insurance subgroups. However, privately insured patients achieved significantly greater degrees of clearance and

  11. InverPep: A database of invertebrate antimicrobial peptides.

    Science.gov (United States)

    Gómez, Esteban A; Giraldo, Paula; Orduz, Sergio

    2017-03-01

    The aim of this work was to construct InverPep, a database specialised in experimentally validated antimicrobial peptides (AMPs) from invertebrates. AMP data contained in InverPep were manually curated from other databases and the scientific literature. MySQL was integrated with the development platform Laravel; this framework allows to integrate programming in PHP with HTML and was used to design the InverPep web page's interface. InverPep contains 18 separated fields, including InverPep code, phylum and species source, peptide name, sequence, peptide length, secondary structure, molar mass, charge, isoelectric point, hydrophobicity, Boman index, aliphatic index and percentage of hydrophobic amino acids. CALCAMPI, an algorithm to calculate the physicochemical properties of multiple peptides simultaneously, was programmed in PERL language. To date, InverPep contains 702 experimentally validated AMPs from invertebrate species. All of the peptides contain information associated with their source, physicochemical properties, secondary structure, biological activity and links to external literature. Most AMPs in InverPep have a length between 10 and 50 amino acids, a positive charge, a Boman index between 0 and 2 kcal/mol, and 30-50% hydrophobic amino acids. InverPep includes 33 AMPs not reported in other databases. Besides, CALCAMPI and statistical analysis of InverPep data is presented. The InverPep database is available in English and Spanish. InverPep is a useful database to study invertebrate AMPs and its information could be used for the design of new peptides. The user-friendly interface of InverPep and its information can be freely accessed via a web-based browser at http://ciencias.medellin.unal.edu.co/gruposdeinvestigacion/prospeccionydisenobiomoleculas/InverPep/public/home_en. Copyright © 2016 International Society for Chemotherapy of Infection and Cancer. Published by Elsevier Ltd. All rights reserved.

  12. Alkamid database: Chemistry, occurrence and functionality of plant N-alkylamides.

    Science.gov (United States)

    Boonen, Jente; Bronselaer, Antoon; Nielandt, Joachim; Veryser, Lieselotte; De Tré, Guy; De Spiegeleer, Bart

    2012-08-01

    N-Alkylamides (NAAs) are a promising group of bioactive compounds, which are anticipated to act as important lead compounds for plant protection and biocidal products, functional food, cosmeceuticals and drugs in the next decennia. These molecules, currently found in more than 25 plant families and with a wide structural diversity, exert a variety of biological-pharmacological effects and are of high ethnopharmacological importance. However, information is scattered in literature, with different, often unstandardized, pharmacological methodologies being used. Therefore, a comprehensive NAA database (acronym: Alkamid) was constructed to collect the available structural and functional NAA data, linked to their occurrence in plants (family, tribe, species, genus). For loading information in the database, literature data was gathered over the period 1950-2010, by using several search engines. In order to represent the collected information about NAAs, the plants in which they occur and the functionalities for which they have been examined, a relational database is constructed and implemented on a MySQL back-end. The database is supported by describing the NAA plant-, functional- and chemical-space. The chemical space includes a NAA classification, according to their fatty acid and amine structures. The Alkamid database (publicly available on the website http://alkamid.ugent.be/) is not only a central information point, but can also function as a useful tool to prioritize the NAA choice in the evaluation of their functionality, to perform data mining leading to quantitative structure-property relationships (QSPRs), functionality comparisons, clustering, plant biochemistry and taxonomic evaluations. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.

  13. Reflections on a decade of research by ASEAN dental faculties: analysis of publications from ISI-WOS databases from 2000 to 2009.

    Science.gov (United States)

    Sirisinha, Stitaya; Koontongkaew, Sittichai; Phantumvanit, Prathip; Wittayawuttikul, Ruchareka

    2011-05-01

    This communication analyzed research publications in dentistry in the Institute of Scientific Information Web of Science databases of 10 dental faculties in the Association of South-East Asian Nations (ASEAN) from 2000 to 2009. The term used for the "all-document types" search was "Faculty of Dentistry/College of Dentistry." Abstracts presented at regional meetings were also included in the analysis. The Times Higher Education System QS World University Rankings showed that universities in the region fare poorly in world university rankings. Only the National University of Singapore and Nanyang Technological University appeared in the top 100 in 2009; 19 universities in the region, including Indonesia, Malaysia, the Philippines, Singapore, and Thailand, appeared in the top 500. Data from the databases showed that research publications by dental institutes in the region fall short of their Asian counterparts. Singapore and Thailand are the most active in dental research of the ASEAN countries. © 2011 Blackwell Publishing Asia Pty Ltd.

  14. STRUCTURAL BIOLOGY AND MOLECULAR MEDICINE RESEARCH PROGRAM (LSBMM)

    International Nuclear Information System (INIS)

    Eisenberg, David S.

    2008-01-01

    The UCLA-DOE Institute of Genomics and Proteomics is an organized research unit of the University of California, sponsored by the Department of Energy through the mechanism of a Cooperative Agreement. Today the Institute consists of 10 Principal Investigators and 7 Associate Members, developing and applying technologies to promote the biological and environmental missions of the Department of Energy, and 5 Core Technology Centers to sustain this work. The focus is on understanding genomes, pathways and molecular machines in organisms of interest to DOE, with special emphasis on developing enabling technologies. Since it was founded in 1947, the UCLA-DOE Institute has adapted its mission to the research needs of DOE and its progenitor agencies as these research needs have changed. The Institute started as the AEC Laboratory of Nuclear Medicine, directed by Stafford Warren, who later became the founding Dean of the UCLA School of Medicine. In this sense, the entire UCLA medical center grew out of the precursor of our Institute. In 1963, the mission of the Institute was expanded into environmental studies by Director Ray Lunt. I became the third director in 1993, and in close consultation with David Galas and John Wooley of DOE, shifted the mission of the Institute towards genomics and proteomics. Since 1993, the Principal Investigators and Core Technology Centers are entirely new, and the Institute has separated from its former division concerned with PET imaging. The UCLA-DOE Institute shares the space of Boyer Hall with the Molecular Biology Institute, and assumes responsibility for the operation of the main core facilities. Fig. 1 gives the organizational chart of the Institute. Some of the benefits to the public of research carried out at the UCLA-DOE Institute include the following: The development of publicly accessible, web-based databases, including the Database of Protein Interactions, and the ProLinks database of genomicly inferred protein function linkages

  15. Global Volcano Locations Database

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — NGDC maintains a database of over 1,500 volcano locations obtained from the Smithsonian Institution Global Volcanism Program, Volcanoes of the World publication. The...

  16. INIST: databases reorientation

    International Nuclear Information System (INIS)

    Bidet, J.C.

    1995-01-01

    INIST is a CNRS (Centre National de la Recherche Scientifique) laboratory devoted to the treatment of scientific and technical informations and to the management of these informations compiled in a database. Reorientation of the database content has been proposed in 1994 to increase the transfer of research towards enterprises and services, to develop more automatized accesses to the informations, and to create a quality assurance plan. The catalog of publications comprises 5800 periodical titles (1300 for fundamental research and 4500 for applied research). A science and technology multi-thematic database will be created in 1995 for the retrieval of applied and technical informations. ''Grey literature'' (reports, thesis, proceedings..) and human and social sciences data will be added to the base by the use of informations selected in the existing GRISELI and Francis databases. Strong modifications are also planned in the thematic cover of Earth sciences and will considerably reduce the geological information content. (J.S.). 1 tab

  17. Scientometrics on Public Health Research in Iran: Increase of Area Studies despite Embargoes? A Review Article.

    Science.gov (United States)

    Poreau, Brice

    2017-03-01

    Due to embargoes and sanctions from 1979 until 2015, impact on scientific research in Iran may be critical. Public health is the main example of this burning point. In this paper, the aim was to map the scientific research in public health in Iran until 2014 with area studies as well as networks of countries involved. We used bibliographic analyses using VOS viewer software for network analysis during the period 1975-2014. Two databases were used: Web of Science and PubMed. We performed analyses of journals, authors, publication years, organizations, funding companies, countries, keywords and Web of sciences Categories. We accessed 862 articles published between 1991 and 2014, the majority of published after 2008. The main countries of research were Iran, the United States of America, England, and Sweden and represented the main network collaboration. The main Web of Sciences categories was public, occupational and environmental health, medicine general internal and parasitology. We accessed 25462 publications on PubMed database from 1950 to 2014. The majority of published after 2004. The main area studies were prognosis, wounds and injuries, soil solutions and biological markers. Public health research in Iran has been developed since 2004. The chief field was emerging cardiovascular diseases and communicable diseases. Other biotechnological fields were emerging such as biological markers research. Iran provides structures to face up with its new challenges using networks of countries such as the USA, England, and Sweden. End of embargoes could provide new perspectives for public health research and more largely scientific research in Iran.

  18. Second-Tier Database for Ecosystem Focus, 2000-2001 Annual Report.

    Energy Technology Data Exchange (ETDEWEB)

    Van Holmes, Chris; Muongchanh, Christine; Anderson, James J. (University of Washington, School of Aquatic and Fishery Sciences, Seattle, WA)

    2001-11-01

    The Second-Tier Database for Ecosystem Focus (Contract 00004124) provides direct and timely public access to Columbia Basin environmental, operational, fishery and riverine data resources for federal, state, public and private entities. The Second-Tier Database known as Data Access in Realtime (DART) does not duplicate services provided by other government entities in the region. Rather, it integrates public data for effective access, consideration and application.

  19. Pathway Distiller - multisource biological pathway consolidation.

    Science.gov (United States)

    Doderer, Mark S; Anguiano, Zachry; Suresh, Uthra; Dashnamoorthy, Ravi; Bishop, Alexander J R; Chen, Yidong

    2012-01-01

    One method to understand and evaluate an experiment that produces a large set of genes, such as a gene expression microarray analysis, is to identify overrepresentation or enrichment for biological pathways. Because pathways are able to functionally describe the set of genes, much effort has been made to collect curated biological pathways into publicly accessible databases. When combining disparate databases, highly related or redundant pathways exist, making their consolidation into pathway concepts essential. This will facilitate unbiased, comprehensive yet streamlined analysis of experiments that result in large gene sets. After gene set enrichment finds representative pathways for large gene sets, pathways are consolidated into representative pathway concepts. Three complementary, but different methods of pathway consolidation are explored. Enrichment Consolidation combines the set of the pathways enriched for the signature gene list through iterative combining of enriched pathways with other pathways with similar signature gene sets; Weighted Consolidation utilizes a Protein-Protein Interaction network based gene-weighting approach that finds clusters of both enriched and non-enriched pathways limited to the experiments' resultant gene list; and finally the de novo Consolidation method uses several measurements of pathway similarity, that finds static pathway clusters independent of any given experiment. We demonstrate that the three consolidation methods provide unified yet different functional insights of a resultant gene set derived from a genome-wide profiling experiment. Results from the methods are presented, demonstrating their applications in biological studies and comparing with a pathway web-based framework that also combines several pathway databases. Additionally a web-based consolidation framework that encompasses all three methods discussed in this paper, Pathway Distiller (http://cbbiweb.uthscsa.edu/PathwayDistiller), is established to allow

  20. Public Domain; Public Interest; Public Funding: Focussing on the ‘three Ps’ in Scientific Research

    Directory of Open Access Journals (Sweden)

    Mags McGinley

    2005-03-01

    Full Text Available The purpose of this paper is to discuss the ‘three Ps’ of scientific research: Public Domain; Public Interest; Public Funding. This is done by examining some of the difficulties faced by scientists engaged in scientific research who may have problems working within the constraints of current copyright and database legislation, where property claims can place obstacles in the way of research, in other words, the public domain. The article then looks at perceptions of the public interest and asks whether copyright and the database right reflect understandings of how this concept should operate. Thirdly, it considers the relevance of public funding for scientific research in the context of both the public domain and of the public interest. Finally, some recent initiatives seeking to change the contours of the legal framework are be examined.

  1. Terminology of the public relations field: corpus — automatic term recognition — terminology database

    Directory of Open Access Journals (Sweden)

    Nataša Logar Berginc

    2013-12-01

    Full Text Available The article describes an analysis of automatic term recognition results performed for single- and multi-word terms with the LUIZ term extraction system. The target application of the results is a terminology database of Public Relations and the main resource the KoRP Public Relations Corpus. Our analysis is focused on two segments: (a single-word noun term candidates, which we compare with the frequency list of nouns from KoRP and evaluate termhood on the basis of the judgements of two domain experts, and (b multi-word term candidates with verb and noun as headword. In order to better assess the performance of the system and the soundness of our approach we also performed an analysis of recall. Our results show that the terminological relevance of extracted nouns is indeed higher than that of merely frequent nouns, and that verbal phrases only rarely count as proper terms. The most productive patterns of multi-word terms with noun as a headword have the following structure: [adjective + noun], [adjective + and + adjective + noun] and [adjective + adjective + noun]. The analysis of recall shows low inter-annotator agreement, but nevertheless very satisfactory recall levels.

  2. Aerospace Medicine and Biology: A Continuing Bibliography with Indexes. Supplement 494

    Science.gov (United States)

    1999-01-01

    This supplemental issue of Aerospace Medicine and Biology, A Continuing Bibliography with Indexes lists reports, articles, and other documents recently announced in the NASA STI Database. In its subject coverage, Aerospace Medicine and Biology concentrates on the biological, physiological, psychological, and environmental effects to which humans are subjected during and following simulated or actual flight in the Earth's atmosphere or in interplanetary space. References describing similar effects on biological organisms of lower order are also included. Such related topics as sanitary problems, pharmacology, toxicology, safety and survival, life support systems, exobiology, and personnel factors receive appropriate attention. Applied research receives the most emphasis, but references to fundamental studies and theoretical principles related to experimental development also qualify for inclusion. Each entry in the publication consists of a standard bibliographic citation accompanied, in most cases, by an abstract. The NASA CASI price code table, addresses of organizations, and document availability information are included before the abstract section. Two indexes--subject and author are included after the abstract section.

  3. Aerospace Medicine and Biology: A Continuing Bibliography with Indexes. Supplement 504

    Science.gov (United States)

    2000-01-01

    This supplemental issue of Aerospace Medicine and Biology, A Continuing Bibliography with Indexes (NASA/SP-2000-7011) lists reports, articles, and other documents recently announced in the NASA STI Database. In its subject coverage, Aerospace Medicine and Biology concentrates on the biological, physiological, psychological, and environmental effects to which humans are subjected during and following simulated or actual flight in the Earth's atmosphere or in interplanetary space. References describing similar effects on biological organisms of lower order are also included. Such related topics as sanitary problems, pharmacology, toxicology, safety and survival, life support systems, exobiology, and personnel factors receive appropriate attention. Applied research receives the most emphasis, but references to fundamental studies and theoretical principles related to experimental development also qualify for inclusion. Each entry in the publication consists of a standard bibliographic citation accompanied, in most cases, by an abstract. Two indexes- subject and author are included after the abstract section.

  4. Aerospace Medicine and Biology: A Continuing Bibliography with Indexes. Supplement 490

    Science.gov (United States)

    1999-01-01

    This supplemental issue of Aerospace Medicine and Biology, A Continuing Bibliography with Indexes (NASA/SP-1999-7011) lists reports, articles, and other documents recently announced in the NASA STI Database. In its subject coverage, Aerospace Medicine and Biology concentrates on the biological, physiological, psychological, and environmental effects to which humans are subjected during and following simulated or actual flight in the Earth's atmosphere or in interplanetary space. References describing similar effects on biological organisms of lower order are also included. Such related topics as sanitary problems, pharmacology, toxicology, safety and survival, life support systems, exobiology, and personnel factors receive appropriate attention. Applied research receives the most emphasis, but references to fundamental studies and theoretical principles related to experimental development also qualify for inclusion. Each entry in the publication consists of a standard bibliographic citation accompanied, in most cases, by an abstract. Two indexes-subject and author are included after the abstract section.

  5. Aerospace Medicine and Biology: A Continuing Bibliography with Indexes. Supplement 487

    Science.gov (United States)

    1999-01-01

    This supplemental issue of Aerospace Medicine and Biology, A Continuing Bibliography with Indexes (NASA/SP-1999-7011) lists reports, articles, and other documents recently announced in the NASA STI Database. In its subject coverage, Aerospace Medicine and Biology concentrates on the biological, physiological, psychological, and environmental effects to which humans are subjected during and following simulated or actual flight in the Earth's atmosphere or in interplanetary space. References describing similar effects on biological organisms of lower order are also included. Such related topics as sanitary problems, pharmacology, toxicology, safety and survival, life support systems, exobiology, and personnel factors receive appropriate attention. Applied research receives the most emphasis, but references to fundamental studies and theoretical principles related to experimental development also qualify for inclusion. Each entry in the publication consists of a standard bibliographic citation accompanied, in most cases, by an abstract. Two indexes-subject and author are included after the abstract section.

  6. Aerospace Medicine and Biology: A Continuing Bibliography With Indexes. Supplement 502

    Science.gov (United States)

    2000-01-01

    This supplemental issue of Aerospace Medicine and Biology, A Continuing Bibliography with Indexes (NASA/SP-2000-7011) lists reports, articles, and other documents recently announced in the NASA STI Database. In its subject coverage, Aerospace Medicine and Biology concentrates on the biological, physiological, psychological, and environmental effects to which humans are subjected during and following simulated or actual flight in the Earth's atmosphere or in interplanetary space. References describing similar effects on biological organisms of lower order are also included. Such related topics as sanitary problems, pharmacology, toxicology, safety and survival, life support systems, exobiology, and personnel factors receive appropriate attention. Applied research receives the most emphasis, but references to fundamental studies and theoretical principles related to experimental development also qualify for inclusion. Each entry in the publication consists of a standard bibliographic citation accompanied, in most cases, by an abstract. Two indexes-subject and author are included after the abstract section.

  7. Text mining facilitates database curation - extraction of mutation-disease associations from Bio-medical literature.

    Science.gov (United States)

    Ravikumar, Komandur Elayavilli; Wagholikar, Kavishwar B; Li, Dingcheng; Kocher, Jean-Pierre; Liu, Hongfang

    2015-06-06

    Advances in the next generation sequencing technology has accelerated the pace of individualized medicine (IM), which aims to incorporate genetic/genomic information into medicine. One immediate need in interpreting sequencing data is the assembly of information about genetic variants and their corresponding associations with other entities (e.g., diseases or medications). Even with dedicated effort to capture such information in biological databases, much of this information remains 'locked' in the unstructured text of biomedical publications. There is a substantial lag between the publication and the subsequent abstraction of such information into databases. Multiple text mining systems have been developed, but most of them focus on the sentence level association extraction with performance evaluation based on gold standard text annotations specifically prepared for text mining systems. We developed and evaluated a text mining system, MutD, which extracts protein mutation-disease associations from MEDLINE abstracts by incorporating discourse level analysis, using a benchmark data set extracted from curated database records. MutD achieves an F-measure of 64.3% for reconstructing protein mutation disease associations in curated database records. Discourse level analysis component of MutD contributed to a gain of more than 10% in F-measure when compared against the sentence level association extraction. Our error analysis indicates that 23 of the 64 precision errors are true associations that were not captured by database curators and 68 of the 113 recall errors are caused by the absence of associated disease entities in the abstract. After adjusting for the defects in the curated database, the revised F-measure of MutD in association detection reaches 81.5%. Our quantitative analysis reveals that MutD can effectively extract protein mutation disease associations when benchmarking based on curated database records. The analysis also demonstrates that incorporating

  8. FY 1998 survey report. Examinational research on the construction of body function database; 1998 nendo chosa hokokusho. Shintai kino database no kochiku ni kansuru chosa kenkyu

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1999-03-01

    The body function database is aimed at supplying and supporting products and environment friendly to aged people by supplying the data on body function of aged people in case of planning, designing and production when companies supply the products and environment. As a method for survey, group measuring was made for measurement of visual characteristics. For the measurement of action characteristics, the moving action including posture change was studied, the experimental plan was carried out, and items of group measurement and measuring methods were finally proposed. The database structure was made public at the end of this fiscal year, through the pre-publication/evaluation after the trial evaluation conducted using pilot database. In the study of the measurement of action characteristics, the verification test was conducted for a small-size group. By this, the measurement of action characteristics was finally proposed. In the body function database system, subjects on operation were extracted/bettered by trially evaluating pilot database, and also adjustment of right relations toward publication and preparation of management methods were made. An evaluation version was made supposing its publication. (NEDO)

  9. Improving decoy databases for protein folding algorithms

    KAUST Repository

    Lindsey, Aaron

    2014-01-01

    Copyright © 2014 ACM. Predicting protein structures and simulating protein folding are two of the most important problems in computational biology today. Simulation methods rely on a scoring function to distinguish the native structure (the most energetically stable) from non-native structures. Decoy databases are collections of non-native structures used to test and verify these functions. We present a method to evaluate and improve the quality of decoy databases by adding novel structures and removing redundant structures. We test our approach on 17 different decoy databases of varying size and type and show significant improvement across a variety of metrics. We also test our improved databases on a popular modern scoring function and show that they contain a greater number of native-like structures than the original databases, thereby producing a more rigorous database for testing scoring functions.

  10. MIPS Arabidopsis thaliana Database (MAtDB): an integrated biological knowledge resource for plant genomics

    Science.gov (United States)

    Schoof, Heiko; Ernst, Rebecca; Nazarov, Vladimir; Pfeifer, Lukas; Mewes, Hans-Werner; Mayer, Klaus F. X.

    2004-01-01

    Arabidopsis thaliana is the most widely studied model plant. Functional genomics is intensively underway in many laboratories worldwide. Beyond the basic annotation of the primary sequence data, the annotated genetic elements of Arabidopsis must be linked to diverse biological data and higher order information such as metabolic or regulatory pathways. The MIPS Arabidopsis thaliana database MAtDB aims to provide a comprehensive resource for Arabidopsis as a genome model that serves as a primary reference for research in plants and is suitable for transfer of knowledge to other plants, especially crops. The genome sequence as a common backbone serves as a scaffold for the integration of data, while, in a complementary effort, these data are enhanced through the application of state-of-the-art bioinformatics tools. This information is visualized on a genome-wide and a gene-by-gene basis with access both for web users and applications. This report updates the information given in a previous report and provides an outlook on further developments. The MAtDB web interface can be accessed at http://mips.gsf.de/proj/thal/db. PMID:14681437

  11. DMPD: Lysophospholipid receptors: signaling and biology. [Dynamic Macrophage Pathway CSML Database

    Lifescience Database Archive (English)

    Full Text Available 15189145 Lysophospholipid receptors: signaling and biology. Ishii I, Fukushima N, Y...e X, Chun J. Annu Rev Biochem. 2004;73:321-54. (.png) (.svg) (.html) (.csml) Show Lysophospholipid receptors...: signaling and biology. PubmedID 15189145 Title Lysophospholipid receptors: signaling and biology. Authors

  12. Data Linkage Graph: computation, querying and knowledge discovery of life science database networks

    Directory of Open Access Journals (Sweden)

    Lange Matthias

    2007-12-01

    Full Text Available To support the interpretation of measured molecular facts, like gene expression experiments or EST sequencing, the functional or the system biological context has to be considered. Doing so, the relationship to existing biological knowledge has to be discovered. In general, biological knowledge is worldwide represented in a network of databases. In this paper we present a method for knowledge extraction in life science databases, which prevents the scientists from screen scraping and web clicking approaches.

  13. HSC-explorer: a curated database for hematopoietic stem cells.

    Science.gov (United States)

    Montrone, Corinna; Kokkaliaris, Konstantinos D; Loeffler, Dirk; Lechner, Martin; Kastenmüller, Gabi; Schroeder, Timm; Ruepp, Andreas

    2013-01-01

    HSC-Explorer (http://mips.helmholtz-muenchen.de/HSC/) is a publicly available, integrative database containing detailed information about the early steps of hematopoiesis. The resource aims at providing fast and easy access to relevant information, in particular to the complex network of interacting cell types and molecules, from the wealth of publications in the field through visualization interfaces. It provides structured information on more than 7000 experimentally validated interactions between molecules, bioprocesses and environmental factors. Information is manually derived by critical reading of the scientific literature from expert annotators. Hematopoiesis-relevant interactions are accompanied with context information such as model organisms and experimental methods for enabling assessment of reliability and relevance of experimental results. Usage of established vocabularies facilitates downstream bioinformatics applications and to convert the results into complex networks. Several predefined datasets (Selected topics) offer insights into stem cell behavior, the stem cell niche and signaling processes supporting hematopoietic stem cell maintenance. HSC-Explorer provides a versatile web-based resource for scientists entering the field of hematopoiesis enabling users to inspect the associated biological processes through interactive graphical presentation.

  14. HSC-explorer: a curated database for hematopoietic stem cells.

    Directory of Open Access Journals (Sweden)

    Corinna Montrone

    Full Text Available HSC-Explorer (http://mips.helmholtz-muenchen.de/HSC/ is a publicly available, integrative database containing detailed information about the early steps of hematopoiesis. The resource aims at providing fast and easy access to relevant information, in particular to the complex network of interacting cell types and molecules, from the wealth of publications in the field through visualization interfaces. It provides structured information on more than 7000 experimentally validated interactions between molecules, bioprocesses and environmental factors. Information is manually derived by critical reading of the scientific literature from expert annotators. Hematopoiesis-relevant interactions are accompanied with context information such as model organisms and experimental methods for enabling assessment of reliability and relevance of experimental results. Usage of established vocabularies facilitates downstream bioinformatics applications and to convert the results into complex networks. Several predefined datasets (Selected topics offer insights into stem cell behavior, the stem cell niche and signaling processes supporting hematopoietic stem cell maintenance. HSC-Explorer provides a versatile web-based resource for scientists entering the field of hematopoiesis enabling users to inspect the associated biological processes through interactive graphical presentation.

  15. International scientific seminar «Chronicle of Nature – a common database for scientific analysis and joint planning of scientific publications»

    Directory of Open Access Journals (Sweden)

    Juri P. Kurhinen

    2016-05-01

    Full Text Available Provides information about the results of the international scienti fic seminar «Сhronicle of Nature – a common database for scientific analysis and joint planning of scientific publications», held at Findland-Russian project «Linking environmental change to biodiversity change: large scale analysis оf Eurasia ecosystem».

  16. Pancreatic Expression database: a generic model for the organization, integration and mining of complex cancer datasets

    Directory of Open Access Journals (Sweden)

    Lemoine Nicholas R

    2007-11-01

    Full Text Available Abstract Background Pancreatic cancer is the 5th leading cause of cancer death in both males and females. In recent years, a wealth of gene and protein expression studies have been published broadening our understanding of pancreatic cancer biology. Due to the explosive growth in publicly available data from multiple different sources it is becoming increasingly difficult for individual researchers to integrate these into their current research programmes. The Pancreatic Expression database, a generic web-based system, is aiming to close this gap by providing the research community with an open access tool, not only to mine currently available pancreatic cancer data sets but also to include their own data in the database. Description Currently, the database holds 32 datasets comprising 7636 gene expression measurements extracted from 20 different published gene or protein expression studies from various pancreatic cancer types, pancreatic precursor lesions (PanINs and chronic pancreatitis. The pancreatic data are stored in a data management system based on the BioMart technology alongside the human genome gene and protein annotations, sequence, homologue, SNP and antibody data. Interrogation of the database can be achieved through both a web-based query interface and through web services using combined criteria from pancreatic (disease stages, regulation, differential expression, expression, platform technology, publication and/or public data (antibodies, genomic region, gene-related accessions, ontology, expression patterns, multi-species comparisons, protein data, SNPs. Thus, our database enables connections between otherwise disparate data sources and allows relatively simple navigation between all data types and annotations. Conclusion The database structure and content provides a powerful and high-speed data-mining tool for cancer research. It can be used for target discovery i.e. of biomarkers from body fluids, identification and analysis

  17. Database citation in supplementary data linked to Europe PubMed Central full text biomedical articles.

    Science.gov (United States)

    Kafkas, Şenay; Kim, Jee-Hyub; Pi, Xingjun; McEntyre, Johanna R

    2015-01-01

    In this study, we present an analysis of data citation practices in full text research articles and their corresponding supplementary data files, made available in the Open Access set of articles from Europe PubMed Central. Our aim is to investigate whether supplementary data files should be considered as a source of information for integrating the literature with biomolecular databases. Using text-mining methods to identify and extract a variety of core biological database accession numbers, we found that the supplemental data files contain many more database citations than the body of the article, and that those citations often take the form of a relatively small number of articles citing large collections of accession numbers in text-based files. Moreover, citation of value-added databases derived from submission databases (such as Pfam, UniProt or Ensembl) is common, demonstrating the reuse of these resources as datasets in themselves. All the database accession numbers extracted from the supplementary data are publicly accessible from http://dx.doi.org/10.5281/zenodo.11771. Our study suggests that supplementary data should be considered when linking articles with data, in curation pipelines, and in information retrieval tasks in order to make full use of the entire research article. These observations highlight the need to improve the management of supplemental data in general, in order to make this information more discoverable and useful.

  18. Enhanced Publications Linking Publications and Research Data in Digital Repositories

    CERN Document Server

    Vernooy-Gerritsen, Marjan

    2009-01-01

    The traditional publication will be overhauled by the 'Enhanced Publication'. This is a publication that is enhanced with research data, extra materials, post publication data, and database records. It has an object-based structure with explicit l

  19. Hydroponics Database and Handbook for the Advanced Life Support Test Bed

    Science.gov (United States)

    Nash, Allen J.

    1999-01-01

    During the summer 1998, I did student assistance to Dr. Daniel J. Barta, chief plant growth expert at Johnson Space Center - NASA. We established the preliminary stages of a hydroponic crop growth database for the Advanced Life Support Systems Integration Test Bed, otherwise referred to as BIO-Plex (Biological Planetary Life Support Systems Test Complex). The database summarizes information from published technical papers by plant growth experts, and it includes bibliographical, environmental and harvest information based on plant growth under varying environmental conditions. I collected 84 lettuce entries, 14 soybean, 49 sweet potato, 16 wheat, 237 white potato, and 26 mix crop entries. The list will grow with the publication of new research. This database will be integrated with a search and systems analysis computer program that will cross-reference multiple parameters to determine optimum edible yield under varying parameters. Also, we have made preliminary effort to put together a crop handbook for BIO-Plex plant growth management. It will be a collection of information obtained from experts who provided recommendations on a particular crop's growing conditions. It includes bibliographic, environmental, nutrient solution, potential yield, harvest nutritional, and propagation procedure information. This handbook will stand as the baseline growth conditions for the first set of experiments in the BIO-Plex facility.

  20. EKPD: a hierarchical database of eukaryotic protein kinases and protein phosphatases.

    Science.gov (United States)

    Wang, Yongbo; Liu, Zexian; Cheng, Han; Gao, Tianshun; Pan, Zhicheng; Yang, Qing; Guo, Anyuan; Xue, Yu

    2014-01-01

    We present here EKPD (http://ekpd.biocuckoo.org), a hierarchical database of eukaryotic protein kinases (PKs) and protein phosphatases (PPs), the key molecules responsible for the reversible phosphorylation of proteins that are involved in almost all aspects of biological processes. As extensive experimental and computational efforts have been carried out to identify PKs and PPs, an integrative resource with detailed classification and annotation information would be of great value for both experimentalists and computational biologists. In this work, we first collected 1855 PKs and 347 PPs from the scientific literature and various public databases. Based on previously established rationales, we classified all of the known PKs and PPs into a hierarchical structure with three levels, i.e. group, family and individual PK/PP. There are 10 groups with 149 families for the PKs and 10 groups with 33 families for the PPs. We constructed 139 and 27 Hidden Markov Model profiles for PK and PP families, respectively. Then we systematically characterized ∼50,000 PKs and >10,000 PPs in eukaryotes. In addition, >500 PKs and >400 PPs were computationally identified by ortholog search. Finally, the online service of the EKPD database was implemented in PHP + MySQL + JavaScript.

  1. Protein Structure Initiative Material Repository: an open shared public resource of structural genomics plasmids for the biological community

    Science.gov (United States)

    Cormier, Catherine Y.; Mohr, Stephanie E.; Zuo, Dongmei; Hu, Yanhui; Rolfs, Andreas; Kramer, Jason; Taycher, Elena; Kelley, Fontina; Fiacco, Michael; Turnbull, Greggory; LaBaer, Joshua

    2010-01-01

    The Protein Structure Initiative Material Repository (PSI-MR; http://psimr.asu.edu) provides centralized storage and distribution for the protein expression plasmids created by PSI researchers. These plasmids are a resource that allows the research community to dissect the biological function of proteins whose structures have been identified by the PSI. The plasmid annotation, which includes the full length sequence, vector information and associated publications, is stored in a freely available, searchable database called DNASU (http://dnasu.asu.edu). Each PSI plasmid is also linked to a variety of additional resources, which facilitates cross-referencing of a particular plasmid to protein annotations and experimental data. Plasmid samples can be requested directly through the website. We have also developed a novel strategy to avoid the most common concern encountered when distributing plasmids namely, the complexity of material transfer agreement (MTA) processing and the resulting delays this causes. The Expedited Process MTA, in which we created a network of institutions that agree to the terms of transfer in advance of a material request, eliminates these delays. Our hope is that by creating a repository of expression-ready plasmids and expediting the process for receiving these plasmids, we will help accelerate the accessibility and pace of scientific discovery. PMID:19906724

  2. Comprehensive T-matrix Reference Database: A 2009-2011 Update

    Science.gov (United States)

    Zakharova, Nadezhda T.; Videen, G.; Khlebtsov, Nikolai G.

    2012-01-01

    The T-matrix method is one of the most versatile and efficient theoretical techniques widely used for the computation of electromagnetic scattering by single and composite particles, discrete random media, and particles in the vicinity of an interface separating two half-spaces with different refractive indices. This paper presents an update to the comprehensive database of peer-reviewed T-matrix publications compiled by us previously and includes the publications that appeared since 2009. It also lists several earlier publications not included in the original database.

  3. Comprehensive T-Matrix Reference Database: A 2007-2009 Update

    Science.gov (United States)

    Mishchenko, Michael I.; Zakharova, Nadia T.; Videen, Gorden; Khlebtsov, Nikolai G.; Wriedt, Thomas

    2010-01-01

    The T-matrix method is among the most versatile, efficient, and widely used theoretical techniques for the numerically exact computation of electromagnetic scattering by homogeneous and composite particles, clusters of particles, discrete random media, and particles in the vicinity of an interface separating two half-spaces with different refractive indices. This paper presents an update to the comprehensive database of T-matrix publications compiled by us previously and includes the publications that appeared since 2007. It also lists several earlier publications not included in the original database.

  4. The PMDB Protein Model Database

    Science.gov (United States)

    Castrignanò, Tiziana; De Meo, Paolo D'Onorio; Cozzetto, Domenico; Talamo, Ivano Giuseppe; Tramontano, Anna

    2006-01-01

    The Protein Model Database (PMDB) is a public resource aimed at storing manually built 3D models of proteins. The database is designed to provide access to models published in the scientific literature, together with validating experimental data. It is a relational database and it currently contains >74 000 models for ∼240 proteins. The system is accessible at and allows predictors to submit models along with related supporting evidence and users to download them through a simple and intuitive interface. Users can navigate in the database and retrieve models referring to the same target protein or to different regions of the same protein. Each model is assigned a unique identifier that allows interested users to directly access the data. PMID:16381873

  5. The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): A Completed Reference Database of Lung Nodules on CT Scans

    International Nuclear Information System (INIS)

    2011-01-01

    Purpose: The development of computer-aided diagnostic (CAD) methods for lung nodule detection, classification, and quantitative assessment can be facilitated through a well-characterized repository of computed tomography (CT) scans. The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI) completed such a database, establishing a publicly available reference for the medical imaging research community. Initiated by the National Cancer Institute (NCI), further advanced by the Foundation for the National Institutes of Health (FNIH), and accompanied by the Food and Drug Administration (FDA) through active participation, this public-private partnership demonstrates the success of a consortium founded on a consensus-based process. Methods: Seven academic centers and eight medical imaging companies collaborated to identify, address, and resolve challenging organizational, technical, and clinical issues to provide a solid foundation for a robust database. The LIDC/IDRI Database contains 1018 cases, each of which includes images from a clinical thoracic CT scan and an associated XML file that records the results of a two-phase image annotation process performed by four experienced thoracic radiologists. In the initial blinded-read phase, each radiologist independently reviewed each CT scan and marked lesions belonging to one of three categories (''nodule≥3 mm,''''nodule<3 mm,'' and ''non-nodule≥3 mm''). In the subsequent unblinded-read phase, each radiologist independently reviewed their own marks along with the anonymized marks of the three other radiologists to render a final opinion. The goal of this process was to identify as completely as possible all lung nodules in each CT scan without requiring forced consensus. Results: The Database contains 7371 lesions marked ''nodule'' by at least one radiologist. 2669 of these lesions were marked ''nodule≥3 mm'' by at least one radiologist, of which 928 (34.7%) received such marks from all

  6. DNA algorithms of implementing biomolecular databases on a biological computer.

    Science.gov (United States)

    Chang, Weng-Long; Vasilakos, Athanasios V

    2015-01-01

    In this paper, DNA algorithms are proposed to perform eight operations of relational algebra (calculus), which include Cartesian product, union, set difference, selection, projection, intersection, join, and division, on biomolecular relational databases.

  7. Trend of R and D publications in pressurised heavy water reactors: A study using INIS and other databases

    International Nuclear Information System (INIS)

    Kumar, V.; Kalyane, V.L.; Prakasan, E.R.; Kumar, A.; Sagar, A.; Mohan, L.

    2004-01-01

    Digital databases INIS (1970-2002), INSPEC (1969-2002), Chemical Abstracts (1977-2002), ISMEC (1973-June 2002), Web of Sciences (1974-2002), and Science Citation Index (1982-2002), were used for comprehensive retrieval of bibliographic details of research publications on Pressurized Heavy Water Reactor (PHWR) research. Among the countries contributing to PHWR research, India (having 1737 papers) is the forerunner followed by Canada (1492), Romania (508) and Argentina (334). Collaboration of Canadian researchers with researchers of other countries resulted in 75 publications. Among the most productive researchers in this field, the first 15 are from India. Top three contributors to PHWR publications with their respective authorship credits are: H.S. Kushwaha (106), Anil Kakodkar (100) and V. Venkat Raj (76). Prominent interdomainary interactions in PHWR subfields are: Specific nuclear reactors and associated plants with General studies of nuclear reactors (481), followed by Environmental sciences (185), and Materials science (154). Number of publications dealing with Geosciences aspect of environmental sciences are 141. Romania, Argentina, India and Republic of Korea have used mostly (≥75%) non-conventional media for publications. Out of the 4851 publications, 1228 have been published in 292 distinct journals. Top most journals publishing PHWR papers are: Radiation Protection and Environment (continued from: Bulletin of Radiation Protection since 1997), India (115); Nuclear Engineering International, UK (84); and Transactions of the American Nuclear Society, USA (68). (author)

  8. Occupational Health and Safety in Aquaculture: Insights on Brazilian Public Policies.

    Science.gov (United States)

    de Oliveira, Pedro Keller; Cavalli, Richard Souto; Kunert Filho, Hiran Castagnino; Carvalho, Daiane; Benedetti, Nadine; Rotta, Marco Aurélio; Peixoto Ramos, Augusto Sávio; de Brito, Kelly Cristina Tagliari; de Brito, Benito Guimarães; da Rocha, Andréa Ferretto; Stech, Marcia Regina; Cavalli, Lissandra Souto

    2017-01-01

    Aquaculture has many occupational hazards, including those that are physical, chemical, biological, ergonomic, and mechanical. The risks in aquaculture are inherent, as this activity requires particular practices. The objective of the present study was to show the risks associated with the aquaculture sector and present a critical overview on the Brazilian public policies concerning aquaculture occupational health. Methods include online research involved web searches and electronic databases including Pubmed, Google Scholar, Scielo and government databases. We conducted a careful revision of Brazilian labor laws related to occupational health and safety, rural workers, and aquaculture. The results and conclusion support the idea that aquaculture requires specific and well-established industry programs and policies, especially in developing countries. Aquaculture still lacks scientific research, strategies, laws, and public policies to boost the sector with regard to occupational health and safety. The establishment of a safe workplace in aquaculture in developing countries remains a challenge for all involved in employer-employee relationships.

  9. Advancements in web-database applications for rabies surveillance

    Directory of Open Access Journals (Sweden)

    Bélanger Denise

    2011-08-01

    Full Text Available Abstract Background Protection of public health from rabies is informed by the analysis of surveillance data from human and animal populations. In Canada, public health, agricultural and wildlife agencies at the provincial and federal level are responsible for rabies disease control, and this has led to multiple agency-specific data repositories. Aggregation of agency-specific data into one database application would enable more comprehensive data analyses and effective communication among participating agencies. In Québec, RageDB was developed to house surveillance data for the raccoon rabies variant, representing the next generation in web-based database applications that provide a key resource for the protection of public health. Results RageDB incorporates data from, and grants access to, all agencies responsible for the surveillance of raccoon rabies in Québec. Technological advancements of RageDB to rabies surveillance databases include 1 automatic integration of multi-agency data and diagnostic results on a daily basis; 2 a web-based data editing interface that enables authorized users to add, edit and extract data; and 3 an interactive dashboard to help visualize data simply and efficiently, in table, chart, and cartographic formats. Furthermore, RageDB stores data from citizens who voluntarily report sightings of rabies suspect animals. We also discuss how sightings data can indicate public perception to the risk of racoon rabies and thus aid in directing the allocation of disease control resources for protecting public health. Conclusions RageDB provides an example in the evolution of spatio-temporal database applications for the storage, analysis and communication of disease surveillance data. The database was fast and inexpensive to develop by using open-source technologies, simple and efficient design strategies, and shared web hosting. The database increases communication among agencies collaborating to protect human health from

  10. The Danish Urogynaecological Database

    DEFF Research Database (Denmark)

    Guldberg, Rikke; Brostrøm, Søren; Hansen, Jesper Kjær

    2013-01-01

    in the DugaBase from 1 January 2009 to 31 October 2010, using medical records as a reference. RESULTS: A total of 16,509 urogynaecological procedures were registered in the DugaBase by 31 December 2010. The database completeness has increased by calendar time, from 38.2 % in 2007 to 93.2 % in 2010 for public......INTRODUCTION AND HYPOTHESIS: The Danish Urogynaecological Database (DugaBase) is a nationwide clinical database established in 2006 to monitor, ensure and improve the quality of urogynaecological surgery. We aimed to describe its establishment and completeness and to validate selected variables....... This is the first study based on data from the DugaBase. METHODS: The database completeness was calculated as a comparison between urogynaecological procedures reported to the Danish National Patient Registry and to the DugaBase. Validity was assessed for selected variables from a random sample of 200 women...

  11. SPECIES DATABASES AND THE BIOINFORMATICS REVOLUTION.

    Science.gov (United States)

    Biological databases are having a growth spurt. Much of this results from research in genetics and biodiversity, coupled with fast-paced developments in information technology. The revolution in bioinformatics, defined by Sugden and Pennisi (2000) as the "tools and techniques for...

  12. dBBQs: dataBase of Bacterial Quality scores

    OpenAIRE

    Wanchai, Visanu; Patumcharoenpol, Preecha; Nookaew, Intawat; Ussery, David

    2017-01-01

    Background: It is well-known that genome sequencing technologies are becoming significantly cheaper and faster. As a result of this, the exponential growth in sequencing data in public databases allows us to explore ever growing large collections of genome sequences. However, it is less known that the majority of available sequenced genome sequences in public databases are not complete, drafts of varying qualities. We have calculated quality scores for around 100,000 bacterial genomes from al...

  13. Carbon dioxide (CO 2 ) utilizing strain database | Saini | African ...

    African Journals Online (AJOL)

    Culling of excess carbon dioxide from our environment is one of the major challenges to scientific communities. Many physical, chemical and biological methods have been practiced to overcome this problem. The biological means of CO2 fixation using various microorganisms is gaining importance because database of ...

  14. ARTI refrigerant database

    Energy Technology Data Exchange (ETDEWEB)

    Calm, J.M.

    1998-03-15

    The Refrigerant Database is an information system on alternative refrigerants, associated lubricants, and their use in air conditioning and refrigeration. It consolidates and facilitates access to thermophysical properties, compatibility, environmental, safety, application and other information. It provides corresponding information on older refrigerants, to assist manufacturers and those using alternative refrigerants, to make comparisons and determine differences. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern. The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air conditioning and refrigeration equipment. It also references documents addressing compatibility of refrigerants and lubricants with other materials.

  15. IntPath--an integrated pathway gene relationship database for model organisms and important pathogens.

    Science.gov (United States)

    Zhou, Hufeng; Jin, Jingjing; Zhang, Haojun; Yi, Bo; Wozniak, Michal; Wong, Limsoon

    2012-01-01

    Pathway data are important for understanding the relationship between genes, proteins and many other molecules in living organisms. Pathway gene relationships are crucial information for guidance, prediction, reference and assessment in biochemistry, computational biology, and medicine. Many well-established databases--e.g., KEGG, WikiPathways, and BioCyc--are dedicated to collecting pathway data for public access. However, the effectiveness of these databases is hindered by issues such as incompatible data formats, inconsistent molecular representations, inconsistent molecular relationship representations, inconsistent referrals to pathway names, and incomprehensive data from different databases. In this paper, we overcome these issues through extraction, normalization and integration of pathway data from several major public databases (KEGG, WikiPathways, BioCyc, etc). We build a database that not only hosts our integrated pathway gene relationship data for public access but also maintains the necessary updates in the long run. This public repository is named IntPath (Integrated Pathway gene relationship database for model organisms and important pathogens). Four organisms--S. cerevisiae, M. tuberculosis H37Rv, H. Sapiens and M. musculus--are included in this version (V2.0) of IntPath. IntPath uses the "full unification" approach to ensure no deletion and no introduced noise in this process. Therefore, IntPath contains much richer pathway-gene and pathway-gene pair relationships and much larger number of non-redundant genes and gene pairs than any of the single-source databases. The gene relationships of each gene (measured by average node degree) per pathway are significantly richer. The gene relationships in each pathway (measured by average number of gene pairs per pathway) are also considerably richer in the integrated pathways. Moderate manual curation are involved to get rid of errors and noises from source data (e.g., the gene ID errors in WikiPathways and

  16. Semantic-JSON: a lightweight web service interface for Semantic Web contents integrating multiple life science databases.

    Science.gov (United States)

    Kobayashi, Norio; Ishii, Manabu; Takahashi, Satoshi; Mochizuki, Yoshiki; Matsushima, Akihiro; Toyoda, Tetsuro

    2011-07-01

    Global cloud frameworks for bioinformatics research databases become huge and heterogeneous; solutions face various diametric challenges comprising cross-integration, retrieval, security and openness. To address this, as of March 2011 organizations including RIKEN published 192 mammalian, plant and protein life sciences databases having 8.2 million data records, integrated as Linked Open or Private Data (LOD/LPD) using SciNetS.org, the Scientists' Networking System. The huge quantity of linked data this database integration framework covers is based on the Semantic Web, where researchers collaborate by managing metadata across public and private databases in a secured data space. This outstripped the data query capacity of existing interface tools like SPARQL. Actual research also requires specialized tools for data analysis using raw original data. To solve these challenges, in December 2009 we developed the lightweight Semantic-JSON interface to access each fragment of linked and raw life sciences data securely under the control of programming languages popularly used by bioinformaticians such as Perl and Ruby. Researchers successfully used the interface across 28 million semantic relationships for biological applications including genome design, sequence processing, inference over phenotype databases, full-text search indexing and human-readable contents like ontology and LOD tree viewers. Semantic-JSON services of SciNetS.org are provided at http://semanticjson.org.

  17. DAVID Knowledgebase: a gene-centered database integrating heterogeneous gene annotation resources to facilitate high-throughput gene functional analysis

    Directory of Open Access Journals (Sweden)

    Baseler Michael W

    2007-11-01

    Full Text Available Abstract Background Due to the complex and distributed nature of biological research, our current biological knowledge is spread over many redundant annotation databases maintained by many independent groups. Analysts usually need to visit many of these bioinformatics databases in order to integrate comprehensive annotation information for their genes, which becomes one of the bottlenecks, particularly for the analytic task associated with a large gene list. Thus, a highly centralized and ready-to-use gene-annotation knowledgebase is in demand for high throughput gene functional analysis. Description The DAVID Knowledgebase is built around the DAVID Gene Concept, a single-linkage method to agglomerate tens of millions of gene/protein identifiers from a variety of public genomic resources into DAVID gene clusters. The grouping of such identifiers improves the cross-reference capability, particularly across NCBI and UniProt systems, enabling more than 40 publicly available functional annotation sources to be comprehensively integrated and centralized by the DAVID gene clusters. The simple, pair-wise, text format files which make up the DAVID Knowledgebase are freely downloadable for various data analysis uses. In addition, a well organized web interface allows users to query different types of heterogeneous annotations in a high-throughput manner. Conclusion The DAVID Knowledgebase is designed to facilitate high throughput gene functional analysis. For a given gene list, it not only provides the quick accessibility to a wide range of heterogeneous annotation data in a centralized location, but also enriches the level of biological information for an individual gene. Moreover, the entire DAVID Knowledgebase is freely downloadable or searchable at http://david.abcc.ncifcrf.gov/knowledgebase/.

  18. Correlates of Access to Business Research Databases

    Science.gov (United States)

    Gottfried, John C.

    2010-01-01

    This study examines potential correlates of business research database access through academic libraries serving top business programs in the United States. Results indicate that greater access to research databases is related to enrollment in graduate business programs, but not to overall enrollment or status as a public or private institution.…

  19. 1.15 - Structural Chemogenomics Databases to Navigate Protein–Ligand Interaction Space

    NARCIS (Netherlands)

    Kanev, G.K.; Kooistra, A.J.; de Esch, I.J.P.; de Graaf, C.

    2017-01-01

    Structural chemogenomics databases allow the integration and exploration of heterogeneous genomic, structural, chemical, and pharmacological data in order to extract useful information that is applicable for the discovery of new protein targets and biologically active molecules. Integrated databases

  20. Analysis of isotropic turbulence using a public database and the Web service model, and applications to study subgrid models

    Science.gov (United States)

    Meneveau, Charles; Yang, Yunke; Perlman, Eric; Wan, Minpin; Burns, Randal; Szalay, Alex; Chen, Shiyi; Eyink, Gregory

    2008-11-01

    A public database system archiving a direct numerical simulation (DNS) data set of isotropic, forced turbulence is used for studying basic turbulence dynamics. The data set consists of the DNS output on 1024-cubed spatial points and 1024 time-samples spanning about one large-scale turn-over timescale. This complete space-time history of turbulence is accessible to users remotely through an interface that is based on the Web-services model (see http://turbulence.pha.jhu.edu). Users may write and execute analysis programs on their host computers, while the programs make subroutine-like calls that request desired parts of the data over the network. The architecture of the database is briefly explained, as are some of the new functions such as Lagrangian particle tracking and spatial box-filtering. These tools are used to evaluate and compare subgrid stresses and models.

  1. The Sequenced Angiosperm Genomes and Genome Databases.

    Science.gov (United States)

    Chen, Fei; Dong, Wei; Zhang, Jiawei; Guo, Xinyue; Chen, Junhao; Wang, Zhengjia; Lin, Zhenguo; Tang, Haibao; Zhang, Liangsheng

    2018-01-01

    Angiosperms, the flowering plants, provide the essential resources for human life, such as food, energy, oxygen, and materials. They also promoted the evolution of human, animals, and the planet earth. Despite the numerous advances in genome reports or sequencing technologies, no review covers all the released angiosperm genomes and the genome databases for data sharing. Based on the rapid advances and innovations in the database reconstruction in the last few years, here we provide a comprehensive review for three major types of angiosperm genome databases, including databases for a single species, for a specific angiosperm clade, and for multiple angiosperm species. The scope, tools, and data of each type of databases and their features are concisely discussed. The genome databases for a single species or a clade of species are especially popular for specific group of researchers, while a timely-updated comprehensive database is more powerful for address of major scientific mysteries at the genome scale. Considering the low coverage of flowering plants in any available database, we propose construction of a comprehensive database to facilitate large-scale comparative studies of angiosperm genomes and to promote the collaborative studies of important questions in plant biology.

  2. The Danish Inguinal Hernia database

    DEFF Research Database (Denmark)

    Friis-Andersen, Hans; Bisgaard, Thue

    2016-01-01

    AIM OF DATABASE: To monitor and improve nation-wide surgical outcome after groin hernia repair based on scientific evidence-based surgical strategies for the national and international surgical community. STUDY POPULATION: Patients ≥18 years operated for groin hernia. MAIN VARIABLES: Type and size...... access to their own data stratified on individual surgeons. Registrations are based on a closed, protected Internet system requiring personal codes also identifying the operating institution. A national steering committee consisting of 13 voluntary and dedicated surgeons, 11 of whom are unpaid, handles...... the medical management of the database. RESULTS: The Danish Inguinal Hernia Database comprises intraoperative data from >130,000 repairs (May 2015). A total of 49 peer-reviewed national and international publications have been published from the database (June 2015). CONCLUSION: The Danish Inguinal Hernia...

  3. HEROD: a human ethnic and regional specific omics database.

    Science.gov (United States)

    Zeng, Xian; Tao, Lin; Zhang, Peng; Qin, Chu; Chen, Shangying; He, Weidong; Tan, Ying; Xia Liu, Hong; Yang, Sheng Yong; Chen, Zhe; Jiang, Yu Yang; Chen, Yu Zong

    2017-10-15

    Genetic and gene expression variations within and between populations and across geographical regions have substantial effects on the biological phenotypes, diseases, and therapeutic response. The development of precision medicines can be facilitated by the OMICS studies of the patients of specific ethnicity and geographic region. However, there is an inadequate facility for broadly and conveniently accessing the ethnic and regional specific OMICS data. Here, we introduced a new free database, HEROD, a human ethnic and regional specific OMICS database. Its first version contains the gene expression data of 53 070 patients of 169 diseases in seven ethnic populations from 193 cities/regions in 49 nations curated from the Gene Expression Omnibus (GEO), the ArrayExpress Archive of Functional Genomics Data (ArrayExpress), the Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC). Geographic region information of curated patients was mainly manually extracted from referenced publications of each original study. These data can be accessed and downloaded via keyword search, World map search, and menu-bar search of disease name, the international classification of disease code, geographical region, location of sample collection, ethnic population, gender, age, sample source organ, patient type (patient or healthy), sample type (disease or normal tissue) and assay type on the web interface. The HEROD database is freely accessible at http://bidd2.nus.edu.sg/herod/index.php. The database and web interface are implemented in MySQL, PHP and HTML with all major browsers supported. phacyz@nus.edu.sg. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  4. Bibliometric analysis of publications on wine tourism in the databases Scopus and WoS

    Directory of Open Access Journals (Sweden)

    Amador Durán Sánchez

    2017-01-01

    Full Text Available The aim of this study was to show the current state of scientific research regarding wine tourism, by comparing the platforms of scientific information WoS and Scopus and applying quantitative methods. For this purpose, a bibliometric study of the publications indexed in WoS and Scopus was conducted, analyzing the correlation between increases, coverage, overlap, dispersion and concentration of documents. During the search process, a set of 238 articles and 122 different journals were obtained. Based on the results of the comparative study, we conclude that WoS and Scopus databases differ in scope, data volume and coverage policies with a high degree of unique sources and articles, resulting both of them complementary and not mutually exclusive. Scopus covers the area of wine tourism better, by including a greater number of journals, papers and signatures.

  5. The PEP-II project-wide database

    International Nuclear Information System (INIS)

    Chan, A.; Calish, S.; Crane, G.; MacGregor, I.; Meyer, S.; Wong, J.

    1995-05-01

    The PEP-II Project Database is a tool for monitoring the technical and documentation aspects of this accelerator construction. It holds the PEP-II design specifications, fabrication and installation data in one integrated system. Key pieces of the database include the machine parameter list, magnet and vacuum fabrication data. CAD drawings, publications and documentation, survey and alignment data and property control. The database can be extended to contain information required for the operations phase of the accelerator and detector. Features such as viewing CAD drawing graphics from the database will be implemented in the future. This central Oracle database on a UNIX server is built using ORACLE Case tools. Users at the three collaborating laboratories (SLAC, LBL, LLNL) can access the data remotely, using various desktop computer platforms and graphical interfaces

  6. StraPep: a structure database of bioactive peptides

    Science.gov (United States)

    Wang, Jian; Yin, Tailang; Xiao, Xuwen; He, Dan; Xue, Zhidong; Jiang, Xinnong; Wang, Yan

    2018-01-01

    Abstract Bioactive peptides, with a variety of biological activities and wide distribution in nature, have attracted great research interest in biological and medical fields, especially in pharmaceutical industry. The structural information of bioactive peptide is important for the development of peptide-based drugs. Many databases have been developed cataloguing bioactive peptides. However, to our knowledge, database dedicated to collect all the bioactive peptides with known structure is not available yet. Thus, we developed StraPep, a structure database of bioactive peptides. StraPep holds 3791 bioactive peptide structures, which belong to 1312 unique bioactive peptide sequences. About 905 out of 1312 (68%) bioactive peptides in StraPep contain disulfide bonds, which is significantly higher than that (21%) of PDB. Interestingly, 150 out of 616 (24%) bioactive peptides with three or more disulfide bonds form a structural motif known as cystine knot, which confers considerable structural stability on proteins and is an attractive scaffold for drug design. Detailed information of each peptide, including the experimental structure, the location of disulfide bonds, secondary structure, classification, post-translational modification and so on, has been provided. A wide range of user-friendly tools, such as browsing, sequence and structure-based searching and so on, has been incorporated into StraPep. We hope that this database will be helpful for the research community. Database URL: http://isyslab.info/StraPep PMID:29688386

  7. Bibliometric analysis of Spanish scientific publications in the subject Construction & Building Technology in Web of Science database (1997-2008)

    OpenAIRE

    Rojas-Sola, J. I.; de San-Antonio-Gómez, C.

    2010-01-01

    In this paper the publications from Spanish institutions listed in the journals of the Construction & Building Technology subject of Web of Science database for the period 1997- 2008 are analyzed. The number of journals in whose is published is 35 and the number of articles was 760 (Article or Review). Also a bibliometric assessment has done and we propose two new parameters: Weighted Impact Factor and Relative Impact Factor; also includes the number of citations and the number documents ...

  8. Legacy2Drupal - Conversion of an existing oceanographic relational database to a semantically enabled Drupal content management system

    Science.gov (United States)

    Maffei, A. R.; Chandler, C. L.; Work, T.; Allen, J.; Groman, R. C.; Fox, P. A.

    2009-12-01

    Content Management Systems (CMSs) provide powerful features that can be of use to oceanographic (and other geo-science) data managers. However, in many instances, geo-science data management offices have previously designed customized schemas for their metadata. The WHOI Ocean Informatics initiative and the NSF funded Biological Chemical and Biological Data Management Office (BCO-DMO) have jointly sponsored a project to port an existing, relational database containing oceanographic metadata, along with an existing interface coded in Cold Fusion middleware, to a Drupal6 Content Management System. The goal was to translate all the existing database tables, input forms, website reports, and other features present in the existing system to employ Drupal CMS features. The replacement features include Drupal content types, CCK node-reference fields, themes, RDB, SPARQL, workflow, and a number of other supporting modules. Strategic use of some Drupal6 CMS features enables three separate but complementary interfaces that provide access to oceanographic research metadata via the MySQL database: 1) a Drupal6-powered front-end; 2) a standard SQL port (used to provide a Mapserver interface to the metadata and data; and 3) a SPARQL port (feeding a new faceted search capability being developed). Future plans include the creation of science ontologies, by scientist/technologist teams, that will drive semantically-enabled faceted search capabilities planned for the site. Incorporation of semantic technologies included in the future Drupal 7 core release is also anticipated. Using a public domain CMS as opposed to proprietary middleware, and taking advantage of the many features of Drupal 6 that are designed to support semantically-enabled interfaces will help prepare the BCO-DMO database for interoperability with other ecosystem databases.

  9. Trends in Type of Original Psoriasis Publications by Decade, 1960 to 2010.

    Science.gov (United States)

    Sako, Eric; Famenini, Shannon; Wu, Jashin J

    2016-01-01

    Research investigating psoriasis has spanned decades, and as our understanding of the disease has evolved, the focus of publications has changed. We sought to characterize the trends in original psoriasis-related research from 1960 to 2010 chronologically by decade. A literature review was performed using the keyword psoriasis in the MEDLINE database. All original psoriasis-related articles published at the beginning of each decade were searched and categorized by study type and topic. Number of articles per topic. A total of 869 original psoriasis-related articles were found. The number of publications increased 18 fold over 5 decades. The immunology and pathogenesis of psoriasis was the most frequently researched topic (36%), and retrospective studies were the most common study type (37%). Recent highly published topics included biologic therapy, genetics, and psoriasis-associated cardiovascular disease. Original psoriasis-related publications have grown substantially since 1960. Basic science research into the immunology and pathogenesis has been and continues to be the mainstay of psoriasis research. Recent research trends suggest the focus has expanded to topics such as psoriasis-associated cardiovascular disease, genetics, and biologic therapy.

  10. Use of Genomic Databases for Inquiry-Based Learning about Influenza

    Science.gov (United States)

    Ledley, Fred; Ndung'u, Eric

    2011-01-01

    The genome projects of the past decades have created extensive databases of biological information with applications in both research and education. We describe an inquiry-based exercise that uses one such database, the National Center for Biotechnology Information Influenza Virus Resource, to advance learning about influenza. This database…

  11. Aerospace Medicine and Biology: A Continuing Bibliography With Indexes. Supplement 499

    Science.gov (United States)

    2000-01-01

    This supplemental issue of Aerospace Medicine and Biology, A Continuing Bibliography with Indexes (NASA/SP#1999-7011) lists reports, articles, and other documents recently announced in the NASA STI Database. In its subject coverage, Aerospace Medicine and Biology concentrates on the biological, physiological, psychological, and environmental effects to which humans are subjected during and following simulated or actual flight in the Earth#s atmosphere or in interplanetary space. References describing similar effects on biological organisms of lower order are also included. Such related topics as sanitary problems, pharmacology, toxicology, safety and survival, life support systems, exobiology, and personnel factors receive appropriate attention. Applied research receives the most emphasis, but references to fundamental studies and theoretical principles related to experimental development also qualify for inclusion. Each entry in the publication consists of a standard bibliographic citation accompanied, in most cases, by an abstract. The NASA CASI price code table, addresses of organizations, and document availability information are included before the abstract section. Two indexes-subject and author are included after the abstract section.

  12. Aerospace Medicine and Biology: A Continuing Bibliography with Indexes. Supplement 485

    Science.gov (United States)

    1999-01-01

    This supplemental issue of Aerospace Medicine and Biology, A Continuing Bibliography with Indexes (NASA/SP-1999-7011) lists reports, articles, and other documents recently announced in the NASA STI Database. In its subject coverage, Aerospace Medicine and Biology concentrates on the biological, physiological, psychological, and environmental effects to which humans are subjected during and following simulated or actual flight in the Earth's atmosphere or in interplanetary space. References describing similar effects on biological organisms of lower order are also included. Such related topics as sanitary problems, pharmacology, toxicology, safety and survival, life support systems, exobiology, and personnel factors receive appropriate attention. Applied research receives the most emphasis, but references to fundamental studies and theoretical principles related to experimental development also qualify for inclusion. Each entry in the publication consists of a standard bibliographic citation accompanied, in most cases, by an abstract. The NASA CASI price code table, addresses of organizations, and document availability information are included before the abstract section. Two indexes-subject and author are included after the abstract section.

  13. Aerospace Medicine and Biology: A Continuing Bibliography With Indexes. Supplement 506

    Science.gov (United States)

    2000-01-01

    This supplemental issue of Aerospace Medicine and Biology, A Continuing Bibliography with Indexes (NASA/SP#2000-7011) lists reports, articles, and other documents recently announced in the NASA STI Database. In its subject coverage, Aerospace Medicine and Biology concentrates on the biological, physiological, psychological, and environmental effects to which humans are subjected during and following simulated or actual flight in the Earth's atmosphere or in interplanetary space. References describing similar effects on biological organisms of lower order are also included. Such related topics as sanitary problems, pharmacology, toxicology, safety and survival, life support systems, exobiology, and personnel factors receive appropriate attention. Applied research receives the most emphasis, but references to fundamental studies and theoretical principles related to experimental development also qualify for inclusion. Each entry in the publication consists of a standard bibliographic citation accompanied, in most cases, by an abstract. The NASA CASI price code table, addresses of organizations, and document availability information are included before the abstract section. Two indexes- subject and author are included after the abstract section.

  14. Aerospace Medicine and Biology: A Continuing Bibliography with Indexes. Supplement 496

    Science.gov (United States)

    2000-01-01

    This supplemental issue of Aerospace Medicine and Biology, A Continuing Bibliography with Indexes (NASA/SP#2000-7011) lists reports, articles, and other documents recently announced in the NASA STI Database. In its subject coverage, Aerospace Medicine and Biology concentrates on the biological, physiological, psychological, and environmental effects to which humans are subjected during and following simulated or actual flight in the Earth#s atmosphere or in interplanetary space. References describing similar effects on biological organisms of lower order are also included. Such related topics as sanitary problems, pharmacology, toxicology, safety and survival, life support systems, exobiology, and personnel factors receive appropriate attention. Applied research receives the most emphasis, but references to fundamental studies and theoretical principles related to experimental development also qualify for inclusion. Each entry in the publication consists of a standard bibliographic citation accompanied, in most cases, by an abstract. The NASA CASI price code table, addresses of organizations, and document availability information are included before the abstract section. Two indexes#subject and author are included after the abstract section.

  15. HERVd: database of human endogenous retroviruses

    Czech Academy of Sciences Publication Activity Database

    Pačes, Jan; Pavlíček, Adam; Pačes, Václav

    2002-01-01

    Roč. 30, č. 1 (2002), s. 205-206 ISSN 0305-1048 R&D Projects: GA MŠk LN00A079; GA ČR GA301/99/M023 Keywords : HERV * database * human genome Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 7.051, year: 2002

  16. The Vocational Guidance Research Database: A Scientometric Approach

    Science.gov (United States)

    Flores-Buils, Raquel; Gil-Beltran, Jose Manuel; Caballer-Miedes, Antonio; Martinez-Martinez, Miguel Angel

    2012-01-01

    The scientometric study of scientific output through publications in specialized journals cannot be undertaken exclusively with the databases available today. For this reason, the objective of this article is to introduce the "Base de Datos de Investigacion en Orientacion Vocacional" [Vocational Guidance Research Database], based on the…

  17. CD-ROM-aided Databases

    Science.gov (United States)

    Masuyama, Keiichi

    CD-ROM has rapidly evolved as a new information medium with large capacity, In the U.S. it is predicted that it will become two hundred billion yen market in three years, and thus CD-ROM is strategic target of database industry. Here in Japan the movement toward its commercialization has been active since this year. Shall CD-ROM bussiness ever conquer information market as an on-disk database or electronic publication? Referring to some cases of the applications in the U.S. the author views marketability and the future trend of this new optical disk medium.

  18. The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): A Completed Reference Database of Lung Nodules on CT Scans

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2011-02-15

    Purpose: The development of computer-aided diagnostic (CAD) methods for lung nodule detection, classification, and quantitative assessment can be facilitated through a well-characterized repository of computed tomography (CT) scans. The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI) completed such a database, establishing a publicly available reference for the medical imaging research community. Initiated by the National Cancer Institute (NCI), further advanced by the Foundation for the National Institutes of Health (FNIH), and accompanied by the Food and Drug Administration (FDA) through active participation, this public-private partnership demonstrates the success of a consortium founded on a consensus-based process. Methods: Seven academic centers and eight medical imaging companies collaborated to identify, address, and resolve challenging organizational, technical, and clinical issues to provide a solid foundation for a robust database. The LIDC/IDRI Database contains 1018 cases, each of which includes images from a clinical thoracic CT scan and an associated XML file that records the results of a two-phase image annotation process performed by four experienced thoracic radiologists. In the initial blinded-read phase, each radiologist independently reviewed each CT scan and marked lesions belonging to one of three categories (''nodule{>=}3 mm,''''nodule<3 mm,'' and ''non-nodule{>=}3 mm''). In the subsequent unblinded-read phase, each radiologist independently reviewed their own marks along with the anonymized marks of the three other radiologists to render a final opinion. The goal of this process was to identify as completely as possible all lung nodules in each CT scan without requiring forced consensus. Results: The Database contains 7371 lesions marked ''nodule'' by at least one radiologist. 2669 of these lesions were marked &apos

  19. Systematization of the protein sequence diversity in enzymes related to secondary metabolic pathways in plants, in the context of big data biology inspired by the KNApSAcK motorcycle database.

    Science.gov (United States)

    Ikeda, Shun; Abe, Takashi; Nakamura, Yukiko; Kibinge, Nelson; Hirai Morita, Aki; Nakatani, Atsushi; Ono, Naoaki; Ikemura, Toshimichi; Nakamura, Kensuke; Altaf-Ul-Amin, Md; Kanaya, Shigehiko

    2013-05-01

    Biology is increasingly becoming a data-intensive science with the recent progress of the omics fields, e.g. genomics, transcriptomics, proteomics and metabolomics. The species-metabolite relationship database, KNApSAcK Core, has been widely utilized and cited in metabolomics research, and chronological analysis of that research work has helped to reveal recent trends in metabolomics research. To meet the needs of these trends, the KNApSAcK database has been extended by incorporating a secondary metabolic pathway database called Motorcycle DB. We examined the enzyme sequence diversity related to secondary metabolism by means of batch-learning self-organizing maps (BL-SOMs). Initially, we constructed a map by using a big data matrix consisting of the frequencies of all possible dipeptides in the protein sequence segments of plants and bacteria. The enzyme sequence diversity of the secondary metabolic pathways was examined by identifying clusters of segments associated with certain enzyme groups in the resulting map. The extent of diversity of 15 secondary metabolic enzyme groups is discussed. Data-intensive approaches such as BL-SOM applied to big data matrices are needed for systematizing protein sequences. Handling big data has become an inevitable part of biology.

  20. WGDB: Wood Gene Database with search interface.

    Science.gov (United States)

    Goyal, Neha; Ginwal, H S

    2014-01-01

    Wood quality can be defined in terms of particular end use with the involvement of several traits. Over the last fifteen years researchers have assessed the wood quality traits in forest trees. The wood quality was categorized as: cell wall biochemical traits, fibre properties include the microfibril angle, density and stiffness in loblolly pine [1]. The user friendly and an open-access database has been developed named Wood Gene Database (WGDB) for describing the wood genes along the information of protein and published research articles. It contains 720 wood genes from species namely Pinus, Deodar, fast growing trees namely Poplar, Eucalyptus. WGDB designed to encompass the majority of publicly accessible genes codes for cellulose, hemicellulose and lignin in tree species which are responsive to wood formation and quality. It is an interactive platform for collecting, managing and searching the specific wood genes; it also enables the data mining relate to the genomic information specifically in Arabidopsis thaliana, Populus trichocarpa, Eucalyptus grandis, Pinus taeda, Pinus radiata, Cedrus deodara, Cedrus atlantica. For user convenience, this database is cross linked with public databases namely NCBI, EMBL & Dendrome with the search engine Google for making it more informative and provides bioinformatics tools named BLAST,COBALT. The database is freely available on www.wgdb.in.

  1. Toxico-Cheminformatics: New and Expanding Public ...

    Science.gov (United States)

    High-throughput screening (HTS) technologies, along with efforts to improve public access to chemical toxicity information resources and to systematize older toxicity studies, have the potential to significantly improve information gathering efforts for chemical assessments and predictive capabilities in toxicology. Important developments include: 1) large and growing public resources that link chemical structures to biological activity and toxicity data in searchable format, and that offer more nuanced and varied representations of activity; 2) standardized relational data models that capture relevant details of chemical treatment and effects of published in vivo experiments; and 3) the generation of large amounts of new data from public efforts that are employing HTS technologies to probe a wide range of bioactivity and cellular processes across large swaths of chemical space. By annotating toxicity data with associated chemical structure information, these efforts link data across diverse study domains (e.g., ‘omics’, HTS, traditional toxicity studies), toxicity domains (carcinogenicity, developmental toxicity, neurotoxicity, immunotoxicity, etc) and database sources (EPA, FDA, NCI, DSSTox, PubChem, GEO, ArrayExpress, etc.). Public initiatives are developing systematized data models of toxicity study areas and introducing standardized templates, controlled vocabularies, hierarchical organization, and powerful relational searching capability across capt

  2. Large-scale Health Information Database and Privacy Protection*1

    OpenAIRE

    YAMAMOTO, Ryuichi

    2016-01-01

    Japan was once progressive in the digitalization of healthcare fields but unfortunately has fallen behind in terms of the secondary use of data for public interest. There has recently been a trend to establish large-scale health databases in the nation, and a conflict between data use for public interest and privacy protection has surfaced as this trend has progressed. Databases for health insurance claims or for specific health checkups and guidance services were created according to the law...

  3. Development of a Consumer Product Ingredient Database for ...

    Science.gov (United States)

    Consumer products are a primary source of chemical exposures, yet little structured information is available on the chemical ingredients of these products and the concentrations at which ingredients are present. To address this data gap, we created a database of chemicals in consumer products using product Material Safety Data Sheets (MSDSs) publicly provided by a large retailer. The resulting database represents 1797 unique chemicals mapped to 8921 consumer products and a hierarchy of 353 consumer product “use categories” within a total of 15 top-level categories. We examine the utility of this database and discuss ways in which it will support (i) exposure screening and prioritization, (ii) generic or framework formulations for several indoor/consumer product exposure modeling initiatives, (iii) candidate chemical selection for monitoring near field exposure from proximal sources, and (iv) as activity tracers or ubiquitous exposure sources using “chemical space” map analyses. Chemicals present at high concentrations and across multiple consumer products and use categories that hold high exposure potential are identified. Our database is publicly available to serve regulators, retailers, manufacturers, and the public for predictive screening of chemicals in new and existing consumer products on the basis of exposure and risk. The National Exposure Research Laboratory’s (NERL’s) Human Exposure and Atmospheric Sciences Division (HEASD) conducts resear

  4. The AMMA database

    Science.gov (United States)

    Boichard, Jean-Luc; Brissebrat, Guillaume; Cloche, Sophie; Eymard, Laurence; Fleury, Laurence; Mastrorillo, Laurence; Moulaye, Oumarou; Ramage, Karim

    2010-05-01

    The AMMA project includes aircraft, ground-based and ocean measurements, an intensive use of satellite data and diverse modelling studies. Therefore, the AMMA database aims at storing a great amount and a large variety of data, and at providing the data as rapidly and safely as possible to the AMMA research community. In order to stimulate the exchange of information and collaboration between researchers from different disciplines or using different tools, the database provides a detailed description of the products and uses standardized formats. The AMMA database contains: - AMMA field campaigns datasets; - historical data in West Africa from 1850 (operational networks and previous scientific programs); - satellite products from past and future satellites, (re-)mapped on a regular latitude/longitude grid and stored in NetCDF format (CF Convention); - model outputs from atmosphere or ocean operational (re-)analysis and forecasts, and from research simulations. The outputs are processed as the satellite products are. Before accessing the data, any user has to sign the AMMA data and publication policy. This chart only covers the use of data in the framework of scientific objectives and categorically excludes the redistribution of data to third parties and the usage for commercial applications. Some collaboration between data producers and users, and the mention of the AMMA project in any publication is also required. The AMMA database and the associated on-line tools have been fully developed and are managed by two teams in France (IPSL Database Centre, Paris and OMP, Toulouse). Users can access data of both data centres using an unique web portal. This website is composed of different modules : - Registration: forms to register, read and sign the data use chart when an user visits for the first time - Data access interface: friendly tool allowing to build a data extraction request by selecting various criteria like location, time, parameters... The request can

  5. Go Figure: Computer Database Adds the Personal Touch.

    Science.gov (United States)

    Gaffney, Jean; Crawford, Pat

    1992-01-01

    A database for recordkeeping for a summer reading club was developed for a public library system using an IBM PC and Microsoft Works. Use of the database resulted in more efficient program management, giving librarians more time to spend with patrons and enabling timely awarding of incentives. (LAE)

  6. A database of new zeolite-like materials.

    Science.gov (United States)

    Pophale, Ramdas; Cheeseman, Phillip A; Deem, Michael W

    2011-07-21

    We here describe a database of computationally predicted zeolite-like materials. These crystals were discovered by a Monte Carlo search for zeolite-like materials. Positions of Si atoms as well as unit cell, space group, density, and number of crystallographically unique atoms were explored in the construction of this database. The database contains over 2.6 M unique structures. Roughly 15% of these are within +30 kJ mol(-1) Si of α-quartz, the band in which most of the known zeolites lie. These structures have topological, geometrical, and diffraction characteristics that are similar to those of known zeolites. The database is the result of refinement by two interatomic potentials that both satisfy the Pauli exclusion principle. The database has been deposited in the publicly available PCOD database and in www.hypotheticalzeolites.net/database/deem/. This journal is © the Owner Societies 2011

  7. Legume and Lotus japonicus Databases

    DEFF Research Database (Denmark)

    Hirakawa, Hideki; Mun, Terry; Sato, Shusei

    2014-01-01

    Since the genome sequence of Lotus japonicus, a model plant of family Fabaceae, was determined in 2008 (Sato et al. 2008), the genomes of other members of the Fabaceae family, soybean (Glycine max) (Schmutz et al. 2010) and Medicago truncatula (Young et al. 2011), have been sequenced. In this sec....... In this section, we introduce representative, publicly accessible online resources related to plant materials, integrated databases containing legume genome information, and databases for genome sequence and derived marker information of legume species including L. japonicus...

  8. The Danish Depression Database

    DEFF Research Database (Denmark)

    Videbech, Poul Bror Hemming; Deleuran, Anette

    2016-01-01

    AIM OF DATABASE: The purpose of the Danish Depression Database (DDD) is to monitor and facilitate the improvement of the quality of the treatment of depression in Denmark. Furthermore, the DDD has been designed to facilitate research. STUDY POPULATION: Inpatients as well as outpatients...... with depression, aged above 18 years, and treated in the public psychiatric hospital system were enrolled. MAIN VARIABLES: Variables include whether the patient has been thoroughly somatically examined and has been interviewed about the psychopathology by a specialist in psychiatry. The Hamilton score as well...... as an evaluation of the risk of suicide are measured before and after treatment. Whether psychiatric aftercare has been scheduled for inpatients and the rate of rehospitalization are also registered. DESCRIPTIVE DATA: The database was launched in 2011. Every year since then ~5,500 inpatients and 7,500 outpatients...

  9. Preliminary surficial geologic map database of the Amboy 30 x 60 minute quadrangle, California

    Science.gov (United States)

    Bedford, David R.; Miller, David M.; Phelps, Geoffrey A.

    2006-01-01

    The surficial geologic map database of the Amboy 30x60 minute quadrangle presents characteristics of surficial materials for an area approximately 5,000 km2 in the eastern Mojave Desert of California. This map consists of new surficial mapping conducted between 2000 and 2005, as well as compilations of previous surficial mapping. Surficial geology units are mapped and described based on depositional process and age categories that reflect the mode of deposition, pedogenic effects occurring post-deposition, and, where appropriate, the lithologic nature of the material. The physical properties recorded in the database focus on those that drive hydrologic, biologic, and physical processes such as particle size distribution (PSD) and bulk density. This version of the database is distributed with point data representing locations of samples for both laboratory determined physical properties and semi-quantitative field-based information. Future publications will include the field and laboratory data as well as maps of distributed physical properties across the landscape tied to physical process models where appropriate. The database is distributed in three parts: documentation, spatial map-based data, and printable map graphics of the database. Documentation includes this file, which provides a discussion of the surficial geology and describes the format and content of the map data, a database 'readme' file, which describes the database contents, and FGDC metadata for the spatial map information. Spatial data are distributed as Arc/Info coverage in ESRI interchange (e00) format, or as tabular data in the form of DBF3-file (.DBF) file formats. Map graphics files are distributed as Postscript and Adobe Portable Document Format (PDF) files, and are appropriate for representing a view of the spatial database at the mapped scale.

  10. The TMI-2 clean-up project collection and databases

    International Nuclear Information System (INIS)

    Osif, B.A.; Conkling, T.W.

    1996-01-01

    A publicly accessible collection containing several thousand of the videotapes, photographs, slides and technical reports generated during the clean-up of the TMI-2 reactor has been established by the Pennsylvania State University Libraries. The collection is intended to serve as a technical resource for the nuclear industry as well as the interested public. Two Internet-searchable databases describing the videotapes and technical reports have been created. The development and use of these materials and databases are described in this paper. (orig.)

  11. Outputs and Growth of Primary Care Databases in the United Kingdom: Bibliometric Analysis

    Directory of Open Access Journals (Sweden)

    Zain Chaudhry

    2017-10-01

    Full Text Available Background: Electronic health database (EHD data is increasingly used by researchers. The major United Kingdom EHDs are the ‘Clinical Practice Research Datalink’ (CPRD, ‘The Health Improvement Network’ (THIN and ‘QResearch’. Over time, outputs from these databases have increased, but have not been evaluated. Objective: This study compares research outputs from CPRD, THIN and QResearch assessing growth and publication outputs over a 10-year period (2004-2013. CPRD was also reviewed separately over 20 years as a case study. Methods:  Publications from CPRD and QResearch were extracted using the Science Citation Index (SCI of the Thomson Scientific Institute for Scientific Information (Web of Science. THIN data was obtained from University College London and validated in Web of Science. All databases were analysed for growth in publications, the speciality areas and the journals in which their data have been published. Results: These databases collectively produced 1,296 publications over a ten-year period, with CPRD representing 63.6% (n=825 papers, THIN 30.4% (n=394 and QResearch 5.9% (n=77. Pharmacoepidemiology and General Medicine were the most common specialities featured. Over the 9-year period (2004-2013, publications for THIN and QResearch have slowly increased over time, whereas CPRD publications have increased substantially in last 4 years with almost 75% of CPRD publications published in the past 9 years. Conclusion: These databases are enhancing scientific research and are growing yearly, however display variability in their growth. They could become more powerful research tools if the National Health Service and general practitioners can provide accurate and comprehensive data for inclusion in these databases.

  12. PrionScan: an online database of predicted prion domains in complete proteomes.

    Science.gov (United States)

    Espinosa Angarica, Vladimir; Angulo, Alfonso; Giner, Arturo; Losilla, Guillermo; Ventura, Salvador; Sancho, Javier

    2014-02-05

    Prions are a particular type of amyloids related to a large variety of important processes in cells, but also responsible for serious diseases in mammals and humans. The number of experimentally characterized prions is still low and corresponds to a handful of examples in microorganisms and mammals. Prion aggregation is mediated by specific protein domains with a remarkable compositional bias towards glutamine/asparagine and against charged residues and prolines. These compositional features have been used to predict new prion proteins in the genomes of different organisms. Despite these efforts, there are only a few available data sources containing prion predictions at a genomic scale. Here we present PrionScan, a new database of predicted prion-like domains in complete proteomes. We have previously developed a predictive methodology to identify and score prionogenic stretches in protein sequences. In the present work, we exploit this approach to scan all the protein sequences in public databases and compile a repository containing relevant information of proteins bearing prion-like domains. The database is updated regularly alongside UniprotKB and in its present version contains approximately 28000 predictions in proteins from different functional categories in more than 3200 organisms from all the taxonomic subdivisions. PrionScan can be used in two different ways: database query and analysis of protein sequences submitted by the users. In the first mode, simple queries allow to retrieve a detailed description of the properties of a defined protein. Queries can also be combined to generate more complex and specific searching patterns. In the second mode, users can submit and analyze their own sequences. It is expected that this database would provide relevant insights on prion functions and regulation from a genome-wide perspective, allowing researches performing cross-species prion biology studies. Our database might also be useful for guiding experimentalists

  13. Development of radionuclide parameter database on internal contamination in nuclear emergencies

    International Nuclear Information System (INIS)

    Zhao Li; Xu Cuihua; Li Wenhong; Su Xu

    2010-01-01

    Objective: To develop a radionuclide parameter database on internal contamination in nuclear emergencies. Methods: By researching the radionuclides composition discharged from different nuclear emergencies, the radionuclide parameters were achieved on physical decay, absorption and metabolism in the body from ICRP publications and some other publications. The database on internal contamination for nuclear incidents was developed by using MS Visual Studio 2005 C and MS Access programming language. Results: The radionuclide parameter database on internal contamination in nuclear emergency was established. Conclusions: The database may be very convenient for searching radionuclides and radionuclide parameter data discharged from different nuclear emergencies, which would be helpful to the monitoring and assessment and assessment of internal contamination in nuclear emergencies. (authors)

  14. Metagenomic Taxonomy-Guided Database-Searching Strategy for Improving Metaproteomic Analysis.

    Science.gov (United States)

    Xiao, Jinqiu; Tanca, Alessandro; Jia, Ben; Yang, Runqing; Wang, Bo; Zhang, Yu; Li, Jing

    2018-04-06

    Metaproteomics provides a direct measure of the functional information by investigating all proteins expressed by a microbiota. However, due to the complexity and heterogeneity of microbial communities, it is very hard to construct a sequence database suitable for a metaproteomic study. Using a public database, researchers might not be able to identify proteins from poorly characterized microbial species, while a sequencing-based metagenomic database may not provide adequate coverage for all potentially expressed protein sequences. To address this challenge, we propose a metagenomic taxonomy-guided database-search strategy (MT), in which a merged database is employed, consisting of both taxonomy-guided reference protein sequences from public databases and proteins from metagenome assembly. By applying our MT strategy to a mock microbial mixture, about two times as many peptides were detected as with the metagenomic database only. According to the evaluation of the reliability of taxonomic attribution, the rate of misassignments was comparable to that obtained using an a priori matched database. We also evaluated the MT strategy with a human gut microbial sample, and we found 1.7 times as many peptides as using a standard metagenomic database. In conclusion, our MT strategy allows the construction of databases able to provide high sensitivity and precision in peptide identification in metaproteomic studies, enabling the detection of proteins from poorly characterized species within the microbiota.

  15. EPlantLIBRA: A composition and biological activity database for bioactive compounds in plant food supplements

    DEFF Research Database (Denmark)

    Plumb, J.; Lyons, J.; Nørby, Karin Kristiane

    2015-01-01

    The newly developed ePlantLIBRA database is a comprehensive and searchable database, with up-to-date coherent and validated scientific information on plant food supplement (PFS) bioactive compounds, with putative health benefits as well as adverse effects, and contaminants and residues. It is the......The newly developed ePlantLIBRA database is a comprehensive and searchable database, with up-to-date coherent and validated scientific information on plant food supplement (PFS) bioactive compounds, with putative health benefits as well as adverse effects, and contaminants and residues...

  16. CERCLIS (Superfund) ASCII Text Format - CPAD Database

    Data.gov (United States)

    U.S. Environmental Protection Agency — The Comprehensive Environmental Response, Compensation and Liability Information System (CERCLIS) (Superfund) Public Access Database (CPAD) contains a selected set...

  17. Publications of the planetary biology program for 1975: A special bibliography. [on NASA programs and research projects on extraterrestrial life

    Science.gov (United States)

    Souza, K. A. (Compiler); Young, R. S. (Compiler)

    1976-01-01

    The Planetary Biology Program of the National Aeronautics and Space Administration is the first and only integrated program to methodically investigate the planetary events which may have been responsible for, or related to, the origin, evolution, and distribution of life in the universe. Research supported by this program is divided into the seven areas listed below: (1) chemical evolution, (2) organic geochemistry, (3) life detection, (4) biological adaptation, (5) bioinstrumentation, (6) planetary environments, and (7) origin of life. The arrangement of references in this bibliography follows the division of research described above. Articles are listed alphabetically by author under the research area with which they are most closely related. Only those publications which resulted from research supported by the Planetary Biology Program and which bear a 1975 publication date have been included. Abstracts and theses are not included because of the preliminary and abbreviated nature of the former and the frequent difficulty of obtaining the latter.

  18. GenderMedDB: an interactive database of sex and gender-specific medical literature.

    Science.gov (United States)

    Oertelt-Prigione, Sabine; Gohlke, Björn-Oliver; Dunkel, Mathias; Preissner, Robert; Regitz-Zagrosek, Vera

    2014-01-01

    Searches for sex and gender-specific publications are complicated by the absence of a specific algorithm within search engines and by the lack of adequate archives to collect the retrieved results. We previously addressed this issue by initiating the first systematic archive of medical literature containing sex and/or gender-specific analyses. This initial collection has now been greatly enlarged and re-organized as a free user-friendly database with multiple functions: GenderMedDB (http://gendermeddb.charite.de). GenderMedDB retrieves the included publications from the PubMed database. Manuscripts containing sex and/or gender-specific analysis are continuously screened and the relevant findings organized systematically into disciplines and diseases. Publications are furthermore classified by research type, subject and participant numbers. More than 11,000 abstracts are currently included in the database, after screening more than 40,000 publications. The main functions of the database include searches by publication data or content analysis based on pre-defined classifications. In addition, registrants are enabled to upload relevant publications, access descriptive publication statistics and interact in an open user forum. Overall, GenderMedDB offers the advantages of a discipline-specific search engine as well as the functions of a participative tool for the gender medicine community.

  19. Proteomics: Protein Identification Using Online Databases

    Science.gov (United States)

    Eurich, Chris; Fields, Peter A.; Rice, Elizabeth

    2012-01-01

    Proteomics is an emerging area of systems biology that allows simultaneous study of thousands of proteins expressed in cells, tissues, or whole organisms. We have developed this activity to enable high school or college students to explore proteomic databases using mass spectrometry data files generated from yeast proteins in a college laboratory…

  20. An analysis of factors influencing the teaching of biological evolution in Louisiana public secondary schools

    Science.gov (United States)

    Aguillard, Donald Wayne

    Louisiana public school biology teachers were surveyed to investigate their attitudes toward biological evolution. A mixed method investigation was employed using a questionnaire and open-ended interviews. Results obtained from 64 percent of the sample receiving the questionnaire indicate that although teachers endorse the study of evolution as important, instructional time allocated to evolution is disproportionate with its status as a unifying concept of science. Two variables, number of college courses specifically devoted to evolution and number of semester credit hours in biology, produced a significant correlation with emphasis placed on evolution. The data suggest that teachers' knowledge base emerged as the most significant factor in determining degree of classroom emphasis on evolution. The data suggest a need for substantive changes in the training of biology teachers. Thirty-five percent of teachers reported pursuing fewer than 20 semester credit hours in biology and 68 percent reported fewer than three college courses in which evolution was specifically discussed. Fifty percent reported a willingness to undergo additional training about evolution. In spite of the fact that evolution has been identified as a major conceptual theme across all of the sciences, there is strong evidence that Louisiana biology teachers de-emphasize evolutionary theory. Even when biology teachers allocate instructional time to evolutionary theory, many avoid discussion of human evolution. The research data show that only ten percent of teachers reported allocating more than sixty minutes of instructional time to human evolution. Louisiana biology teachers were found to hold extreme views on the subject of creationism as a component of the biology curriculum. Twenty-nine percent indicated that creationism should be taught in high school biology and 25--35 percent allocated instructional time to discussions of creationism. Contributing to the de-emphasis of evolutionary theory

  1. ARTI Refrigerant Database

    Energy Technology Data Exchange (ETDEWEB)

    Cain, J.M. (Calm (James M.), Great Falls, VA (United States))

    1993-04-30

    The Refrigerant Database consolidates and facilitates access to information to assist industry in developing equipment using alternative refrigerants. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern. The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air-conditioning and refrigeration equipment. The complete documents are not included. The database identifies sources of specific information on R-32, R-123, R-124, R-125, R-134, R-134a, R-141b, R-142b, R-143a, R-152a, R-245ca, R-290 (propane), R-717 (ammonia), ethers, and others as well as azeotropic and zeotropic blends of these fluids. It addresses lubricants including alkylbenzene, polyalkylene glycol, ester, and other synthetics as well as mineral oils. It also references documents addressing compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits. Incomplete citations or abstracts are provided for some documents to accelerate availability of the information and will be completed or replaced in future updates.

  2. ARTI refrigerant database

    Energy Technology Data Exchange (ETDEWEB)

    Calm, J.M.

    1997-02-01

    The Refrigerant Database is an information system on alternative refrigerants, associated lubricants, and their use in air conditioning and refrigeration. It consolidates and facilitates access to property, compatibility, environmental, safety, application and other information. It provides corresponding information on older refrigerants, to assist manufacturers and those using alterative refrigerants, to make comparisons and determine differences. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern. The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air-conditioning and refrigeration equipment. The complete documents are not included, though some may be added at a later date. The database identifies sources of specific information on various refrigerants. It addresses lubricants including alkylbenzene, polyalkylene glycol, polyolester, and other synthetics as well as mineral oils. It also references documents addressing compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits. Incomplete citations or abstracts are provided for some documents. They are included to accelerate availability of the information and will be completed or replaced in future updates.

  3. Protected Areas Database for New Mexico

    Data.gov (United States)

    Earth Data Analysis Center, University of New Mexico — The Protected Areas Database of the United States (PAD-US) is a geodatabase, managed by USGS GAP, that illustrates and describes public land ownership, management...

  4. Pacific Northwest Salmon Habitat Project Database

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — In the Pacific Northwest Salmon Habitat Project Database Across the Pacific Northwest, both public and private agents are working to improve riverine habitat for a...

  5. Construction of a Linux based chemical and biological information system.

    Science.gov (United States)

    Molnár, László; Vágó, István; Fehér, András

    2003-01-01

    A chemical and biological information system with a Web-based easy-to-use interface and corresponding databases has been developed. The constructed system incorporates all chemical, numerical and textual data related to the chemical compounds, including numerical biological screen results. Users can search the database by traditional textual/numerical and/or substructure or similarity queries through the web interface. To build our chemical database management system, we utilized existing IT components such as ORACLE or Tripos SYBYL for database management and Zope application server for the web interface. We chose Linux as the main platform, however, almost every component can be used under various operating systems.

  6. Human Ageing Genomic Resources: new and updated databases

    Science.gov (United States)

    Tacutu, Robi; Thornton, Daniel; Johnson, Emily; Budovsky, Arie; Barardo, Diogo; Craig, Thomas; Diana, Eugene; Lehmann, Gilad; Toren, Dmitri; Wang, Jingwei; Fraifeld, Vadim E

    2018-01-01

    Abstract In spite of a growing body of research and data, human ageing remains a poorly understood process. Over 10 years ago we developed the Human Ageing Genomic Resources (HAGR), a collection of databases and tools for studying the biology and genetics of ageing. Here, we present HAGR’s main functionalities, highlighting new additions and improvements. HAGR consists of six core databases: (i) the GenAge database of ageing-related genes, in turn composed of a dataset of >300 human ageing-related genes and a dataset with >2000 genes associated with ageing or longevity in model organisms; (ii) the AnAge database of animal ageing and longevity, featuring >4000 species; (iii) the GenDR database with >200 genes associated with the life-extending effects of dietary restriction; (iv) the LongevityMap database of human genetic association studies of longevity with >500 entries; (v) the DrugAge database with >400 ageing or longevity-associated drugs or compounds; (vi) the CellAge database with >200 genes associated with cell senescence. All our databases are manually curated by experts and regularly updated to ensure a high quality data. Cross-links across our databases and to external resources help researchers locate and integrate relevant information. HAGR is freely available online (http://genomics.senescence.info/). PMID:29121237

  7. The STRING database in 2017

    DEFF Research Database (Denmark)

    Szklarczyk, Damian; Morris, John H; Cook, Helen

    2017-01-01

    A system-wide understanding of cellular function requires knowledge of all functional interactions between the expressed proteins. The STRING database aims to collect and integrate this information, by consolidating known and predicted protein-protein association data for a large number of organi......A system-wide understanding of cellular function requires knowledge of all functional interactions between the expressed proteins. The STRING database aims to collect and integrate this information, by consolidating known and predicted protein-protein association data for a large number...... of organisms. The associations in STRING include direct (physical) interactions, as well as indirect (functional) interactions, as long as both are specific and biologically meaningful. Apart from collecting and reassessing available experimental data on protein-protein interactions, and importing known...... pathways and protein complexes from curated databases, interaction predictions are derived from the following sources: (i) systematic co-expression analysis, (ii) detection of shared selective signals across genomes, (iii) automated text-mining of the scientific literature and (iv) computational transfer...

  8. MIPS Arabidopsis thaliana Database (MAtDB): an integrated biological knowledge resource based on the first complete plant genome

    Science.gov (United States)

    Schoof, Heiko; Zaccaria, Paolo; Gundlach, Heidrun; Lemcke, Kai; Rudd, Stephen; Kolesov, Grigory; Arnold, Roland; Mewes, H. W.; Mayer, Klaus F. X.

    2002-01-01

    Arabidopsis thaliana is the first plant for which the complete genome has been sequenced and published. Annotation of complex eukaryotic genomes requires more than the assignment of genetic elements to the sequence. Besides completing the list of genes, we need to discover their cellular roles, their regulation and their interactions in order to understand the workings of the whole plant. The MIPS Arabidopsis thaliana Database (MAtDB; http://mips.gsf.de/proj/thal/db) started out as a repository for genome sequence data in the European Scientists Sequencing Arabidopsis (ESSA) project and the Arabidopsis Genome Initiative. Our aim is to transform MAtDB into an integrated biological knowledge resource by integrating diverse data, tools, query and visualization capabilities and by creating a comprehensive resource for Arabidopsis as a reference model for other species, including crop plants. PMID:11752263

  9. Power source roadmaps using bibliometrics and database tomography

    International Nuclear Information System (INIS)

    Kostoff, R.N.; Tshiteya, R.; Pfeil, K.M.; Humenik, J.A.; Karypis, G.

    2005-01-01

    Database Tomography (DT) is a textual database analysis system consisting of two major components: (1) algorithms for extracting multi-word phrase frequencies and phrase proximities (physical closeness of the multi-word technical phrases) from any type of large textual database, to augment (2) interpretative capabilities of the expert human analyst. DT was used to derive technical intelligence from a Power Sources database derived from the Science Citation Index. Phrase frequency analysis by the technical domain experts provided the pervasive technical themes of the Power Sources database, and the phrase proximity analysis provided the relationships among the pervasive technical themes. Bibliometric analysis of the Power Sources literature supplemented the DT results with author/journal/institution/country publication and citation data

  10. The DExH/D protein family database.

    Science.gov (United States)

    Jankowsky, E; Jankowsky, A

    2000-01-01

    DExH/D proteins are essential for all aspects of cellular RNA metabolism and processing, in the replication of many viruses and in DNA replication. DExH/D proteins are subject to current biological, biochemical and biophysical research which provides a continuous wealth of data. The DExH/D protein family database compiles this information and makes it available over the WWW (http://www.columbia.edu/ ej67/dbhome.htm ). The database can be fully searched by text based queries, facilitating fast access to specific information about this important class of enzymes.

  11. Access to DNA and protein databases on the Internet.

    Science.gov (United States)

    Harper, R

    1994-02-01

    During the past year, the number of biological databases that can be queried via Internet has dramatically increased. This increase has resulted from the introduction of networking tools, such as Gopher and WAIS, that make it easy for research workers to index databases and make them available for on-line browsing. Biocomputing in the nineties will see the advent of more client/server options for the solution of problems in bioinformatics.

  12. The Moroccan Genetic Disease Database (MGDD): a database for DNA variations related to inherited disorders and disease susceptibility.

    Science.gov (United States)

    Charoute, Hicham; Nahili, Halima; Abidi, Omar; Gabi, Khalid; Rouba, Hassan; Fakiri, Malika; Barakat, Abdelhamid

    2014-03-01

    National and ethnic mutation databases provide comprehensive information about genetic variations reported in a population or an ethnic group. In this paper, we present the Moroccan Genetic Disease Database (MGDD), a catalogue of genetic data related to diseases identified in the Moroccan population. We used the PubMed, Web of Science and Google Scholar databases to identify available articles published until April 2013. The Database is designed and implemented on a three-tier model using Mysql relational database and the PHP programming language. To date, the database contains 425 mutations and 208 polymorphisms found in 301 genes and 259 diseases. Most Mendelian diseases in the Moroccan population follow autosomal recessive mode of inheritance (74.17%) and affect endocrine, nutritional and metabolic physiology. The MGDD database provides reference information for researchers, clinicians and health professionals through a user-friendly Web interface. Its content should be useful to improve researches in human molecular genetics, disease diagnoses and design of association studies. MGDD can be publicly accessed at http://mgdd.pasteur.ma.

  13. DB-PABP: a database of polyanion-binding proteins.

    Science.gov (United States)

    Fang, Jianwen; Dong, Yinghua; Salamat-Miller, Nazila; Middaugh, C Russell

    2008-01-01

    The interactions between polyanions (PAs) and polyanion-binding proteins (PABPs) have been found to play significant roles in many essential biological processes including intracellular organization, transport and protein folding. Furthermore, many neurodegenerative disease-related proteins are PABPs. Thus, a better understanding of PA/PABP interactions may not only enhance our understandings of biological systems but also provide new clues to these deadly diseases. The literature in this field is widely scattered, suggesting the need for a comprehensive and searchable database of PABPs. The DB-PABP is a comprehensive, manually curated and searchable database of experimentally characterized PABPs. It is freely available and can be accessed online at http://pabp.bcf.ku.edu/DB_PABP/. The DB-PABP was implemented as a MySQL relational database. An interactive web interface was created using Java Server Pages (JSP). The search page of the database is organized into a main search form and a section for utilities. The main search form enables custom searches via four menus: protein names, polyanion names, the source species of the proteins and the methods used to discover the interactions. Available utilities include a commonality matrix, a function of listing PABPs by the number of interacting polyanions and a string search for author surnames. The DB-PABP is maintained at the University of Kansas. We encourage users to provide feedback and submit new data and references.

  14. Public Use Airports, Geographic WGS84, BTS (2006) [public_use_airports_BTS_2006

    Data.gov (United States)

    Louisiana Geographic Information Center — The Public Use Airports database is a geographic point database of aircraft landing facilities in the United States and U.S. Territories. Attribute data is provided...

  15. CORE-Hom: a powerful and exhaustive database of clinical trials in homeopathy.

    Science.gov (United States)

    Clausen, Jürgen; Moss, Sian; Tournier, Alexander; Lüdtke, Rainer; Albrecht, Henning

    2014-10-01

    The CORE-Hom database was created to answer the need for a reliable and publicly available source of information in the field of clinical research in homeopathy. As of May 2014 it held 1048 entries of clinical trials, observational studies and surveys in the field of homeopathy, including second publications and re-analyses. 352 of the trials referenced in the database were published in peer reviewed journals, 198 of which were randomised controlled trials. The most often used remedies were Arnica montana (n = 103) and Traumeel(®) (n = 40). The most studied medical conditions were respiratory tract infections (n = 126) and traumatic injuries (n = 110). The aim of this article is to introduce the database to the public, describing and explaining the interface, features and content of the CORE-Hom database. Copyright © 2014 The Faculty of Homeopathy. Published by Elsevier Ltd. All rights reserved.

  16. ARTI Refrigerant Database

    Energy Technology Data Exchange (ETDEWEB)

    Calm, J.M.

    1992-11-09

    The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air- conditioning and refrigeration equipment. The database identifies sources of specific information on R-32, R-123, R-124, R-125, R-134, R-134a, R-141b, R-142b, R-143a, R-152a, R-245ca, R-290 (propane), R- 717 (ammonia), ethers, and others as well as azeotropic and zeotropic and zeotropic blends of these fluids. It addresses lubricants including alkylbenzene, polyalkylene glycol, ester, and other synthetics as well as mineral oils. It also references documents on compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits. A computerized version is available that includes retrieval software.

  17. Databases for marine biologists and biotechnologists: The state-of-the art and prospects

    Digital Repository Service at National Institute of Oceanography (India)

    Chavan, V.S.

    Only 1% of the presently available 5000 database titles are relevant to marine biology and biotechnology. Nearly 60% of these are bibliographic in nature. There are almost no textural and numeric databases, which are the prime need of researchers...

  18. HERVd: the Human Endogenous Retrovirus Database: update

    Czech Academy of Sciences Publication Activity Database

    Pačes, Jan; Pavlíček, A.; Zíka, Radek; Jurka, J.; Pačes, Václav

    2004-01-01

    Roč. 32, č. 1 (2004), s. 50-50 ISSN 0305-1048 R&D Projects: GA MŠk LN00A079 Institutional research plan: CEZ:AV0Z5052915 Keywords : human * endogenous retrovirus * database Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 7.260, year: 2004

  19. Enhanced Biological Sampling Data

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — This is a database of a variety of biological, reproductive, and energetic data collected from fish on the continental shelf in the northwest Atlantic Ocean. Species...

  20. The HITRAN 2008 molecular spectroscopic database

    International Nuclear Information System (INIS)

    Rothman, L.S.; Gordon, I.E.; Barbe, A.; Benner, D.Chris; Bernath, P.F.; Birk, M.; Boudon, V.; Brown, L.R.; Campargue, A.; Champion, J.-P.; Chance, K.; Coudert, L.H.; Dana, V.; Devi, V.M.; Fally, S.; Flaud, J.-M.

    2009-01-01

    This paper describes the status of the 2008 edition of the HITRAN molecular spectroscopic database. The new edition is the first official public release since the 2004 edition, although a number of crucial updates had been made available online since 2004. The HITRAN compilation consists of several components that serve as input for radiative-transfer calculation codes: individual line parameters for the microwave through visible spectra of molecules in the gas phase; absorption cross-sections for molecules having dense spectral features, i.e. spectra in which the individual lines are not resolved; individual line parameters and absorption cross-sections for bands in the ultraviolet; refractive indices of aerosols, tables and files of general properties associated with the database; and database management software. The line-by-line portion of the database contains spectroscopic parameters for 42 molecules including many of their isotopologues.

  1. TCMGeneDIT: a database for associated traditional Chinese medicine, gene and disease information using text mining

    Directory of Open Access Journals (Sweden)

    Chen Hsin-Hsi

    2008-10-01

    Full Text Available Abstract Background Traditional Chinese Medicine (TCM, a complementary and alternative medical system in Western countries, has been used to treat various diseases over thousands of years in East Asian countries. In recent years, many herbal medicines were found to exhibit a variety of effects through regulating a wide range of gene expressions or protein activities. As available TCM data continue to accumulate rapidly, an urgent need for exploring these resources systematically is imperative, so as to effectively utilize the large volume of literature. Methods TCM, gene, disease, biological pathway and protein-protein interaction information were collected from public databases. For association discovery, the TCM names, gene names, disease names, TCM ingredients and effects were used to annotate the literature corpus obtained from PubMed. The concept to mine entity associations was based on hypothesis testing and collocation analysis. The annotated corpus was processed with natural language processing tools and rule-based approaches were applied to the sentences for extracting the relations between TCM effecters and effects. Results We developed a database, TCMGeneDIT, to provide association information about TCMs, genes, diseases, TCM effects and TCM ingredients mined from vast amount of biomedical literature. Integrated protein-protein interaction and biological pathways information are also available for exploring the regulations of genes associated with TCM curative effects. In addition, the transitive relationships among genes, TCMs and diseases could be inferred through the shared intermediates. Furthermore, TCMGeneDIT is useful in understanding the possible therapeutic mechanisms of TCMs via gene regulations and deducing synergistic or antagonistic contributions of the prescription components to the overall therapeutic effects. The database is now available at http://tcm.lifescience.ntu.edu.tw/. Conclusion TCMGeneDIT is a unique database

  2. MultitaskProtDB: a database of multitasking proteins.

    Science.gov (United States)

    Hernández, Sergio; Ferragut, Gabriela; Amela, Isaac; Perez-Pons, JosepAntoni; Piñol, Jaume; Mozo-Villarias, Angel; Cedano, Juan; Querol, Enrique

    2014-01-01

    We have compiled MultitaskProtDB, available online at http://wallace.uab.es/multitask, to provide a repository where the many multitasking proteins found in the literature can be stored. Multitasking or moonlighting is the capability of some proteins to execute two or more biological functions. Usually, multitasking proteins are experimentally revealed by serendipity. This ability of proteins to perform multitasking functions helps us to understand one of the ways used by cells to perform many complex functions with a limited number of genes. Even so, the study of this phenomenon is complex because, among other things, there is no database of moonlighting proteins. The existence of such a tool facilitates the collection and dissemination of these important data. This work reports the database, MultitaskProtDB, which is designed as a friendly user web page containing >288 multitasking proteins with their NCBI and UniProt accession numbers, canonical and additional biological functions, monomeric/oligomeric states, PDB codes when available and bibliographic references. This database also serves to gain insight into some characteristics of multitasking proteins such as frequencies of the different pairs of functions, phylogenetic conservation and so forth.

  3. The Molecular Signatures Database (MSigDB) hallmark gene set collection.

    Science.gov (United States)

    Liberzon, Arthur; Birger, Chet; Thorvaldsdóttir, Helga; Ghandi, Mahmoud; Mesirov, Jill P; Tamayo, Pablo

    2015-12-23

    The Molecular Signatures Database (MSigDB) is one of the most widely used and comprehensive databases of gene sets for performing gene set enrichment analysis. Since its creation, MSigDB has grown beyond its roots in metabolic disease and cancer to include >10,000 gene sets. These better represent a wider range of biological processes and diseases, but the utility of the database is reduced by increased redundancy across, and heterogeneity within, gene sets. To address this challenge, here we use a combination of automated approaches and expert curation to develop a collection of "hallmark" gene sets as part of MSigDB. Each hallmark in this collection consists of a "refined" gene set, derived from multiple "founder" sets, that conveys a specific biological state or process and displays coherent expression. The hallmarks effectively summarize most of the relevant information of the original founder sets and, by reducing both variation and redundancy, provide more refined and concise inputs for gene set enrichment analysis.

  4. Biologic interventions for fatigue in rheumatoid arthritis

    DEFF Research Database (Denmark)

    Almeida, Celia; Choy, Ernest H S; Hewlett, Sarah

    2016-01-01

    BACKGROUND: Fatigue is a common and potentially distressing symptom for patients with rheumatoid arthritis (RA), with no accepted evidence-based management guidelines. Evidence suggests that biologic interventions improve symptoms and signs in RA as well as reducing joint damage. OBJECTIVES......: To evaluate the effect of biologic interventions on fatigue in rheumatoid arthritis. SEARCH METHODS: We searched the following electronic databases up to 1 April 2014: Cochrane Central Register of Controlled Trials (CENTRAL), MEDLINE, EMBASE, Cochrane Database of Systematic Reviews, Current Controlled Trials...... and contacted key authors. SELECTION CRITERIA: We included randomised controlled trials if they evaluated a biologic intervention in people with rheumatoid arthritis and had self reported fatigue as an outcome measure. DATA COLLECTION AND ANALYSIS: Two reviewers selected relevant trials, assessed methodological...

  5. DESTAF: A database of text-mined associations for reproductive toxins potentially affecting human fertility

    KAUST Repository

    Dawe, Adam Sean; Radovanovic, Aleksandar; Kaur, Mandeep; Sagar, Sunil; Seshadri, Sundararajan Vijayaraghava; Schaefer, Ulf; Kamau, Allan; Christoffels, Alan G.; Bajic, Vladimir B.

    2012-01-01

    The Dragon Exploration System for Toxicants and Fertility (DESTAF) is a publicly available resource which enables researchers to efficiently explore both known and potentially novel information and associations in the field of reproductive toxicology. To create DESTAF we used data from the literature (including over 10. 500 PubMed abstracts), several publicly available biomedical repositories, and specialized, curated dictionaries. DESTAF has an interface designed to facilitate rapid assessment of the key associations between relevant concepts, allowing for a more in-depth exploration of information based on different gene/protein-, enzyme/metabolite-, toxin/chemical-, disease- or anatomically centric perspectives. As a special feature, DESTAF allows for the creation and initial testing of potentially new association hypotheses that suggest links between biological entities identified through the database.DESTAF, along with a PDF manual, can be found at http://cbrc.kaust.edu.sa/destaf. It is free to academic and non-commercial users and will be updated quarterly. © 2011 Elsevier Inc.

  6. BIOPEP database and other programs for processing bioactive peptide sequences.

    Science.gov (United States)

    Minkiewicz, Piotr; Dziuba, Jerzy; Iwaniak, Anna; Dziuba, Marta; Darewicz, Małgorzata

    2008-01-01

    This review presents the potential for application of computational tools in peptide science based on a sample BIOPEP database and program as well as other programs and databases available via the World Wide Web. The BIOPEP application contains a database of biologically active peptide sequences and a program enabling construction of profiles of the potential biological activity of protein fragments, calculation of quantitative descriptors as measures of the value of proteins as potential precursors of bioactive peptides, and prediction of bonds susceptible to hydrolysis by endopeptidases in a protein chain. Other bioactive and allergenic peptide sequence databases are also presented. Programs enabling the construction of binary and multiple alignments between peptide sequences, the construction of sequence motifs attributed to a given type of bioactivity, searching for potential precursors of bioactive peptides, and the prediction of sites susceptible to proteolytic cleavage in protein chains are available via the Internet as are other approaches concerning secondary structure prediction and calculation of physicochemical features based on amino acid sequence. Programs for prediction of allergenic and toxic properties have also been developed. This review explores the possibilities of cooperation between various programs.

  7. Publication rates of public health theses in international and national peer-review journals in Turkey.

    Science.gov (United States)

    Sipahi, H; Durusoy, R; Ergin, I; Hassoy, H; Davas, A; Karababa, Ao

    2012-01-01

    Thesis is an important part of specialisation and doctorate education and requires intense work. The aim of this study was to investigate the publication rates of Turkish Public Health Doctorate Theses (PHDT) and Public Health Specialization (PHST) theses in international and Turkish national peer-review journals and to analyze the distribution of research areas. List of all theses upto 30 September 2009 were retrieved from theses database of the Council of Higher Education of the Republic of Turkey. The publication rates of these theses were found by searching PubMed, Science Citation Index-Expanded, Turkish Academic Network and Information Center (ULAKBIM) Turkish Medical Database, and Turkish Medline databases for the names of thesis author and mentor. The theses which were published in journals indexed either in PubMed or SCI-E were considered as international publications. Our search yielded a total of 538 theses (243 PHDT, 295 PHST). It was found that the overall publication rate in Turkish national journals was 18%. The overall publication rate in international journals was 11.9%. Overall the most common research area was occupational health. Publication rates of Turkish PHDT and PHST are low. A better understanding of factors affecting this publication rate is important for public health issues where national data is vital for better intervention programs and develop better public health policies.

  8. Gene Name Thesaurus - Gene Name Thesaurus | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available 08/lsdba.nbdc00966-001 Description of data contents Curators who have expertize in biological research edit ...onym information fields in various gene/genome databases. 2. The curators who have expertise in biological research

  9. The COMPADRE Plant Matrix Database

    DEFF Research Database (Denmark)

    2014-01-01

    COMPADRE contains demographic information on hundreds of plant species. The data in COMPADRE are in the form of matrix population models and our goal is to make these publicly available to facilitate their use for research and teaching purposes. COMPADRE is an open-access database. We only request...

  10. Banking biological collections: data warehousing, data mining, and data dilemmas in genomics and global health policy.

    Science.gov (United States)

    Blatt, R J R

    2000-01-01

    While DNA databases may offer the opportunity to (1) assess population-based prevalence of specific genes and variants, (2) simplify the search for molecular markers, (3) improve targeted drug discovery and development for disease management, (4) refine strategies for disease prevention, and (5) provide the data necessary for evidence-based decision-making, serious scientific and social questions remain. Whether samples are identified, coded, or anonymous, biological banking raises profound ethical and legal issues pertaining to access, informed consent, privacy and confidentiality of genomic information, civil liberties, patenting, and proprietary rights. This paper provides an overview of key policy issues and questions pertaining to biological banking, with a focus on developments in specimen collection, transnational distribution, and public health and academic-industry research alliances. It highlights the challenges posed by the commercialization of genomics, and proposes the need for harmonization of biological banking policies.

  11. Conceptual design of nuclear power plants database system

    International Nuclear Information System (INIS)

    Ishikawa, Masaaki; Izumi, Fumio; Sudoh, Takashi.

    1984-03-01

    This report is the result of the joint study on the developments of the nuclear power plants database system. The present conceptual design of the database system, which includes Japanese character processing and image processing, has been made on the data of safety design parameters mainly found in the application documents for reactor construction permit made available to the public. (author)

  12. Comprehensive T-Matrix Reference Database: A 2012 - 2013 Update

    Science.gov (United States)

    Mishchenko, Michael I.; Videen, Gorden; Khlebtsov, Nikolai G.; Wriedt, Thomas

    2013-01-01

    The T-matrix method is one of the most versatile, efficient, and accurate theoretical techniques widely used for numerically exact computer calculations of electromagnetic scattering by single and composite particles, discrete random media, and particles imbedded in complex environments. This paper presents the fifth update to the comprehensive database of peer-reviewed T-matrix publications initiated by us in 2004 and includes relevant publications that have appeared since 2012. It also lists several earlier publications not incorporated in the original database, including Peter Waterman's reports from the 1960s illustrating the history of the T-matrix approach and demonstrating that John Fikioris and Peter Waterman were the true pioneers of the multi-sphere method otherwise known as the generalized Lorenz - Mie theory.

  13. Accidents with biological material among undergraduate nursing students in a public Brazilian university.

    Science.gov (United States)

    Reis, Renata Karina; Gir, Elucir; Canini, Silvia Rita M S

    2004-02-01

    During their academic activities, undergraduate nursing students are exposed to contamination by bloodborne pathogens, as well as by others found in body fluids, among which are the Human Immunodeficiency (HIV), Hepatitis B and C viruses. We developed a profile of victimized students, characterizing accidents with biological material occurring among undergraduate nursing students at a public university in São Paulo State, Brazil. We identified the main causes and evaluated the conduct adopted by students and their reactions and thoughts concerning the accidents. Seventy-two accidents were identified, of which 17% involved potentially contaminated biological material. Needles were the predominant cause of accidents. The most frequently involved topographic areas were the fingers. Only five students reported the accidents and sought medical care. Among these, two students were advised to begin prophylactic treatment against HIV infection by means of antiretroviral drugs. It was found that the risk of accidents is underestimated and that strategies such as formal teaching and continual training are necessary in order to make students aware of biosafety measures.

  14. Accidents with biological material among undergraduate nursing students in a public Brazilian university

    Directory of Open Access Journals (Sweden)

    Renata Karina Reis

    Full Text Available During their academic activities, undergraduate nursing students are exposed to contamination by bloodborne pathogens, as well as by others found in body fluids, among which are the Human Immunodeficiency (HIV, Hepatitis B and C viruses. We developed a profile of victimized students, characterizing accidents with biological material occurring among undergraduate nursing students at a public university in São Paulo State, Brazil. We identified the main causes and evaluated the conduct adopted by students and their reactions and thoughts concerning the accidents. Seventy-two accidents were identified, of which 17% involved potentially contaminated biological material. Needles were the predominant cause of accidents. The most frequently involved topographic areas were the fingers. Only five students reported the accidents and sought medical care. Among these, two students were advised to begin prophylactic treatment against HIV infection by means of antiretroviral drugs. It was found that the risk of accidents is underestimated and that strategies such as formal teaching and continual training are necessary in order to make students aware of biosafety measures.

  15. USING THE INTERNATIONAL SCIENTOMETRIC DATABASES OF OPEN ACCESS IN SCIENTIFIC RESEARCH

    Directory of Open Access Journals (Sweden)

    O. Galchevska

    2015-05-01

    Full Text Available In the article the problem of the use of international scientometric databases in research activities as web-oriented resources and services that are the means of publication and dissemination of research results is considered. Selection criteria of scientometric platforms of open access in conducting scientific researches (coverage Ukrainian scientific periodicals and publications, data accuracy, general characteristics of international scientometrics database, technical, functional characteristics and their indexes are emphasized. The review of the most popular scientometric databases of open access Google Scholar, Russian Scientific Citation Index (RSCI, Scholarometer, Index Copernicus (IC, Microsoft Academic Search is made. Advantages of usage of International Scientometrics database Google Scholar in conducting scientific researches and prospects of research that are in the separation of cloud information and analytical services of the system are determined.

  16. MareyMap Online: A User-Friendly Web Application and Database Service for Estimating Recombination Rates Using Physical and Genetic Maps.

    Science.gov (United States)

    Siberchicot, Aurélie; Bessy, Adrien; Guéguen, Laurent; Marais, Gabriel A B

    2017-10-01

    Given the importance of meiotic recombination in biology, there is a need to develop robust methods to estimate meiotic recombination rates. A popular approach, called the Marey map approach, relies on comparing genetic and physical maps of a chromosome to estimate local recombination rates. In the past, we have implemented this approach in an R package called MareyMap, which includes many functionalities useful to get reliable recombination rate estimates in a semi-automated way. MareyMap has been used repeatedly in studies looking at the effect of recombination on genome evolution. Here, we propose a simpler user-friendly web service version of MareyMap, called MareyMap Online, which allows a user to get recombination rates from her/his own data or from a publicly available database that we offer in a few clicks. When the analysis is done, the user is asked whether her/his curated data can be placed in the database and shared with other users, which we hope will make meta-analysis on recombination rates including many species easy in the future. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  17. The Princeton Protein Orthology Database (P-POD): a comparative genomics analysis tool for biologists.

    OpenAIRE

    Sven Heinicke; Michael S Livstone; Charles Lu; Rose Oughtred; Fan Kang; Samuel V Angiuoli; Owen White; David Botstein; Kara Dolinski

    2007-01-01

    Many biological databases that provide comparative genomics information and tools are now available on the internet. While certainly quite useful, to our knowledge none of the existing databases combine results from multiple comparative genomics methods with manually curated information from the literature. Here we describe the Princeton Protein Orthology Database (P-POD, http://ortholog.princeton.edu), a user-friendly database system that allows users to find and visualize the phylogenetic r...

  18. Neurovascular Network Explorer 1.0: a database of 2-photon single-vessel diameter measurements with MATLAB(®) graphical user interface.

    Science.gov (United States)

    Sridhar, Vishnu B; Tian, Peifang; Dale, Anders M; Devor, Anna; Saisan, Payam A

    2014-01-01

    We present a database client software-Neurovascular Network Explorer 1.0 (NNE 1.0)-that uses MATLAB(®) based Graphical User Interface (GUI) for interaction with a database of 2-photon single-vessel diameter measurements from our previous publication (Tian et al., 2010). These data are of particular interest for modeling the hemodynamic response. NNE 1.0 is downloaded by the user and then runs either as a MATLAB script or as a standalone program on a Windows platform. The GUI allows browsing the database according to parameters specified by the user, simple manipulation and visualization of the retrieved records (such as averaging and peak-normalization), and export of the results. Further, we provide NNE 1.0 source code. With this source code, the user can database their own experimental results, given the appropriate data structure and naming conventions, and thus share their data in a user-friendly format with other investigators. NNE 1.0 provides an example of seamless and low-cost solution for sharing of experimental data by a regular size neuroscience laboratory and may serve as a general template, facilitating dissemination of biological results and accelerating data-driven modeling approaches.

  19. Databases of the marine metagenomics

    KAUST Repository

    Mineta, Katsuhiko

    2015-10-28

    The metagenomic data obtained from marine environments is significantly useful for understanding marine microbial communities. In comparison with the conventional amplicon-based approach of metagenomics, the recent shotgun sequencing-based approach has become a powerful tool that provides an efficient way of grasping a diversity of the entire microbial community at a sampling point in the sea. However, this approach accelerates accumulation of the metagenome data as well as increase of data complexity. Moreover, when metagenomic approach is used for monitoring a time change of marine environments at multiple locations of the seawater, accumulation of metagenomics data will become tremendous with an enormous speed. Because this kind of situation has started becoming of reality at many marine research institutions and stations all over the world, it looks obvious that the data management and analysis will be confronted by the so-called Big Data issues such as how the database can be constructed in an efficient way and how useful knowledge should be extracted from a vast amount of the data. In this review, we summarize the outline of all the major databases of marine metagenome that are currently publically available, noting that database exclusively on marine metagenome is none but the number of metagenome databases including marine metagenome data are six, unexpectedly still small. We also extend our explanation to the databases, as reference database we call, that will be useful for constructing a marine metagenome database as well as complementing important information with the database. Then, we would point out a number of challenges to be conquered in constructing the marine metagenome database.

  20. Mining Bug Databases for Unidentified Software Vulnerabilities

    Energy Technology Data Exchange (ETDEWEB)

    Dumidu Wijayasekara; Milos Manic; Jason Wright; Miles McQueen

    2012-06-01

    Identifying software vulnerabilities is becoming more important as critical and sensitive systems increasingly rely on complex software systems. It has been suggested in previous work that some bugs are only identified as vulnerabilities long after the bug has been made public. These vulnerabilities are known as hidden impact vulnerabilities. This paper discusses the feasibility and necessity to mine common publicly available bug databases for vulnerabilities that are yet to be identified. We present bug database analysis of two well known and frequently used software packages, namely Linux kernel and MySQL. It is shown that for both Linux and MySQL, a significant portion of vulnerabilities that were discovered for the time period from January 2006 to April 2011 were hidden impact vulnerabilities. It is also shown that the percentage of hidden impact vulnerabilities has increased in the last two years, for both software packages. We then propose an improved hidden impact vulnerability identification methodology based on text mining bug databases, and conclude by discussing a few potential problems faced by such a classifier.

  1. Functional role of bacteriophage transfer RNAs: codon usage analysis of genomic sequences stored in the GENBANK/EMBL/DDBJ databases

    Directory of Open Access Journals (Sweden)

    T Kunisawa

    2006-01-01

    Full Text Available Complete genomic sequence data are stored in the public GenBank/EMBL/DDBJ databases so that any investigator can make use of the data. This report describes a comparative analysis of codon usage that is impossible without such a public and open data system. A limited number of bacteriophages harbor their own transfer RNAs. Based on a comparison between T4 phage-encoded tRNA species and the relative cellular amounts of host Escherichia coli tRNAs, it is hypothesized that T4 tRNAs could serve to supplement host isoacceptor tRNA species that are present in minor amounts and thus enhance the translational efficiency of phage proteins. When compared to their respective host bacteria, the codon usage data of bacteriophages D3, φC31, HP1, D29 and 933W all show an increased frequency of synonymous codons or amino acids that correspond to phage tRNA species, suggesting their supplemental role in the efficient production of phage proteins. The data-analysis presents an example in which the availability of an open and fully accessible database system would allow one to obtain comprehensive insights into a fundamental problem in molecular biology.

  2. Database Description - Trypanosomes Database | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Trypanosomes Database Database Description General information of database Database name Trypanosomes Database...stitute of Genetics Research Organization of Information and Systems Yata 1111, Mishima, Shizuoka 411-8540, JAPAN E mail: Database...y Name: Trypanosoma Taxonomy ID: 5690 Taxonomy Name: Homo sapiens Taxonomy ID: 9606 Database description The... Article title: Author name(s): Journal: External Links: Original website information Database maintenance s...DB (Protein Data Bank) KEGG PATHWAY Database DrugPort Entry list Available Query search Available Web servic

  3. A high-energy nuclear database proposal

    International Nuclear Information System (INIS)

    Brown, D.A.; Vogt, R.; UC Davis, CA

    2006-01-01

    We propose to develop a high-energy heavy-ion experimental database and make it accessible to the scientific community through an on-line interface. This database will be searchable and cross-indexed with relevant publications, including published detector descriptions. Since this database will be a community resource, it requires the high-energy nuclear physics community's financial and manpower support. This database should eventually contain all published data from the Bevalac, AGS and SPS to RHIC and LHC energies, proton-proton to nucleus-nucleus collisions as well as other relevant systems, and all measured observables. Such a database would have tremendous scientific payoff as it makes systematic studies easier and allows simpler benchmarking of theoretical models to a broad range of old and new experiments. Furthermore, there is a growing need for compilations of high-energy nuclear data for applications including stockpile stewardship, technology development for inertial confinement fusion and target and source development for upcoming facilities such as the Next Linear Collider. To enhance the utility of this database, we propose periodically performing evaluations of the data and summarizing the results in topical reviews. (author)

  4. Israel Marine Bio-geographic Database (ISRAMAR-BIO)

    Science.gov (United States)

    Greengrass, Eyal; Krivenko, Yevgeniya; Ozer, Tal; Ben Yosef, Dafna; Tom, Moshe; Gertman, Isaac

    2015-04-01

    The knowledge of the space/time variations of species is the basis for any ecological investigations. While historical observations containing integral concentrations of biological parameters (chlorophyll, abundance, biomass…) are organized partly in ISRAMAR Cast Database, the taxon-specific data collected in Israel has not been sufficiently organized. This has been hindered by the lack of standards, variability of methods and complexity of biological data formalization. The ISRAMAR-BIO DB was developed to store various types of historical and future available information related to marine species observations and related metadata. Currently the DB allows to store biological data acquired by the following sampling devices such as: van veer grab, box corer, sampling bottles, nets (plankton, trawls and fish), quadrates, and cameras. The DB's logical unit is information regarding a specimen (taxa name, barcode, image), related attributes (abundance, size, age, contaminants…), habitat description, sampling device and method, time and space of sampling, responsible organization and scientist, source of information (cruise, project and publication). The following standardization of specimen and attributes naming were implemented: Taxonomy according to World Register of Marine Species (WoRMS: http://www.marinespecies.org). Habitat description according to Coastal and Marine Ecological Classification Standards (CMECS: http://www.cmecscatalog.org) Parameter name; Unit; Device name; Developmental stage; Institution name; Country name; Marine region according to SeaDataNet Vocabularies (http://www.seadatanet.org/Standards-Software/Common-Vocabularies). This system supports two types of data submission procedures, which support the above stated data structure. The first is a downloadable excel file with drop-down fields based on the ISRAMAR-BIO vocabularies. The file is filled and uploaded online by the data contributor. Alternatively, the same dataset can be assembled by

  5. Rationale and uses of a public HIV drug-resistance database.

    Science.gov (United States)

    Shafer, Robert W

    2006-09-15

    Knowledge regarding the drug resistance of human immunodeficiency virus (HIV) is critical for surveillance of drug resistance, development of antiretroviral drugs, and management of infections with drug-resistant viruses. Such knowledge is derived from studies that correlate genetic variation in the targets of therapy with the antiretroviral treatments received by persons from whom the variant was obtained (genotype-treatment), with drug-susceptibility data on genetic variants (genotype-phenotype), and with virological and clinical response to a new treatment regimen (genotype-outcome). An HIV drug-resistance database is required to represent, store, and analyze the diverse forms of data underlying our knowledge of drug resistance and to make these data available to the broad community of researchers studying drug resistance in HIV and clinicians using HIV drug-resistance tests. Such genotype-treatment, genotype-phenotype, and genotype-outcome correlations are contained in the Stanford HIV RT and Protease Sequence Database and have specific usefulness.

  6. Large-scale Health Information Database and Privacy Protection.

    Science.gov (United States)

    Yamamoto, Ryuichi

    2016-09-01

    Japan was once progressive in the digitalization of healthcare fields but unfortunately has fallen behind in terms of the secondary use of data for public interest. There has recently been a trend to establish large-scale health databases in the nation, and a conflict between data use for public interest and privacy protection has surfaced as this trend has progressed. Databases for health insurance claims or for specific health checkups and guidance services were created according to the law that aims to ensure healthcare for the elderly; however, there is no mention in the act about using these databases for public interest in general. Thus, an initiative for such use must proceed carefully and attentively. The PMDA projects that collect a large amount of medical record information from large hospitals and the health database development project that the Ministry of Health, Labour and Welfare (MHLW) is working on will soon begin to operate according to a general consensus; however, the validity of this consensus can be questioned if issues of anonymity arise. The likelihood that researchers conducting a study for public interest would intentionally invade the privacy of their subjects is slim. However, patients could develop a sense of distrust about their data being used since legal requirements are ambiguous. Nevertheless, without using patients' medical records for public interest, progress in medicine will grind to a halt. Proper legislation that is clear for both researchers and patients will therefore be highly desirable. A revision of the Act on the Protection of Personal Information is currently in progress. In reality, however, privacy is not something that laws alone can protect; it will also require guidelines and self-discipline. We now live in an information capitalization age. I will introduce the trends in legal reform regarding healthcare information and discuss some basics to help people properly face the issue of health big data and privacy

  7. A database application for the Naval Command Physical Readiness Testing Program

    OpenAIRE

    Quinones, Frances M.

    1998-01-01

    Approved for public release; distribution is unlimited 1T21 envisions a Navy with tandardized, state-of-art computer systems. Based on this vision, Naval database management systems will also need to become standardized among Naval commands. Today most commercial off the shelf (COTS) database management systems provide a graphical user interface. Among the many Naval database systems currently in use, the Navy's Physical Readiness Program database has continued to exist at the command leve...

  8. NCEI Standard Product: World Ocean Database (WOD)

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — The World Ocean Database (WOD) is the world's largest publicly available uniform format quality controlled ocean profile dataset. Ocean profile data are sets of...

  9. Database Description - SKIP Stemcell Database | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us SKIP Stemcell Database Database Description General information of database Database name SKIP Stemcell Database...rsity Journal Search: Contact address http://www.skip.med.keio.ac.jp/en/contact/ Database classification Human Genes and Diseases Dat...abase classification Stemcell Article Organism Taxonomy Name: Homo sapiens Taxonomy ID: 9606 Database...ks: Original website information Database maintenance site Center for Medical Genetics, School of medicine, ...lable Web services Not available URL of Web services - Need for user registration Not available About This Database Database

  10. Database Description - Arabidopsis Phenome Database | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Arabidopsis Phenome Database Database Description General information of database Database n... BioResource Center Hiroshi Masuya Database classification Plant databases - Arabidopsis thaliana Organism T...axonomy Name: Arabidopsis thaliana Taxonomy ID: 3702 Database description The Arabidopsis thaliana phenome i...heir effective application. We developed the new Arabidopsis Phenome Database integrating two novel database...seful materials for their experimental research. The other, the “Database of Curated Plant Phenome” focusing

  11. dbMDEGA: a database for meta-analysis of differentially expressed genes in autism spectrum disorder.

    Science.gov (United States)

    Zhang, Shuyun; Deng, Libin; Jia, Qiyue; Huang, Shaoting; Gu, Junwang; Zhou, Fankun; Gao, Meng; Sun, Xinyi; Feng, Chang; Fan, Guangqin

    2017-11-16

    Autism spectrum disorders (ASD) are hereditary, heterogeneous and biologically complex neurodevelopmental disorders. Individual studies on gene expression in ASD cannot provide clear consensus conclusions. Therefore, a systematic review to synthesize the current findings from brain tissues and a search tool to share the meta-analysis results are urgently needed. Here, we conducted a meta-analysis of brain gene expression profiles in the current reported human ASD expression datasets (with 84 frozen male cortex samples, 17 female cortex samples, 32 cerebellum samples and 4 formalin fixed samples) and knock-out mouse ASD model expression datasets (with 80 collective brain samples). Then, we applied R language software and developed an interactive shared and updated database (dbMDEGA) displaying the results of meta-analysis of data from ASD studies regarding differentially expressed genes (DEGs) in the brain. This database, dbMDEGA ( https://dbmdega.shinyapps.io/dbMDEGA/ ), is a publicly available web-portal for manual annotation and visualization of DEGs in the brain from data from ASD studies. This database uniquely presents meta-analysis values and homologous forest plots of DEGs in brain tissues. Gene entries are annotated with meta-values, statistical values and forest plots of DEGs in brain samples. This database aims to provide searchable meta-analysis results based on the current reported brain gene expression datasets of ASD to help detect candidate genes underlying this disorder. This new analytical tool may provide valuable assistance in the discovery of DEGs and the elucidation of the molecular pathogenicity of ASD. This database model may be replicated to study other disorders.

  12. dBBQs: dataBase of Bacterial Quality scores.

    Science.gov (United States)

    Wanchai, Visanu; Patumcharoenpol, Preecha; Nookaew, Intawat; Ussery, David

    2017-12-28

    It is well-known that genome sequencing technologies are becoming significantly cheaper and faster. As a result of this, the exponential growth in sequencing data in public databases allows us to explore ever growing large collections of genome sequences. However, it is less known that the majority of available sequenced genome sequences in public databases are not complete, drafts of varying qualities. We have calculated quality scores for around 100,000 bacterial genomes from all major genome repositories and put them in a fast and easy-to-use database. Prokaryotic genomic data from all sources were collected and combined to make a non-redundant set of bacterial genomes. The genome quality score for each was calculated by four different measurements: assembly quality, number of rRNA and tRNA genes, and the occurrence of conserved functional domains. The dataBase of Bacterial Quality scores (dBBQs) was designed to store and retrieve quality scores. It offers fast searching and download features which the result can be used for further analysis. In addition, the search results are shown in interactive JavaScript chart framework using DC.js. The analysis of quality scores across major public genome databases find that around 68% of the genomes are of acceptable quality for many uses. dBBQs (available at http://arc-gem.uams.edu/dbbqs ) provides genome quality scores for all available prokaryotic genome sequences with a user-friendly Web-interface. These scores can be used as cut-offs to get a high-quality set of genomes for testing bioinformatics tools or improving the analysis. Moreover, all data of the four measurements that were combined to make the quality score for each genome, which can potentially be used for further analysis. dBBQs will be updated regularly and is freely use for non-commercial purpose.

  13. Respiratory cancer database: An open access database of respiratory cancer gene and miRNA

    Directory of Open Access Journals (Sweden)

    Jyotsna Choubey

    2017-01-01

    Results and Conclusions: RespCanDB is expected to contribute to the understanding of scientific community regarding respiratory cancer biology as well as developments of new way of diagnosing and treating respiratory cancer. Currently, the database consist the oncogenomic information of lung cancer, laryngeal cancer, and nasopharyngeal cancer. Data for other cancers, such as oral and tracheal cancers, will be added in the near future. The URL of RespCanDB is http://ridb.subdic-bioinformatics-nitrr.in/.

  14. Nonlinear dimensionality reduction methods for synthetic biology biobricks' visualization.

    Science.gov (United States)

    Yang, Jiaoyun; Wang, Haipeng; Ding, Huitong; An, Ning; Alterovitz, Gil

    2017-01-19

    Visualizing data by dimensionality reduction is an important strategy in Bioinformatics, which could help to discover hidden data properties and detect data quality issues, e.g. data noise, inappropriately labeled data, etc. As crowdsourcing-based synthetic biology databases face similar data quality issues, we propose to visualize biobricks to tackle them. However, existing dimensionality reduction methods could not be directly applied on biobricks datasets. Hereby, we use normalized edit distance to enhance dimensionality reduction methods, including Isomap and Laplacian Eigenmaps. By extracting biobricks from synthetic biology database Registry of Standard Biological Parts, six combinations of various types of biobricks are tested. The visualization graphs illustrate discriminated biobricks and inappropriately labeled biobricks. Clustering algorithm K-means is adopted to quantify the reduction results. The average clustering accuracy for Isomap and Laplacian Eigenmaps are 0.857 and 0.844, respectively. Besides, Laplacian Eigenmaps is 5 times faster than Isomap, and its visualization graph is more concentrated to discriminate biobricks. By combining normalized edit distance with Isomap and Laplacian Eigenmaps, synthetic biology biobircks are successfully visualized in two dimensional space. Various types of biobricks could be discriminated and inappropriately labeled biobricks could be determined, which could help to assess crowdsourcing-based synthetic biology databases' quality, and make biobricks selection.

  15. Proposal for a High Energy Nuclear Database

    International Nuclear Information System (INIS)

    Brown, David A.; Vogt, Ramona

    2005-01-01

    We propose to develop a high-energy heavy-ion experimental database and make it accessible to the scientific community through an on-line interface. This database will be searchable and cross-indexed with relevant publications, including published detector descriptions. Since this database will be a community resource, it requires the high-energy nuclear physics community's financial and manpower support. This database should eventually contain all published data from Bevalac and AGS to RHIC to CERN-LHC energies, proton-proton to nucleus-nucleus collisions as well as other relevant systems, and all measured observables. Such a database would have tremendous scientific payoff as it makes systematic studies easier and allows simpler benchmarking of theoretical models to a broad range of old and new experiments. Furthermore, there is a growing need for compilations of high-energy nuclear data for applications including stockpile stewardship, technology development for inertial confinement fusion and target and source development for upcoming facilities such as the Next Linear Collider. To enhance the utility of this database, we propose periodically performing evaluations of the data and summarizing the results in topical reviews

  16. BioBenchmark Toyama 2012: an evaluation of the performance of triple stores on biological data

    Science.gov (United States)

    2014-01-01

    Background Biological databases vary enormously in size and data complexity, from small databases that contain a few million Resource Description Framework (RDF) triples to large databases that contain billions of triples. In this paper, we evaluate whether RDF native stores can be used to meet the needs of a biological database provider. Prior evaluations have used synthetic data with a limited database size. For example, the largest BSBM benchmark uses 1 billion synthetic e-commerce knowledge RDF triples on a single node. However, real world biological data differs from the simple synthetic data much. It is difficult to determine whether the synthetic e-commerce data is efficient enough to represent biological databases. Therefore, for this evaluation, we used five real data sets from biological databases. Results We evaluated five triple stores, 4store, Bigdata, Mulgara, Virtuoso, and OWLIM-SE, with five biological data sets, Cell Cycle Ontology, Allie, PDBj, UniProt, and DDBJ, ranging in size from approximately 10 million to 8 billion triples. For each database, we loaded all the data into our single node and prepared the database for use in a classical data warehouse scenario. Then, we ran a series of SPARQL queries against each endpoint and recorded the execution time and the accuracy of the query response. Conclusions Our paper shows that with appropriate configuration Virtuoso and OWLIM-SE can satisfy the basic requirements to load and query biological data less than 8 billion or so on a single node, for the simultaneous access of 64 clients. OWLIM-SE performs best for databases with approximately 11 million triples; For data sets that contain 94 million and 590 million triples, OWLIM-SE and Virtuoso perform best. They do not show overwhelming advantage over each other; For data over 4 billion Virtuoso works best. 4store performs well on small data sets with limited features when the number of triples is less than 100 million, and our test shows its

  17. BioBenchmark Toyama 2012: an evaluation of the performance of triple stores on biological data.

    Science.gov (United States)

    Wu, Hongyan; Fujiwara, Toyofumi; Yamamoto, Yasunori; Bolleman, Jerven; Yamaguchi, Atsuko

    2014-01-01

    Biological databases vary enormously in size and data complexity, from small databases that contain a few million Resource Description Framework (RDF) triples to large databases that contain billions of triples. In this paper, we evaluate whether RDF native stores can be used to meet the needs of a biological database provider. Prior evaluations have used synthetic data with a limited database size. For example, the largest BSBM benchmark uses 1 billion synthetic e-commerce knowledge RDF triples on a single node. However, real world biological data differs from the simple synthetic data much. It is difficult to determine whether the synthetic e-commerce data is efficient enough to represent biological databases. Therefore, for this evaluation, we used five real data sets from biological databases. We evaluated five triple stores, 4store, Bigdata, Mulgara, Virtuoso, and OWLIM-SE, with five biological data sets, Cell Cycle Ontology, Allie, PDBj, UniProt, and DDBJ, ranging in size from approximately 10 million to 8 billion triples. For each database, we loaded all the data into our single node and prepared the database for use in a classical data warehouse scenario. Then, we ran a series of SPARQL queries against each endpoint and recorded the execution time and the accuracy of the query response. Our paper shows that with appropriate configuration Virtuoso and OWLIM-SE can satisfy the basic requirements to load and query biological data less than 8 billion or so on a single node, for the simultaneous access of 64 clients. OWLIM-SE performs best for databases with approximately 11 million triples; For data sets that contain 94 million and 590 million triples, OWLIM-SE and Virtuoso perform best. They do not show overwhelming advantage over each other; For data over 4 billion Virtuoso works best. 4store performs well on small data sets with limited features when the number of triples is less than 100 million, and our test shows its scalability is poor; Bigdata

  18. UltraPse: A Universal and Extensible Software Platform for Representing Biological Sequences.

    Science.gov (United States)

    Du, Pu-Feng; Zhao, Wei; Miao, Yang-Yang; Wei, Le-Yi; Wang, Likun

    2017-11-14

    With the avalanche of biological sequences in public databases, one of the most challenging problems in computational biology is to predict their biological functions and cellular attributes. Most of the existing prediction algorithms can only handle fixed-length numerical vectors. Therefore, it is important to be able to represent biological sequences with various lengths using fixed-length numerical vectors. Although several algorithms, as well as software implementations, have been developed to address this problem, these existing programs can only provide a fixed number of representation modes. Every time a new sequence representation mode is developed, a new program will be needed. In this paper, we propose the UltraPse as a universal software platform for this problem. The function of the UltraPse is not only to generate various existing sequence representation modes, but also to simplify all future programming works in developing novel representation modes. The extensibility of UltraPse is particularly enhanced. It allows the users to define their own representation mode, their own physicochemical properties, or even their own types of biological sequences. Moreover, UltraPse is also the fastest software of its kind. The source code package, as well as the executables for both Linux and Windows platforms, can be downloaded from the GitHub repository.

  19. Therapeutic preferences and outcomes in newly diagnosed patients with Crohn's diseases in the biological era in Hungary: a nationwide study based on the National Health Insurance Fund database.

    Science.gov (United States)

    Kurti, Zsuzsanna; Ilias, Akos; Gonczi, Lorant; Vegh, Zsuzsanna; Fadgyas-Freyler, Petra; Korponay, Gyula; Golovics, Petra A; Lovasz, Barbara D; Lakatos, Peter L

    2018-01-30

    Accelerated treatment strategy, including tight disease control and early aggressive therapy with immunosuppressives (IS) and biological agents have become increasingly common in inflammatory bowel disease (IBD). The aim of the present study was to estimate the early treatment strategy and outcomes in newly diagnosed patients with Crohn's disease (CD) between 2004 and 2008 and 2009-2015 in the whole IBD population in Hungary based on the administrative database of the National Health Insurance Fund (OEP). We used the administrative database of the OEP, the only nationwide state-owned health insurance provider in Hungary. Patients were identified through previously reported algorithms using the ICD-10 codes for CD in the out-, inpatient (medical, surgical) non-primary care records and drug prescription databases between 2004 and 2015. Patients were stratified according to the year of diagnosis and maximum treatment steps during the first 3 years after diagnosis. A total of 6173 (male/female: 46.12%/53.87%) newly diagnosed CD patients with physician-diagnosed IBD were found in the period of 2004-2015. The use of 5-ASA and steroids remained common in the biological era, while immunosuppressives and biologicals were started earlier and became more frequent among patients diagnosed after 2009. The probability of biological therapy was 2.9%/6.4% and 8.4%/13.7% after 1 and 3 years in patients diagnosed in 2004-2008/2009-2015. The probability of hospitalization in the first 3 years after diagnosis was different before and after 2009, according to the maximal treatment step (overall 55.7%vs. 47.4% (p = 0.001), anti-TNF: 73%vs. 66.7% (p = 0.103), IS: 64.6% vs. 56.1% (p = 0.001), steroid: 44.2%vs. 36.8% (p < 0.007), 5-ASA: 32.6% vs. 26.7% p = 0.157)). In contrast, surgery rates were not significantly different in patients diagnosed before and after 2009 according to the maximum treatment step (overall 16.0%vs.15.3%(p = 0.672) anti-TNF 26.7%vs.27

  20. Risoe Publication Activities in 1997

    International Nuclear Information System (INIS)

    Alvi, Hanne; Bennov, Solvejg

    1998-08-01

    Risoe's publication and lecture activities in the last decades are presented through data of total number of publications, distribution of types of publications, number of citations to the international scientific journal articles, and institutions with which Risoe has published the largest number of articles. The data are derived from Risoe's in-house Publications Database and from the Risoe Institutional Citation Report database produced by the Institute for Scientific Information. The largest part of the report contains a list of references to the scientific and technical journal articles, books, reports, lectures, and to publications for a broader readership authored by researchers at Risoe National Laboratory during the year 1997. The references are organised according to the programme areas of Risoe. (au)

  1. Nutritional Systems Biology

    DEFF Research Database (Denmark)

    Jensen, Kasper

    and network biology has the potential to increase our understanding of how small molecules affect metabolic pathways and homeostasis, how this perturbation changes at the disease state, and to what extent individual genotypes contribute to this. A fruitful strategy in approaching and exploring the field...... biology research. The paper also shows as a proof-of-concept that a systems biology approach to diet is meaningful and demonstrates some basic principles on how to work with diet systematic. The second chapter of this thesis we developed the resource NutriChem v1.0. A foodchemical database linking...... sites of diet on the disease pathway. We propose a framework for interrogating the critical targets in colon cancer process and identifying plant-based dietary interventions as important modifiers using a systems chemical biology approach. The fifth chapter of the thesis is on discovering of novel anti...

  2. Construction of an ortholog database using the semantic web technology for integrative analysis of genomic data.

    Science.gov (United States)

    Chiba, Hirokazu; Nishide, Hiroyo; Uchiyama, Ikuo

    2015-01-01

    Recently, various types of biological data, including genomic sequences, have been rapidly accumulating. To discover biological knowledge from such growing heterogeneous data, a flexible framework for data integration is necessary. Ortholog information is a central resource for interlinking corresponding genes among different organisms, and the Semantic Web provides a key technology for the flexible integration of heterogeneous data. We have constructed an ortholog database using the Semantic Web technology, aiming at the integration of numerous genomic data and various types of biological information. To formalize the structure of the ortholog information in the Semantic Web, we have constructed the Ortholog Ontology (OrthO). While the OrthO is a compact ontology for general use, it is designed to be extended to the description of database-specific concepts. On the basis of OrthO, we described the ortholog information from our Microbial Genome Database for Comparative Analysis (MBGD) in the form of Resource Description Framework (RDF) and made it available through the SPARQL endpoint, which accepts arbitrary queries specified by users. In this framework based on the OrthO, the biological data of different organisms can be integrated using the ortholog information as a hub. Besides, the ortholog information from different data sources can be compared with each other using the OrthO as a shared ontology. Here we show some examples demonstrating that the ortholog information described in RDF can be used to link various biological data such as taxonomy information and Gene Ontology. Thus, the ortholog database using the Semantic Web technology can contribute to biological knowledge discovery through integrative data analysis.

  3. A novel database of bio-effects from non-ionizing radiation.

    Science.gov (United States)

    Leach, Victor; Weller, Steven; Redmayne, Mary

    2018-06-06

    A significant amount of electromagnetic field/electromagnetic radiation (EMF/EMR) research is available that examines biological and disease associated endpoints. The quantity, variety and changing parameters in the available research can be challenging when undertaking a literature review, meta-analysis, preparing a study design, building reference lists or comparing findings between relevant scientific papers. The Oceania Radiofrequency Scientific Advisory Association (ORSAA) has created a comprehensive, non-biased, multi-categorized, searchable database of papers on non-ionizing EMF/EMR to help address these challenges. It is regularly added to, freely accessible online and designed to allow data to be easily retrieved, sorted and analyzed. This paper demonstrates the content and search flexibility of the ORSAA database. Demonstration searches are presented by Effect/No Effect; frequency-band/s; in vitro; in vivo; biological effects; study type; and funding source. As of the 15th September 2017, the clear majority of 2653 papers captured in the database examine outcomes in the 300 MHz-3 GHz range. There are 3 times more biological "Effect" than "No Effect" papers; nearly a third of papers provide no funding statement; industry-funded studies more often than not find "No Effect", while institutional funding commonly reveal "Effects". Country of origin where the study is conducted/funded also appears to have a dramatic influence on the likely result outcome.

  4. MAGIC Database and Interfaces: An Integrated Package for Gene Discovery and Expression

    Directory of Open Access Journals (Sweden)

    Lee H. Pratt

    2006-03-01

    Full Text Available The rapidly increasing rate at which biological data is being produced requires a corresponding growth in relational databases and associated tools that can help laboratories contend with that data. With this need in mind, we describe here a Modular Approach to a Genomic, Integrated and Comprehensive (MAGIC Database. This Oracle 9i database derives from an initial focus in our laboratory on gene discovery via production and analysis of expressed sequence tags (ESTs, and subsequently on gene expression as assessed by both EST clustering and microarrays. The MAGIC Gene Discovery portion of the database focuses on information derived from DNA sequences and on its biological relevance. In addition to MAGIC SEQ-LIMS, which is designed to support activities in the laboratory, it contains several additional subschemas. The latter include MAGIC Admin for database administration, MAGIC Sequence for sequence processing as well as sequence and clone attributes, MAGIC Cluster for the results of EST clustering, MAGIC Polymorphism in support of microsatellite and single-nucleotide-polymorphism discovery, and MAGIC Annotation for electronic annotation by BLAST and BLAT. The MAGIC Microarray portion is a MIAME-compliant database with two components at present. These are MAGIC Array-LIMS, which makes possible remote entry of all information into the database, and MAGIC Array Analysis, which provides data mining and visualization. Because all aspects of interaction with the MAGIC Database are via a web browser, it is ideally suited not only for individual research laboratories but also for core facilities that serve clients at any distance.

  5. CHID: a unique health information and education database.

    OpenAIRE

    Lunin, L F; Stein, R S

    1987-01-01

    The public's growing interest in health information and the health professions' increasing need to locate health education materials can be answered in part by the new Combined Health Information Database (CHID). This unique database focuses on materials and programs in professional and patient education, general health education, and community risk reduction. Accessible through BRS, CHID suggests sources for procuring brochures, pamphlets, articles, and films on community services, programs ...

  6. National Database for Clinical Trials Related to Mental Illness (NDCT)

    Data.gov (United States)

    U.S. Department of Health & Human Services — The National Database for Clinical Trials Related to Mental Illness (NDCT) is an extensible informatics platform for relevant data at all levels of biological and...

  7. The Brainomics/Localizer database.

    Science.gov (United States)

    Papadopoulos Orfanos, Dimitri; Michel, Vincent; Schwartz, Yannick; Pinel, Philippe; Moreno, Antonio; Le Bihan, Denis; Frouin, Vincent

    2017-01-01

    The Brainomics/Localizer database exposes part of the data collected by the in-house Localizer project, which planned to acquire four types of data from volunteer research subjects: anatomical MRI scans, functional MRI data, behavioral and demographic data, and DNA sampling. Over the years, this local project has been collecting such data from hundreds of subjects. We had selected 94 of these subjects for their complete datasets, including all four types of data, as the basis for a prior publication; the Brainomics/Localizer database publishes the data associated with these 94 subjects. Since regulatory rules prevent us from making genetic data available for download, the database serves only anatomical MRI scans, functional MRI data, behavioral and demographic data. To publish this set of heterogeneous data, we use dedicated software based on the open-source CubicWeb semantic web framework. Through genericity in the data model and flexibility in the display of data (web pages, CSV, JSON, XML), CubicWeb helps us expose these complex datasets in original and efficient ways. Copyright © 2015 Elsevier Inc. All rights reserved.

  8. A user-friendly phytoremediation database: creating the searchable database, the users, and the broader implications.

    Science.gov (United States)

    Famulari, Stevie; Witz, Kyla

    2015-01-01

    Designers, students, teachers, gardeners, farmers, landscape architects, architects, engineers, homeowners, and others have uses for the practice of phytoremediation. This research looks at the creation of a phytoremediation database which is designed for ease of use for a non-scientific user, as well as for students in an educational setting ( http://www.steviefamulari.net/phytoremediation ). During 2012, Environmental Artist & Professor of Landscape Architecture Stevie Famulari, with assistance from Kyla Witz, a landscape architecture student, created an online searchable database designed for high public accessibility. The database is a record of research of plant species that aid in the uptake of contaminants, including metals, organic materials, biodiesels & oils, and radionuclides. The database consists of multiple interconnected indexes categorized into common and scientific plant name, contaminant name, and contaminant type. It includes photographs, hardiness zones, specific plant qualities, full citations to the original research, and other relevant information intended to aid those designing with phytoremediation search for potential plants which may be used to address their site's need. The objective of the terminology section is to remove uncertainty for more inexperienced users, and to clarify terms for a more user-friendly experience. Implications of the work, including education and ease of browsing, as well as use of the database in teaching, are discussed.

  9. OGDD (Olive Genetic Diversity Database): a microsatellite markers' genotypes database of worldwide olive trees for cultivar identification and virgin olive oil traceability.

    Science.gov (United States)

    Ben Ayed, Rayda; Ben Hassen, Hanen; Ennouri, Karim; Ben Marzoug, Riadh; Rebai, Ahmed

    2016-01-01

    Olive (Olea europaea), whose importance is mainly due to nutritional and health features, is one of the most economically significant oil-producing trees in the Mediterranean region. Unfortunately, the increasing market demand towards virgin olive oil could often result in its adulteration with less expensive oils, which is a serious problem for the public and quality control evaluators of virgin olive oil. Therefore, to avoid frauds, olive cultivar identification and virgin olive oil authentication have become a major issue for the producers and consumers of quality control in the olive chain. Presently, genetic traceability using SSR is the cost effective and powerful marker technique that can be employed to resolve such problems. However, to identify an unknown monovarietal virgin olive oil cultivar, a reference system has become necessary. Thus, an Olive Genetic Diversity Database (OGDD) (http://www.bioinfo-cbs.org/ogdd/) is presented in this work. It is a genetic, morphologic and chemical database of worldwide olive tree and oil having a double function. In fact, besides being a reference system generated for the identification of unkown olive or virgin olive oil cultivars based on their microsatellite allele size(s), it provides users additional morphological and chemical information for each identified cultivar. Currently, OGDD is designed to enable users to easily retrieve and visualize biologically important information (SSR markers, and olive tree and oil characteristics of about 200 cultivars worldwide) using a set of efficient query interfaces and analysis tools. It can be accessed through a web service from any modern programming language using a simple hypertext transfer protocol call. The web site is implemented in java, JavaScript, PHP, HTML and Apache with all major browsers supported. Database URL: http://www.bioinfo-cbs.org/ogdd/. © The Author(s) 2016. Published by Oxford University Press.

  10. Building a genome database using an object-oriented approach.

    Science.gov (United States)

    Barbasiewicz, Anna; Liu, Lin; Lang, B Franz; Burger, Gertraud

    2002-01-01

    GOBASE is a relational database that integrates data associated with mitochondria and chloroplasts. The most important data in GOBASE, i. e., molecular sequences and taxonomic information, are obtained from the public sequence data repository at the National Center for Biotechnology Information (NCBI), and are validated by our experts. Maintaining a curated genomic database comes with a towering labor cost, due to the shear volume of available genomic sequences and the plethora of annotation errors and omissions in records retrieved from public repositories. Here we describe our approach to increase automation of the database population process, thereby reducing manual intervention. As a first step, we used Unified Modeling Language (UML) to construct a list of potential errors. Each case was evaluated independently, and an expert solution was devised, and represented as a diagram. Subsequently, the UML diagrams were used as templates for writing object-oriented automation programs in the Java programming language.

  11. Research Outputs of England's Hospital Episode Statistics (HES) Database: Bibliometric Analysis.

    Science.gov (United States)

    Chaudhry, Zain; Mannan, Fahmida; Gibson-White, Angela; Syed, Usama; Ahmed, Shirin; Majeed, Azeem

    2017-12-06

    Hospital administrative data, such as those provided by the Hospital Episode Statistics (HES) database in England, are increasingly being used for research and quality improvement. To date, no study has tried to quantify and examine trends in the use of HES for research purposes. To examine trends in the use of HES data for research. Publications generated from the use of HES data were extracted from PubMed and analysed. Publications from 1996 to 2014 were then examined further in the Science Citation Index (SCI) of the Thompson Scientific Institute for Science Information (Web of Science) for details of research specialty area. 520 studies, categorised into 44 specialty areas, were extracted from PubMed. The review showed an increase in publications over the 18-year period with an average of 27 publications per year, however with the majority of output observed in the latter part of the study period. The highest number of publications was in the Health Statistics specialty area. The use of HES data for research is becoming more common. Increase in publications over time shows that researchers are beginning to take advantage of the potential of HES data. Although HES is a valuable database, concerns exist over the accuracy and completeness of the data entered. Clinicians need to be more engaged with HES for the full potential of this database to be harnessed.

  12. SBR-Blood: systems biology repository for hematopoietic cells.

    Science.gov (United States)

    Lichtenberg, Jens; Heuston, Elisabeth F; Mishra, Tejaswini; Keller, Cheryl A; Hardison, Ross C; Bodine, David M

    2016-01-04

    Extensive research into hematopoiesis (the development of blood cells) over several decades has generated large sets of expression and epigenetic profiles in multiple human and mouse blood cell types. However, there is no single location to analyze how gene regulatory processes lead to different mature blood cells. We have developed a new database framework called hematopoietic Systems Biology Repository (SBR-Blood), available online at http://sbrblood.nhgri.nih.gov, which allows user-initiated analyses for cell type correlations or gene-specific behavior during differentiation using publicly available datasets for array- and sequencing-based platforms from mouse hematopoietic cells. SBR-Blood organizes information by both cell identity and by hematopoietic lineage. The validity and usability of SBR-Blood has been established through the reproduction of workflows relevant to expression data, DNA methylation, histone modifications and transcription factor occupancy profiles. Published by Oxford University Press on behalf of Nucleic Acids Research 2015. This work is written by (a) US Government employee(s) and is in the public domain in the US.

  13. Research reactor records in the INIS database

    International Nuclear Information System (INIS)

    Marinkovic, N.

    2001-01-01

    This report presents a statistical analysis of more than 13,000 records of publications concerned with research and technology in the field of research and experimental reactors which are included in the INIS Bibliographic Database for the period from 1970 to 2001. The main objectives of this bibliometric study were: to make an inventory of research reactor related records in the INIS Database; to provide statistics and scientific indicators for the INIS users, namely science managers, researchers, engineers, operators, scientific editors and publishers, decision-makers in the field of research reactors related subjects; to extract other useful information from the INIS Bibliographic Database about articles published in research reactors research and technology. (author)

  14. An online database of nuclear electromagnetic moments

    International Nuclear Information System (INIS)

    Mertzimekis, T.J.; Stamou, K.; Psaltis, A.

    2016-01-01

    Measurements of nuclear magnetic dipole and electric quadrupole moments are considered quite important for the understanding of nuclear structure both near and far from the valley of stability. The recent advent of radioactive beams has resulted in a plethora of new, continuously flowing, experimental data on nuclear structure – including nuclear moments – which hinders the information management. A new, dedicated, public and user friendly online database ( (http://magneticmoments.info)) has been created comprising experimental data of nuclear electromagnetic moments. The present database supersedes existing printed compilations, including also non-evaluated series of data and relevant meta-data, while putting strong emphasis on bimonthly updates. The scope, features and extensions of the database are reported.

  15. Databases for neurogenetics: introduction, overview, and challenges.

    Science.gov (United States)

    Sobrido, María-Jesús; Cacheiro, Pilar; Carracedo, Angel; Bertram, Lars

    2012-09-01

    The importance for research and clinical utility of mutation databases, as well as the issues and difficulties entailed in their construction, is discussed within the Human Variome Project. While general principles and standards can apply to most human diseases, some specific questions arise when dealing with the nature of genetic neurological disorders. So far, publically accessible mutation databases exist for only about half of the genes causing neurogenetic disorders; and a considerable work is clearly still needed to optimize their content. The current landscape, main challenges, some potential solutions, and future perspectives on genetic databases for disorders of the nervous system are reviewed in this special issue of Human Mutation on neurogenetics. © 2012 Wiley Periodicals, Inc.

  16. JAMSTEC DARWIN Database Assimilates GANSEKI and COEDO

    Science.gov (United States)

    Tomiyama, T.; Toyoda, Y.; Horikawa, H.; Sasaki, T.; Fukuda, K.; Hase, H.; Saito, H.

    2017-12-01

    Introduction: Japan Agency for Marine-Earth Science and Technology (JAMSTEC) archives data and samples obtained by JAMSTEC research vessels and submersibles. As a common property of the human society, JAMSTEC archive is open for public users with scientific/educational purposes [1]. For publicizing its data and samples online, JAMSTEC is operating NUUNKUI data sites [2], a group of several databases for various data and sample types. For years, data and metadata of JAMSTEC rock samples, sediment core samples and cruise/dive observation were publicized through databases named GANSEKI, COEDO, and DARWIN, respectively. However, because they had different user interfaces and data structures, these services were somewhat confusing for unfamiliar users. Maintenance costs of multiple hardware and software were also problematic for performing sustainable services and continuous improvements. Database Integration: In 2017, GANSEKI, COEDO and DARWIN were integrated into DARWIN+ [3]. The update also included implementation of map-search function as a substitute of closed portal site. Major functions of previous systems were incorporated into the new system; users can perform the complex search, by thumbnail browsing, map area, keyword filtering, and metadata constraints. As for data handling, the new system is more flexible, allowing the entry of variety of additional data types. Data Management: After the DARWIN major update, JAMSTEC data & sample team has been dealing with minor issues of individual sample data/metadata which sometimes need manual modification to be transferred to the new system. Some new data sets, such as onboard sample photos and surface close-up photos of rock samples, are getting available online. Geochemical data of sediment core samples will supposedly be added in the near future. Reference: [1] http://www.jamstec.go.jp/e/database/data_policy.html [2] http://www.godac.jamstec.go.jp/jmedia/portal/e/ [3] http://www.godac.jamstec.go.jp/darwin/e/

  17. Databases and Associated Bioinformatic Tools in Studies of Food Allergens, Epitopes and Haptens – a Review

    Directory of Open Access Journals (Sweden)

    Bucholska Justyna

    2018-06-01

    Full Text Available Allergies and/or food intolerances are a growing problem of the modern world. Diffi culties associated with the correct diagnosis of food allergies result in the need to classify the factors causing allergies and allergens themselves. Therefore, internet databases and other bioinformatic tools play a special role in deepening knowledge of biologically-important compounds. Internet repositories, as a source of information on different chemical compounds, including those related to allergy and intolerance, are increasingly being used by scientists. Bioinformatic methods play a signifi cant role in biological and medical sciences, and their importance in food science is increasing. This study aimed at presenting selected databases and tools of bioinformatic analysis useful in research on food allergies, allergens (11 databases, epitopes (7 databases, and haptens (2 databases. It also presents examples of the application of computer methods in studies related to allergies.

  18. Plant databases and data analysis tools

    Science.gov (United States)

    It is anticipated that the coming years will see the generation of large datasets including diagnostic markers in several plant species with emphasis on crop plants. To use these datasets effectively in any plant breeding program, it is essential to have the information available via public database...

  19. The Danish Inguinal Hernia Database

    Directory of Open Access Journals (Sweden)

    Friis-Andersen H

    2016-10-01

    Full Text Available Hans Friis-Andersen1,2, Thue Bisgaard2,3 1Surgical Department, Horsens Regional Hospital, Horsens, Denmark; 2Steering Committee, Danish Hernia Database, 3Surgical Gastroenterological Department 235, Copenhagen University Hospital, Hvidovre, Denmark Aim of database: To monitor and improve nation-wide surgical outcome after groin hernia repair based on scientific evidence-based surgical strategies for the national and international surgical community. Study population: Patients ≥18 years operated for groin hernia. Main variables: Type and size of hernia, primary or recurrent, type of surgical repair procedure, mesh and mesh fixation methods. Descriptive data: According to the Danish National Health Act, surgeons are obliged to register all hernia repairs immediately after surgery (3 minute registration time. All institutions have continuous access to their own data stratified on individual surgeons. Registrations are based on a closed, protected Internet system requiring personal codes also identifying the operating institution. A national steering committee consisting of 13 voluntary and dedicated surgeons, 11 of whom are unpaid, handles the medical management of the database. Results: The Danish Inguinal Hernia Database comprises intraoperative data from >130,000 repairs (May 2015. A total of 49 peer-reviewed national and international publications have been published from the database (June 2015. Conclusion: The Danish Inguinal Hernia Database is fully active monitoring surgical quality and contributes to the national and international surgical society to improve outcome after groin hernia repair. Keywords: nation-wide, recurrence, chronic pain, femoral hernia, surgery, quality improvement

  20. 42 CFR 409.13 - Drugs and biologicals.

    Science.gov (United States)

    2010-10-01

    ... 42 Public Health 2 2010-10-01 2010-10-01 false Drugs and biologicals. 409.13 Section 409.13 Public... § 409.13 Drugs and biologicals. (a) Except as specified in paragraph (b) of this section, Medicare pays for drugs and biologicals as inpatient hospital or inpatient CAH services only if— (1) They represent...

  1. Hmrbase: a database of hormones and their receptors

    Science.gov (United States)

    Rashid, Mamoon; Singla, Deepak; Sharma, Arun; Kumar, Manish; Raghava, Gajendra PS

    2009-01-01

    Background Hormones are signaling molecules that play vital roles in various life processes, like growth and differentiation, physiology, and reproduction. These molecules are mostly secreted by endocrine glands, and transported to target organs through the bloodstream. Deficient, or excessive, levels of hormones are associated with several diseases such as cancer, osteoporosis, diabetes etc. Thus, it is important to collect and compile information about hormones and their receptors. Description This manuscript describes a database called Hmrbase which has been developed for managing information about hormones and their receptors. It is a highly curated database for which information has been collected from the literature and the public databases. The current version of Hmrbase contains comprehensive information about ~2000 hormones, e.g., about their function, source organism, receptors, mature sequences, structures etc. Hmrbase also contains information about ~3000 hormone receptors, in terms of amino acid sequences, subcellular localizations, ligands, and post-translational modifications etc. One of the major features of this database is that it provides data about ~4100 hormone-receptor pairs. A number of online tools have been integrated into the database, to provide the facilities like keyword search, structure-based search, mapping of a given peptide(s) on the hormone/receptor sequence, sequence similarity search. This database also provides a number of external links to other resources/databases in order to help in the retrieving of further related information. Conclusion Owing to the high impact of endocrine research in the biomedical sciences, the Hmrbase could become a leading data portal for researchers. The salient features of Hmrbase are hormone-receptor pair-related information, mapping of peptide stretches on the protein sequences of hormones and receptors, Pfam domain annotations, categorical browsing options, online data submission, Drug

  2. Parallel Worlds of Public and Commercial Bioactive Chemistry Data

    Science.gov (United States)

    2014-01-01

    The availability of structures and linked bioactivity data in databases is powerfully enabling for drug discovery and chemical biology. However, we now review some confounding issues with the divergent expansions of public and commercial sources of chemical structures. These are associated with not only expanding patent extraction but also increasingly large vendor collections amassed via different selection criteria between SciFinder from Chemical Abstracts Service (CAS) and major public sources such as PubChem, ChemSpider, UniChem, and others. These increasingly massive collections may include both real and virtual compounds, as well as so-called prophetic compounds from patents. We address a range of issues raised by the challenges faced resolving the NIH probe compounds. In addition we highlight the confounding of prior-art searching by virtual compounds that could impact the composition of matter patentability of a new medicinal chemistry lead. Finally, we propose some potential solutions. PMID:25415348

  3. Using SQL Databases for Sequence Similarity Searching and Analysis.

    Science.gov (United States)

    Pearson, William R; Mackey, Aaron J

    2017-09-13

    Relational databases can integrate diverse types of information and manage large sets of similarity search results, greatly simplifying genome-scale analyses. By focusing on taxonomic subsets of sequences, relational databases can reduce the size and redundancy of sequence libraries and improve the statistical significance of homologs. In addition, by loading similarity search results into a relational database, it becomes possible to explore and summarize the relationships between all of the proteins in an organism and those in other biological kingdoms. This unit describes how to use relational databases to improve the efficiency of sequence similarity searching and demonstrates various large-scale genomic analyses of homology-related data. It also describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. The unit also introduces search_demo, a database that stores sequence similarity search results. The search_demo database is then used to explore the evolutionary relationships between E. coli proteins and proteins in other organisms in a large-scale comparative genomic analysis. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.

  4. Open Geoscience Database

    Science.gov (United States)

    Bashev, A.

    2012-04-01

    Currently there is an enormous amount of various geoscience databases. Unfortunately the only users of the majority of the databases are their elaborators. There are several reasons for that: incompaitability, specificity of tasks and objects and so on. However the main obstacles for wide usage of geoscience databases are complexity for elaborators and complication for users. The complexity of architecture leads to high costs that block the public access. The complication prevents users from understanding when and how to use the database. Only databases, associated with GoogleMaps don't have these drawbacks, but they could be hardly named "geoscience" Nevertheless, open and simple geoscience database is necessary at least for educational purposes (see our abstract for ESSI20/EOS12). We developed a database and web interface to work with them and now it is accessible at maps.sch192.ru. In this database a result is a value of a parameter (no matter which) in a station with a certain position, associated with metadata: the date when the result was obtained; the type of a station (lake, soil etc); the contributor that sent the result. Each contributor has its own profile, that allows to estimate the reliability of the data. The results can be represented on GoogleMaps space image as a point in a certain position, coloured according to the value of the parameter. There are default colour scales and each registered user can create the own scale. The results can be also extracted in *.csv file. For both types of representation one could select the data by date, object type, parameter type, area and contributor. The data are uploaded in *.csv format: Name of the station; Lattitude(dd.dddddd); Longitude(ddd.dddddd); Station type; Parameter type; Parameter value; Date(yyyy-mm-dd). The contributor is recognised while entering. This is the minimal set of features that is required to connect a value of a parameter with a position and see the results. All the complicated data

  5. Simple re-instantiation of small databases using cloud computing.

    Science.gov (United States)

    Tan, Tin Wee; Xie, Chao; De Silva, Mark; Lim, Kuan Siong; Patro, C Pawan K; Lim, Shen Jean; Govindarajan, Kunde Ramamoorthy; Tong, Joo Chuan; Choo, Khar Heng; Ranganathan, Shoba; Khan, Asif M

    2013-01-01

    Small bioinformatics databases, unlike institutionally funded large databases, are vulnerable to discontinuation and many reported in publications are no longer accessible. This leads to irreproducible scientific work and redundant effort, impeding the pace of scientific progress. We describe a Web-accessible system, available online at http://biodb100.apbionet.org, for archival and future on demand re-instantiation of small databases within minutes. Depositors can rebuild their databases by downloading a Linux live operating system (http://www.bioslax.com), preinstalled with bioinformatics and UNIX tools. The database and its dependencies can be compressed into an ".lzm" file for deposition. End-users can search for archived databases and activate them on dynamically re-instantiated BioSlax instances, run as virtual machines over the two popular full virtualization standard cloud-computing platforms, Xen Hypervisor or vSphere. The system is adaptable to increasing demand for disk storage or computational load and allows database developers to use the re-instantiated databases for integration and development of new databases. Herein, we demonstrate that a relatively inexpensive solution can be implemented for archival of bioinformatics databases and their rapid re-instantiation should the live databases disappear.

  6. 45 CFR 1356.80 - Scope of the National Youth in Transition Database.

    Science.gov (United States)

    2010-10-01

    ... 45 Public Welfare 4 2010-10-01 2010-10-01 false Scope of the National Youth in Transition Database... REQUIREMENTS APPLICABLE TO TITLE IV-E § 1356.80 Scope of the National Youth in Transition Database. The requirements of the National Youth in Transition Database (NYTD) §§ 1356.81 through 1356.86 of this part apply...

  7. ARTI refrigerant database

    Energy Technology Data Exchange (ETDEWEB)

    Calm, J.M. [Calm (James M.), Great Falls, VA (United States)

    1998-08-01

    The Refrigerant Database is an information system on alternative refrigerants, associated lubricants, and their use in air conditioning and refrigeration. It consolidates and facilitates access to property, compatibility, environmental, safety, application and other information. It provides corresponding information on older refrigerants, to assist manufactures and those using alternative refrigerants, to make comparisons and determine differences. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern. The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air-conditioning and refrigeration equipment. The complete documents are not included, though some may be added at a later date. The database identifies sources of specific information on many refrigerants including propane, ammonia, water, carbon dioxide, propylene, ethers, and others as well as azeotropic and zeotropic blends of these fluids. It addresses lubricants including alkylbenzene, polyalkylene glycol, polyolester, and other synthetics as well as mineral oils. It also references documents addressing compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits. Incomplete citations or abstracts are provided for some documents. They are included to accelerate availability of the information and will be completed or replaced in future updates.

  8. LmSmdB: an integrated database for metabolic and gene regulatory network in Leishmania major and Schistosoma mansoni

    Directory of Open Access Journals (Sweden)

    Priyanka Patel

    2016-03-01

    Full Text Available A database that integrates all the information required for biological processing is essential to be stored in one platform. We have attempted to create one such integrated database that can be a one stop shop for the essential features required to fetch valuable result. LmSmdB (L. major and S. mansoni database is an integrated database that accounts for the biological networks and regulatory pathways computationally determined by integrating the knowledge of the genome sequences of the mentioned organisms. It is the first database of its kind that has together with the network designing showed the simulation pattern of the product. This database intends to create a comprehensive canopy for the regulation of lipid metabolism reaction in the parasite by integrating the transcription factors, regulatory genes and the protein products controlled by the transcription factors and hence operating the metabolism at genetic level. Keywords: L.major, S.mansoni, Regulatory networks, Transcription factors, Database

  9. BioWarehouse: a bioinformatics database warehouse toolkit.

    Science.gov (United States)

    Lee, Thomas J; Pouliot, Yannick; Wagner, Valerie; Gupta, Priyanka; Stringer-Calvert, David W J; Tenenbaum, Jessica D; Karp, Peter D

    2006-03-23

    This article addresses the problem of interoperation of heterogeneous bioinformatics databases. We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL) but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. BioWarehouse embodies significant progress on the database integration problem for bioinformatics.

  10. IPAD: the Integrated Pathway Analysis Database for Systematic Enrichment Analysis.

    Science.gov (United States)

    Zhang, Fan; Drabier, Renee

    2012-01-01

    Next-Generation Sequencing (NGS) technologies and Genome-Wide Association Studies (GWAS) generate millions of reads and hundreds of datasets, and there is an urgent need for a better way to accurately interpret and distill such large amounts of data. Extensive pathway and network analysis allow for the discovery of highly significant pathways from a set of disease vs. healthy samples in the NGS and GWAS. Knowledge of activation of these processes will lead to elucidation of the complex biological pathways affected by drug treatment, to patient stratification studies of new and existing drug treatments, and to understanding the underlying anti-cancer drug effects. There are approximately 141 biological human pathway resources as of Jan 2012 according to the Pathguide database. However, most currently available resources do not contain disease, drug or organ specificity information such as disease-pathway, drug-pathway, and organ-pathway associations. Systematically integrating pathway, disease, drug and organ specificity together becomes increasingly crucial for understanding the interrelationships between signaling, metabolic and regulatory pathway, drug action, disease susceptibility, and organ specificity from high-throughput omics data (genomics, transcriptomics, proteomics and metabolomics). We designed the Integrated Pathway Analysis Database for Systematic Enrichment Analysis (IPAD, http://bioinfo.hsc.unt.edu/ipad), defining inter-association between pathway, disease, drug and organ specificity, based on six criteria: 1) comprehensive pathway coverage; 2) gene/protein to pathway/disease/drug/organ association; 3) inter-association between pathway, disease, drug, and organ; 4) multiple and quantitative measurement of enrichment and inter-association; 5) assessment of enrichment and inter-association analysis with the context of the existing biological knowledge and a "gold standard" constructed from reputable and reliable sources; and 6) cross-linking of

  11. The Danish Inguinal Hernia database.

    Science.gov (United States)

    Friis-Andersen, Hans; Bisgaard, Thue

    2016-01-01

    To monitor and improve nation-wide surgical outcome after groin hernia repair based on scientific evidence-based surgical strategies for the national and international surgical community. Patients ≥18 years operated for groin hernia. Type and size of hernia, primary or recurrent, type of surgical repair procedure, mesh and mesh fixation methods. According to the Danish National Health Act, surgeons are obliged to register all hernia repairs immediately after surgery (3 minute registration time). All institutions have continuous access to their own data stratified on individual surgeons. Registrations are based on a closed, protected Internet system requiring personal codes also identifying the operating institution. A national steering committee consisting of 13 voluntary and dedicated surgeons, 11 of whom are unpaid, handles the medical management of the database. The Danish Inguinal Hernia Database comprises intraoperative data from >130,000 repairs (May 2015). A total of 49 peer-reviewed national and international publications have been published from the database (June 2015). The Danish Inguinal Hernia Database is fully active monitoring surgical quality and contributes to the national and international surgical society to improve outcome after groin hernia repair.

  12. Prototyping visual interface for maintenance and supply databases

    OpenAIRE

    Fore, Henry Ray

    1989-01-01

    Approved for public release; distribution is unlimited This research examined the feasibility of providing a visual interface to standard Army Management Information Systems at the unit level. The potential of improving the Human-Machine Interface of unit level maintenance and supply software, such as ULLS (Unit Level Logistics System), is very attractive. A prototype was implemented in GLAD (Graphics Language for Database). GLAD is a graphics object-oriented environment for databases t...

  13. The Danish ventral hernia database

    DEFF Research Database (Denmark)

    Helgstrand, Frederik; Jorgensen, Lars Nannestad

    2016-01-01

    Aim: The Danish Ventral Hernia Database (DVHD) provides national surveillance of current surgical practice and clinical postoperative outcomes. The intention is to reduce postoperative morbidity and hernia recurrence, evaluate new treatment strategies, and facilitate nationwide implementation of ...... of operations and is an excellent tool for observing changes over time, including adjustment of several confounders. This national database registry has impacted on clinical practice in Denmark and led to a high number of scientific publications in recent years.......Aim: The Danish Ventral Hernia Database (DVHD) provides national surveillance of current surgical practice and clinical postoperative outcomes. The intention is to reduce postoperative morbidity and hernia recurrence, evaluate new treatment strategies, and facilitate nationwide implementation...... to the surgical repair are recorded. Data registration is mandatory. Data may be merged with other Danish health registries and information from patient questionnaires or clinical examinations. Descriptive data: More than 37,000 operations have been registered. Data have demonstrated high agreement with patient...

  14. Atlantic Canada's energy research and development website and database

    International Nuclear Information System (INIS)

    2005-01-01

    Petroleum Research Atlantic Canada maintains a website devoted to energy research and development in Atlantic Canada. The site can be viewed on the world wide web at www.energyresearch.ca. It includes a searchable database with information about researchers in Nova Scotia, their projects and published materials on issues related to hydrocarbons, alternative energy technologies, energy efficiency, climate change, environmental impacts and policy. The website also includes links to research funding agencies, external related databases and related energy organizations around the world. Nova Scotia-based users are invited to submit their academic, private or public research to the site. Before being uploaded into the database, a site administrator reviews and processes all new information. Users are asked to identify their areas of interest according to the following research categories: alternative or renewable energy technologies; climate change; coal; computer applications; economics; energy efficiency; environmental impacts; geology; geomatics; geophysics; health and safety; human factors; hydrocarbons; meteorology and oceanology (metocean) activities; petroleum operations in deep and shallow waters; policy; and power generation and supply. The database can be searched 5 ways according to topic, researchers, publication, projects or funding agency. refs., tabs., figs

  15. Ranked retrieval of Computational Biology models.

    Science.gov (United States)

    Henkel, Ron; Endler, Lukas; Peters, Andre; Le Novère, Nicolas; Waltemath, Dagmar

    2010-08-11

    The study of biological systems demands computational support. If targeting a biological problem, the reuse of existing computational models can save time and effort. Deciding for potentially suitable models, however, becomes more challenging with the increasing number of computational models available, and even more when considering the models' growing complexity. Firstly, among a set of potential model candidates it is difficult to decide for the model that best suits ones needs. Secondly, it is hard to grasp the nature of an unknown model listed in a search result set, and to judge how well it fits for the particular problem one has in mind. Here we present an improved search approach for computational models of biological processes. It is based on existing retrieval and ranking methods from Information Retrieval. The approach incorporates annotations suggested by MIRIAM, and additional meta-information. It is now part of the search engine of BioModels Database, a standard repository for computational models. The introduced concept and implementation are, to our knowledge, the first application of Information Retrieval techniques on model search in Computational Systems Biology. Using the example of BioModels Database, it was shown that the approach is feasible and extends the current possibilities to search for relevant models. The advantages of our system over existing solutions are that we incorporate a rich set of meta-information, and that we provide the user with a relevance ranking of the models found for a query. Better search capabilities in model databases are expected to have a positive effect on the reuse of existing models.

  16. Using Online Databases in Corporate Issues Management.

    Science.gov (United States)

    Thomsen, Steven R.

    1995-01-01

    Finds that corporate public relations practitioners felt they were able, using online database and information services, to intercept issues earlier in the "issue cycle" and thus enable their organizations to develop more "proactionary" or "catalytic" issues management repose strategies. (SR)

  17. Large-scale Health Information Database and Privacy Protection*1

    Science.gov (United States)

    YAMAMOTO, Ryuichi

    2016-01-01

    Japan was once progressive in the digitalization of healthcare fields but unfortunately has fallen behind in terms of the secondary use of data for public interest. There has recently been a trend to establish large-scale health databases in the nation, and a conflict between data use for public interest and privacy protection has surfaced as this trend has progressed. Databases for health insurance claims or for specific health checkups and guidance services were created according to the law that aims to ensure healthcare for the elderly; however, there is no mention in the act about using these databases for public interest in general. Thus, an initiative for such use must proceed carefully and attentively. The PMDA*2 projects that collect a large amount of medical record information from large hospitals and the health database development project that the Ministry of Health, Labour and Welfare (MHLW) is working on will soon begin to operate according to a general consensus; however, the validity of this consensus can be questioned if issues of anonymity arise. The likelihood that researchers conducting a study for public interest would intentionally invade the privacy of their subjects is slim. However, patients could develop a sense of distrust about their data being used since legal requirements are ambiguous. Nevertheless, without using patients’ medical records for public interest, progress in medicine will grind to a halt. Proper legislation that is clear for both researchers and patients will therefore be highly desirable. A revision of the Act on the Protection of Personal Information is currently in progress. In reality, however, privacy is not something that laws alone can protect; it will also require guidelines and self-discipline. We now live in an information capitalization age. I will introduce the trends in legal reform regarding healthcare information and discuss some basics to help people properly face the issue of health big data and privacy

  18. TcruziDB, an Integrated Database, and the WWW Information Server for the Trypanosoma cruzi Genome Project

    Directory of Open Access Journals (Sweden)

    Degrave Wim

    1997-01-01

    Full Text Available Data analysis, presentation and distribution is of utmost importance to a genome project. A public domain software, ACeDB, has been chosen as the common basis for parasite genome databases, and a first release of TcruziDB, the Trypanosoma cruzi genome database, is available by ftp from ftp://iris.dbbm.fiocruz.br/pub/genomedb/TcruziDB as well as versions of the software for different operating systems (ftp://iris.dbbm.fiocruz.br/pub/unixsoft/. Moreover, data originated from the project are available from the WWW server at http://www.dbbm.fiocruz.br. It contains biological and parasitological data on CL Brener, its karyotype, all available T. cruzi sequences from Genbank, data on the EST-sequencing project and on available libraries, a T. cruzi codon table and a listing of activities and participating groups in the genome project, as well as meeting reports. T. cruzi discussion lists (tcruzi-l@iris.dbbm.fiocruz.br and tcgenics@iris.dbbm.fiocruz.br are being maintained for communication and to promote collaboration in the genome project

  19. Proposal for a high-energy nuclear database

    International Nuclear Information System (INIS)

    Brown, D.A.; Vogt, R.

    2006-01-01

    We propose to develop a high-energy heavy-ion experimental database and make it accessible to the scientific community through an on-line interface. This database will be searchable and cross-indexed with relevant publications, including published detector descriptions. Since this database will be a community resource, it requires the high-energy nuclear physics community's financial and manpower support. This database should eventually contain all published data from Bevalac, AGS and SPS to RHIC and LHC energies, proton-proton to nucleus-nucleus collisions as well as other relevant systems, and all measured observables. Such a database would have tremendous scientific payoff as it makes systematic studies easier and allows simpler benchmarking of theoretical models to a broad range of old and new experiments. Furthermore, there is a growing need for compilations of high-energy nuclear data for applications including stockpile stewardship, technology development for inertial confinement fusion and target and source development for upcoming facilities such as the Next Linear Collider. To enhance the utility of this database, we propose periodically performing evaluations of the data and summarizing the results in topical reviews. (author)

  20. Proposal for a High Energy Nuclear Database

    International Nuclear Information System (INIS)

    Brown, D A; Vogt, R

    2005-01-01

    The authors propose to develop a high-energy heavy-ion experimental database and make it accessible to the scientific community through an on-line interface. This database will be searchable and cross-indexed with relevant publications, including published detector descriptions. Since this database will be a community resource, it requires the high-energy nuclear physics community's financial and manpower support. This database should eventually contain all published data from Bevalac, AGS and SPS to RHIC and CERN-LHC energies, proton-proton to nucleus-nucleus collisions as well as other relevant systems, and all measured observables. Such a database would have tremendous scientific payoff as it makes systematic studies easier and allows simpler benchmarking of theoretical models to a broad range of old and new experiments. Furthermore, there is a growing need for compilations of high-energy nuclear data for applications including stockpile stewardship, technology development for inertial confinement fusion and target and source development for upcoming facilities such as the Next Linear Collider. To enhance the utility of this database, they propose periodically performing evaluations of the data and summarizing the results in topical reviews

  1. A plant resource and experiment management system based on the Golm Plant Database as a basic tool for omics research

    Directory of Open Access Journals (Sweden)

    Selbig Joachim

    2008-05-01

    Full Text Available Abstract Background For omics experiments, detailed characterisation of experimental material with respect to its genetic features, its cultivation history and its treatment history is a requirement for analyses by bioinformatics tools and for publication needs. Furthermore, meta-analysis of several experiments in systems biology based approaches make it necessary to store this information in a standardised manner, preferentially in relational databases. In the Golm Plant Database System, we devised a data management system based on a classical Laboratory Information Management System combined with web-based user interfaces for data entry and retrieval to collect this information in an academic environment. Results The database system contains modules representing the genetic features of the germplasm, the experimental conditions and the sampling details. In the germplasm module, genetically identical lines of biological material are generated by defined workflows, starting with the import workflow, followed by further workflows like genetic modification (transformation, vegetative or sexual reproduction. The latter workflows link lines and thus create pedigrees. For experiments, plant objects are generated from plant lines and united in so-called cultures, to which the cultivation conditions are linked. Materials and methods for each cultivation step are stored in a separate ACCESS database of the plant cultivation unit. For all cultures and thus every plant object, each cultivation site and the culture's arrival time at a site are logged by a barcode-scanner based system. Thus, for each plant object, all site-related parameters, e.g. automatically logged climate data, are available. These life history data and genetic information for the plant objects are linked to analytical results by the sampling module, which links sample components to plant object identifiers. This workflow uses controlled vocabulary for organs and treatments. Unique

  2. Third workshop on heavy charged particles in biology and medicine

    International Nuclear Information System (INIS)

    Kraft, G.; Grundinger, U.

    1987-07-01

    The book of abstracts contains 67 papers presented at the workshop. Main topics are: Physics, chemistry, DNA, cell biology, cellular and molecular repair, space biology, tumor and tissue biology, predictive assays, cancer therapy, and new projects. Separate entries in the database are prepared for all of these papers. (MG)

  3. Integrated olfactory receptor and microarray gene expression databases

    Directory of Open Access Journals (Sweden)

    Crasto Chiquito J

    2007-06-01

    Full Text Available Abstract Background Gene expression patterns of olfactory receptors (ORs are an important component of the signal encoding mechanism in the olfactory system since they determine the interactions between odorant ligands and sensory neurons. We have developed the Olfactory Receptor Microarray Database (ORMD to house OR gene expression data. ORMD is integrated with the Olfactory Receptor Database (ORDB, which is a key repository of OR gene information. Both databases aim to aid experimental research related to olfaction. Description ORMD is a Web-accessible database that provides a secure data repository for OR microarray experiments. It contains both publicly available and private data; accessing the latter requires authenticated login. The ORMD is designed to allow users to not only deposit gene expression data but also manage their projects/experiments. For example, contributors can choose whether to make their datasets public. For each experiment, users can download the raw data files and view and export the gene expression data. For each OR gene being probed in a microarray experiment, a hyperlink to that gene in ORDB provides access to genomic and proteomic information related to the corresponding olfactory receptor. Individual ORs archived in ORDB are also linked to ORMD, allowing users access to the related microarray gene expression data. Conclusion ORMD serves as a data repository and project management system. It facilitates the study of microarray experiments of gene expression in the olfactory system. In conjunction with ORDB, ORMD integrates gene expression data with the genomic and functional data of ORs, and is thus a useful resource for both olfactory researchers and the public.

  4. The Danish Fetal Medicine Database

    Directory of Open Access Journals (Sweden)

    Ekelund CK

    2016-10-01

    Full Text Available Charlotte Kvist Ekelund,1 Tine Iskov Kopp,2 Ann Tabor,1 Olav Bjørn Petersen3 1Department of Obstetrics, Center of Fetal Medicine, Rigshospitalet, University of Copenhagen, Copenhagen, Denmark; 2Registry Support Centre (East – Epidemiology and Biostatistics, Research Centre for Prevention and Health, Glostrup, Denmark; 3Fetal Medicine Unit, Aarhus University Hospital, Aarhus Nord, Denmark Aim: The aim of this study is to set up a database in order to monitor the detection rates and false-positive rates of first-trimester screening for chromosomal abnormalities and prenatal detection rates of fetal malformations in Denmark. Study population: Pregnant women with a first or second trimester ultrasound scan performed at all public hospitals in Denmark are registered in the database. Main variables/descriptive data: Data on maternal characteristics, ultrasonic, and biochemical variables are continuously sent from the fetal medicine units' Astraia databases to the central database via web service. Information about outcome of pregnancy (miscarriage, termination, live birth, or stillbirth is received from the National Patient Register and National Birth Register and linked via the Danish unique personal registration number. Furthermore, results of all pre- and postnatal chromosome analyses are sent to the database. Conclusion: It has been possible to establish a fetal medicine database, which monitors first-trimester screening for chromosomal abnormalities and second-trimester screening for major fetal malformations with the input from already collected data. The database is valuable to assess the performance at a regional level and to compare Danish performance with international results at a national level. Keywords: prenatal screening, nuchal translucency, fetal malformations, chromosomal abnormalities

  5. The Mouse Genome Database (MGD): facilitating mouse as a model for human biology and disease.

    Science.gov (United States)

    Eppig, Janan T; Blake, Judith A; Bult, Carol J; Kadin, James A; Richardson, Joel E

    2015-01-01

    The Mouse Genome Database (MGD, http://www.informatics.jax.org) serves the international biomedical research community as the central resource for integrated genomic, genetic and biological data on the laboratory mouse. To facilitate use of mouse as a model in translational studies, MGD maintains a core of high-quality curated data and integrates experimentally and computationally generated data sets. MGD maintains a unified catalog of genes and genome features, including functional RNAs, QTL and phenotypic loci. MGD curates and provides functional and phenotype annotations for mouse genes using the Gene Ontology and Mammalian Phenotype Ontology. MGD integrates phenotype data and associates mouse genotypes to human diseases, providing critical mouse-human relationships and access to repositories holding mouse models. MGD is the authoritative source of nomenclature for genes, genome features, alleles and strains following guidelines of the International Committee on Standardized Genetic Nomenclature for Mice. A new addition to MGD, the Human-Mouse: Disease Connection, allows users to explore gene-phenotype-disease relationships between human and mouse. MGD has also updated search paradigms for phenotypic allele attributes, incorporated incidental mutation data, added a module for display and exploration of genes and microRNA interactions and adopted the JBrowse genome browser. MGD resources are freely available to the scientific community. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  6. Tropical Freshwater Biology

    African Journals Online (AJOL)

    Tropical Freshwater Biology promotes the publication of scientific contributions in the field of freshwater biology in the tropical and subtropical regions of the world. One issue is published annually but this number may be increased. Original research papers and short communications on any aspect of tropical freshwater ...

  7. Neutron cross-sections database for amino acids and proteins analysis

    Energy Technology Data Exchange (ETDEWEB)

    Voi, Dante L.; Ferreira, Francisco de O.; Nunes, Rogerio Chaffin, E-mail: dante@ien.gov.br, E-mail: fferreira@ien.gov.br, E-mail: Chaffin@ien.gov.br [Instituto de Engenharia Nuclear (IEN/CNEN-RJ), Rio de Janeiro, RJ (Brazil); Rocha, Helio F. da, E-mail: hrocha@gbl.com.br [Universidade Federal do Rio de Janeiro (IPPMG/UFRJ), Rio de Janeiro, RJ (Brazil). Instituto de Pediatria

    2015-07-01

    Biological materials may be studied using neutrons as an unconventional tool of analysis. Dynamics and structures data can be obtained for amino acids, protein and others cellular components by neutron cross sections determinations especially for applications in nuclear purity and conformation analysis. The instrument used for this is the crystal spectrometer of the Instituto de Engenharia Nuclear (IEN-CNEN-RJ), the only one in Latin America that uses neutrons for this type of analyzes and it is installed in one of the reactor Argonauta irradiation channels. The experimentally values obtained are compared with calculated values using literature data with a rigorous analysis of the chemical composition, conformation and molecular structure analysis of the materials. A neutron cross-section database was constructed to assist in determining molecular dynamic, structure and formulae of biological materials. The database contains neutron cross-sections values of all amino acids, chemical elements, molecular groups, auxiliary radicals, as well as values of constants and parameters necessary for the analysis. An unprecedented analytical procedure was developed using the neutron cross section parceling and grouping method for data manipulation. This database is a result of measurements obtained from twenty amino acids that were provided by different manufactories and are used in oral administration in hospital individuals for nutritional applications. It was also constructed a small data file of compounds with different molecular groups including carbon, nitrogen, sulfur and oxygen, all linked to hydrogen atoms. A review of global and national scene in the acquisition of neutron cross sections data, the formation of libraries and the application of neutrons for analyzing biological materials is presented. This database has further application in protein analysis and the neutron cross-section from the insulin was estimated. (author)

  8. Neutron cross-sections database for amino acids and proteins analysis

    International Nuclear Information System (INIS)

    Voi, Dante L.; Ferreira, Francisco de O.; Nunes, Rogerio Chaffin; Rocha, Helio F. da

    2015-01-01

    Biological materials may be studied using neutrons as an unconventional tool of analysis. Dynamics and structures data can be obtained for amino acids, protein and others cellular components by neutron cross sections determinations especially for applications in nuclear purity and conformation analysis. The instrument used for this is the crystal spectrometer of the Instituto de Engenharia Nuclear (IEN-CNEN-RJ), the only one in Latin America that uses neutrons for this type of analyzes and it is installed in one of the reactor Argonauta irradiation channels. The experimentally values obtained are compared with calculated values using literature data with a rigorous analysis of the chemical composition, conformation and molecular structure analysis of the materials. A neutron cross-section database was constructed to assist in determining molecular dynamic, structure and formulae of biological materials. The database contains neutron cross-sections values of all amino acids, chemical elements, molecular groups, auxiliary radicals, as well as values of constants and parameters necessary for the analysis. An unprecedented analytical procedure was developed using the neutron cross section parceling and grouping method for data manipulation. This database is a result of measurements obtained from twenty amino acids that were provided by different manufactories and are used in oral administration in hospital individuals for nutritional applications. It was also constructed a small data file of compounds with different molecular groups including carbon, nitrogen, sulfur and oxygen, all linked to hydrogen atoms. A review of global and national scene in the acquisition of neutron cross sections data, the formation of libraries and the application of neutrons for analyzing biological materials is presented. This database has further application in protein analysis and the neutron cross-section from the insulin was estimated. (author)

  9. Databases on safety issues for WWER and RBMK reactors. Users' manual. A publication of the extrabudgetary programme on the safety of WWER and RBMK nuclear power plants

    International Nuclear Information System (INIS)

    1996-04-01

    At the beginning of the IAEA Extrabudgetary Programme on the safety of WWER reactors a great number of findings and recommendations (safety items) were collected as a result of design review and safety review missions of the WWER-440/230 type reactors. On the basis of these findings a technical database containing more than 1300 records was established to support the consolidation of the information obtained and to help in identification of safety issues. After the scope of the WWER extrabudgetary programme was extended similar data sets were prepared for the WWER-440/213, WWER-1000 and RBMK nuclear power plants. This publication describes the structure of the databases on safety issues of WWER and RBMK NPPs, the information sources used in the databases and interrogation capabilities for users to obtain the necessary information. 14 refs, 9 figs, 5 tabs

  10. Using Bibliographic Knowledge for Ranking in Scientific Publication Databases

    CERN Document Server

    Vesely, Martin; Le Meur, Jean-Yves

    2008-01-01

    Document ranking for scientific publications involves a variety of specialized resources (e.g. author or citation indexes) that are usually difficult to use within standard general purpose search engines that usually operate on large-scale heterogeneous document collections for which the required specialized resources are not always available for all the documents present in the collections. Integrating such resources into specialized information retrieval engines is therefore important to cope with community-specific user expectations that strongly influence the perception of relevance within the considered community. In this perspective, this paper extends the notion of ranking with various methods exploiting different types of bibliographic knowledge that represent a crucial resource for measuring the relevance of scientific publications. In our work, we experimentally evaluated the adequacy of two such ranking methods (one based on freshness, i.e. the publication date, and the other on a novel index, the ...

  11. The UMIST database for astrochemistry 2006

    Science.gov (United States)

    Woodall, J.; Agúndez, M.; Markwick-Kemper, A. J.; Millar, T. J.

    2007-05-01

    Aims:We present a new version of the UMIST Database for Astrochemistry, the fourth such version to be released to the public. The current version contains some 4573 binary gas-phase reactions, an increase of 10% from the previous (1999) version, among 420 species, of which 23 are new to the database. Methods: Major updates have been made to ion-neutral reactions, neutral-neutral reactions, particularly at low temperature, and dissociative recombination reactions. We have included for the first time the interstellar chemistry of fluorine. In addition to the usual database, we have also released a reaction set in which the effects of dipole-enhanced ion-neutral rate coefficients are included. Results: These two reactions sets have been used in a dark cloud model and the results of these models are presented and discussed briefly. The database and associated software are available on the World Wide Web at www.udfa.net. Tables 1, 2, 4 and 9 are only available in electronic form at http://www.aanda.org

  12. Benchmarking database performance for genomic data.

    Science.gov (United States)

    Khushi, Matloob

    2015-06-01

    Genomic regions represent features such as gene annotations, transcription factor binding sites and epigenetic modifications. Performing various genomic operations such as identifying overlapping/non-overlapping regions or nearest gene annotations are common research needs. The data can be saved in a database system for easy management, however, there is no comprehensive database built-in algorithm at present to identify overlapping regions. Therefore I have developed a novel region-mapping (RegMap) SQL-based algorithm to perform genomic operations and have benchmarked the performance of different databases. Benchmarking identified that PostgreSQL extracts overlapping regions much faster than MySQL. Insertion and data uploads in PostgreSQL were also better, although general searching capability of both databases was almost equivalent. In addition, using the algorithm pair-wise, overlaps of >1000 datasets of transcription factor binding sites and histone marks, collected from previous publications, were reported and it was found that HNF4G significantly co-locates with cohesin subunit STAG1 (SA1).Inc. © 2015 Wiley Periodicals, Inc.

  13. IEEE Conference Publications in Libraries.

    Science.gov (United States)

    Johnson, Karl E.

    1984-01-01

    Conclusions of surveys (63 libraries, OCLC database, University of Rhode Island users) assessing handling of Institute of Electrical and Electronics Engineers (IEEE) conference publications indicate that most libraries fully catalog these publications using LC cataloging, and library patrons frequently require series access to publications. Eight…

  14. Word selection affects perceptions of synthetic biology

    Directory of Open Access Journals (Sweden)

    Tonidandel Scott

    2011-07-01

    Full Text Available Abstract Members of the synthetic biology community have discussed the significance of word selection when describing synthetic biology to the general public. In particular, many leaders proposed the word "create" was laden with negative connotations. We found that word choice and framing does affect public perception of synthetic biology. In a controlled experiment, participants perceived synthetic biology more negatively when "create" was used to describe the field compared to "construct" (p = 0.008. Contrary to popular opinion among synthetic biologists, however, low religiosity individuals were more influenced negatively by the framing manipulation than high religiosity people. Our results suggest that synthetic biologists directly influence public perception of their field through avoidance of the word "create".

  15. Risoe publication activities in 1998

    International Nuclear Information System (INIS)

    Bennov, Solvejg

    1999-04-01

    The report contains a list of references to the scientific and technical journal articles, books, reports, lectures published in full text, and to publications for a broader readership authored by researchers at Risoe National Laboratory and published in 1998. If the publication mentioned in the reference is electronically available the link to the web-address is added. The references are organised according to the programme areas of Risoe. The text is introduced by total number of publications, distribution of types of publications, number of citations to the international scientific journal articles, institutions with which Risoe has published the largest number of articles, and journals in which Risoe has published most articles. The data are derived from Risoe's in-house Publications Database and from the Risoe Institutional Citation Report database produced by the Institute for Scientific Information. (au)

  16. Risoe publication activities in 1998

    Energy Technology Data Exchange (ETDEWEB)

    Bennov, Solvejg [ed.

    1999-04-01

    The report contains a list of references to the scientific and technical journal articles, books, reports, lectures published in full text, and to publications for a broader readership authored by researchers at Risoe National Laboratory and published in 1998. If the publication mentioned in the reference is electronically available the link to the web-address is added. The references are organised according to the programme areas of Risoe. The text is introduced by total number of publications, distribution of types of publications, number of citations to the international scientific journal articles, institutions with which Risoe has published the largest number of articles, and journals in which Risoe has published most articles. The data are derived from Risoe`s in-house Publications Database and from the Risoe Institutional Citation Report database produced by the Institute for Scientific Information. (au)

  17. BioWarehouse: a bioinformatics database warehouse toolkit

    Directory of Open Access Journals (Sweden)

    Stringer-Calvert David WJ

    2006-03-01

    Full Text Available Abstract Background This article addresses the problem of interoperation of heterogeneous bioinformatics databases. Results We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. Conclusion BioWarehouse embodies significant progress on the

  18. Integrating Biological Perspectives:. a Quantum Leap for Microarray Expression Analysis

    Science.gov (United States)

    Wanke, Dierk; Kilian, Joachim; Bloss, Ulrich; Mangelsen, Elke; Supper, Jochen; Harter, Klaus; Berendzen, Kenneth W.

    2009-02-01

    Biologists and bioinformatic scientists cope with the analysis of transcript abundance and the extraction of meaningful information from microarray expression data. By exploiting biological information accessible in public databases, we try to extend our current knowledge over the plant model organism Arabidopsis thaliana. Here, we give two examples of increasing the quality of information gained from large scale expression experiments by the integration of microarray-unrelated biological information: First, we utilize Arabidopsis microarray data to demonstrate that expression profiles are usually conserved between orthologous genes of different organisms. In an initial step of the analysis, orthology has to be inferred unambiguously, which then allows comparison of expression profiles between orthologs. We make use of the publicly available microarray expression data of Arabidopsis and barley, Hordeum vulgare. We found a generally positive correlation in expression trajectories between true orthologs although both organisms are only distantly related in evolutionary time scale. Second, extracting clusters of co-regulated genes implies similarities in transcriptional regulation via similar cis-regulatory elements (CREs). Vice versa approaches, where co-regulated gene clusters are found by investigating on CREs were not successful in general. Nonetheless, in some cases the presence of CREs in a defined position, orientation or CRE-combinations is positively correlated with co-regulated gene clusters. Here, we make use of genes involved in the phenylpropanoid biosynthetic pathway, to give one positive example for this approach.

  19. YMDB: the Yeast Metabolome Database

    Science.gov (United States)

    Jewison, Timothy; Knox, Craig; Neveu, Vanessa; Djoumbou, Yannick; Guo, An Chi; Lee, Jacqueline; Liu, Philip; Mandal, Rupasri; Krishnamurthy, Ram; Sinelnikov, Igor; Wilson, Michael; Wishart, David S.

    2012-01-01

    The Yeast Metabolome Database (YMDB, http://www.ymdb.ca) is a richly annotated ‘metabolomic’ database containing detailed information about the metabolome of Saccharomyces cerevisiae. Modeled closely after the Human Metabolome Database, the YMDB contains >2000 metabolites with links to 995 different genes/proteins, including enzymes and transporters. The information in YMDB has been gathered from hundreds of books, journal articles and electronic databases. In addition to its comprehensive literature-derived data, the YMDB also contains an extensive collection of experimental intracellular and extracellular metabolite concentration data compiled from detailed Mass Spectrometry (MS) and Nuclear Magnetic Resonance (NMR) metabolomic analyses performed in our lab. This is further supplemented with thousands of NMR and MS spectra collected on pure, reference yeast metabolites. Each metabolite entry in the YMDB contains an average of 80 separate data fields including comprehensive compound description, names and synonyms, structural information, physico-chemical data, reference NMR and MS spectra, intracellular/extracellular concentrations, growth conditions and substrates, pathway information, enzyme data, gene/protein sequence data, as well as numerous hyperlinks to images, references and other public databases. Extensive searching, relational querying and data browsing tools are also provided that support text, chemical structure, spectral, molecular weight and gene/protein sequence queries. Because of S. cervesiae's importance as a model organism for biologists and as a biofactory for industry, we believe this kind of database could have considerable appeal not only to metabolomics researchers, but also to yeast biologists, systems biologists, the industrial fermentation industry, as well as the beer, wine and spirit industry. PMID:22064855

  20. The scientific production in health and biological sciences of the top 20 Brazilian universities

    Directory of Open Access Journals (Sweden)

    R. Zorzetto

    2006-12-01

    Full Text Available Brazilian scientific output exhibited a 4-fold increase in the last two decades because of the stability of the investment in research and development activities and of changes in the policies of the main funding agencies. Most of this production is concentrated in public universities and research institutes located in the richest part of the country. Among all areas of knowledge, the most productive are Health and Biological Sciences. During the 1998-2002 period these areas presented heterogeneous growth ranging from 4.5% (Pharmacology to 191% (Psychiatry, with a median growth rate of 47.2%. In order to identify and rank the 20 most prolific institutions in these areas, searches were made in three databases (DataCAPES, ISI and MEDLINE which permitted the identification of 109,507 original articles produced by the 592 Graduate Programs in Health and Biological Sciences offered by 118 public universities and research institutes. The 20 most productive centers, ranked according to the total number of ISI-indexed articles published during the 1998-2003 period, produced 78.7% of the papers in these areas and are strongly concentrated in the Southern part of the country, mainly in São Paulo State.

  1. PlantNATsDB: a comprehensive database of plant natural antisense transcripts.

    Science.gov (United States)

    Chen, Dijun; Yuan, Chunhui; Zhang, Jian; Zhang, Zhao; Bai, Lin; Meng, Yijun; Chen, Ling-Ling; Chen, Ming

    2012-01-01

    Natural antisense transcripts (NATs), as one type of regulatory RNAs, occur prevalently in plant genomes and play significant roles in physiological and pathological processes. Although their important biological functions have been reported widely, a comprehensive database is lacking up to now. Consequently, we constructed a plant NAT database (PlantNATsDB) involving approximately 2 million NAT pairs in 69 plant species. GO annotation and high-throughput small RNA sequencing data currently available were integrated to investigate the biological function of NATs. PlantNATsDB provides various user-friendly web interfaces to facilitate the presentation of NATs and an integrated, graphical network browser to display the complex networks formed by different NATs. Moreover, a 'Gene Set Analysis' module based on GO annotation was designed to dig out the statistical significantly overrepresented GO categories from the specific NAT network. PlantNATsDB is currently the most comprehensive resource of NATs in the plant kingdom, which can serve as a reference database to investigate the regulatory function of NATs. The PlantNATsDB is freely available at http://bis.zju.edu.cn/pnatdb/.

  2. Factors impacting time to acceptance and publication for peer-reviewed publications.

    Science.gov (United States)

    Toroser, Dikran; Carlson, Janice; Robinson, Micah; Gegner, Julie; Girard, Victoria; Smette, Lori; Nilsen, Jon; O'Kelly, James

    2017-07-01

    Timely publication of data is important for the medical community and provides a valuable contribution to data disclosure. The objective of this study was to identify and evaluate times to acceptance and publication for peer-reviewed manuscripts, reviews, and letters to the editor. Key publication metrics for published manuscripts, reviews, and letters to the editor were identified by eight Amgen publications professionals. Data for publications submitted between 1 January 2013 and 1 November 2015 were extracted from a proprietary internal publication-tracking database. Variables included department initiating the study, publication type, number of submissions per publication, and the total number of weeks from first submission to acceptance, online publication, and final publication. A total of 337 publications were identified, of which 300 (89%) were manuscripts. Time from submission to acceptance and publication was generally similar between clinical and real-world evidence (e.g. observational and health economics studies) publications. Median (range) time from first submission to acceptance was 23.4 (0.2-226.2) weeks. Median (range) time from first submission to online (early-release) publication was 29.7 (2.4-162.6) weeks. Median (range) time from first submission to final (print) publication was 36.2 (2.8-230.8) weeks. Time from first submission to acceptance, online publication, and final publication increased accordingly with number of submissions required for acceptance, with similar times noted between each subsequent submission. Analysis of a single-company publication database showed that the median time for manuscripts to be fully published after initial submission was 36.2 weeks, and time to publication increased accordingly with the number of submissions. Causes for multiple submissions and time from clinical trial completion to first submission were not assessed; these were limitations of the study. Nonetheless, publication planners should consider

  3. An online model composition tool for system biology models.

    Science.gov (United States)

    Coskun, Sarp A; Cicek, A Ercument; Lai, Nicola; Dash, Ranjan K; Ozsoyoglu, Z Meral; Ozsoyoglu, Gultekin

    2013-09-05

    There are multiple representation formats for Systems Biology computational models, and the Systems Biology Markup Language (SBML) is one of the most widely used. SBML is used to capture, store, and distribute computational models by Systems Biology data sources (e.g., the BioModels Database) and researchers. Therefore, there is a need for all-in-one web-based solutions that support advance SBML functionalities such as uploading, editing, composing, visualizing, simulating, querying, and browsing computational models. We present the design and implementation of the Model Composition Tool (Interface) within the PathCase-SB (PathCase Systems Biology) web portal. The tool helps users compose systems biology models to facilitate the complex process of merging systems biology models. We also present three tools that support the model composition tool, namely, (1) Model Simulation Interface that generates a visual plot of the simulation according to user's input, (2) iModel Tool as a platform for users to upload their own models to compose, and (3) SimCom Tool that provides a side by side comparison of models being composed in the same pathway. Finally, we provide a web site that hosts BioModels Database models and a separate web site that hosts SBML Test Suite models. Model composition tool (and the other three tools) can be used with little or no knowledge of the SBML document structure. For this reason, students or anyone who wants to learn about systems biology will benefit from the described functionalities. SBML Test Suite models will be a nice starting point for beginners. And, for more advanced purposes, users will able to access and employ models of the BioModels Database as well.

  4. AllergenOnline: A peer-reviewed, curated allergen database to assess novel food proteins for potential cross-reactivity.

    Science.gov (United States)

    Goodman, Richard E; Ebisawa, Motohiro; Ferreira, Fatima; Sampson, Hugh A; van Ree, Ronald; Vieths, Stefan; Baumert, Joseph L; Bohle, Barbara; Lalithambika, Sreedevi; Wise, John; Taylor, Steve L

    2016-05-01

    Increasingly regulators are demanding evaluation of potential allergenicity of foods prior to marketing. Primary risks are the transfer of allergens or potentially cross-reactive proteins into new foods. AllergenOnline was developed in 2005 as a peer-reviewed bioinformatics platform to evaluate risks of new dietary proteins in genetically modified organisms (GMO) and novel foods. The process used to identify suspected allergens and evaluate the evidence of allergenicity was refined between 2010 and 2015. Candidate proteins are identified from the NCBI database using keyword searches, the WHO/IUIS nomenclature database and peer reviewed publications. Criteria to classify proteins as allergens are described. Characteristics of the protein, the source and human subjects, test methods and results are evaluated by our expert panel and archived. Food, inhalant, salivary, venom, and contact allergens are included. Users access allergen sequences through links to the NCBI database and relevant references are listed online. Version 16 includes 1956 sequences from 778 taxonomic-protein groups that are accepted with evidence of allergic serum IgE-binding and/or biological activity. AllergenOnline provides a useful peer-reviewed tool for identifying the primary potential risks of allergy for GMOs and novel foods based on criteria described by the Codex Alimentarius Commission (2003). © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  5. Database Description - Yeast Interacting Proteins Database | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Yeast Interacting Proteins Database Database Description General information of database Database... name Yeast Interacting Proteins Database Alternative name - DOI 10.18908/lsdba.nbdc00742-000 Creator C...-ken 277-8561 Tel: +81-4-7136-3989 FAX: +81-4-7136-3979 E-mail : Database classif...s cerevisiae Taxonomy ID: 4932 Database description Information on interactions and related information obta...l Acad Sci U S A. 2001 Apr 10;98(8):4569-74. Epub 2001 Mar 13. External Links: Original website information Database

  6. Development of a national, dynamic reservoir-sedimentation database

    Science.gov (United States)

    Gray, J.R.; Bernard, J.M.; Stewart, D.W.; McFaul, E.J.; Laurent, K.W.; Schwarz, G.E.; Stinson, J.T.; Jonas, M.M.; Randle, T.J.; Webb, J.W.

    2010-01-01

    The importance of dependable, long-term water supplies, coupled with the need to quantify rates of capacity loss of the Nation’s re servoirs due to sediment deposition, were the most compelling reasons for developing the REServoir- SEDimentation survey information (RESSED) database and website. Created under the auspices of the Advisory Committee on Water Information’s Subcommittee on Sedimenta ion by the U.S. Geological Survey and the Natural Resources Conservation Service, the RESSED database is the most comprehensive compilation of data from reservoir bathymetric and dry-basin surveys in the United States. As of March 2010, the database, which contains data compiled on the 1950s vintage Soil Conservation Service’s Form SCS-34 data sheets, contained results from 6,616 surveys on 1,823 reservoirs in the United States and two surveys on one reservoir in Puerto Rico. The data span the period 1755–1997, with 95 percent of the surveys performed from 1930–1990. The reservoir surface areas range from sub-hectare-scale farm ponds to 658 km2 Lake Powell. The data in the RESSED database can be useful for a number of purposes, including calculating changes in reservoir-storage characteristics, quantifying sediment budgets, and estimating erosion rates in a reservoir’s watershed. The March 2010 version of the RESSED database has a number of deficiencies, including a cryptic and out-of-date database architecture; some geospatial inaccuracies (although most have been corrected); other data errors; an inability to store all data in a readily retrievable manner; and an inability to store all data types that currently exist. Perhaps most importantly, the March 2010 version of RESSED database provides no publically available means to submit new data and corrections to existing data. To address these and other deficiencies, the Subcommittee on Sedimentation, through the U.S. Geological Survey and the U.S. Army Corps of Engineers, began a collaborative project in

  7. GigaDB: announcing the GigaScience database

    Directory of Open Access Journals (Sweden)

    Sneddon Tam P

    2012-07-01

    Full Text Available Abstract With the launch of GigaScience journal, here we provide insight into the accompanying database GigaDB, which allows the integration of manuscript publication with supporting data and tools. Reinforcing and upholding GigaScience’s goals to promote open-data and reproducibility of research, GigaDB also aims to provide a home, when a suitable public repository does not exist, for the supporting data or tools featured in the journal and beyond.

  8. Open TG-GATEs: a large-scale toxicogenomics database

    Science.gov (United States)

    Igarashi, Yoshinobu; Nakatsu, Noriyuki; Yamashita, Tomoya; Ono, Atsushi; Ohno, Yasuo; Urushidani, Tetsuro; Yamada, Hiroshi

    2015-01-01

    Toxicogenomics focuses on assessing the safety of compounds using gene expression profiles. Gene expression signatures from large toxicogenomics databases are expected to perform better than small databases in identifying biomarkers for the prediction and evaluation of drug safety based on a compound's toxicological mechanisms in animal target organs. Over the past 10 years, the Japanese Toxicogenomics Project consortium (TGP) has been developing a large-scale toxicogenomics database consisting of data from 170 compounds (mostly drugs) with the aim of improving and enhancing drug safety assessment. Most of the data generated by the project (e.g. gene expression, pathology, lot number) are freely available to the public via Open TG-GATEs (Toxicogenomics Project-Genomics Assisted Toxicity Evaluation System). Here, we provide a comprehensive overview of the database, including both gene expression data and metadata, with a description of experimental conditions and procedures used to generate the database. Open TG-GATEs is available from http://toxico.nibio.go.jp/english/index.html. PMID:25313160

  9. RNA STRAND: The RNA Secondary Structure and Statistical Analysis Database

    Directory of Open Access Journals (Sweden)

    Andronescu Mirela

    2008-08-01

    Full Text Available Abstract Background The ability to access, search and analyse secondary structures of a large set of known RNA molecules is very important for deriving improved RNA energy models, for evaluating computational predictions of RNA secondary structures and for a better understanding of RNA folding. Currently there is no database that can easily provide these capabilities for almost all RNA molecules with known secondary structures. Results In this paper we describe RNA STRAND – the RNA secondary STRucture and statistical ANalysis Database, a curated database containing known secondary structures of any type and organism. Our new database provides a wide collection of known RNA secondary structures drawn from public databases, searchable and downloadable in a common format. Comprehensive statistical information on the secondary structures in our database is provided using the RNA Secondary Structure Analyser, a new tool we have developed to analyse RNA secondary structures. The information thus obtained is valuable for understanding to which extent and with which probability certain structural motifs can appear. We outline several ways in which the data provided in RNA STRAND can facilitate research on RNA structure, including the improvement of RNA energy models and evaluation of secondary structure prediction programs. In order to keep up-to-date with new RNA secondary structure experiments, we offer the necessary tools to add solved RNA secondary structures to our database and invite researchers to contribute to RNA STRAND. Conclusion RNA STRAND is a carefully assembled database of trusted RNA secondary structures, with easy on-line tools for searching, analyzing and downloading user selected entries, and is publicly available at http://www.rnasoft.ca/strand.

  10. GeneLab: A Systems Biology Platform for Spaceflight Omics Data

    Science.gov (United States)

    Reinsch, Sigrid S.; Lai, San-Huei; Chen, Rick; Thompson, Terri; Berrios, Daniel; Fogle, Homer; Marcu, Oana; Timucin, Linda; Chakravarty, Kaushik; Coughlan, Joseph

    2015-01-01

    NASA's mission includes expanding our understanding of biological systems to improve life on Earth and to enable long-duration human exploration of space. Resources to support large numbers of spaceflight investigations are limited. NASA's GeneLab project is maximizing the science output from these experiments by: (1) developing a unique public bioinformatics database that includes space bioscience relevant "omics" data (genomics, transcriptomics, proteomics, and metabolomics) and experimental metadata; (2) partnering with NASA-funded flight experiments through bio-sample sharing or sample augmentation to expedite omics data input to the GeneLab database; and (3) developing community-driven reference flight experiments. The first database, GeneLab Data System Version 1.0, went online in April 2015. V1.0 contains numerous flight datasets and has search and download capabilities. Version 2.0 will be released in 2016 and will link to analytic tools. In 2015 Genelab partnered with two Biological Research in Canisters experiments (BBRIC-19 and BRIC-20) which examine responses of Arabidopsis thaliana to spaceflight. GeneLab also partnered with Rodent Research-1 (RR1), the maiden flight to test the newly developed rodent habitat. GeneLab developed protocols for maxiumum yield of RNA, DNA and protein from precious RR-1 tissues harvested and preserved during the SpaceX-4 mission, as well as from tissues from mice that were frozen intact during spaceflight and later dissected. GeneLab is establishing partnerships with at least three planned flights for 2016. Organism-specific nationwide Science Definition Teams (SDTs) will define future GeneLab dedicated missions and ensure the broader scientific impact of the GeneLab missions. GeneLab ensures prompt release and open access to all high-throughput omics data from spaceflight and ground-based simulations of microgravity and radiation. Overall, GeneLab will facilitate the generation and query of parallel multi-omics data, and

  11. Cameroon Journal of Experimental Biology

    African Journals Online (AJOL)

    The Cameroon Journal of Experimental Biology is the official journal of the Cameroon Forum for Biological Sciences (CAFOBIOS). It is an interdisciplinary journal for the publication of original research papers, short communications and review articles in all fields of experimental biology including biochemistry, physiology, ...

  12. Worldwide Protein Data Bank biocuration supporting open access to high-quality 3D structural biology data

    Science.gov (United States)

    Westbrook, John D; Feng, Zukang; Persikova, Irina; Sala, Raul; Sen, Sanchayita; Berrisford, John M; Swaminathan, G Jawahar; Oldfield, Thomas J; Gutmanas, Aleksandras; Igarashi, Reiko; Armstrong, David R; Baskaran, Kumaran; Chen, Li; Chen, Minyu; Clark, Alice R; Di Costanzo, Luigi; Dimitropoulos, Dimitris; Gao, Guanghua; Ghosh, Sutapa; Gore, Swanand; Guranovic, Vladimir; Hendrickx, Pieter M S; Hudson, Brian P; Ikegawa, Yasuyo; Kengaku, Yumiko; Lawson, Catherine L; Liang, Yuhe; Mak, Lora; Mukhopadhyay, Abhik; Narayanan, Buvaneswari; Nishiyama, Kayoko; Patwardhan, Ardan; Sahni, Gaurav; Sanz-García, Eduardo; Sato, Junko; Sekharan, Monica R; Shao, Chenghua; Smart, Oliver S; Tan, Lihua; van Ginkel, Glen; Yang, Huanwang; Zhuravleva, Marina A; Markley, John L; Nakamura, Haruki; Kurisu, Genji; Kleywegt, Gerard J; Velankar, Sameer; Berman, Helen M; Burley, Stephen K

    2018-01-01

    Abstract The Protein Data Bank (PDB) is the single global repository for experimentally determined 3D structures of biological macromolecules and their complexes with ligands. The worldwide PDB (wwPDB) is the international collaboration that manages the PDB archive according to the FAIR principles: Findability, Accessibility, Interoperability and Reusability. The wwPDB recently developed OneDep, a unified tool for deposition, validation and biocuration of structures of biological macromolecules. All data deposited to the PDB undergo critical review by wwPDB Biocurators. This article outlines the importance of biocuration for structural biology data deposited to the PDB and describes wwPDB biocuration processes and the role of expert Biocurators in sustaining a high-quality archive. Structural data submitted to the PDB are examined for self-consistency, standardized using controlled vocabularies, cross-referenced with other biological data resources and validated for scientific/technical accuracy. We illustrate how biocuration is integral to PDB data archiving, as it facilitates accurate, consistent and comprehensive representation of biological structure data, allowing efficient and effective usage by research scientists, educators, students and the curious public worldwide. Database URL: https://www.wwpdb.org/ PMID:29688351

  13. Update History of This Database - Trypanosomes Database | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Trypanosomes Database Update History of This Database Date Update contents 2014/05/07 The co...ntact information is corrected. The features and manner of utilization of the database are corrected. 2014/02/04 Trypanosomes Databas...e English archive site is opened. 2011/04/04 Trypanosomes Database ( http://www.tan...paku.org/tdb/ ) is opened. About This Database Database Description Download Lice...nse Update History of This Database Site Policy | Contact Us Update History of This Database - Trypanosomes Database | LSDB Archive ...

  14. CUDASW++2.0: enhanced Smith-Waterman protein database search on CUDA-enabled GPUs based on SIMT and virtualized SIMD abstractions

    Directory of Open Access Journals (Sweden)

    Schmidt Bertil

    2010-04-01

    Full Text Available Abstract Background Due to its high sensitivity, the Smith-Waterman algorithm is widely used for biological database searches. Unfortunately, the quadratic time complexity of this algorithm makes it highly time-consuming. The exponential growth of biological databases further deteriorates the situation. To accelerate this algorithm, many efforts have been made to develop techniques in high performance architectures, especially the recently emerging many-core architectures and their associated programming models. Findings This paper describes the latest release of the CUDASW++ software, CUDASW++ 2.0, which makes new contributions to Smith-Waterman protein database searches using compute unified device architecture (CUDA. A parallel Smith-Waterman algorithm is proposed to further optimize the performance of CUDASW++ 1.0 based on the single instruction, multiple thread (SIMT abstraction. For the first time, we have investigated a partitioned vectorized Smith-Waterman algorithm using CUDA based on the virtualized single instruction, multiple data (SIMD abstraction. The optimized SIMT and the partitioned vectorized algorithms were benchmarked, and remarkably, have similar performance characteristics. CUDASW++ 2.0 achieves performance improvement over CUDASW++ 1.0 as much as 1.74 (1.72 times using the optimized SIMT algorithm and up to 1.77 (1.66 times using the partitioned vectorized algorithm, with a performance of up to 17 (30 billion cells update per second (GCUPS on a single-GPU GeForce GTX 280 (dual-GPU GeForce GTX 295 graphics card. Conclusions CUDASW++ 2.0 is publicly available open-source software, written in CUDA and C++ programming languages. It obtains significant performance improvement over CUDASW++ 1.0 using either the optimized SIMT algorithm or the partitioned vectorized algorithm for Smith-Waterman protein database searches by fully exploiting the compute capability of commonly used CUDA-enabled low-cost GPUs.

  15. The PREDICTS database: a global database of how local terrestrial biodiversity responds to human impacts

    Science.gov (United States)

    Hudson, Lawrence N; Newbold, Tim; Contu, Sara; Hill, Samantha L L; Lysenko, Igor; De Palma, Adriana; Phillips, Helen R P; Senior, Rebecca A; Bennett, Dominic J; Booth, Hollie; Choimes, Argyrios; Correia, David L P; Day, Julie; Echeverría-Londoño, Susy; Garon, Morgan; Harrison, Michelle L K; Ingram, Daniel J; Jung, Martin; Kemp, Victoria; Kirkpatrick, Lucinda; Martin, Callum D; Pan, Yuan; White, Hannah J; Aben, Job; Abrahamczyk, Stefan; Adum, Gilbert B; Aguilar-Barquero, Virginia; Aizen, Marcelo A; Ancrenaz, Marc; Arbeláez-Cortés, Enrique; Armbrecht, Inge; Azhar, Badrul; Azpiroz, Adrián B; Baeten, Lander; Báldi, András; Banks, John E; Barlow, Jos; Batáry, Péter; Bates, Adam J; Bayne, Erin M; Beja, Pedro; Berg, Åke; Berry, Nicholas J; Bicknell, Jake E; Bihn, Jochen H; Böhning-Gaese, Katrin; Boekhout, Teun; Boutin, Céline; Bouyer, Jérémy; Brearley, Francis Q; Brito, Isabel; Brunet, Jörg; Buczkowski, Grzegorz; Buscardo, Erika; Cabra-García, Jimmy; Calviño-Cancela, María; Cameron, Sydney A; Cancello, Eliana M; Carrijo, Tiago F; Carvalho, Anelena L; Castro, Helena; Castro-Luna, Alejandro A; Cerda, Rolando; Cerezo, Alexis; Chauvat, Matthieu; Clarke, Frank M; Cleary, Daniel F R; Connop, Stuart P; D'Aniello, Biagio; da Silva, Pedro Giovâni; Darvill, Ben; Dauber, Jens; Dejean, Alain; Diekötter, Tim; Dominguez-Haydar, Yamileth; Dormann, Carsten F; Dumont, Bertrand; Dures, Simon G; Dynesius, Mats; Edenius, Lars; Elek, Zoltán; Entling, Martin H; Farwig, Nina; Fayle, Tom M; Felicioli, Antonio; Felton, Annika M; Ficetola, Gentile F; Filgueiras, Bruno K C; Fonte, Steven J; Fraser, Lauchlan H; Fukuda, Daisuke; Furlani, Dario; Ganzhorn, Jörg U; Garden, Jenni G; Gheler-Costa, Carla; Giordani, Paolo; Giordano, Simonetta; Gottschalk, Marco S; Goulson, Dave; Gove, Aaron D; Grogan, James; Hanley, Mick E; Hanson, Thor; Hashim, Nor R; Hawes, Joseph E; Hébert, Christian; Helden, Alvin J; Henden, John-André; Hernández, Lionel; Herzog, Felix; Higuera-Diaz, Diego; Hilje, Branko; Horgan, Finbarr G; Horváth, Roland; Hylander, Kristoffer; Isaacs-Cubides, Paola; Ishitani, Masahiro; Jacobs, Carmen T; Jaramillo, Víctor J; Jauker, Birgit; Jonsell, Mats; Jung, Thomas S; Kapoor, Vena; Kati, Vassiliki; Katovai, Eric; Kessler, Michael; Knop, Eva; Kolb, Annette; Kőrösi, Ádám; Lachat, Thibault; Lantschner, Victoria; Le Féon, Violette; LeBuhn, Gretchen; Légaré, Jean-Philippe; Letcher, Susan G; Littlewood, Nick A; López-Quintero, Carlos A; Louhaichi, Mounir; Lövei, Gabor L; Lucas-Borja, Manuel Esteban; Luja, Victor H; Maeto, Kaoru; Magura, Tibor; Mallari, Neil Aldrin; Marin-Spiotta, Erika; Marshall, E J P; Martínez, Eliana; Mayfield, Margaret M; Mikusinski, Grzegorz; Milder, Jeffrey C; Miller, James R; Morales, Carolina L; Muchane, Mary N; Muchane, Muchai; Naidoo, Robin; Nakamura, Akihiro; Naoe, Shoji; Nates-Parra, Guiomar; Navarrete Gutierrez, Dario A; Neuschulz, Eike L; Noreika, Norbertas; Norfolk, Olivia; Noriega, Jorge Ari; Nöske, Nicole M; O'Dea, Niall; Oduro, William; Ofori-Boateng, Caleb; Oke, Chris O; Osgathorpe, Lynne M; Paritsis, Juan; Parra-H, Alejandro; Pelegrin, Nicolás; Peres, Carlos A; Persson, Anna S; Petanidou, Theodora; Phalan, Ben; Philips, T Keith; Poveda, Katja; Power, Eileen F; Presley, Steven J; Proença, Vânia; Quaranta, Marino; Quintero, Carolina; Redpath-Downing, Nicola A; Reid, J Leighton; Reis, Yana T; Ribeiro, Danilo B; Richardson, Barbara A; Richardson, Michael J; Robles, Carolina A; Römbke, Jörg; Romero-Duque, Luz Piedad; Rosselli, Loreta; Rossiter, Stephen J; Roulston, T'ai H; Rousseau, Laurent; Sadler, Jonathan P; Sáfián, Szabolcs; Saldaña-Vázquez, Romeo A; Samnegård, Ulrika; Schüepp, Christof; Schweiger, Oliver; Sedlock, Jodi L; Shahabuddin, Ghazala; Sheil, Douglas; Silva, Fernando A B; Slade, Eleanor M; Smith-Pardo, Allan H; Sodhi, Navjot S; Somarriba, Eduardo J; Sosa, Ramón A; Stout, Jane C; Struebig, Matthew J; Sung, Yik-Hei; Threlfall, Caragh G; Tonietto, Rebecca; Tóthmérész, Béla; Tscharntke, Teja; Turner, Edgar C; Tylianakis, Jason M; Vanbergen, Adam J; Vassilev, Kiril; Verboven, Hans A F; Vergara, Carlos H; Vergara, Pablo M; Verhulst, Jort; Walker, Tony R; Wang, Yanping; Watling, James I; Wells, Konstans; Williams, Christopher D; Willig, Michael R; Woinarski, John C Z; Wolf, Jan H D; Woodcock, Ben A; Yu, Douglas W; Zaitsev, Andrey S; Collen, Ben; Ewers, Rob M; Mace, Georgina M; Purves, Drew W; Scharlemann, Jörn P W; Purvis, Andy

    2014-01-01

    Biodiversity continues to decline in the face of increasing anthropogenic pressures such as habitat destruction, exploitation, pollution and introduction of alien species. Existing global databases of species’ threat status or population time series are dominated by charismatic species. The collation of datasets with broad taxonomic and biogeographic extents, and that support computation of a range of biodiversity indicators, is necessary to enable better understanding of historical declines and to project – and avert – future declines. We describe and assess a new database of more than 1.6 million samples from 78 countries representing over 28,000 species, collated from existing spatial comparisons of local-scale biodiversity exposed to different intensities and types of anthropogenic pressures, from terrestrial sites around the world. The database contains measurements taken in 208 (of 814) ecoregions, 13 (of 14) biomes, 25 (of 35) biodiversity hotspots and 16 (of 17) megadiverse countries. The database contains more than 1% of the total number of all species described, and more than 1% of the described species within many taxonomic groups – including flowering plants, gymnosperms, birds, mammals, reptiles, amphibians, beetles, lepidopterans and hymenopterans. The dataset, which is still being added to, is therefore already considerably larger and more representative than those used by previous quantitative models of biodiversity trends and responses. The database is being assembled as part of the PREDICTS project (Projecting Responses of Ecological Diversity In Changing Terrestrial Systems – http://www.predicts.org.uk). We make site-level summary data available alongside this article. The full database will be publicly available in 2015. PMID:25558364

  16. CellBase, a comprehensive collection of RESTful web services for retrieving relevant biological information from heterogeneous sources.

    Science.gov (United States)

    Bleda, Marta; Tarraga, Joaquin; de Maria, Alejandro; Salavert, Francisco; Garcia-Alonso, Luz; Celma, Matilde; Martin, Ainoha; Dopazo, Joaquin; Medina, Ignacio

    2012-07-01

    During the past years, the advances in high-throughput technologies have produced an unprecedented growth in the number and size of repositories and databases storing relevant biological data. Today, there is more biological information than ever but, unfortunately, the current status of many of these repositories is far from being optimal. Some of the most common problems are that the information is spread out in many small databases; frequently there are different standards among repositories and some databases are no longer supported or they contain too specific and unconnected information. In addition, data size is increasingly becoming an obstacle when accessing or storing biological data. All these issues make very difficult to extract and integrate information from different sources, to analyze experiments or to access and query this information in a programmatic way. CellBase provides a solution to the growing necessity of integration by easing the access to biological data. CellBase implements a set of RESTful web services that query a centralized database containing the most relevant biological data sources. The database is hosted in our servers and is regularly updated. CellBase documentation can be found at http://docs.bioinfo.cipf.es/projects/cellbase.

  17. Pathbase: A new reference resource and database for laboratory mouse pathology

    International Nuclear Information System (INIS)

    Schofield, P. N.; Bard, J. B. L.; Boniver, J.; Covelli, V.; Delvenne, P.; Ellender, M.; Engstrom, W.; Goessner, W.; Gruenberger, M.; Hoefler, H.; Hopewell, J. W.; Mancuso, M.; Mothersill, C.; Quintanilla-Martinez, L.; Rozell, B.; Sariola, H.; Sundberg, J. P.; Ward, A.

    2004-01-01

    Pathbase (http:/www.pathbase.net) is a web accessible database of histopathological images of laboratory mice, developed as a resource for the coding and archiving of data derived from the analysis of mutant or genetically engineered mice and their background strains. The metadata for the images, which allows retrieval and inter-operability with other databases, is derived from a series of orthogonal ontologies, and controlled vocabularies. One of these controlled vocabularies, MPATH, was developed by the Pathbase Consortium as a formal description of the content of mouse histopathological images. The database currently has over 1000 images on-line with 2000 more under curation and presents a paradigm for the development of future databases dedicated to aspects of experimental biology. (authors)

  18. Native Pig and Chicken Breed Database: NPCDB

    Directory of Open Access Journals (Sweden)

    Hyeon-Soo Jeong

    2014-10-01

    Full Text Available Indigenous (native breeds of livestock have higher disease resistance and adaptation to the environment due to high genetic diversity. Even though their extinction rate is accelerated due to the increase of commercial breeds, natural disaster, and civil war, there is a lack of well-established databases for the native breeds. Thus, we constructed the native pig and chicken breed database (NPCDB which integrates available information on the breeds from around the world. It is a nonprofit public database aimed to provide information on the genetic resources of indigenous pig and chicken breeds for their conservation. The NPCDB (http://npcdb.snu.ac.kr/ provides the phenotypic information and population size of each breed as well as its specific habitat. In addition, it provides information on the distribution of genetic resources across the country. The database will contribute to understanding of the breed’s characteristics such as disease resistance and adaptation to environmental changes as well as the conservation of indigenous genetic resources.

  19. A New Approach to Data Publication in Ocean Sciences

    Science.gov (United States)

    Lowry, Roy; Urban, Ed; Pissierssens, Peter

    2009-12-01

    Data are collected from ocean sciences activities that range from a single investigator working in a laboratory to large teams of scientists cooperating on big, multinational, global ocean research projects. What these activities have in common is that all result in data, some of which are used as the basis for publications in peer-reviewed journals. However, two major problems regarding data remain. First, many data valuable for understanding ocean physics, chemistry, geology, biology, and how the oceans operate in the Earth system are never archived or made accessible to other scientists. Data underlying traditional journal articles are often difficult to obtain. Second, when scientists do contribute data to databases, their data become freely available, with little acknowledgment and no contribution to their career advancement. To address these problems, stronger ties must be made between data repositories and academic journals, and a “digital backbone” needs to be created for data related to journal publications.

  20. Database Perspectives on Blockchains

    OpenAIRE

    Cohen, Sara; Zohar, Aviv

    2018-01-01

    Modern blockchain systems are a fresh look at the paradigm of distributed computing, applied under assumptions of large-scale public networks. They can be used to store and share information without a trusted central party. There has been much effort to develop blockchain systems for a myriad of uses, ranging from cryptocurrencies to identity control, supply chain management, etc. None of this work has directly studied the fundamental database issues that arise when using blockchains as the u...