WorldWideScience

Sample records for dna database searches

  1. MICA: desktop software for comprehensive searching of DNA databases

    Directory of Open Access Journals (Sweden)

    Glick Benjamin S

    2006-10-01

    Full Text Available Abstract Background Molecular biologists work with DNA databases that often include entire genomes. A common requirement is to search a DNA database to find exact matches for a nondegenerate or partially degenerate query. The software programs available for such purposes are normally designed to run on remote servers, but an appealing alternative is to work with DNA databases stored on local computers. We describe a desktop software program termed MICA (K-Mer Indexing with Compact Arrays that allows large DNA databases to be searched efficiently using very little memory. Results MICA rapidly indexes a DNA database. On a Macintosh G5 computer, the complete human genome could be indexed in about 5 minutes. The indexing algorithm recognizes all 15 characters of the DNA alphabet and fully captures the information in any DNA sequence, yet for a typical sequence of length L, the index occupies only about 2L bytes. The index can be searched to return a complete list of exact matches for a nondegenerate or partially degenerate query of any length. A typical search of a long DNA sequence involves reading only a small fraction of the index into memory. As a result, searches are fast even when the available RAM is limited. Conclusion MICA is suitable as a search engine for desktop DNA analysis software.

  2. Forensic utilization of familial searches in DNA databases.

    Science.gov (United States)

    Gershaw, Cassandra J; Schweighardt, Andrew J; Rourke, Linda C; Wallace, Margaret M

    2011-01-01

    DNA evidence is widely recognized as an invaluable tool in the process of investigation and identification, as well as one of the most sought after types of evidence for presentation to a jury. In the United States, the development of state and federal DNA databases has greatly impacted the forensic community by creating an efficient, searchable system that can be used to eliminate or include suspects in an investigation based on matching DNA profiles - the profile already in the database to the profile of the unknown sample in evidence. Recent changes in legislation have begun to allow for the possibility to expand the parameters of DNA database searches, taking into account the possibility of familial searches. This article discusses prospective positive outcomes of utilizing familial DNA searches and acknowledges potential negative outcomes, thereby presenting both sides of this very complicated, rapidly evolving situation. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.

  3. Searching mixed DNA profiles directly against profile databases.

    Science.gov (United States)

    Bright, Jo-Anne; Taylor, Duncan; Curran, James; Buckleton, John

    2014-03-01

    DNA databases have revolutionised forensic science. They are a powerful investigative tool as they have the potential to identify persons of interest in criminal investigations. Routinely, a DNA profile generated from a crime sample could only be searched for in a database of individuals if the stain was from single contributor (single source) or if a contributor could unambiguously be determined from a mixed DNA profile. This meant that a significant number of samples were unsuitable for database searching. The advent of continuous methods for the interpretation of DNA profiles offers an advanced way to draw inferential power from the considerable investment made in DNA databases. Using these methods, each profile on the database may be considered a possible contributor to a mixture and a likelihood ratio (LR) can be formed. Those profiles which produce a sufficiently large LR can serve as an investigative lead. In this paper empirical studies are described to determine what constitutes a large LR. We investigate the effect on a database search of complex mixed DNA profiles with contributors in equal proportions with dropout as a consideration, and also the effect of an incorrect assignment of the number of contributors to a profile. In addition, we give, as a demonstration of the method, the results using two crime samples that were previously unsuitable for database comparison. We show that effective management of the selection of samples for searching and the interpretation of the output can be highly informative. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  4. Validation of SmartRank: A likelihood ratio software for searching national DNA databases with complex DNA profiles.

    Science.gov (United States)

    Benschop, Corina C G; van de Merwe, Linda; de Jong, Jeroen; Vanvooren, Vanessa; Kempenaers, Morgane; Kees van der Beek, C P; Barni, Filippo; Reyes, Eusebio López; Moulin, Léa; Pene, Laurent; Haned, Hinda; Sijen, Titia

    2017-07-01

    Searching a national DNA database with complex and incomplete profiles usually yields very large numbers of possible matches that can present many candidate suspects to be further investigated by the forensic scientist and/or police. Current practice in most forensic laboratories consists of ordering these 'hits' based on the number of matching alleles with the searched profile. Thus, candidate profiles that share the same number of matching alleles are not differentiated and due to the lack of other ranking criteria for the candidate list it may be difficult to discern a true match from the false positives or notice that all candidates are in fact false positives. SmartRank was developed to put forward only relevant candidates and rank them accordingly. The SmartRank software computes a likelihood ratio (LR) for the searched profile and each profile in the DNA database and ranks database entries above a defined LR threshold according to the calculated LR. In this study, we examined for mixed DNA profiles of variable complexity whether the true donors are retrieved, what the number of false positives above an LR threshold is and the ranking position of the true donors. Using 343 mixed DNA profiles over 750 SmartRank searches were performed. In addition, the performance of SmartRank and CODIS were compared regarding DNA database searches and SmartRank was found complementary to CODIS. We also describe the applicable domain of SmartRank and provide guidelines. The SmartRank software is open-source and freely available. Using the best practice guidelines, SmartRank enables obtaining investigative leads in criminal cases lacking a suspect. Copyright © 2017 Elsevier B.V. All rights reserved.

  5. The effect of wild card designations and rare alleles in forensic DNA database searches

    DEFF Research Database (Denmark)

    Tvedebrink, Torben; Bright, Jo-Anne; Buckleton, John S

    2015-01-01

    Forensic DNA databases are powerful tools used for the identification of persons of interest in criminal investigations. Typically, they consist of two parts: (1) a database containing DNA profiles of known individuals and (2) a database of DNA profiles associated with crime scenes. The risk...... of adventitious or chance matches between crimes and innocent people increases as the number of profiles within a database grows and more data is shared between various forensic DNA databases, e.g. from different jurisdictions. The DNA profiles obtained from crime scenes are often partial because crime samples...

  6. The DNA database search controversy revisited: bridging the Bayesian-frequentist gap.

    Science.gov (United States)

    Storvik, Geir; Egeland, Thore

    2007-09-01

    Two different quantities have been suggested for quantification of evidence in cases where a suspect is found by a search through a database of DNA profiles. The likelihood ratio, typically motivated from a Bayesian setting, is preferred by most experts in the field. The so-called np rule has been suggested through frequentist arguments and has been suggested by the American National Research Council and Stockmarr (1999, Biometrics55, 671-677). The two quantities differ substantially and have given rise to the DNA database search controversy. Although several authors have criticized the different approaches, a full explanation of why these differences appear is still lacking. In this article we show that a P-value in a frequentist hypothesis setting is approximately equal to the result of the np rule. We argue, however, that a more reasonable procedure in this case is to use conditional testing, in which case a P-value directly related to posterior probabilities and the likelihood ratio is obtained. This way of viewing the problem bridges the gap between the Bayesian and frequentist approaches. At the same time it indicates that the np rule should not be used to quantify evidence.

  7. Keyword Search in Databases

    CERN Document Server

    Yu, Jeffrey Xu; Chang, Lijun

    2009-01-01

    It has become highly desirable to provide users with flexible ways to query/search information over databases as simple as keyword search like Google search. This book surveys the recent developments on keyword search over databases, and focuses on finding structural information among objects in a database using a set of keywords. Such structural information to be returned can be either trees or subgraphs representing how the objects, that contain the required keywords, are interconnected in a relational database or in an XML database. The structural keyword search is completely different from

  8. Search Databases and Statistics

    DEFF Research Database (Denmark)

    Refsgaard, Jan C; Munk, Stephanie; Jensen, Lars J

    2016-01-01

    having strengths and weaknesses that must be considered for the individual needs. These are reviewed in this chapter. Equally critical for generating highly confident output datasets is the application of sound statistical criteria to limit the inclusion of incorrect peptide identifications from database...... searches. Additionally, careful filtering and use of appropriate statistical tests on the output datasets affects the quality of all downstream analyses and interpretation of the data. Our considerations and general practices on these aspects of phosphoproteomics data processing are presented here....

  9. NBIC: Search Ballast Report Database

    Science.gov (United States)

    Smithsonian Environmental Research Center Logo US Coast Guard Logo Submit BW Report | Search NBIC Database developed an online database that can be queried through our website. Data are accessible for all coastal Lakes, have been incorporated into the NBIC database as of August 2004. Information on data availability

  10. Familial searching on DNA mixtures with dropout

    NARCIS (Netherlands)

    Slooten, K.

    2016-01-01

    Familial searching, the act of searching a database for a relative of an unknown individual whose DNA profile has been obtained, is usually restricted to cases where the DNA profile of that person has been unambiguously determined. Therefore, it is normally applied only with a good quality single

  11. Update History of This Database - KAIKOcDNA | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us KAIKOcDNA Update History of This Database Date Update contents 2014/10/20 The URL of the dat... database ( http://sgp.dna.affrc.go.jp/EST/ ) is opened. About This Database Database Description Download License Update Hi...story of This Database Site Policy | Contact Us Update History of This Database - KAIKOcDNA | LSDB Archive ... ...abase maintenance site is changed. 2014/10/08 KAIKOcDNA English archive site is opened. 2004/04/12 KAIKOcDNA

  12. Database searches for qualitative research

    OpenAIRE

    Evans, David

    2002-01-01

    Interest in the role of qualitative research in evidence-based health care is growing. However, the methods currently used to identify quantitative research do not translate easily to qualitative research. This paper highlights some of the difficulties during searches of electronic databases for qualitative research. These difficulties relate to the descriptive nature of the titles used in some qualitative studies, the variable information provided in abstracts, and the differences in the ind...

  13. Database Search Engines: Paradigms, Challenges and Solutions.

    Science.gov (United States)

    Verheggen, Kenneth; Martens, Lennart; Berven, Frode S; Barsnes, Harald; Vaudel, Marc

    2016-01-01

    The first step in identifying proteins from mass spectrometry based shotgun proteomics data is to infer peptides from tandem mass spectra, a task generally achieved using database search engines. In this chapter, the basic principles of database search engines are introduced with a focus on open source software, and the use of database search engines is demonstrated using the freely available SearchGUI interface. This chapter also discusses how to tackle general issues related to sequence database searching and shows how to minimize their impact.

  14. Quantum search of a real unstructured database

    Science.gov (United States)

    Broda, Bogusław

    2016-02-01

    A simple circuit implementation of the oracle for Grover's quantum search of a real unstructured classical database is proposed. The oracle contains a kind of quantumly accessible classical memory, which stores the database.

  15. An integrated web medicinal materials DNA database: MMDBD (Medicinal Materials DNA Barcode Database

    Directory of Open Access Journals (Sweden)

    But Paul

    2010-06-01

    Full Text Available Abstract Background Thousands of plants and animals possess pharmacological properties and there is an increased interest in using these materials for therapy and health maintenance. Efficacies of the application is critically dependent on the use of genuine materials. For time to time, life-threatening poisoning is found because toxic adulterant or substitute is administered. DNA barcoding provides a definitive means of authentication and for conducting molecular systematics studies. Owing to the reduced cost in DNA authentication, the volume of the DNA barcodes produced for medicinal materials is on the rise and necessitates the development of an integrated DNA database. Description We have developed an integrated DNA barcode multimedia information platform- Medicinal Materials DNA Barcode Database (MMDBD for data retrieval and similarity search. MMDBD contains over 1000 species of medicinal materials listed in the Chinese Pharmacopoeia and American Herbal Pharmacopoeia. MMDBD also contains useful information of the medicinal material, including resources, adulterant information, medical parts, photographs, primers used for obtaining the barcodes and key references. MMDBD can be accessed at http://www.cuhk.edu.hk/icm/mmdbd.htm. Conclusions This work provides a centralized medicinal materials DNA barcode database and bioinformatics tools for data storage, analysis and exchange for promoting the identification of medicinal materials. MMDBD has the largest collection of DNA barcodes of medicinal materials and is a useful resource for researchers in conservation, systematic study, forensic and herbal industry.

  16. Interactive searching of facial image databases

    Science.gov (United States)

    Nicholls, Robert A.; Shepherd, John W.; Shepherd, Jean

    1995-09-01

    A set of psychological facial descriptors has been devised to enable computerized searching of criminal photograph albums. The descriptors have been used to encode image databased of up to twelve thousand images. Using a system called FACES, the databases are searched by translating a witness' verbal description into corresponding facial descriptors. Trials of FACES have shown that this coding scheme is more productive and efficient than searching traditional photograph albums. An alternative method of searching the encoded database using a genetic algorithm is currenly being tested. The genetic search method does not require the witness to verbalize a description of the target but merely to indicate a degree of similarity between the target and a limited selection of images from the database. The major drawback of FACES is that is requires a manual encoding of images. Research is being undertaken to automate the process, however, it will require an algorithm which can predict human descriptive values. Alternatives to human derived coding schemes exist using statistical classifications of images. Since databases encoded using statistical classifiers do not have an obvious direct mapping to human derived descriptors, a search method which does not require the entry of human descriptors is required. A genetic search algorithm is being tested for such a purpose.

  17. Fast Structural Search in Phylogenetic Databases

    Directory of Open Access Journals (Sweden)

    William H. Piel

    2005-01-01

    Full Text Available As the size of phylogenetic databases grows, the need for efficiently searching these databases arises. Thanks to previous and ongoing research, searching by attribute value and by text has become commonplace in these databases. However, searching by topological or physical structure, especially for large databases and especially for approximate matches, is still an art. We propose structural search techniques that, given a query or pattern tree P and a database of phylogenies D, find trees in D that are sufficiently close to P . The “closeness” is a measure of the topological relationships in P that are found to be the same or similar in a tree D in D. We develop a filtering technique that accelerates searches and present algorithms for rooted and unrooted trees where the trees can be weighted or unweighted. Experimental results on comparing the similarity measure with existing tree metrics and on evaluating the efficiency of the search techniques demonstrate that the proposed approach is promising

  18. Phonetic search methods for large speech databases

    CERN Document Server

    Moyal, Ami; Tetariy, Ella; Gishri, Michal

    2013-01-01

    “Phonetic Search Methods for Large Databases” focuses on Keyword Spotting (KWS) within large speech databases. The brief will begin by outlining the challenges associated with Keyword Spotting within large speech databases using dynamic keyword vocabularies. It will then continue by highlighting the various market segments in need of KWS solutions, as well as, the specific requirements of each market segment. The work also includes a detailed description of the complexity of the task and the different methods that are used, including the advantages and disadvantages of each method and an in-depth comparison. The main focus will be on the Phonetic Search method and its efficient implementation. This will include a literature review of the various methods used for the efficient implementation of Phonetic Search Keyword Spotting, with an emphasis on the authors’ own research which entails a comparative analysis of the Phonetic Search method which includes algorithmic details. This brief is useful for resea...

  19. WGDB: Wood Gene Database with search interface.

    Science.gov (United States)

    Goyal, Neha; Ginwal, H S

    2014-01-01

    Wood quality can be defined in terms of particular end use with the involvement of several traits. Over the last fifteen years researchers have assessed the wood quality traits in forest trees. The wood quality was categorized as: cell wall biochemical traits, fibre properties include the microfibril angle, density and stiffness in loblolly pine [1]. The user friendly and an open-access database has been developed named Wood Gene Database (WGDB) for describing the wood genes along the information of protein and published research articles. It contains 720 wood genes from species namely Pinus, Deodar, fast growing trees namely Poplar, Eucalyptus. WGDB designed to encompass the majority of publicly accessible genes codes for cellulose, hemicellulose and lignin in tree species which are responsive to wood formation and quality. It is an interactive platform for collecting, managing and searching the specific wood genes; it also enables the data mining relate to the genomic information specifically in Arabidopsis thaliana, Populus trichocarpa, Eucalyptus grandis, Pinus taeda, Pinus radiata, Cedrus deodara, Cedrus atlantica. For user convenience, this database is cross linked with public databases namely NCBI, EMBL & Dendrome with the search engine Google for making it more informative and provides bioinformatics tools named BLAST,COBALT. The database is freely available on www.wgdb.in.

  20. Protein structure database search and evolutionary classification.

    Science.gov (United States)

    Yang, Jinn-Moon; Tung, Chi-Hua

    2006-01-01

    As more protein structures become available and structural genomics efforts provide structural models in a genome-wide strategy, there is a growing need for fast and accurate methods for discovering homologous proteins and evolutionary classifications of newly determined structures. We have developed 3D-BLAST, in part, to address these issues. 3D-BLAST is as fast as BLAST and calculates the statistical significance (E-value) of an alignment to indicate the reliability of the prediction. Using this method, we first identified 23 states of the structural alphabet that represent pattern profiles of the backbone fragments and then used them to represent protein structure databases as structural alphabet sequence databases (SADB). Our method enhanced BLAST as a search method, using a new structural alphabet substitution matrix (SASM) to find the longest common substructures with high-scoring structured segment pairs from an SADB database. Using personal computers with Intel Pentium4 (2.8 GHz) processors, our method searched more than 10 000 protein structures in 1.3 s and achieved a good agreement with search results from detailed structure alignment methods. [3D-BLAST is available at http://3d-blast.life.nctu.edu.tw].

  1. Audio stream classification for multimedia database search

    Science.gov (United States)

    Artese, M.; Bianco, S.; Gagliardi, I.; Gasparini, F.

    2013-03-01

    Search and retrieval of huge archives of Multimedia data is a challenging task. A classification step is often used to reduce the number of entries on which to perform the subsequent search. In particular, when new entries of the database are continuously added, a fast classification based on simple threshold evaluation is desirable. In this work we present a CART-based (Classification And Regression Tree [1]) classification framework for audio streams belonging to multimedia databases. The database considered is the Archive of Ethnography and Social History (AESS) [2], which is mainly composed of popular songs and other audio records describing the popular traditions handed down generation by generation, such as traditional fairs, and customs. The peculiarities of this database are that it is continuously updated; the audio recordings are acquired in unconstrained environment; and for the non-expert human user is difficult to create the ground truth labels. In our experiments, half of all the available audio files have been randomly extracted and used as training set. The remaining ones have been used as test set. The classifier has been trained to distinguish among three different classes: speech, music, and song. All the audio files in the dataset have been previously manually labeled into the three classes above defined by domain experts.

  2. Compressing DNA sequence databases with coil

    Directory of Open Access Journals (Sweden)

    Hendy Michael D

    2008-05-01

    Full Text Available Abstract Background Publicly available DNA sequence databases such as GenBank are large, and are growing at an exponential rate. The sheer volume of data being dealt with presents serious storage and data communications problems. Currently, sequence data is usually kept in large "flat files," which are then compressed using standard Lempel-Ziv (gzip compression – an approach which rarely achieves good compression ratios. While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences. In this study we introduce the sequence database compression software coil. Results We have designed and implemented a portable software package, coil, for compressing and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared towards achieving high compression ratios at the expense of execution time and memory usage during compression – the compression time represents a "one-off investment" whose cost is quickly amortised if the resulting compressed file is transmitted many times. Decompression requires little memory and is extremely fast. We demonstrate a 5% improvement in compression ratio over state-of-the-art general-purpose compression tools for a large GenBank database file containing Expressed Sequence Tag (EST data. Finally, coil can efficiently encode incremental additions to a sequence database. Conclusion coil presents a compelling alternative to conventional compression of flat files for the storage and distribution of DNA sequence databases having a narrow distribution of sequence lengths, such as EST data. Increasing compression levels for databases having a wide distribution of sequence lengths is a direction for future work.

  3. Search pattern of databases by the undergraduate students of ...

    African Journals Online (AJOL)

    The main objective of this study is to assess the awareness and search pattern of databases in order to determine the extent to which user are aware and search for databases by examining the relationship between their Awareness and search patterns of Databases, and their information literacy skills. The methodology ...

  4. Winnowing sequences from a database search.

    Science.gov (United States)

    Berman, P; Zhang, Z; Wolf, Y I; Koonin, E V; Miller, W

    2000-01-01

    In database searches for sequence similarity, matches to a distinct sequence region (e.g., protein domain) are frequently obscured by numerous matches to another region of the same sequence. In order to cope with this problem, algorithms are developed to discard redundant matches. One model for this problem begins with a list of intervals, each with an associated score; each interval gives the range of positions in the query sequence that align to a database sequence, and the score is that of the alignment. If interval I is contained in interval J, and I's score is less than J's, then I is said to be dominated by J. The problem is then to identify each interval that is dominated by at least K other intervals, where K is a given level of "tolerable redundancy." An algorithm is developed to solve the problem in O(N log N) time and O(N*) space, where N is the number of intervals and N* is a precisely defined value that never exceeds N and is frequently much smaller. This criterion for discarding database hits has been implemented in the Blast program, as illustrated herein with examples. Several variations and extensions of this approach are also described.

  5. Database Description - KAIKOcDNA | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us KAIKOcDNA Database Description General information of database Database name KAIKOcDNA Alter...National Institute of Agrobiological Sciences Akiya Jouraku E-mail : Database cla...ssification Nucleotide Sequence Databases Organism Taxonomy Name: Bombyx mori Taxonomy ID: 7091 Database des...rnal: G3 (Bethesda) / 2013, Sep / vol.9 External Links: Original website information Database maintenance si...available URL of Web services - Need for user registration Not available About This Database Database

  6. WAIS Searching of the Current Contents Database

    Science.gov (United States)

    Banholzer, P.; Grabenstein, M. E.

    The Homer E. Newell Memorial Library of NASA's Goddard Space Flight Center is developing capabilities to permit Goddard personnel to access electronic resources of the Library via the Internet. The Library's support services contractor, Maxima Corporation, and their subcontractor, SANAD Support Technologies have recently developed a World Wide Web Home Page (http://www-library.gsfc.nasa.gov) to provide the primary means of access. The first searchable database to be made available through the HomePage to Goddard employees is Current Contents, from the Institute for Scientific Information (ISI). The initial implementation includes coverage of articles from the last few months of 1992 to present. These records are augmented with abstracts and references, and often are more robust than equivalent records in bibliographic databases that currently serve the astronomical community. Maxima/SANAD selected Wais Incorporated's WAIS product with which to build the interface to Current Contents. This system allows access from Macintosh, IBM PC, and Unix hosts, which is an important feature for Goddard's multiplatform environment. The forms interface is structured to allow both fielded (author, article title, journal name, id number, keyword, subject term, and citation) and unfielded WAIS searches. The system allows a user to: Retrieve individual journal article records. Retrieve Table of Contents of specific issues of journals. Connect to articles with similar subject terms or keywords. Connect to other issues of the same journal in the same year. Browse journal issues from an alphabetical list of indexed journal names.

  7. Searching the ASRS Database Using QUORUM Keyword Search, Phrase Search, Phrase Generation, and Phrase Discovery

    Science.gov (United States)

    McGreevy, Michael W.; Connors, Mary M. (Technical Monitor)

    2001-01-01

    To support Search Requests and Quick Responses at the Aviation Safety Reporting System (ASRS), four new QUORUM methods have been developed: keyword search, phrase search, phrase generation, and phrase discovery. These methods build upon the core QUORUM methods of text analysis, modeling, and relevance-ranking. QUORUM keyword search retrieves ASRS incident narratives that contain one or more user-specified keywords in typical or selected contexts, and ranks the narratives on their relevance to the keywords in context. QUORUM phrase search retrieves narratives that contain one or more user-specified phrases, and ranks the narratives on their relevance to the phrases. QUORUM phrase generation produces a list of phrases from the ASRS database that contain a user-specified word or phrase. QUORUM phrase discovery finds phrases that are related to topics of interest. Phrase generation and phrase discovery are particularly useful for finding query phrases for input to QUORUM phrase search. The presentation of the new QUORUM methods includes: a brief review of the underlying core QUORUM methods; an overview of the new methods; numerous, concrete examples of ASRS database searches using the new methods; discussion of related methods; and, in the appendices, detailed descriptions of the new methods.

  8. PubData: search engine for bioinformatics databases worldwide

    OpenAIRE

    Vand, Kasra; Wahlestedt, Thor; Khomtchouk, Kelly; Sayed, Mohammed; Wahlestedt, Claes; Khomtchouk, Bohdan

    2016-01-01

    We propose a search engine and file retrieval system for all bioinformatics databases worldwide. PubData searches biomedical data in a user-friendly fashion similar to how PubMed searches biomedical literature. PubData is built on novel network programming, natural language processing, and artificial intelligence algorithms that can patch into the file transfer protocol servers of any user-specified bioinformatics database, query its contents, retrieve files for download, and adapt to the use...

  9. Searching the PASCAL database - A user's perspective

    Science.gov (United States)

    Jack, Robert F.

    1989-01-01

    The operation of PASCAL, a bibliographic data base covering broad subject areas in science and technology, is discussed. The data base includes information from about 1973 to the present, including topics in engineering, chemistry, physics, earth science, environmental science, biology, psychology, and medicine. Data from 1986 to the present may be searched using DIALOG. The procedures and classification codes for searching PASCAL are presented. Examples of citations retrieved from the data base are given and suggestions are made concerning when to use PASCAL.

  10. Two Search Techniques within a Human Pedigree Database

    OpenAIRE

    Gersting, J. M.; Conneally, P. M.; Rogers, K.

    1982-01-01

    This paper presents the basic features of two search techniques from MEGADATS-2 (MEdical Genetics Acquisition and DAta Transfer System), a system for collecting, storing, retrieving and plotting human family pedigrees. The individual search provides a quick method for locating an individual in the pedigree database. This search uses a modified soundex coding and an inverted file structure based on a composite key. The navigational search uses a set of pedigree traversal operations (individual...

  11. Protein search for multiple targets on DNA

    Energy Technology Data Exchange (ETDEWEB)

    Lange, Martin [Johannes Gutenberg University, Mainz 55122 (Germany); Department of Chemistry, Rice University, Houston, Texas 77005 (United States); Kochugaeva, Maria [Department of Chemistry, Rice University, Houston, Texas 77005 (United States); Kolomeisky, Anatoly B., E-mail: tolya@rice.edu [Department of Chemistry, Rice University, Houston, Texas 77005 (United States); Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005 (United States)

    2015-09-14

    Protein-DNA interactions are crucial for all biological processes. One of the most important fundamental aspects of these interactions is the process of protein searching and recognizing specific binding sites on DNA. A large number of experimental and theoretical investigations have been devoted to uncovering the molecular description of these phenomena, but many aspects of the mechanisms of protein search for the targets on DNA remain not well understood. One of the most intriguing problems is the role of multiple targets in protein search dynamics. Using a recently developed theoretical framework we analyze this question in detail. Our method is based on a discrete-state stochastic approach that takes into account most relevant physical-chemical processes and leads to fully analytical description of all dynamic properties. Specifically, systems with two and three targets have been explicitly investigated. It is found that multiple targets in most cases accelerate the search in comparison with a single target situation. However, the acceleration is not always proportional to the number of targets. Surprisingly, there are even situations when it takes longer to find one of the multiple targets in comparison with the single target. It depends on the spatial position of the targets, distances between them, average scanning lengths of protein molecules on DNA, and the total DNA lengths. Physical-chemical explanations of observed results are presented. Our predictions are compared with experimental observations as well as with results from a continuum theory for the protein search. Extensive Monte Carlo computer simulations fully support our theoretical calculations.

  12. Using SQL Databases for Sequence Similarity Searching and Analysis.

    Science.gov (United States)

    Pearson, William R; Mackey, Aaron J

    2017-09-13

    Relational databases can integrate diverse types of information and manage large sets of similarity search results, greatly simplifying genome-scale analyses. By focusing on taxonomic subsets of sequences, relational databases can reduce the size and redundancy of sequence libraries and improve the statistical significance of homologs. In addition, by loading similarity search results into a relational database, it becomes possible to explore and summarize the relationships between all of the proteins in an organism and those in other biological kingdoms. This unit describes how to use relational databases to improve the efficiency of sequence similarity searching and demonstrates various large-scale genomic analyses of homology-related data. It also describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. The unit also introduces search_demo, a database that stores sequence similarity search results. The search_demo database is then used to explore the evolutionary relationships between E. coli proteins and proteins in other organisms in a large-scale comparative genomic analysis. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.

  13. RICD: A rice indica cDNA database resource for rice functional genomics

    Directory of Open Access Journals (Sweden)

    Zhang Qifa

    2008-11-01

    Full Text Available Abstract Background The Oryza sativa L. indica subspecies is the most widely cultivated rice. During the last few years, we have collected over 20,000 putative full-length cDNAs and over 40,000 ESTs isolated from various cDNA libraries of two indica varieties Guangluai 4 and Minghui 63. A database of the rice indica cDNAs was therefore built to provide a comprehensive web data source for searching and retrieving the indica cDNA clones. Results Rice Indica cDNA Database (RICD is an online MySQL-PHP driven database with a user-friendly web interface. It allows investigators to query the cDNA clones by keyword, genome position, nucleotide or protein sequence, and putative function. It also provides a series of information, including sequences, protein domain annotations, similarity search results, SNPs and InDels information, and hyperlinks to gene annotation in both The Rice Annotation Project Database (RAP-DB and The TIGR Rice Genome Annotation Resource, expression atlas in RiceGE and variation report in Gramene of each cDNA. Conclusion The online rice indica cDNA database provides cDNA resource with comprehensive information to researchers for functional analysis of indica subspecies and for comparative genomics. The RICD database is available through our website http://www.ncgr.ac.cn/ricd.

  14. Method and electronic database search engine for exposing the content of an electronic database

    NARCIS (Netherlands)

    Stappers, P.J.

    2000-01-01

    The invention relates to an electronic database search engine comprising an electronic memory device suitable for storing and releasing elements from the database, a display unit, a user interface for selecting and displaying at least one element from the database on the display unit, and control

  15. Effective Image Database Search via Dimensionality Reduction

    DEFF Research Database (Denmark)

    Dahl, Anders Bjorholm; Aanæs, Henrik

    2008-01-01

    Image search using the bag-of-words image representation is investigated further in this paper. This approach has shown promising results for large scale image collections making it relevant for Internet applications. The steps involved in the bag-of-words approach are feature extraction, vocabul......Image search using the bag-of-words image representation is investigated further in this paper. This approach has shown promising results for large scale image collections making it relevant for Internet applications. The steps involved in the bag-of-words approach are feature extraction......, vocabulary building, and searching with a query image. It is important to keep the computational cost low through all steps. In this paper we focus on the efficiency of the technique. To do that we substantially reduce the dimensionality of the features by the use of PCA and addition of color. Building...... of the visual vocabulary is typically done using k-means. We investigate a clustering algorithm based on the leader follower principle (LF-clustering), in which the number of clusters is not fixed. The adaptive nature of LF-clustering is shown to improve the quality of the visual vocabulary using this...

  16. Simplified validation of borderline hits of database searches

    OpenAIRE

    Thomas, Henrik; Shevchenko, Andrej

    2008-01-01

    Along with unequivocal hits produced by matching multiple MS/MS spectra to database sequences, LC-MS/MS analysis often yields a large number of hits of borderline statistical confidence. To simplify their validation, we propose to use rapid de novo interpretation of all acquired MS/MS spectra and, with the help of a simple software tool, display the candidate sequences together with each database search hit. We demonstrate that comparing hit database sequences and independent de novo interpre...

  17. The LAILAPS Search Engine: Relevance Ranking in Life Science Databases

    Directory of Open Access Journals (Sweden)

    Lange Matthias

    2010-06-01

    Full Text Available Search engines and retrieval systems are popular tools at a life science desktop. The manual inspection of hundreds of database entries, that reflect a life science concept or fact, is a time intensive daily work. Hereby, not the number of query results matters, but the relevance does. In this paper, we present the LAILAPS search engine for life science databases. The concept is to combine a novel feature model for relevance ranking, a machine learning approach to model user relevance profiles, ranking improvement by user feedback tracking and an intuitive and slim web user interface, that estimates relevance rank by tracking user interactions. Queries are formulated as simple keyword lists and will be expanded by synonyms. Supporting a flexible text index and a simple data import format, LAILAPS can easily be used both as search engine for comprehensive integrated life science databases and for small in-house project databases.

  18. BioCarian: search engine for exploratory searches in heterogeneous biological databases.

    Science.gov (United States)

    Zaki, Nazar; Tennakoon, Chandana

    2017-10-02

    There are a large number of biological databases publicly available for scientists in the web. Also, there are many private databases generated in the course of research projects. These databases are in a wide variety of formats. Web standards have evolved in the recent times and semantic web technologies are now available to interconnect diverse and heterogeneous sources of data. Therefore, integration and querying of biological databases can be facilitated by techniques used in semantic web. Heterogeneous databases can be converted into Resource Description Format (RDF) and queried using SPARQL language. Searching for exact queries in these databases is trivial. However, exploratory searches need customized solutions, especially when multiple databases are involved. This process is cumbersome and time consuming for those without a sufficient background in computer science. In this context, a search engine facilitating exploratory searches of databases would be of great help to the scientific community. We present BioCarian, an efficient and user-friendly search engine for performing exploratory searches on biological databases. The search engine is an interface for SPARQL queries over RDF databases. We note that many of the databases can be converted to tabular form. We first convert the tabular databases to RDF. The search engine provides a graphical interface based on facets to explore the converted databases. The facet interface is more advanced than conventional facets. It allows complex queries to be constructed, and have additional features like ranking of facet values based on several criteria, visually indicating the relevance of a facet value and presenting the most important facet values when a large number of choices are available. For the advanced users, SPARQL queries can be run directly on the databases. Using this feature, users will be able to incorporate federated searches of SPARQL endpoints. We used the search engine to do an exploratory search

  19. Searching Harvard Business Review Online. . . Lessons in Searching a Full Text Database.

    Science.gov (United States)

    Tenopir, Carol

    1985-01-01

    This article examines the Harvard Business Review Online (HBRO) database (bibliographic description fields, abstracts, extracted information, full text, subject descriptors) and reports on 31 sample HBRO searches conducted in Bibliographic Retrieval Services to test differences between searching full text and searching bibliographic record. Sample…

  20. TALE proteins search DNA using a rotationally decoupled mechanism.

    Science.gov (United States)

    Cuculis, Luke; Abil, Zhanar; Zhao, Huimin; Schroeder, Charles M

    2016-10-01

    Transcription activator-like effector (TALE) proteins are a class of programmable DNA-binding proteins used extensively for gene editing. Despite recent progress, however, little is known about their sequence search mechanism. Here, we use single-molecule experiments to study TALE search along DNA. Our results show that TALEs utilize a rotationally decoupled mechanism for nonspecific search, despite remaining associated with DNA templates during the search process. Our results suggest that the protein helical structure enables TALEs to adopt a loosely wrapped conformation around DNA templates during nonspecific search, facilitating rapid one-dimensional (1D) diffusion under a range of solution conditions. Furthermore, this model is consistent with a previously reported two-state mechanism for TALE search that allows these proteins to overcome the search speed-stability paradox. Taken together, our results suggest that TALE search is unique among the broad class of sequence-specific DNA-binding proteins and supports efficient 1D search along DNA.

  1. Molecule database framework: a framework for creating database applications with chemical structure search capability.

    Science.gov (United States)

    Kiener, Joos

    2013-12-11

    Research in organic chemistry generates samples of novel chemicals together with their properties and other related data. The involved scientists must be able to store this data and search it by chemical structure. There are commercial solutions for common needs like chemical registration systems or electronic lab notebooks. However for specific requirements of in-house databases and processes no such solutions exist. Another issue is that commercial solutions have the risk of vendor lock-in and may require an expensive license of a proprietary relational database management system. To speed up and simplify the development for applications that require chemical structure search capabilities, I have developed Molecule Database Framework. The framework abstracts the storing and searching of chemical structures into method calls. Therefore software developers do not require extensive knowledge about chemistry and the underlying database cartridge. This decreases application development time. Molecule Database Framework is written in Java and I created it by integrating existing free and open-source tools and frameworks. The core functionality includes:•Support for multi-component compounds (mixtures)•Import and export of SD-files•Optional security (authorization)For chemical structure searching Molecule Database Framework leverages the capabilities of the Bingo Cartridge for PostgreSQL and provides type-safe searching, caching, transactions and optional method level security. Molecule Database Framework supports multi-component chemical compounds (mixtures).Furthermore the design of entity classes and the reasoning behind it are explained. By means of a simple web application I describe how the framework could be used. I then benchmarked this example application to create some basic performance expectations for chemical structure searches and import and export of SD-files. By using a simple web application it was shown that Molecule Database Framework

  2. DNA algorithms of implementing biomolecular databases on a biological computer.

    Science.gov (United States)

    Chang, Weng-Long; Vasilakos, Athanasios V

    2015-01-01

    In this paper, DNA algorithms are proposed to perform eight operations of relational algebra (calculus), which include Cartesian product, union, set difference, selection, projection, intersection, join, and division, on biomolecular relational databases.

  3. PFTijah: text search in an XML database system

    NARCIS (Netherlands)

    Hiemstra, Djoerd; Rode, H.; van Os, R.; Flokstra, Jan

    2006-01-01

    This paper introduces the PFTijah system, a text search system that is integrated with an XML/XQuery database management system. We present examples of its use, we explain some of the system internals, and discuss plans for future work. PFTijah is part of the open source release of MonetDB/XQuery.

  4. A practical approach for inexpensive searches of radiology report databases.

    Science.gov (United States)

    Desjardins, Benoit; Hamilton, R Curtis

    2007-06-01

    We present a method to perform full text searches of radiology reports for the large number of departments that do not have this ability as part of their radiology or hospital information system. A tool written in Microsoft Access (front-end) has been designed to search a server (back-end) containing the indexed backup weekly copy of the full relational database extracted from a radiology information system (RIS). This front end-/back-end approach has been implemented in a large academic radiology department, and is used for teaching, research and administrative purposes. The weekly second backup of the 80 GB, 4 million record RIS database takes 2 hours. Further indexing of the exported radiology reports takes 6 hours. Individual searches of the indexed database typically take less than 1 minute on the indexed database and 30-60 minutes on the nonindexed database. Guidelines to properly address privacy and institutional review board issues are closely followed by all users. This method has potential to improve teaching, research, and administrative programs within radiology departments that cannot afford more expensive technology.

  5. Combined semantic and similarity search in medical image databases

    Science.gov (United States)

    Seifert, Sascha; Thoma, Marisa; Stegmaier, Florian; Hammon, Matthias; Kramer, Martin; Huber, Martin; Kriegel, Hans-Peter; Cavallaro, Alexander; Comaniciu, Dorin

    2011-03-01

    The current diagnostic process at hospitals is mainly based on reviewing and comparing images coming from multiple time points and modalities in order to monitor disease progression over a period of time. However, for ambiguous cases the radiologist deeply relies on reference literature or second opinion. Although there is a vast amount of acquired images stored in PACS systems which could be reused for decision support, these data sets suffer from weak search capabilities. Thus, we present a search methodology which enables the physician to fulfill intelligent search scenarios on medical image databases combining ontology-based semantic and appearance-based similarity search. It enabled the elimination of 12% of the top ten hits which would arise without taking the semantic context into account.

  6. A Taxonomic Search Engine: Federating taxonomic databases using web services

    Directory of Open Access Journals (Sweden)

    Page Roderic DM

    2005-03-01

    Full Text Available Abstract Background The taxonomic name of an organism is a key link between different databases that store information on that organism. However, in the absence of a single, comprehensive database of organism names, individual databases lack an easy means of checking the correctness of a name. Furthermore, the same organism may have more than one name, and the same name may apply to more than one organism. Results The Taxonomic Search Engine (TSE is a web application written in PHP that queries multiple taxonomic databases (ITIS, Index Fungorum, IPNI, NCBI, and uBIO and summarises the results in a consistent format. It supports "drill-down" queries to retrieve a specific record. The TSE can optionally suggest alternative spellings the user can try. It also acts as a Life Science Identifier (LSID authority for the source taxonomic databases, providing globally unique identifiers (and associated metadata for each name. Conclusion The Taxonomic Search Engine is available at http://darwin.zoology.gla.ac.uk/~rpage/portal/ and provides a simple demonstration of the potential of the federated approach to providing access to taxonomic names.

  7. A Taxonomic Search Engine: federating taxonomic databases using web services.

    Science.gov (United States)

    Page, Roderic D M

    2005-03-09

    The taxonomic name of an organism is a key link between different databases that store information on that organism. However, in the absence of a single, comprehensive database of organism names, individual databases lack an easy means of checking the correctness of a name. Furthermore, the same organism may have more than one name, and the same name may apply to more than one organism. The Taxonomic Search Engine (TSE) is a web application written in PHP that queries multiple taxonomic databases (ITIS, Index Fungorum, IPNI, NCBI, and uBIO) and summarises the results in a consistent format. It supports "drill-down" queries to retrieve a specific record. The TSE can optionally suggest alternative spellings the user can try. It also acts as a Life Science Identifier (LSID) authority for the source taxonomic databases, providing globally unique identifiers (and associated metadata) for each name. The Taxonomic Search Engine is available at http://darwin.zoology.gla.ac.uk/~rpage/portal/ and provides a simple demonstration of the potential of the federated approach to providing access to taxonomic names.

  8. The database search problem: a question of rational decision making.

    Science.gov (United States)

    Gittelson, S; Biedermann, A; Bozza, S; Taroni, F

    2012-10-10

    This paper applies probability and decision theory in the graphical interface of an influence diagram to study the formal requirements of rationality which justify the individualization of a person found through a database search. The decision-theoretic part of the analysis studies the parameters that a rational decision maker would use to individualize the selected person. The modeling part (in the form of an influence diagram) clarifies the relationships between this decision and the ingredients that make up the database search problem, i.e., the results of the database search and the different pairs of propositions describing whether an individual is at the source of the crime stain. These analyses evaluate the desirability associated with the decision of 'individualizing' (and 'not individualizing'). They point out that this decision is a function of (i) the probability that the individual in question is, in fact, at the source of the crime stain (i.e., the state of nature), and (ii) the decision maker's preferences among the possible consequences of the decision (i.e., the decision maker's loss function). We discuss the relevance and argumentative implications of these insights with respect to recent comments in specialized literature, which suggest points of view that are opposed to the results of our study. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.

  9. Optimal database combinations for literature searches in systematic reviews : a prospective exploratory study

    NARCIS (Netherlands)

    Bramer, W. M.; Rethlefsen, Melissa L.; Kleijnen, Jos; Franco, Oscar H.

    2017-01-01

    Background: Within systematic reviews, when searching for relevant references, it is advisable to use multiple databases. However, searching databases is laborious and time-consuming, as syntax of search strategies are database specific. We aimed to determine the optimal combination of databases

  10. Enriching Great Britain's National Landslide Database by searching newspaper archives

    Science.gov (United States)

    Taylor, Faith E.; Malamud, Bruce D.; Freeborough, Katy; Demeritt, David

    2015-11-01

    Our understanding of where landslide hazard and impact will be greatest is largely based on our knowledge of past events. Here, we present a method to supplement existing records of landslides in Great Britain by searching an electronic archive of regional newspapers. In Great Britain, the British Geological Survey (BGS) is responsible for updating and maintaining records of landslide events and their impacts in the National Landslide Database (NLD). The NLD contains records of more than 16,500 landslide events in Great Britain. Data sources for the NLD include field surveys, academic articles, grey literature, news, public reports and, since 2012, social media. We aim to supplement the richness of the NLD by (i) identifying additional landslide events, (ii) acting as an additional source of confirmation of events existing in the NLD and (iii) adding more detail to existing database entries. This is done by systematically searching the Nexis UK digital archive of 568 regional newspapers published in the UK. In this paper, we construct a robust Boolean search criterion by experimenting with landslide terminology for four training periods. We then apply this search to all articles published in 2006 and 2012. This resulted in the addition of 111 records of landslide events to the NLD over the 2 years investigated (2006 and 2012). We also find that we were able to obtain information about landslide impact for 60-90% of landslide events identified from newspaper articles. Spatial and temporal patterns of additional landslides identified from newspaper articles are broadly in line with those existing in the NLD, confirming that the NLD is a representative sample of landsliding in Great Britain. This method could now be applied to more time periods and/or other hazards to add richness to databases and thus improve our ability to forecast future events based on records of past events.

  11. Supporting ontology-based keyword search over medical databases.

    Science.gov (United States)

    Kementsietsidis, Anastasios; Lim, Lipyeow; Wang, Min

    2008-11-06

    The proliferation of medical terms poses a number of challenges in the sharing of medical information among different stakeholders. Ontologies are commonly used to establish relationships between different terms, yet their role in querying has not been investigated in detail. In this paper, we study the problem of supporting ontology-based keyword search queries on a database of electronic medical records. We present several approaches to support this type of queries, study the advantages and limitations of each approach, and summarize the lessons learned as best practices.

  12. Policy required for entry of DNA profiles onto the National Forensic DNA Database of South Africa

    Directory of Open Access Journals (Sweden)

    Laura J. Heathfield

    2014-07-01

    Full Text Available The recent Criminal Law (Forensic Procedures Amendment Act (2013 provides a definition for forensic DNA profiles and, in so doing, states that medical information about an individual may not be revealed through a forensic DNA profile. Yet chromosomal abnormalities can exhibit as tri-allelic patterns on DNA profiles and such information can expose medical conditions such as Down syndrome. This short report highlights this concern and suggests a policy be created for the entering of such DNA profiles onto the National Forensic DNA database of South Africa.

  13. Sequence heterogeneity accelerates protein search for targets on DNA

    International Nuclear Information System (INIS)

    Shvets, Alexey A.; Kolomeisky, Anatoly B.

    2015-01-01

    The process of protein search for specific binding sites on DNA is fundamentally important since it marks the beginning of all major biological processes. We present a theoretical investigation that probes the role of DNA sequence symmetry, heterogeneity, and chemical composition in the protein search dynamics. Using a discrete-state stochastic approach with a first-passage events analysis, which takes into account the most relevant physical-chemical processes, a full analytical description of the search dynamics is obtained. It is found that, contrary to existing views, the protein search is generally faster on DNA with more heterogeneous sequences. In addition, the search dynamics might be affected by the chemical composition near the target site. The physical origins of these phenomena are discussed. Our results suggest that biological processes might be effectively regulated by modifying chemical composition, symmetry, and heterogeneity of a genome

  14. Sequence heterogeneity accelerates protein search for targets on DNA

    Energy Technology Data Exchange (ETDEWEB)

    Shvets, Alexey A.; Kolomeisky, Anatoly B., E-mail: tolya@rice.edu [Department of Chemistry and Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005 (United States)

    2015-12-28

    The process of protein search for specific binding sites on DNA is fundamentally important since it marks the beginning of all major biological processes. We present a theoretical investigation that probes the role of DNA sequence symmetry, heterogeneity, and chemical composition in the protein search dynamics. Using a discrete-state stochastic approach with a first-passage events analysis, which takes into account the most relevant physical-chemical processes, a full analytical description of the search dynamics is obtained. It is found that, contrary to existing views, the protein search is generally faster on DNA with more heterogeneous sequences. In addition, the search dynamics might be affected by the chemical composition near the target site. The physical origins of these phenomena are discussed. Our results suggest that biological processes might be effectively regulated by modifying chemical composition, symmetry, and heterogeneity of a genome.

  15. Archiving, ordering and searching: search engines, algorithms, databases and deep mediatization

    DEFF Research Database (Denmark)

    Andersen, Jack

    2018-01-01

    This article argues that search engines, algorithms, and databases can be considered as a way of understanding deep mediatization (Couldry & Hepp, 2016). They are embedded in a variety of social and cultural practices and as such they change our communicative actions to be shaped by their logic o...... reviewed recent trends in mediatization research, the argument is discussed and unfolded in-between the material and social constructivist-phenomenological interpretations of mediatization. In conclusion, it is discussed how deep this form of mediatization can be taken to be.......This article argues that search engines, algorithms, and databases can be considered as a way of understanding deep mediatization (Couldry & Hepp, 2016). They are embedded in a variety of social and cultural practices and as such they change our communicative actions to be shaped by their logic...

  16. The Development of a Combined Search for a Heterogeneous Chemistry Database

    Directory of Open Access Journals (Sweden)

    Lulu Jiang

    2015-05-01

    Full Text Available A combined search, which joins a slow molecule structure search with a fast compound property search, results in more accurate search results and has been applied in several chemistry databases. However, the problems of search speed differences and combining the two separate search results are two major challenges. In this paper, two kinds of search strategies, synchronous search and asynchronous search, are proposed to solve these problems in the heterogeneous structure database and the property database found in ChemDB, a chemistry database owned by the Institute of Process Engineering, CAS. Their advantages and disadvantages under different conditions are discussed in detail. Furthermore, we applied these two searches to ChemDB and used them to screen for potential molecules that can work as CO2 absorbents. The results reveal that this combined search discovers reasonable target molecules within an acceptable time frame.

  17. CORE: a phylogenetically-curated 16S rDNA database of the core oral microbiome.

    Directory of Open Access Journals (Sweden)

    Ann L Griffen

    2011-04-01

    Full Text Available Comparing bacterial 16S rDNA sequences to GenBank and other large public databases via BLAST often provides results of little use for identification and taxonomic assignment of the organisms of interest. The human microbiome, and in particular the oral microbiome, includes many taxa, and accurate identification of sequence data is essential for studies of these communities. For this purpose, a phylogenetically curated 16S rDNA database of the core oral microbiome, CORE, was developed. The goal was to include a comprehensive and minimally redundant representation of the bacteria that regularly reside in the human oral cavity with computationally robust classification at the level of species and genus. Clades of cultivated and uncultivated taxa were formed based on sequence analyses using multiple criteria, including maximum-likelihood-based topology and bootstrap support, genetic distance, and previous naming. A number of classification inconsistencies for previously named species, especially at the level of genus, were resolved. The performance of the CORE database for identifying clinical sequences was compared to that of three publicly available databases, GenBank nr/nt, RDP and HOMD, using a set of sequencing reads that had not been used in creation of the database. CORE offered improved performance compared to other public databases for identification of human oral bacterial 16S sequences by a number of criteria. In addition, the CORE database and phylogenetic tree provide a framework for measures of community divergence, and the focused size of the database offers advantages of efficiency for BLAST searching of large datasets. The CORE database is available as a searchable interface and for download at http://microbiome.osu.edu.

  18. An approach in building a chemical compound search engine in oracle database.

    Science.gov (United States)

    Wang, H; Volarath, P; Harrison, R

    2005-01-01

    A searching or identifying of chemical compounds is an important process in drug design and in chemistry research. An efficient search engine involves a close coupling of the search algorithm and database implementation. The database must process chemical structures, which demands the approaches to represent, store, and retrieve structures in a database system. In this paper, a general database framework for working as a chemical compound search engine in Oracle database is described. The framework is devoted to eliminate data type constrains for potential search algorithms, which is a crucial step toward building a domain specific query language on top of SQL. A search engine implementation based on the database framework is also demonstrated. The convenience of the implementation emphasizes the efficiency and simplicity of the framework.

  19. PIR search result - KOME | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available e filtered with Expect values lower than 1e-10. Number of data entries 1,549,409 ...he searches. Data analysis method Performed blastx searches against the PIR protein database. The results ar

  20. Scientific criticism of “DNA criminal investigation – DNA database, mandatory DNA collection and time limit for data retention” - Notes on the unconstitutionality of Law 12.654/2012

    Directory of Open Access Journals (Sweden)

    Rodrigo Grazinoli Garrido

    2018-06-01

    Full Text Available This scientific criticism was based on what was proposed by the article “DNA criminal investigation – DNA database, mandatory DNA collection and time limit for data retention” - Notes on the unconstitutionality of Law 12.654/2012”, in the search to offer doctrinal and empirical evidences that allowed to expand the academic dialogue on the implantation of the National Database of Genetic Profiles (BNPG. For that, we conducted exploratory and qualitative research, developed from documentation of doctrine, empirical work, judgments and rules related to the Brazilian and foreign databases. It is possible to recognize the possibility of using different references from the one presented by the article in question and, thus, the scope of contrary conclusions, particularly regarding the offense to the principle nemo tenetur is detegere in the application of Law 12,654 / 2012. Furthermore, important limitations in relation to the reduction of crime rates and the increase in DNA databases need to be emphasized.

  1. cDNA library Table - KAIKOcDNA | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available c00951-005 Description of data contents List of Bombyx mori cDNA libraries. Data file File name: kaiko_cdna_...library.zip File URL: ftp://ftp.biosciencedbc.jp/archive/kaiko-cdna/LATEST/kaiko_cdna_library.zip File size:... 4.8 KB Simple search URL http://togodb.biosciencedbc.jp/togodb/view/kaiko_cdna_l

  2. pSort search result - KOME | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...name: kome_psort_search_result.zip File URL: ftp://ftp.biosciencedbc.jp/archive/kome/LATEST/kome_psort_searc...abase Description Download License Update History of This Database Site Policy | Contact Us pSort search result - KOME | LSDB Archive ...

  3. Inspecting close maternal relatedness: Towards better mtDNA population samples in forensic databases.

    Science.gov (United States)

    Bodner, Martin; Irwin, Jodi A; Coble, Michael D; Parson, Walther

    2011-03-01

    Reliable data are crucial for all research fields applying mitochondrial DNA (mtDNA) as a genetic marker. Quality control measures have been introduced to ensure the highest standards in sequence data generation, validation and a posteriori inspection. A phylogenetic alignment strategy has been widely accepted as a prerequisite for data comparability and database searches, for forensic applications, for reconstructions of human migrations and for correct interpretation of mtDNA mutations in medical genetics. There is continuing effort to enhance the number of worldwide population samples in order to contribute to a better understanding of human mtDNA variation. This has often lead to the analysis of convenience samples collected for other purposes, which might not meet the quality requirement of random sampling for mtDNA data sets. Here, we introduce an additional quality control means that deals with one aspect of this limitation: by combining autosomal short tandem repeat (STR) marker with mtDNA information, it helps to avoid the bias introduced by related individuals included in the same (small) sample. By STR analysis of individuals sharing their mitochondrial haplotype, pedigree construction and subsequent software-assisted calculation of likelihood ratios based on the allele frequencies found in the population, closely maternally related individuals can be identified and excluded. We also discuss scenarios that allow related individuals in the same set. An ideal population sample would be representative for its population: this new approach represents another contribution towards this goal. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.

  4. Faster Smith-Waterman database searches with inter-sequence SIMD parallelisation

    Directory of Open Access Journals (Sweden)

    Rognes Torbjørn

    2011-06-01

    Full Text Available Abstract Background The Smith-Waterman algorithm for local sequence alignment is more sensitive than heuristic methods for database searching, but also more time-consuming. The fastest approach to parallelisation with SIMD technology has previously been described by Farrar in 2007. The aim of this study was to explore whether further speed could be gained by other approaches to parallelisation. Results A faster approach and implementation is described and benchmarked. In the new tool SWIPE, residues from sixteen different database sequences are compared in parallel to one query residue. Using a 375 residue query sequence a speed of 106 billion cell updates per second (GCUPS was achieved on a dual Intel Xeon X5650 six-core processor system, which is over six times more rapid than software based on Farrar's 'striped' approach. SWIPE was about 2.5 times faster when the programs used only a single thread. For shorter queries, the increase in speed was larger. SWIPE was about twice as fast as BLAST when using the BLOSUM50 score matrix, while BLAST was about twice as fast as SWIPE for the BLOSUM62 matrix. The software is designed for 64 bit Linux on processors with SSSE3. Source code is available from http://dna.uio.no/swipe/ under the GNU Affero General Public License. Conclusions Efficient parallelisation using SIMD on standard hardware makes it possible to run Smith-Waterman database searches more than six times faster than before. The approach described here could significantly widen the potential application of Smith-Waterman searches. Other applications that require optimal local alignment scores could also benefit from improved performance.

  5. Faster Smith-Waterman database searches with inter-sequence SIMD parallelisation.

    Science.gov (United States)

    Rognes, Torbjørn

    2011-06-01

    The Smith-Waterman algorithm for local sequence alignment is more sensitive than heuristic methods for database searching, but also more time-consuming. The fastest approach to parallelisation with SIMD technology has previously been described by Farrar in 2007. The aim of this study was to explore whether further speed could be gained by other approaches to parallelisation. A faster approach and implementation is described and benchmarked. In the new tool SWIPE, residues from sixteen different database sequences are compared in parallel to one query residue. Using a 375 residue query sequence a speed of 106 billion cell updates per second (GCUPS) was achieved on a dual Intel Xeon X5650 six-core processor system, which is over six times more rapid than software based on Farrar's 'striped' approach. SWIPE was about 2.5 times faster when the programs used only a single thread. For shorter queries, the increase in speed was larger. SWIPE was about twice as fast as BLAST when using the BLOSUM50 score matrix, while BLAST was about twice as fast as SWIPE for the BLOSUM62 matrix. The software is designed for 64 bit Linux on processors with SSSE3. Source code is available from http://dna.uio.no/swipe/ under the GNU Affero General Public License. Efficient parallelisation using SIMD on standard hardware makes it possible to run Smith-Waterman database searches more than six times faster than before. The approach described here could significantly widen the potential application of Smith-Waterman searches. Other applications that require optimal local alignment scores could also benefit from improved performance.

  6. An effective suggestion method for keyword search of databases

    KAUST Repository

    Huang, Hai; Chen, Zonghai; Liu, Chengfei; Huang, He; Zhang, Xiangliang

    2016-01-01

    This paper solves the problem of providing high-quality suggestions for user keyword queries over databases. With the assumption that the returned suggestions are independent, existing query suggestion methods over databases score candidate

  7. MetaboSearch: tool for mass-based metabolite identification using multiple databases.

    Directory of Open Access Journals (Sweden)

    Bin Zhou

    Full Text Available Searching metabolites against databases according to their masses is often the first step in metabolite identification for a mass spectrometry-based untargeted metabolomics study. Major metabolite databases include Human Metabolome DataBase (HMDB, Madison Metabolomics Consortium Database (MMCD, Metlin, and LIPID MAPS. Since each one of these databases covers only a fraction of the metabolome, integration of the search results from these databases is expected to yield a more comprehensive coverage. However, the manual combination of multiple search results is generally difficult when identification of hundreds of metabolites is desired. We have implemented a web-based software tool that enables simultaneous mass-based search against the four major databases, and the integration of the results. In addition, more complete chemical identifier information for the metabolites is retrieved by cross-referencing multiple databases. The search results are merged based on IUPAC International Chemical Identifier (InChI keys. Besides a simple list of m/z values, the software can accept the ion annotation information as input for enhanced metabolite identification. The performance of the software is demonstrated on mass spectrometry data acquired in both positive and negative ionization modes. Compared with search results from individual databases, MetaboSearch provides better coverage of the metabolome and more complete chemical identifier information.The software tool is available at http://omics.georgetown.edu/MetaboSearch.html.

  8. MitBASE : a comprehensive and integrated mitochondrial DNA database. The present status

    NARCIS (Netherlands)

    Attimonelli, M.; Altamura, N.; Benne, R.; Brennicke, A.; Cooper, J. M.; D'Elia, D.; Montalvo, A.; Pinto, B.; de Robertis, M.; Golik, P.; Knoop, V.; Lanave, C.; Lazowska, J.; Licciulli, F.; Malladi, B. S.; Memeo, F.; Monnerot, M.; Pasimeni, R.; Pilbout, S.; Schapira, A. H.; Sloof, P.; Saccone, C.

    2000-01-01

    MitBASE is an integrated and comprehensive database of mitochondrial DNA data which collects, under a single interface, databases for Plant, Vertebrate, Invertebrate, Human, Protist and Fungal mtDNA and a Pilot database on nuclear genes involved in mitochondrial biogenesis in Saccharomyces

  9. Using relational databases for improved sequence similarity searching and large-scale genomic analyses.

    Science.gov (United States)

    Mackey, Aaron J; Pearson, William R

    2004-10-01

    Relational databases are designed to integrate diverse types of information and manage large sets of search results, greatly simplifying genome-scale analyses. Relational databases are essential for management and analysis of large-scale sequence analyses, and can also be used to improve the statistical significance of similarity searches by focusing on subsets of sequence libraries most likely to contain homologs. This unit describes using relational databases to improve the efficiency of sequence similarity searching and to demonstrate various large-scale genomic analyses of homology-related data. This unit describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. These include basic use of the database to generate a novel sequence library subset, how to extend and use seqdb_demo for the storage of sequence similarity search results and making use of various kinds of stored search results to address aspects of comparative genomic analysis.

  10. Plant rDNA database: ribosomal DNA loci information goes online

    Czech Academy of Sciences Publication Activity Database

    Garcia, S.; Garnatje, T.; Kovařík, Aleš

    2012-01-01

    Roč. 121, č. 4 (2012), s. 389-394 ISSN 0009-5915 R&D Projects: GA ČR(CZ) GAP501/10/0208; GA ČR GBP501/12/G090 Institutional research plan: CEZ:AV0Z50040702 Keywords : rDNA loci * FISH * database Subject RIV: BO - Biophysics Impact factor: 3.340, year: 2012

  11. Efficiency of Database Search for Identification of Mutated and Modified Proteins via Mass Spectrometry

    OpenAIRE

    Pevzner, Pavel A.; Mulyukov, Zufar; Dancik, Vlado; Tang, Chris L

    2001-01-01

    Although protein identification by matching tandem mass spectra (MS/MS) against protein databases is a widespread tool in mass spectrometry, the question about reliability of such searches remains open. Absence of rigorous significance scores in MS/MS database search makes it difficult to discard random database hits and may lead to erroneous protein identification, particularly in the case of mutated or post-translationally modified peptides. This problem is especially important for high-thr...

  12. Searching Databases without Query-Building Aids: Implications for Dyslexic Users

    Science.gov (United States)

    Berget, Gerd; Sandnes, Frode Eika

    2015-01-01

    Introduction: Few studies document the information searching behaviour of users with cognitive impairments. This paper therefore addresses the effect of dyslexia on information searching in a database with no tolerance for spelling errors and no query-building aids. The purpose was to identify effective search interface design guidelines that…

  13. Term Relevance Feedback and Mediated Database Searching: Implications for Information Retrieval Practice and Systems Design.

    Science.gov (United States)

    Spink, Amanda

    1995-01-01

    This study uses the human approach to examine the sources and effectiveness of search terms selected during 40 mediated interactive database searches and focuses on determining the retrieval effectiveness of search terms identified by users and intermediaries from retrieved items during term relevance feedback. (Author/JKP)

  14. A student's guide to searching the literature using online databases

    Science.gov (United States)

    Miller, Casey W.; Belyea, Dustin; Chabot, Michelle; Messina, Troy

    2012-02-01

    A method is described to empower students to efficiently perform general and specific literature searches using online resources [Miller et al., Am. J. Phys. 77, 1112 (2009)]. The method was tested on multiple groups, including undergraduate and graduate students with varying backgrounds in scientific literature searches. Students involved in this study showed marked improvement in their awareness of how and where to find scientific information. Repeated exposure to literature searching methods appears worthwhile, starting early in the undergraduate career, and even in graduate school orientation.

  15. License - KAIKOcDNA | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us KAIKOcDNA License License to Use This Database Last updated : 2014/10/08 You may use this database... license terms regarding the use of this database and the requirements you must follow in using this database...-Share Alike 2.1 Japan . If you use data from this database, please be sure attribute this database as follo...e Alike 2.1 Japan is found here . With regard to this database, you are licensed to: freely access part or whole of this database..., and acquire data; freely redistribute part or whole of the data from this database; a

  16. Searching for religion and mental health studies required health, social science, and grey literature databases.

    Science.gov (United States)

    Wright, Judy M; Cottrell, David J; Mir, Ghazala

    2014-07-01

    To determine the optimal databases to search for studies of faith-sensitive interventions for treating depression. We examined 23 health, social science, religious, and grey literature databases searched for an evidence synthesis. Databases were prioritized by yield of (1) search results, (2) potentially relevant references identified during screening, (3) included references contained in the synthesis, and (4) included references that were available in the database. We assessed the impact of databases beyond MEDLINE, EMBASE, and PsycINFO by their ability to supply studies identifying new themes and issues. We identified pragmatic workload factors that influence database selection. PsycINFO was the best performing database within all priority lists. ArabPsyNet, CINAHL, Dissertations and Theses, EMBASE, Global Health, Health Management Information Consortium, MEDLINE, PsycINFO, and Sociological Abstracts were essential for our searches to retrieve the included references. Citation tracking activities and the personal library of one of the research teams made significant contributions of unique, relevant references. Religion studies databases (Am Theo Lib Assoc, FRANCIS) did not provide unique, relevant references. Literature searches for reviews and evidence syntheses of religion and health studies should include social science, grey literature, non-Western databases, personal libraries, and citation tracking activities. Copyright © 2014 Elsevier Inc. All rights reserved.

  17. Chapter 51: How to Build a Simple Cone Search Service Using a Local Database

    Science.gov (United States)

    Kent, B. R.; Greene, G. R.

    The cone search service protocol will be examined from the server side in this chapter. A simple cone search service will be setup and configured locally using MySQL. Data will be read into a table, and the Java JDBC will be used to connect to the database. Readers will understand the VO cone search specification and how to use it to query a database on their local systems and return an XML/VOTable file based on an input of RA/DEC coordinates and a search radius. The cone search in this example will be deployed as a Java servlet. The resulting cone search can be tested with a verification service. This basic setup can be used with other languages and relational databases.

  18. Enabling Searches on Wavelengths in a Hyperspectral Indices Database

    Science.gov (United States)

    Piñuela, F.; Cerra, D.; Müller, R.

    2017-10-01

    Spectral indices derived from hyperspectral reflectance measurements are powerful tools to estimate physical parameters in a non-destructive and precise way for several fields of applications, among others vegetation health analysis, coastal and deep water constituents, geology, and atmosphere composition. In the last years, several micro-hyperspectral sensors have appeared, with both full-frame and push-broom acquisition technologies, while in the near future several hyperspectral spaceborne missions are planned to be launched. This is fostering the use of hyperspectral data in basic and applied research causing a large number of spectral indices to be defined and used in various applications. Ad hoc search engines are therefore needed to retrieve the most appropriate indices for a given application. In traditional systems, query input parameters are limited to alphanumeric strings, while characteristics such as spectral range/ bandwidth are not used in any existing search engine. Such information would be relevant, as it enables an inverse type of search: given the spectral capabilities of a given sensor or a specific spectral band, find all indices which can be derived from it. This paper describes a tool which enables a search as described above, by using the central wavelength or spectral range used by a given index as a search parameter. This offers the ability to manage numeric wavelength ranges in order to select indices which work at best in a given set of wavelengths or wavelength ranges.

  19. Social Work Literature Searching: Current Issues with Databases and Online Search Engines

    Science.gov (United States)

    McGinn, Tony; Taylor, Brian; McColgan, Mary; McQuilkan, Janice

    2016-01-01

    Objectives: To compare the performance of a range of search facilities; and to illustrate the execution of a comprehensive literature search for qualitative evidence in social work. Context: Developments in literature search methods and comparisons of search facilities help facilitate access to the best available evidence for social workers.…

  20. Usability Testing of a Large, Multidisciplinary Library Database: Basic Search and Visual Search

    Directory of Open Access Journals (Sweden)

    Jody Condit Fagan

    2006-09-01

    Full Text Available Visual search interfaces have been shown by researchers to assist users with information search and retrieval. Recently, several major library vendors have added visual search interfaces or functions to their products. For public service librarians, perhaps the most critical area of interest is the extent to which visual search interfaces and text-based search interfaces support research. This study presents the results of eight full-scale usability tests of both the EBSCOhost Basic Search and Visual Search in the context of a large liberal arts university.

  1. cDNA sequence quality data - Budding yeast cDNA sequencing project | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Budding yeast cDNA sequencing project cDNA sequence quality data Data detail Data name cDNA sequence quality... data DOI 10.18908/lsdba.nbdc00838-003 Description of data contents Phred's quality score. P...tion Download License Update History of This Database Site Policy | Contact Us cDNA sequence quality

  2. Modelling antibody side chain conformations using heuristic database search.

    Science.gov (United States)

    Ritchie, D W; Kemp, G J

    1997-01-01

    We have developed a knowledge-based system which models the side chain conformations of residues in the variable domains of antibody Fv fragments. The system is written in Prolog and uses an object-oriented database of aligned antibody structures in conjunction with a side chain rotamer library. The antibody database provides 3-dimensional clusters of side chain conformations which can be copied en masse into the model structure. The object-oriented database architecture facilitates a navigational style of database access, necessary to assemble side chains clusters. Around 60% of the model is built using side chain clusters and this eliminates much of the combinatorial complexity associated with many other side chain placement algorithms. Construction and placement of side chain clusters is guided by a heuristic cost function based on a simple model of side chain packing interactions. Even with a simple model, we find that a large proportion of side chain conformations are modelled accurately. We expect our approach could be used with other homologous protein families, in addition to antibodies, both to improve the quality of model structures and to give a "smart start" to the side chain placement problem.

  3. Searching for evidence or approval? A commentary on database search in systematic reviews and alternative information retrieval methodologies.

    Science.gov (United States)

    Delaney, Aogán; Tamás, Peter A

    2018-03-01

    Despite recognition that database search alone is inadequate even within the health sciences, it appears that reviewers in fields that have adopted systematic review are choosing to rely primarily, or only, on database search for information retrieval. This commentary reminds readers of factors that call into question the appropriateness of default reliance on database searches particularly as systematic review is adapted for use in new and lower consensus fields. It then discusses alternative methods for information retrieval that require development, formalisation, and evaluation. Our goals are to encourage reviewers to reflect critically and transparently on their choice of information retrieval methods and to encourage investment in research on alternatives. Copyright © 2017 John Wiley & Sons, Ltd.

  4. SierraDNA – Demonstrating the Usefulness of Direct ILS Database Access

    Directory of Open Access Journals (Sweden)

    James Padgett

    2015-10-01

    Full Text Available Innovative Interface’s Sierra(™ Integrated Library System (ILS brings with it a Database Navigator Application (SierraDNA - in layman's terms SierraDNA gives Sierra sites read access to their ILS database. Unlike the closed use cases produced by vendor supplied APIs, which restrict Libraries to limited development opportunities, SierraDNA enables sites to publish their own APIs and scripts based upon custom SQL code to meet their own needs and those of their users and processes. In this article we give examples showing how SierraDNA can be utilized to improve Library services. We highlight three example use cases which have benefited our users, enhanced online security and improved our back office processes. In the first use case we employ user access data from our electronic resources proxy server (WAM to detect hacked user accounts. Three scripts are used in conjunction to flag user accounts which are being hijacked to systematically steal content from our electronic resource provider’s websites. In the second we utilize the reading histories of our users to augment our search experience with an Amazon style “People who borrowed this book also borrowed…these books” feature. Two scripts are used together to determine which other items were borrowed by borrowers of the item currently of interest. And lastly, we use item holds data to improve our acquisitions workflow through an automated demand based ordering process. Our explanation and SQL code should be of direct use for adoption or as examples for other Sierra customers willing to exploit their ILS data in similar ways, but the principles may also be useful to non-Sierra sites that also wish to enhancement security, user services or improve back office processes.

  5. Federated or cached searches: providing expected performance from multiple invasive species databases

    Science.gov (United States)

    Graham, Jim; Jarnevich, Catherine S.; Simpson, Annie; Newman, Gregory J.; Stohlgren, Thomas J.

    2011-01-01

    Invasive species are a universal global problem, but the information to identify them, manage them, and prevent invasions is stored around the globe in a variety of formats. The Global Invasive Species Information Network is a consortium of organizations working toward providing seamless access to these disparate databases via the Internet. A distributed network of databases can be created using the Internet and a standard web service protocol. There are two options to provide this integration. First, federated searches are being proposed to allow users to search “deep” web documents such as databases for invasive species. A second method is to create a cache of data from the databases for searching. We compare these two methods, and show that federated searches will not provide the performance and flexibility required from users and a central cache of the datum are required to improve performance.

  6. Content Based Retrieval Database Management System with Support for Similarity Searching and Query Refinement

    National Research Council Canada - National Science Library

    Ortega-Binderberger, Michael

    2002-01-01

    ... as a critical area of research. This thesis explores how to enhance database systems with content based search over arbitrary abstract data types in a similarity based framework with query refinement...

  7. STEPS: a grid search methodology for optimized peptide identification filtering of MS/MS database search results.

    Science.gov (United States)

    Piehowski, Paul D; Petyuk, Vladislav A; Sandoval, John D; Burnum, Kristin E; Kiebel, Gary R; Monroe, Matthew E; Anderson, Gordon A; Camp, David G; Smith, Richard D

    2013-03-01

    For bottom-up proteomics, there are wide variety of database-searching algorithms in use for matching peptide sequences to tandem MS spectra. Likewise, there are numerous strategies being employed to produce a confident list of peptide identifications from the different search algorithm outputs. Here we introduce a grid-search approach for determining optimal database filtering criteria in shotgun proteomics data analyses that is easily adaptable to any search. Systematic Trial and Error Parameter Selection--referred to as STEPS--utilizes user-defined parameter ranges to test a wide array of parameter combinations to arrive at an optimal "parameter set" for data filtering, thus maximizing confident identifications. The benefits of this approach in terms of numbers of true-positive identifications are demonstrated using datasets derived from immunoaffinity-depleted blood serum and a bacterial cell lysate, two common proteomics sample types. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  8. The new ENSDF search system NESSY: IBM/PC nuclear spectroscopy database

    International Nuclear Information System (INIS)

    Boboshin, I.N.; Varlamov, V.V.

    1996-01-01

    The universal relational nuclear structure and decay database NESSY (New ENSDF Search SYstem) developed for the IBM/PC and compatible PCs, and based on the international file ENSDF (Evaluated Nuclear Structure Data File), is described. The NESSY provides the possibility of high efficiency processing (the search and retrieval of any kind of physical data) of the information from ENSDF. The principles of the database development are described and examples of applications are presented. (orig.)

  9. MethBank 3.0: a database of DNA methylomes across a variety of species.

    Science.gov (United States)

    Li, Rujiao; Liang, Fang; Li, Mengwei; Zou, Dong; Sun, Shixiang; Zhao, Yongbing; Zhao, Wenming; Bao, Yiming; Xiao, Jingfa; Zhang, Zhang

    2018-01-04

    MethBank (http://bigd.big.ac.cn/methbank) is a database that integrates high-quality DNA methylomes across a variety of species and provides an interactive browser for visualization of methylation data. Here, we present an updated implementation of MethBank (version 3.0) by incorporating more DNA methylomes from multiple species and equipping with more enhanced functionalities for data annotation and more friendly web interfaces for data presentation, search and visualization. MethBank 3.0 features large-scale integration of high-quality methylomes, involving 34 consensus reference methylomes derived from a large number of human samples, 336 single-base resolution methylomes from different developmental stages and/or tissues of five plants, and 18 single-base resolution methylomes from gametes and early embryos at multiple stages of two animals. Additionally, it is enhanced by improving the functionalities for data annotation, which accordingly enables systematic identification of methylation sites closely associated with age, sites with constant methylation levels across different ages, differentially methylated promoters, age-specific differentially methylated cytosines/regions, and methylated CpG islands. Moreover, MethBank provides tools to estimate human methylation age online and to identify differentially methylated promoters, respectively. Taken together, MethBank is upgraded with significant improvements and advances over the previous version, which is of great help for deciphering DNA methylation regulatory mechanisms for epigenetic studies. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  10. When is a search not a search? A comparison of searching the AMED complementary health database via EBSCOhost, OVID and DIALOG.

    Science.gov (United States)

    Younger, Paula; Boddy, Kate

    2009-06-01

    The researchers involved in this study work at Exeter Health library and at the Complementary Medicine Unit, Peninsula School of Medicine and Dentistry (PCMD). Within this collaborative environment it is possible to access the electronic resources of three institutions. This includes access to AMED and other databases using different interfaces. The aim of this study was to investigate whether searching different interfaces to the AMED allied health and complementary medicine database produced the same results when using identical search terms. The following Internet-based AMED interfaces were searched: DIALOG DataStar; EBSCOhost and OVID SP_UI01.00.02. Search results from all three databases were saved in an endnote database to facilitate analysis. A checklist was also compiled comparing interface features. In our initial search, DIALOG returned 29 hits, OVID 14 and Ebsco 8. If we assume that DIALOG returned 100% of potential hits, OVID initially returned only 48% of hits and EBSCOhost only 28%. In our search, a researcher using the Ebsco interface to carry out a simple search on AMED would miss over 70% of possible search hits. Subsequent EBSCOhost searches on different subjects failed to find between 21 and 86% of the hits retrieved using the same keywords via DIALOG DataStar. In two cases, the simple EBSCOhost search failed to find any of the results found via DIALOG DataStar. Depending on the interface, the number of hits retrieved from the same database with the same simple search can vary dramatically. Some simple searches fail to retrieve a substantial percentage of citations. This may result in an uninformed literature review, research funding application or treatment intervention. In addition to ensuring that keywords, spelling and medical subject headings (MeSH) accurately reflect the nature of the search, database users should include wildcards and truncation and adapt their search strategy substantially to retrieve the maximum number of appropriate

  11. An effective suggestion method for keyword search of databases

    KAUST Repository

    Huang, Hai

    2016-09-09

    This paper solves the problem of providing high-quality suggestions for user keyword queries over databases. With the assumption that the returned suggestions are independent, existing query suggestion methods over databases score candidate suggestions individually and return the top-k best of them. However, the top-k suggestions have high redundancy with respect to the topics. To provide informative suggestions, the returned k suggestions are expected to be diverse, i.e., maximizing the relevance to the user query and the diversity with respect to topics that the user might be interested in simultaneously. In this paper, an objective function considering both factors is defined for evaluating a suggestion set. We show that maximizing the objective function is a submodular function maximization problem subject to n matroid constraints, which is an NP-hard problem. An greedy approximate algorithm with an approximation ratio O((Formula presented.)) is also proposed. Experimental results show that our suggestion outperforms other methods on providing relevant and diverse suggestions. © 2016 Springer Science+Business Media New York

  12. PLAST: parallel local alignment search tool for database comparison

    Directory of Open Access Journals (Sweden)

    Lavenier Dominique

    2009-10-01

    Full Text Available Abstract Background Sequence similarity searching is an important and challenging task in molecular biology and next-generation sequencing should further strengthen the need for faster algorithms to process such vast amounts of data. At the same time, the internal architecture of current microprocessors is tending towards more parallelism, leading to the use of chips with two, four and more cores integrated on the same die. The main purpose of this work was to design an effective algorithm to fit with the parallel capabilities of modern microprocessors. Results A parallel algorithm for comparing large genomic banks and targeting middle-range computers has been developed and implemented in PLAST software. The algorithm exploits two key parallel features of existing and future microprocessors: the SIMD programming model (SSE instruction set and the multithreading concept (multicore. Compared to multithreaded BLAST software, tests performed on an 8-processor server have shown speedup ranging from 3 to 6 with a similar level of accuracy. Conclusion A parallel algorithmic approach driven by the knowledge of the internal microprocessor architecture allows significant speedup to be obtained while preserving standard sensitivity for similarity search problems.

  13. Ariadne: a database search engine for identification and chemical analysis of RNA using tandem mass spectrometry data.

    Science.gov (United States)

    Nakayama, Hiroshi; Akiyama, Misaki; Taoka, Masato; Yamauchi, Yoshio; Nobe, Yuko; Ishikawa, Hideaki; Takahashi, Nobuhiro; Isobe, Toshiaki

    2009-04-01

    We present here a method to correlate tandem mass spectra of sample RNA nucleolytic fragments with an RNA nucleotide sequence in a DNA/RNA sequence database, thereby allowing tandem mass spectrometry (MS/MS)-based identification of RNA in biological samples. Ariadne, a unique web-based database search engine, identifies RNA by two probability-based evaluation steps of MS/MS data. In the first step, the software evaluates the matches between the masses of product ions generated by MS/MS of an RNase digest of sample RNA and those calculated from a candidate nucleotide sequence in a DNA/RNA sequence database, which then predicts the nucleotide sequences of these RNase fragments. In the second step, the candidate sequences are mapped for all RNA entries in the database, and each entry is scored for a function of occurrences of the candidate sequences to identify a particular RNA. Ariadne can also predict post-transcriptional modifications of RNA, such as methylation of nucleotide bases and/or ribose, by estimating mass shifts from the theoretical mass values. The method was validated with MS/MS data of RNase T1 digests of in vitro transcripts. It was applied successfully to identify an unknown RNA component in a tRNA mixture and to analyze post-transcriptional modification in yeast tRNA(Phe-1).

  14. muBLASTP: database-indexed protein sequence search on multicore CPUs.

    Science.gov (United States)

    Zhang, Jing; Misra, Sanchit; Wang, Hao; Feng, Wu-Chun

    2016-11-04

    The Basic Local Alignment Search Tool (BLAST) is a fundamental program in the life sciences that searches databases for sequences that are most similar to a query sequence. Currently, the BLAST algorithm utilizes a query-indexed approach. Although many approaches suggest that sequence search with a database index can achieve much higher throughput (e.g., BLAT, SSAHA, and CAFE), they cannot deliver the same level of sensitivity as the query-indexed BLAST, i.e., NCBI BLAST, or they can only support nucleotide sequence search, e.g., MegaBLAST. Due to different challenges and characteristics between query indexing and database indexing, the existing techniques for query-indexed search cannot be used into database indexed search. muBLASTP, a novel database-indexed BLAST for protein sequence search, delivers identical hits returned to NCBI BLAST. On Intel Haswell multicore CPUs, for a single query, the single-threaded muBLASTP achieves up to a 4.41-fold speedup for alignment stages, and up to a 1.75-fold end-to-end speedup over single-threaded NCBI BLAST. For a batch of queries, the multithreaded muBLASTP achieves up to a 5.7-fold speedups for alignment stages, and up to a 4.56-fold end-to-end speedup over multithreaded NCBI BLAST. With a newly designed index structure for protein database and associated optimizations in BLASTP algorithm, we re-factored BLASTP algorithm for modern multicore processors that achieves much higher throughput with acceptable memory footprint for the database index.

  15. Statistics of DNA Markers - RGP gmap | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us RGP gmap Statistics of DNA Markers Data detail Data name Statistics of DNA Markers DOI 10.18...908/lsdba.nbdc00318-01-001 Description of data contents Statistics of DNA markers that were used to create t...iption Download License Update History of This Database Site Policy | Contact Us Statistics of DNA Markers - RGP gmap | LSDB Archive ...

  16. Database search for safety information on cosmetic ingredients.

    Science.gov (United States)

    Pauwels, Marleen; Rogiers, Vera

    2007-12-01

    Ethical considerations with respect to experimental animal use and regulatory testing are worldwide under heavy discussion and are, in certain cases, taken up in legislative measures. The most explicit example is the European cosmetic legislation, establishing a testing ban on finished cosmetic products since 11 September 2004 and enforcing that the safety of a cosmetic product is assessed by taking into consideration "the general toxicological profile of the ingredients, their chemical structure and their level of exposure" (OJ L151, 32-37, 23 June 1993; OJ L066, 26-35, 11 March 2003). Therefore the availability of referenced and reliable information on cosmetic ingredients becomes a dire necessity. Given the high-speed progress of the World Wide Web services and the concurrent drastic increase in free access to information, identification of relevant data sources and evaluation of the scientific value and quality of the retrieved data, are crucial. Based upon own practical experience, a survey is put together of freely and commercially available data sources with their individual description, field of application, benefits and drawbacks. It should be mentioned that the search strategies described are equally useful as a starting point for any quest for safety data on chemicals or chemical-related substances in general.

  17. Forensic DNA databases in Western Balkan region: retrospectives, perspectives, and initiatives

    Science.gov (United States)

    Marjanović, Damir; Konjhodžić, Rijad; Butorac, Sara Sanela; Drobnič, Katja; Merkaš, Siniša; Lauc, Gordan; Primorac, Damir; Anđelinović, Šimun; Milosavljević, Mladen; Karan, Željko; Vidović, Stojko; Stojković, Oliver; Panić, Bojana; Vučetić Dragović, Anđelka; Kovačević, Sandra; Jakovski, Zlatko; Asplen, Chris; Primorac, Dragan

    2011-01-01

    The European Network of Forensic Science Institutes (ENFSI) recommended the establishment of forensic DNA databases and specific implementation and management legislations for all EU/ENFSI members. Therefore, forensic institutions from Bosnia and Herzegovina, Serbia, Montenegro, and Macedonia launched a wide set of activities to support these recommendations. To assess the current state, a regional expert team completed detailed screening and investigation of the existing forensic DNA data repositories and associated legislation in these countries. The scope also included relevant concurrent projects and a wide spectrum of different activities in relation to forensics DNA use. The state of forensic DNA analysis was also determined in the neighboring Slovenia and Croatia, which already have functional national DNA databases. There is a need for a ‘regional supplement’ to the current documentation and standards pertaining to forensic application of DNA databases, which should include regional-specific preliminary aims and recommendations. PMID:21674821

  18. A searching and reporting system for relational databases using a graph-based metadata representation.

    Science.gov (United States)

    Hewitt, Robin; Gobbi, Alberto; Lee, Man-Ling

    2005-01-01

    Relational databases are the current standard for storing and retrieving data in the pharmaceutical and biotech industries. However, retrieving data from a relational database requires specialized knowledge of the database schema and of the SQL query language. At Anadys, we have developed an easy-to-use system for searching and reporting data in a relational database to support our drug discovery project teams. This system is fast and flexible and allows users to access all data without having to write SQL queries. This paper presents the hierarchical, graph-based metadata representation and SQL-construction methods that, together, are the basis of this system's capabilities.

  19. Canis mtDNA HV1 database: a web-based tool for collecting and surveying Canis mtDNA HV1 haplotype in public database.

    Science.gov (United States)

    Thai, Quan Ke; Chung, Dung Anh; Tran, Hoang-Dung

    2017-06-26

    Canine and wolf mitochondrial DNA haplotypes, which can be used for forensic or phylogenetic analyses, have been defined in various schemes depending on the region analyzed. In recent studies, the 582 bp fragment of the HV1 region is most commonly used. 317 different canine HV1 haplotypes have been reported in the rapidly growing public database GenBank. These reported haplotypes contain several inconsistencies in their haplotype information. To overcome this issue, we have developed a Canis mtDNA HV1 database. This database collects data on the HV1 582 bp region in dog mitochondrial DNA from the GenBank to screen and correct the inconsistencies. It also supports users in detection of new novel mutation profiles and assignment of new haplotypes. The Canis mtDNA HV1 database (CHD) contains 5567 nucleotide entries originating from 15 subspecies in the species Canis lupus. Of these entries, 3646 were haplotypes and grouped into 804 distinct sequences. 319 sequences were recognized as previously assigned haplotypes, while the remaining 485 sequences had new mutation profiles and were marked as new haplotype candidates awaiting further analysis for haplotype assignment. Of the 3646 nucleotide entries, only 414 were annotated with correct haplotype information, while 3232 had insufficient or lacked haplotype information and were corrected or modified before storing in the CHD. The CHD can be accessed at http://chd.vnbiology.com . It provides sequences, haplotype information, and a web-based tool for mtDNA HV1 haplotyping. The CHD is updated monthly and supplies all data for download. The Canis mtDNA HV1 database contains information about canine mitochondrial DNA HV1 sequences with reconciled annotation. It serves as a tool for detection of inconsistencies in GenBank and helps identifying new HV1 haplotypes. Thus, it supports the scientific community in naming new HV1 haplotypes and to reconcile existing annotation of HV1 582 bp sequences.

  20. MIDAS: a database-searching algorithm for metabolite identification in metabolomics.

    Science.gov (United States)

    Wang, Yingfeng; Kora, Guruprasad; Bowen, Benjamin P; Pan, Chongle

    2014-10-07

    A database searching approach can be used for metabolite identification in metabolomics by matching measured tandem mass spectra (MS/MS) against the predicted fragments of metabolites in a database. Here, we present the open-source MIDAS algorithm (Metabolite Identification via Database Searching). To evaluate a metabolite-spectrum match (MSM), MIDAS first enumerates possible fragments from a metabolite by systematic bond dissociation, then calculates the plausibility of the fragments based on their fragmentation pathways, and finally scores the MSM to assess how well the experimental MS/MS spectrum from collision-induced dissociation (CID) is explained by the metabolite's predicted CID MS/MS spectrum. MIDAS was designed to search high-resolution tandem mass spectra acquired on time-of-flight or Orbitrap mass spectrometer against a metabolite database in an automated and high-throughput manner. The accuracy of metabolite identification by MIDAS was benchmarked using four sets of standard tandem mass spectra from MassBank. On average, for 77% of original spectra and 84% of composite spectra, MIDAS correctly ranked the true compounds as the first MSMs out of all MetaCyc metabolites as decoys. MIDAS correctly identified 46% more original spectra and 59% more composite spectra at the first MSMs than an existing database-searching algorithm, MetFrag. MIDAS was showcased by searching a published real-world measurement of a metabolome from Synechococcus sp. PCC 7002 against the MetaCyc metabolite database. MIDAS identified many metabolites missed in the previous study. MIDAS identifications should be considered only as candidate metabolites, which need to be confirmed using standard compounds. To facilitate manual validation, MIDAS provides annotated spectra for MSMs and labels observed mass spectral peaks with predicted fragments. The database searching and manual validation can be performed online at http://midas.omicsbio.org.

  1. Global search tool for the Advanced Photon Source Integrated Relational Model of Installed Systems (IRMIS) database

    International Nuclear Information System (INIS)

    Quock, D.E.R.; Cianciarulo, M.B.

    2007-01-01

    The Integrated Relational Model of Installed Systems (IRMIS) is a relational database tool that has been implemented at the Advanced Photon Source to maintain an updated account of approximately 600 control system software applications, 400,000 process variables, and 30,000 control system hardware components. To effectively display this large amount of control system information to operators and engineers, IRMIS was initially built with nine Web-based viewers: Applications Organizing Index, IOC, PLC, Component Type, Installed Components, Network, Controls Spares, Process Variables, and Cables. However, since each viewer is designed to provide details from only one major category of the control system, the necessity for a one-stop global search tool for the entire database became apparent. The user requirements for extremely fast database search time and ease of navigation through search results led to the choice of Asynchronous JavaScript and XML (AJAX) technology in the implementation of the IRMIS global search tool. Unique features of the global search tool include a two-tier level of displayed search results, and a database data integrity validation and reporting mechanism.

  2. Sagace: A web-based search engine for biomedical databases in Japan

    Directory of Open Access Journals (Sweden)

    Morita Mizuki

    2012-10-01

    Full Text Available Abstract Background In the big data era, biomedical research continues to generate a large amount of data, and the generated information is often stored in a database and made publicly available. Although combining data from multiple databases should accelerate further studies, the current number of life sciences databases is too large to grasp features and contents of each database. Findings We have developed Sagace, a web-based search engine that enables users to retrieve information from a range of biological databases (such as gene expression profiles and proteomics data and biological resource banks (such as mouse models of disease and cell lines. With Sagace, users can search more than 300 databases in Japan. Sagace offers features tailored to biomedical research, including manually tuned ranking, a faceted navigation to refine search results, and rich snippets constructed with retrieved metadata for each database entry. Conclusions Sagace will be valuable for experts who are involved in biomedical research and drug development in both academia and industry. Sagace is freely available at http://sagace.nibio.go.jp/en/.

  3. The LAILAPS search engine: a feature model for relevance ranking in life science databases.

    Science.gov (United States)

    Lange, Matthias; Spies, Karl; Colmsee, Christian; Flemming, Steffen; Klapperstück, Matthias; Scholz, Uwe

    2010-03-25

    Efficient and effective information retrieval in life sciences is one of the most pressing challenge in bioinformatics. The incredible growth of life science databases to a vast network of interconnected information systems is to the same extent a big challenge and a great chance for life science research. The knowledge found in the Web, in particular in life-science databases, are a valuable major resource. In order to bring it to the scientist desktop, it is essential to have well performing search engines. Thereby, not the response time nor the number of results is important. The most crucial factor for millions of query results is the relevance ranking. In this paper, we present a feature model for relevance ranking in life science databases and its implementation in the LAILAPS search engine. Motivated by the observation of user behavior during their inspection of search engine result, we condensed a set of 9 relevance discriminating features. These features are intuitively used by scientists, who briefly screen database entries for potential relevance. The features are both sufficient to estimate the potential relevance, and efficiently quantifiable. The derivation of a relevance prediction function that computes the relevance from this features constitutes a regression problem. To solve this problem, we used artificial neural networks that have been trained with a reference set of relevant database entries for 19 protein queries. Supporting a flexible text index and a simple data import format, this concepts are implemented in the LAILAPS search engine. It can easily be used both as search engine for comprehensive integrated life science databases and for small in-house project databases. LAILAPS is publicly available for SWISSPROT data at http://lailaps.ipk-gatersleben.de.

  4. Access to DNA and protein databases on the Internet.

    Science.gov (United States)

    Harper, R

    1994-02-01

    During the past year, the number of biological databases that can be queried via Internet has dramatically increased. This increase has resulted from the introduction of networking tools, such as Gopher and WAIS, that make it easy for research workers to index databases and make them available for on-line browsing. Biocomputing in the nineties will see the advent of more client/server options for the solution of problems in bioinformatics.

  5. Metagenomic Taxonomy-Guided Database-Searching Strategy for Improving Metaproteomic Analysis.

    Science.gov (United States)

    Xiao, Jinqiu; Tanca, Alessandro; Jia, Ben; Yang, Runqing; Wang, Bo; Zhang, Yu; Li, Jing

    2018-04-06

    Metaproteomics provides a direct measure of the functional information by investigating all proteins expressed by a microbiota. However, due to the complexity and heterogeneity of microbial communities, it is very hard to construct a sequence database suitable for a metaproteomic study. Using a public database, researchers might not be able to identify proteins from poorly characterized microbial species, while a sequencing-based metagenomic database may not provide adequate coverage for all potentially expressed protein sequences. To address this challenge, we propose a metagenomic taxonomy-guided database-search strategy (MT), in which a merged database is employed, consisting of both taxonomy-guided reference protein sequences from public databases and proteins from metagenome assembly. By applying our MT strategy to a mock microbial mixture, about two times as many peptides were detected as with the metagenomic database only. According to the evaluation of the reliability of taxonomic attribution, the rate of misassignments was comparable to that obtained using an a priori matched database. We also evaluated the MT strategy with a human gut microbial sample, and we found 1.7 times as many peptides as using a standard metagenomic database. In conclusion, our MT strategy allows the construction of databases able to provide high sensitivity and precision in peptide identification in metaproteomic studies, enabling the detection of proteins from poorly characterized species within the microbiota.

  6. License - Budding yeast cDNA sequencing project | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Budding yeast cDNA sequencing project License to Use This Database Last updated : 2010/02/15 You may use this databas...ional License described below. The Standard License specifies the license terms regarding the use of this database... and the requirements you must follow in using this database. The Additiona...n the Standard License. Standard License The Standard License for this database is the license specified in ...the Creative Commons Attribution-Share Alike 2.1 Japan . If you use data from this database

  7. A Bayesian network approach to the database search problem in criminal proceedings

    Science.gov (United States)

    2012-01-01

    Background The ‘database search problem’, that is, the strengthening of a case - in terms of probative value - against an individual who is found as a result of a database search, has been approached during the last two decades with substantial mathematical analyses, accompanied by lively debate and centrally opposing conclusions. This represents a challenging obstacle in teaching but also hinders a balanced and coherent discussion of the topic within the wider scientific and legal community. This paper revisits and tracks the associated mathematical analyses in terms of Bayesian networks. Their derivation and discussion for capturing probabilistic arguments that explain the database search problem are outlined in detail. The resulting Bayesian networks offer a distinct view on the main debated issues, along with further clarity. Methods As a general framework for representing and analyzing formal arguments in probabilistic reasoning about uncertain target propositions (that is, whether or not a given individual is the source of a crime stain), this paper relies on graphical probability models, in particular, Bayesian networks. This graphical probability modeling approach is used to capture, within a single model, a series of key variables, such as the number of individuals in a database, the size of the population of potential crime stain sources, and the rarity of the corresponding analytical characteristics in a relevant population. Results This paper demonstrates the feasibility of deriving Bayesian network structures for analyzing, representing, and tracking the database search problem. The output of the proposed models can be shown to agree with existing but exclusively formulaic approaches. Conclusions The proposed Bayesian networks allow one to capture and analyze the currently most well-supported but reputedly counter-intuitive and difficult solution to the database search problem in a way that goes beyond the traditional, purely formulaic expressions

  8. Search extension transforms Wiki into a relational system: a case for flavonoid metabolite database.

    Science.gov (United States)

    Arita, Masanori; Suwa, Kazuhiro

    2008-09-17

    In computer science, database systems are based on the relational model founded by Edgar Codd in 1970. On the other hand, in the area of biology the word 'database' often refers to loosely formatted, very large text files. Although such bio-databases may describe conflicts or ambiguities (e.g. a protein pair do and do not interact, or unknown parameters) in a positive sense, the flexibility of the data format sacrifices a systematic query mechanism equivalent to the widely used SQL. To overcome this disadvantage, we propose embeddable string-search commands on a Wiki-based system and designed a half-formatted database. As proof of principle, a database of flavonoid with 6902 molecular structures from over 1687 plant species was implemented on MediaWiki, the background system of Wikipedia. Registered users can describe any information in an arbitrary format. Structured part is subject to text-string searches to realize relational operations. The system was written in PHP language as the extension of MediaWiki. All modifications are open-source and publicly available. This scheme benefits from both the free-formatted Wiki style and the concise and structured relational-database style. MediaWiki supports multi-user environments for document management, and the cost for database maintenance is alleviated.

  9. cDNA table - RPD | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available of data contents Results of homology search to cDNA clones in the KOME. Data file File name: rpd_cdna.zip F...ile URL: ftp://ftp.biosciencedbc.jp/archive/rpd/LATEST/rpd_cdna.zip File size: 15 KB Simple search URL http:...//togodb.biosciencedbc.jp/togodb/view/rpd_cdna#en Data acquisition method - Data

  10. Parallel database search and prime factorization with magnonic holographic memory devices

    Energy Technology Data Exchange (ETDEWEB)

    Khitun, Alexander [Electrical and Computer Engineering Department, University of California - Riverside, Riverside, California 92521 (United States)

    2015-12-28

    In this work, we describe the capabilities of Magnonic Holographic Memory (MHM) for parallel database search and prime factorization. MHM is a type of holographic device, which utilizes spin waves for data transfer and processing. Its operation is based on the correlation between the phases and the amplitudes of the input spin waves and the output inductive voltage. The input of MHM is provided by the phased array of spin wave generating elements allowing the producing of phase patterns of an arbitrary form. The latter makes it possible to code logic states into the phases of propagating waves and exploit wave superposition for parallel data processing. We present the results of numerical modeling illustrating parallel database search and prime factorization. The results of numerical simulations on the database search are in agreement with the available experimental data. The use of classical wave interference may results in a significant speedup over the conventional digital logic circuits in special task data processing (e.g., √n in database search). Potentially, magnonic holographic devices can be implemented as complementary logic units to digital processors. Physical limitations and technological constrains of the spin wave approach are also discussed.

  11. Parallel database search and prime factorization with magnonic holographic memory devices

    Science.gov (United States)

    Khitun, Alexander

    2015-12-01

    In this work, we describe the capabilities of Magnonic Holographic Memory (MHM) for parallel database search and prime factorization. MHM is a type of holographic device, which utilizes spin waves for data transfer and processing. Its operation is based on the correlation between the phases and the amplitudes of the input spin waves and the output inductive voltage. The input of MHM is provided by the phased array of spin wave generating elements allowing the producing of phase patterns of an arbitrary form. The latter makes it possible to code logic states into the phases of propagating waves and exploit wave superposition for parallel data processing. We present the results of numerical modeling illustrating parallel database search and prime factorization. The results of numerical simulations on the database search are in agreement with the available experimental data. The use of classical wave interference may results in a significant speedup over the conventional digital logic circuits in special task data processing (e.g., √n in database search). Potentially, magnonic holographic devices can be implemented as complementary logic units to digital processors. Physical limitations and technological constrains of the spin wave approach are also discussed.

  12. Parallel database search and prime factorization with magnonic holographic memory devices

    International Nuclear Information System (INIS)

    Khitun, Alexander

    2015-01-01

    In this work, we describe the capabilities of Magnonic Holographic Memory (MHM) for parallel database search and prime factorization. MHM is a type of holographic device, which utilizes spin waves for data transfer and processing. Its operation is based on the correlation between the phases and the amplitudes of the input spin waves and the output inductive voltage. The input of MHM is provided by the phased array of spin wave generating elements allowing the producing of phase patterns of an arbitrary form. The latter makes it possible to code logic states into the phases of propagating waves and exploit wave superposition for parallel data processing. We present the results of numerical modeling illustrating parallel database search and prime factorization. The results of numerical simulations on the database search are in agreement with the available experimental data. The use of classical wave interference may results in a significant speedup over the conventional digital logic circuits in special task data processing (e.g., √n in database search). Potentially, magnonic holographic devices can be implemented as complementary logic units to digital processors. Physical limitations and technological constrains of the spin wave approach are also discussed

  13. Toward a public analysis database for LHC new physics searches using M ADA NALYSIS 5

    Science.gov (United States)

    Dumont, B.; Fuks, B.; Kraml, S.; Bein, S.; Chalons, G.; Conte, E.; Kulkarni, S.; Sengupta, D.; Wymant, C.

    2015-02-01

    We present the implementation, in the MadAnalysis 5 framework, of several ATLAS and CMS searches for supersymmetry in data recorded during the first run of the LHC. We provide extensive details on the validation of our implementations and propose to create a public analysis database within this framework.

  14. A Web-based Tool for SDSS and 2MASS Database Searches

    Science.gov (United States)

    Hendrickson, M. A.; Uomoto, A.; Golimowski, D. A.

    We have developed a web site using HTML, Php, Python, and MySQL that extracts, processes, and displays data from the Sloan Digital Sky Survey (SDSS) and the Two-Micron All-Sky Survey (2MASS). The goal is to locate brown dwarf candidates in the SDSS database by looking at color cuts; however, this site could also be useful for targeted searches of other databases as well. MySQL databases are created from broad searches of SDSS and 2MASS data. Broad queries on the SDSS and 2MASS database servers are run weekly so that observers have the most up-to-date information from which to select candidates for observation. Observers can look at detailed information about specific objects including finding charts, images, and available spectra. In addition, updates from previous observations can be added by any collaborators; this format makes observational collaboration simple. Observers can also restrict the database search, just before or during an observing run, to select objects of special interest.

  15. Repair of Alkylation Damage in Eukaryotic Chromatin Depends on Searching Ability of Alkyladenine DNA Glycosylase.

    Science.gov (United States)

    Zhang, Yaru; O'Brien, Patrick J

    2015-11-20

    Human alkyladenine DNA glycosylase (AAG) initiates the base excision repair pathway by excising alkylated and deaminated purine lesions. In vitro biochemical experiments demonstrate that AAG uses facilitated diffusion to efficiently search DNA to find rare sites of damage and suggest that electrostatic interactions are critical to the searching process. However, it remains an open question whether DNA searching limits the rate of DNA repair in vivo. We constructed AAG mutants with altered searching ability and measured their ability to protect yeast from alkylation damage in order to address this question. Each of the conserved arginine and lysine residues that are near the DNA binding interface were mutated, and the functional impacts were evaluated using kinetic and thermodynamic analysis. These mutations do not perturb catalysis of N-glycosidic bond cleavage, but they decrease the ability to capture rare lesion sites. Nonspecific and specific DNA binding properties are closely correlated, suggesting that the electrostatic interactions observed in the specific recognition complex are similarly important for DNA searching complexes. The ability of the mutant proteins to complement repair-deficient yeast cells is positively correlated with the ability of the proteins to search DNA in vitro, suggesting that cellular resistance to DNA alkylation is governed by the ability to find and efficiently capture cytotoxic lesions. It appears that chromosomal access is not restricted and toxic sites of alkylation damage are readily accessible to a searching protein.

  16. MSblender: A probabilistic approach for integrating peptide identifications from multiple database search engines.

    Science.gov (United States)

    Kwon, Taejoon; Choi, Hyungwon; Vogel, Christine; Nesvizhskii, Alexey I; Marcotte, Edward M

    2011-07-01

    Shotgun proteomics using mass spectrometry is a powerful method for protein identification but suffers limited sensitivity in complex samples. Integrating peptide identifications from multiple database search engines is a promising strategy to increase the number of peptide identifications and reduce the volume of unassigned tandem mass spectra. Existing methods pool statistical significance scores such as p-values or posterior probabilities of peptide-spectrum matches (PSMs) from multiple search engines after high scoring peptides have been assigned to spectra, but these methods lack reliable control of identification error rates as data are integrated from different search engines. We developed a statistically coherent method for integrative analysis, termed MSblender. MSblender converts raw search scores from search engines into a probability score for every possible PSM and properly accounts for the correlation between search scores. The method reliably estimates false discovery rates and identifies more PSMs than any single search engine at the same false discovery rate. Increased identifications increment spectral counts for most proteins and allow quantification of proteins that would not have been quantified by individual search engines. We also demonstrate that enhanced quantification contributes to improve sensitivity in differential expression analyses.

  17. High-throughput STR analysis for DNA database using direct PCR.

    Science.gov (United States)

    Sim, Jeong Eun; Park, Su Jeong; Lee, Han Chul; Kim, Se-Yong; Kim, Jong Yeol; Lee, Seung Hwan

    2013-07-01

    Since the Korean criminal DNA database was launched in 2010, we have focused on establishing an automated DNA database profiling system that analyzes short tandem repeat loci in a high-throughput and cost-effective manner. We established a DNA database profiling system without DNA purification using a direct PCR buffer system. The quality of direct PCR procedures was compared with that of conventional PCR system under their respective optimized conditions. The results revealed not only perfect concordance but also an excellent PCR success rate, good electropherogram quality, and an optimal intra/inter-loci peak height ratio. In particular, the proportion of DNA extraction required due to direct PCR failure could be minimized to <3%. In conclusion, the newly developed direct PCR system can be adopted for automated DNA database profiling systems to replace or supplement conventional PCR system in a time- and cost-saving manner. © 2013 American Academy of Forensic Sciences Published 2013. This article is a U.S. Government work and is in the public domain in the U.S.A.

  18. BioBarcode: a general DNA barcoding database and server platform for Asian biodiversity resources

    Science.gov (United States)

    2009-01-01

    Background DNA barcoding provides a rapid, accurate, and standardized method for species-level identification using short DNA sequences. Such a standardized identification method is useful for mapping all the species on Earth, particularly when DNA sequencing technology is cheaply available. There are many nations in Asia with many biodiversity resources that need to be mapped and registered in databases. Results We have built a general DNA barcode data processing system, BioBarcode, with open source software - which is a general purpose database and server. It uses mySQL RDBMS 5.0, BLAST2, and Apache httpd server. An exemplary database of BioBarcode has around 11,300 specimen entries (including GenBank data) and registers the biological species to map their genetic relationships. The BioBarcode database contains a chromatogram viewer which improves the performance in DNA sequence analyses. Conclusion Asia has a very high degree of biodiversity and the BioBarcode database server system aims to provide an efficient bioinformatics protocol that can be freely used by Asian researchers and research organizations interested in DNA barcoding. The BioBarcode promotes the rapid acquisition of biological species DNA sequence data that meet global standards by providing specialized services, and provides useful tools that will make barcoding cheaper and faster in the biodiversity community such as standardization, depository, management, and analysis of DNA barcode data. The system can be downloaded upon request, and an exemplary server has been constructed with which to build an Asian biodiversity system http://www.asianbarcode.org. PMID:19958506

  19. IMPROVED SEARCH OF PRINCIPAL COMPONENT ANALYSIS DATABASES FOR SPECTRO-POLARIMETRIC INVERSION

    International Nuclear Information System (INIS)

    Casini, R.; Lites, B. W.; Ramos, A. Asensio; Ariste, A. López

    2013-01-01

    We describe a simple technique for the acceleration of spectro-polarimetric inversions based on principal component analysis (PCA) of Stokes profiles. This technique involves the indexing of the database models based on the sign of the projections (PCA coefficients) of the first few relevant orders of principal components of the four Stokes parameters. In this way, each model in the database can be attributed a distinctive binary number of 2 4n bits, where n is the number of PCA orders used for the indexing. Each of these binary numbers (indices) identifies a group of ''compatible'' models for the inversion of a given set of observed Stokes profiles sharing the same index. The complete set of the binary numbers so constructed evidently determines a partition of the database. The search of the database for the PCA inversion of spectro-polarimetric data can profit greatly from this indexing. In practical cases it becomes possible to approach the ideal acceleration factor of 2 4n as compared to the systematic search of a non-indexed database for a traditional PCA inversion. This indexing method relies on the existence of a physical meaning in the sign of the PCA coefficients of a model. For this reason, the presence of model ambiguities and of spectro-polarimetric noise in the observations limits in practice the number n of relevant PCA orders that can be used for the indexing

  20. Download - KAIKOcDNA | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available rch and download 1 README README_e.html - 2 EST Table kaiko_cdna_main.zip (157 MB...) Simple search and download 3 Cluster Table kaiko_cdna_cluster.zip (453 KB) Simple search and download 4 ORF Table kaiko_cdna..._orf.zip (11 MB) Simple search and download 5 InterProScan Result kaiko_cdna_interpro.zip ...(3.1 MB) Simple search and download 6 cDNA library Table kaiko_cdna_library.zip (

  1. About DNA databasing and investigative genetic analysis of externally visible characteristics: A public survey.

    Science.gov (United States)

    Zieger, Martin; Utz, Silvia

    2015-07-01

    During the last decade, DNA profiling and the use of DNA databases have become two of the most employed instruments of police investigations. This very rapid establishment of forensic genetics is yet far from being complete. In the last few years novel types of analyses have been presented to describe phenotypically a possible perpetrator. We conducted the present study among German speaking Swiss residents for two main reasons: firstly, we aimed at getting an impression of the public awareness and acceptance of the Swiss DNA database and the perception of a hypothetical DNA database containing all Swiss residents. Secondly, we wanted to get a broader picture of how people that are not working in the field of forensic genetics think about legal permission to establish phenotypic descriptions of alleged criminals by genetic means. Even though a significant number of study participants did not even know about the existence of the Swiss DNA database, its acceptance appears to be very high. Generally our results suggest that the current forensic use of DNA profiling is considered highly trustworthy. However, the acceptance of a hypothetical universal database would be only as low as about 30% among the 284 respondents to our study, mostly because people are concerned about the security of their genetic data, their privacy or a possible risk of abuse of such a database. Concerning the genetic analysis of externally visible characteristics and biogeographical ancestry, we discover a high degree of acceptance. The acceptance decreases slightly when precise characteristics are presented to the participants in detail. About half of the respondents would be in favor of the moderate use of physical traits analyses only for serious crimes threatening life, health or sexual integrity. The possible risk of discrimination and reinforcement of racism, as discussed by scholars from anthropology, bioethics, law, philosophy and sociology, is mentioned less frequently by the study

  2. Tandem Mass Spectrum Sequencing: An Alternative to Database Search Engines in Shotgun Proteomics.

    Science.gov (United States)

    Muth, Thilo; Rapp, Erdmann; Berven, Frode S; Barsnes, Harald; Vaudel, Marc

    2016-01-01

    Protein identification via database searches has become the gold standard in mass spectrometry based shotgun proteomics. However, as the quality of tandem mass spectra improves, direct mass spectrum sequencing gains interest as a database-independent alternative. In this chapter, the general principle of this so-called de novo sequencing is introduced along with pitfalls and challenges of the technique. The main tools available are presented with a focus on user friendly open source software which can be directly applied in everyday proteomic workflows.

  3. Efficient Similarity Search Using the Earth Mover's Distance for Large Multimedia Databases

    DEFF Research Database (Denmark)

    Assent, Ira; Wichterich, Marc; Meisen, Tobias

    2008-01-01

    Multimedia similarity search in large databases requires efficient query processing. The Earth mover's distance, introduced in computer vision, is successfully used as a similarity model in a number of small-scale applications. Its computational complexity hindered its adoption in large multimedia...... databases. We enable directly indexing the Earth mover's distance in structures such as the R-tree and the VA-file by providing the accurate 'MinDist' function to any bounding rectangle in the index. We exploit the computational structure of the new MinDist to derive a new lower bound for the EMD Min...

  4. Quantum Partial Searching Algorithm of a Database with Several Target Items

    International Nuclear Information System (INIS)

    Pu-Cha, Zhong; Wan-Su, Bao; Yun, Wei

    2009-01-01

    Choi and Korepin [Quantum Information Processing 6(2007)243] presented a quantum partial search algorithm of a database with several target items which can find a target block quickly when each target block contains the same number of target items. Actually, the number of target items in each target block is arbitrary. Aiming at this case, we give a condition to guarantee performance of the partial search algorithm to be performed and the number of queries to oracle of the algorithm to be minimized. In addition, by further numerical computing we come to the conclusion that the more uniform the distribution of target items, the smaller the number of queries

  5. Public participation in genetic databases: crossing the boundaries between biobanks and forensic DNA databases through the principle of solidarity.

    Science.gov (United States)

    Machado, Helena; Silva, Susana

    2015-10-01

    The ethical aspects of biobanks and forensic DNA databases are often treated as separate issues. As a reflection of this, public participation, or the involvement of citizens in genetic databases, has been approached differently in the fields of forensics and medicine. This paper aims to cross the boundaries between medicine and forensics by exploring the flows between the ethical issues presented in the two domains and the subsequent conceptualisation of public trust and legitimisation. We propose to introduce the concept of 'solidarity', traditionally applied only to medical and research biobanks, into a consideration of public engagement in medicine and forensics. Inclusion of a solidarity-based framework, in both medical biobanks and forensic DNA databases, raises new questions that should be included in the ethical debate, in relation to both health services/medical research and activities associated with the criminal justice system. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  6. Indexing Bibliographic Database Content Using MariaDB and Sphinx Search Server

    Directory of Open Access Journals (Sweden)

    Arie Nugraha

    2014-07-01

    Full Text Available Fast retrieval of digital content has become mandatory for library and archive information systems. Many software applications have emerged to handle the indexing of digital content, from low-level ones such Apache Lucene, to more RESTful and web-services-ready ones such Apache Solr and ElasticSearch. Solr’s popularity among library software developers makes it the “de-facto” standard software for indexing digital content. For content (full-text content or bibliographic description already stored inside a relational DBMS such as MariaDB (a fork of MySQL or PostgreSQL, Sphinx Search Server (Sphinx is a suitable alternative. This article will cover an introduction on how to use Sphinx with MariaDB databases to index database content as well as some examples of Sphinx API usage.

  7. Identification of Alternative Splice Variants Using Unique Tryptic Peptide Sequences for Database Searches.

    Science.gov (United States)

    Tran, Trung T; Bollineni, Ravi C; Strozynski, Margarita; Koehler, Christian J; Thiede, Bernd

    2017-07-07

    Alternative splicing is a mechanism in eukaryotes by which different forms of mRNAs are generated from the same gene. Identification of alternative splice variants requires the identification of peptides specific for alternative splice forms. For this purpose, we generated a human database that contains only unique tryptic peptides specific for alternative splice forms from Swiss-Prot entries. Using this database allows an easy access to splice variant-specific peptide sequences that match to MS data. Furthermore, we combined this database without alternative splice variant-1-specific peptides with human Swiss-Prot. This combined database can be used as a general database for searching of LC-MS data. LC-MS data derived from in-solution digests of two different cell lines (LNCaP, HeLa) and phosphoproteomics studies were analyzed using these two databases. Several nonalternative splice variant-1-specific peptides were found in both cell lines, and some of them seemed to be cell-line-specific. Control and apoptotic phosphoproteomes from Jurkat T cells revealed several nonalternative splice variant-1-specific peptides, and some of them showed clear quantitative differences between the two states.

  8. Accelerating Smith-Waterman Algorithm for Biological Database Search on CUDA-Compatible GPUs

    Science.gov (United States)

    Munekawa, Yuma; Ino, Fumihiko; Hagihara, Kenichi

    This paper presents a fast method capable of accelerating the Smith-Waterman algorithm for biological database search on a cluster of graphics processing units (GPUs). Our method is implemented using compute unified device architecture (CUDA), which is available on the nVIDIA GPU. As compared with previous methods, our method has four major contributions. (1) The method efficiently uses on-chip shared memory to reduce the data amount being transferred between off-chip video memory and processing elements in the GPU. (2) It also reduces the number of data fetches by applying a data reuse technique to query and database sequences. (3) A pipelined method is also implemented to overlap GPU execution with database access. (4) Finally, a master/worker paradigm is employed to accelerate hundreds of database searches on a cluster system. In experiments, the peak performance on a GeForce GTX 280 card reaches 8.32 giga cell updates per second (GCUPS). We also find that our method reduces the amount of data fetches to 1/140, achieving approximately three times higher performance than a previous CUDA-based method. Our 32-node cluster version is approximately 28 times faster than a single GPU version. Furthermore, the effective performance reaches 75.6 giga instructions per second (GIPS) using 32 GeForce 8800 GTX cards.

  9. What is lost when searching only one literature database for articles relevant to injury prevention and safety promotion?

    Science.gov (United States)

    Lawrence, D W

    2008-12-01

    To assess what is lost if only one literature database is searched for articles relevant to injury prevention and safety promotion (IPSP) topics. Serial textword (keyword, free-text) searches using multiple synonym terms for five key IPSP topics (bicycle-related brain injuries, ethanol-impaired driving, house fires, road rage, and suicidal behaviors among adolescents) were conducted in four of the bibliographic databases that are most used by IPSP professionals: EMBASE, MEDLINE, PsycINFO, and Web of Science. Through a systematic procedure, an inventory of articles on each topic in each database was conducted to identify the total unduplicated count of all articles on each topic, the number of articles unique to each database, and the articles available if only one database is searched. No single database included all of the relevant articles on any topic, and the database with the broadest coverage differed by topic. A search of only one literature database will return 16.7-81.5% (median 43.4%) of the available articles on any of five key IPSP topics. Each database contributed unique articles to the total bibliography for each topic. A literature search performed in only one database will, on average, lead to a loss of more than half of the available literature on a topic.

  10. Quality standards for DNA sequence variation databases to improve clinical management under development in Australia

    Directory of Open Access Journals (Sweden)

    B. Bennetts

    2014-09-01

    Full Text Available Despite the routine nature of comparing sequence variations identified during clinical testing to database records, few databases meet quality requirements for clinical diagnostics. To address this issue, The Royal College of Pathologists of Australasia (RCPA in collaboration with the Human Genetics Society of Australasia (HGSA, and the Human Variome Project (HVP is developing standards for DNA sequence variation databases intended for use in the Australian clinical environment. The outputs of this project will be promoted to other health systems and accreditation bodies by the Human Variome Project to support the development of similar frameworks in other jurisdictions.

  11. Colil: a database and search service for citation contexts in the life sciences domain.

    Science.gov (United States)

    Fujiwara, Toyofumi; Yamamoto, Yasunori

    2015-01-01

    To promote research activities in a particular research area, it is important to efficiently identify current research trends, advances, and issues in that area. Although review papers in the research area can suffice for this purpose in general, researchers are not necessarily able to obtain these papers from research aspects of their interests at the time they are required. Therefore, the utilization of the citation contexts of papers in a research area has been considered as another approach. However, there are few search services to retrieve citation contexts in the life sciences domain; furthermore, efficiently obtaining citation contexts is becoming difficult due to the large volume and rapid growth of life sciences papers. Here, we introduce the Colil (Comments on Literature in Literature) database to store citation contexts in the life sciences domain. By using the Resource Description Framework (RDF) and a newly compiled vocabulary, we built the Colil database and made it available through the SPARQL endpoint. In addition, we developed a web-based search service called Colil that searches for a cited paper in the Colil database and then returns a list of citation contexts for it along with papers relevant to it based on co-citations. The citation contexts in the Colil database were extracted from full-text papers of the PubMed Central Open Access Subset (PMC-OAS), which includes 545,147 papers indexed in PubMed. These papers are distributed across 3,171 journals and cite 5,136,741 unique papers that correspond to approximately 25 % of total PubMed entries. By utilizing Colil, researchers can easily refer to a set of citation contexts and relevant papers based on co-citations for a target paper. Colil helps researchers to comprehend life sciences papers in a research area more efficiently and makes their biological research more efficient.

  12. CUDASW++: optimizing Smith-Waterman sequence database searches for CUDA-enabled graphics processing units

    Directory of Open Access Journals (Sweden)

    Maskell Douglas L

    2009-05-01

    Full Text Available Abstract Background The Smith-Waterman algorithm is one of the most widely used tools for searching biological sequence databases due to its high sensitivity. Unfortunately, the Smith-Waterman algorithm is computationally demanding, which is further compounded by the exponential growth of sequence databases. The recent emergence of many-core architectures, and their associated programming interfaces, provides an opportunity to accelerate sequence database searches using commonly available and inexpensive hardware. Findings Our CUDASW++ implementation (benchmarked on a single-GPU NVIDIA GeForce GTX 280 graphics card and a dual-GPU GeForce GTX 295 graphics card provides a significant performance improvement compared to other publicly available implementations, such as SWPS3, CBESW, SW-CUDA, and NCBI-BLAST. CUDASW++ supports query sequences of length up to 59K and for query sequences ranging in length from 144 to 5,478 in Swiss-Prot release 56.6, the single-GPU version achieves an average performance of 9.509 GCUPS with a lowest performance of 9.039 GCUPS and a highest performance of 9.660 GCUPS, and the dual-GPU version achieves an average performance of 14.484 GCUPS with a lowest performance of 10.660 GCUPS and a highest performance of 16.087 GCUPS. Conclusion CUDASW++ is publicly available open-source software. It provides a significant performance improvement for Smith-Waterman-based protein sequence database searches by fully exploiting the compute capability of commonly used CUDA-enabled low-cost GPUs.

  13. An Internet-Accessible DNA Sequence Database for Identifying Fusaria from Human and Animal Infections

    Science.gov (United States)

    Because less than one-third of clinically relevant fusaria can be accurately identified to species level using phenotypic data (i.e., morphological species recognition), we constructed a three-locus DNA sequence database to facilitate molecular identification of the 69 Fusarium species associated wi...

  14. Dialysis search filters for PubMed, Ovid MEDLINE, and Embase databases.

    Science.gov (United States)

    Iansavichus, Arthur V; Haynes, R Brian; Lee, Christopher W C; Wilczynski, Nancy L; McKibbon, Ann; Shariff, Salimah Z; Blake, Peter G; Lindsay, Robert M; Garg, Amit X

    2012-10-01

    Physicians frequently search bibliographic databases, such as MEDLINE via PubMed, for best evidence for patient care. The objective of this study was to develop and test search filters to help physicians efficiently retrieve literature related to dialysis (hemodialysis or peritoneal dialysis) from all other articles indexed in PubMed, Ovid MEDLINE, and Embase. A diagnostic test assessment framework was used to develop and test robust dialysis filters. The reference standard was a manual review of the full texts of 22,992 articles from 39 journals to determine whether each article contained dialysis information. Next, 1,623,728 unique search filters were developed, and their ability to retrieve relevant articles was evaluated. The high-performance dialysis filters consisted of up to 65 search terms in combination. These terms included the words "dialy" (truncated), "uremic," "catheters," and "renal transplant wait list." These filters reached peak sensitivities of 98.6% and specificities of 98.5%. The filters' performance remained robust in an independent validation subset of articles. These empirically derived and validated high-performance search filters should enable physicians to effectively retrieve dialysis information from PubMed, Ovid MEDLINE, and Embase.

  15. Cluster Table - KAIKOcDNA | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available Description of data contents List of number of identical cDNA sequences that make up cluster. Data file File name: kaiko_cdna..._cluster.zip File URL: ftp://ftp.biosciencedbc.jp/archive/kaiko-cdna/LATEST/kaiko_cdna_clu...ster.zip File size: 453 KB Simple search URL http://togodb.biosciencedbc.jp/togodb/view/kaiko_cdna

  16. Protein backbone angle restraints from searching a database for chemical shift and sequence homology

    Energy Technology Data Exchange (ETDEWEB)

    Cornilescu, Gabriel; Delaglio, Frank; Bax, Ad [National Institutes of Health, Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases (United States)

    1999-03-15

    Chemical shifts of backbone atoms in proteins are exquisitely sensitive to local conformation, and homologous proteins show quite similar patterns of secondary chemical shifts. The inverse of this relation is used to search a database for triplets of adjacent residues with secondary chemical shifts and sequence similarity which provide the best match to the query triplet of interest. The database contains 13C{alpha}, 13C{beta}, 13C', 1H{alpha} and 15N chemical shifts for 20 proteins for which a high resolution X-ray structure is available. The computer program TALOS was developed to search this database for strings of residues with chemical shift and residue type homology. The relative importance of the weighting factors attached to the secondary chemical shifts of the five types of resonances relative to that of sequence similarity was optimized empirically. TALOS yields the 10 triplets which have the closest similarity in secondary chemical shift and amino acid sequence to those of the query sequence. If the central residues in these 10 triplets exhibit similar {phi} and {psi} backbone angles, their averages can reliably be used as angular restraints for the protein whose structure is being studied. Tests carried out for proteins of known structure indicate that the root-mean-square difference (rmsd) between the output of TALOS and the X-ray derived backbone angles is about 15 deg. Approximately 3% of the predictions made by TALOS are found to be in error.

  17. The mitochondrial DNA makeup of Romanians: A forensic mtDNA control region database and phylogenetic characterization.

    Science.gov (United States)

    Turchi, Chiara; Stanciu, Florin; Paselli, Giorgia; Buscemi, Loredana; Parson, Walther; Tagliabracci, Adriano

    2016-09-01

    To evaluate the pattern of Romanian population from a mitochondrial perspective and to establish an appropriate mtDNA forensic database, we generated a high-quality mtDNA control region dataset from 407 Romanian subjects belonging to four major historical regions: Moldavia, Transylvania, Wallachia and Dobruja. The entire control region (CR) was analyzed by Sanger-type sequencing assays and the resulting 306 different haplotypes were classified into haplogroups according to the most updated mtDNA phylogeny. The Romanian gene pool is mainly composed of West Eurasian lineages H (31.7%), U (12.8%), J (10.8%), R (10.1%), T (9.1%), N (8.1%), HV (5.4%),K (3.7%), HV0 (4.2%), with exceptions of East Asian haplogroup M (3.4%) and African haplogroup L (0.7%). The pattern of mtDNA variation observed in this study indicates that the mitochondrial DNA pool is geographically homogeneous across Romania and that the haplogroup composition reveals signals of admixture of populations of different origin. The PCA scatterplot supported this scenario, with Romania located in southeastern Europe area, close to Bulgaria and Hungary, and as a borderland with respect to east Mediterranean and other eastern European countries. High haplotype diversity (0.993) and nucleotide diversity indices (0.00838±0.00426), together with low random match probability (0.0087) suggest the usefulness of this control region dataset as a forensic database in routine forensic mtDNA analysis and in the investigation of maternal genetic lineages in the Romanian population. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  18. mirPub: a database for searching microRNA publications.

    Science.gov (United States)

    Vergoulis, Thanasis; Kanellos, Ilias; Kostoulas, Nikos; Georgakilas, Georgios; Sellis, Timos; Hatzigeorgiou, Artemis; Dalamagas, Theodore

    2015-05-01

    Identifying, amongst millions of publications available in MEDLINE, those that are relevant to specific microRNAs (miRNAs) of interest based on keyword search faces major obstacles. References to miRNA names in the literature often deviate from standard nomenclature for various reasons, since even the official nomenclature evolves. For instance, a single miRNA name may identify two completely different molecules or two different names may refer to the same molecule. mirPub is a database with a powerful and intuitive interface, which facilitates searching for miRNA literature, addressing the aforementioned issues. To provide effective search services, mirPub applies text mining techniques on MEDLINE, integrates data from several curated databases and exploits data from its user community following a crowdsourcing approach. Other key features include an interactive visualization service that illustrates intuitively the evolution of miRNA data, tag clouds summarizing the relevance of publications to particular diseases, cell types or tissues and access to TarBase 6.0 data to oversee genes related to miRNA publications. mirPub is freely available at http://www.microrna.gr/mirpub/. vergoulis@imis.athena-innovation.gr or dalamag@imis.athena-innovation.gr Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.

  19. EST Table - KAIKOcDNA | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available uences registered to public database as of September 2011. Data file File name: kaiko_cdna_main.zip File URL...: ftp://ftp.biosciencedbc.jp/archive/kaiko-cdna/LATEST/kaiko_cdna_main.zip File s...ize: 157 MB Simple search URL http://togodb.biosciencedbc.jp/togodb/view/kaiko_cdna_main#en Data acquisition

  20. Database with web interface and search engine as a diagnostics tool for electromagnetic calorimeter

    CERN Document Server

    Paluoja, Priit

    2017-01-01

    During 2016 data collection, the Compact Muon Solenoid Data Acquisition (CMS DAQ) system has shown a very good reliability. Nevertheless, the high complexity of the hardware and the software involved is, by its nature, prone to some occasional problems. As CMS subdetector, electromagnetic calorimeter (ECAL) is affected in the same way. Some of the issues are not predictable and can appear during the year more than once such as components getting noisy, power shortcuts or failing communication between machines. The chain detection-diagnosis-intervention must be as fast as possible to minimise the downtime of the detector. The aim of this project was to create a diagnostic software for ECAL crew, which consists of database and its web interface that allows to search, add and edit the contents of the database.

  1. Integration of first-principles methods and crystallographic database searches for new ferroelectrics: Strategies and explorations

    International Nuclear Information System (INIS)

    Bennett, Joseph W.; Rabe, Karin M.

    2012-01-01

    In this concept paper, the development of strategies for the integration of first-principles methods with crystallographic database mining for the discovery and design of novel ferroelectric materials is discussed, drawing on the results and experience derived from exploratory investigations on three different systems: (1) the double perovskite Sr(Sb 1/2 Mn 1/2 )O 3 as a candidate semiconducting ferroelectric; (2) polar derivatives of schafarzikite MSb 2 O 4 ; and (3) ferroelectric semiconductors with formula M 2 P 2 (S,Se) 6 . A variety of avenues for further research and investigation are suggested, including automated structure type classification, low-symmetry improper ferroelectrics, and high-throughput first-principles searches for additional representatives of structural families with desirable functional properties. - Graphical abstract: Integration of first-principles methods with crystallographic database mining, for the discovery and design of novel ferroelectric materials, could potentially lead to new classes of multifunctional materials. Highlights: ► Integration of first-principles methods and database mining. ► Minor structural families with desirable functional properties. ► Survey of polar entries in the Inorganic Crystal Structural Database.

  2. The VirusBanker database uses a Java program to allow flexible searching through Bunyaviridae sequences

    Directory of Open Access Journals (Sweden)

    Gibbs Mark J

    2008-02-01

    Full Text Available Abstract Background Viruses of the Bunyaviridae have segmented negative-stranded RNA genomes and several of them cause significant disease. Many partial sequences have been obtained from the segments so that GenBank searches give complex results. Sequence databases usually use HTML pages to mediate remote sorting, but this approach can be limiting and may discourage a user from exploring a database. Results The VirusBanker database contains Bunyaviridae sequences and alignments and is presented as two spreadsheets generated by a Java program that interacts with a MySQL database on a server. Sequences are displayed in rows and may be sorted using information that is displayed in columns and includes data relating to the segment, gene, protein, species, strain, sequence length, terminal sequence and date and country of isolation. Bunyaviridae sequences and alignments may be downloaded from the second spreadsheet with titles defined by the user from the columns, or viewed when passed directly to the sequence editor, Jalview. Conclusion VirusBanker allows large datasets of aligned nucleotide and protein sequences from the Bunyaviridae to be compiled and winnowed rapidly using criteria that are formulated heuristically.

  3. The VirusBanker database uses a Java program to allow flexible searching through Bunyaviridae sequences.

    Science.gov (United States)

    Fourment, Mathieu; Gibbs, Mark J

    2008-02-05

    Viruses of the Bunyaviridae have segmented negative-stranded RNA genomes and several of them cause significant disease. Many partial sequences have been obtained from the segments so that GenBank searches give complex results. Sequence databases usually use HTML pages to mediate remote sorting, but this approach can be limiting and may discourage a user from exploring a database. The VirusBanker database contains Bunyaviridae sequences and alignments and is presented as two spreadsheets generated by a Java program that interacts with a MySQL database on a server. Sequences are displayed in rows and may be sorted using information that is displayed in columns and includes data relating to the segment, gene, protein, species, strain, sequence length, terminal sequence and date and country of isolation. Bunyaviridae sequences and alignments may be downloaded from the second spreadsheet with titles defined by the user from the columns, or viewed when passed directly to the sequence editor, Jalview. VirusBanker allows large datasets of aligned nucleotide and protein sequences from the Bunyaviridae to be compiled and winnowed rapidly using criteria that are formulated heuristically.

  4. Quantum Query Complexity for Searching Multiple Marked States from an Unsorted Database

    International Nuclear Information System (INIS)

    Shang Bin

    2007-01-01

    An important and usual sort of search problems is to find all marked states from an unsorted database with a large number of states. Grover's original quantum search algorithm is for finding single marked state with uncertainty, and it has been generalized to the case of multiple marked states, as well as been modified to find single marked state with certainty. However, the query complexity for finding all multiple marked states has not been addressed. We use a generalized Long's algorithm with high precision to solve such a problem. We calculate the approximate query complexity, which increases with the number of marked states and with the precision that we demand. In the end we introduce an algorithm for the problem on a 'duality computer' and show its advantage over other algorithms.

  5. Making a search engine for Indocean - A database of abstracts: An experience

    Digital Repository Service at National Institute of Oceanography (India)

    Tapaswi, M.P.; Haravu, L.J.

    stream_size 23701 stream_content_type text/plain stream_name Inf_Manage_Trends_Issues_2003_307.pdf.txt stream_source_info Inf_Manage_Trends_Issues_2003_307.pdf.txt Content-Encoding UTF-8 Content-Type text/plain; charset=UTF-8... Information Mallagement : Trends and Issues (Festschrift ill honour of Prof S. Seetharama) 52 . Making a Search Engine for Indocean - A Database of Abstracts : An Experience Murari P Tapaswi* and L J Haravu** *Documentation Officer. National Information...

  6. Allie: a database and a search service of abbreviations and long forms

    Science.gov (United States)

    Yamamoto, Yasunori; Yamaguchi, Atsuko; Bono, Hidemasa; Takagi, Toshihisa

    2011-01-01

    Many abbreviations are used in the literature especially in the life sciences, and polysemous abbreviations appear frequently, making it difficult to read and understand scientific papers that are outside of a reader’s expertise. Thus, we have developed Allie, a database and a search service of abbreviations and their long forms (a.k.a. full forms or definitions). Allie searches for abbreviations and their corresponding long forms in a database that we have generated based on all titles and abstracts in MEDLINE. When a user query matches an abbreviation, Allie returns all potential long forms of the query along with their bibliographic data (i.e. title and publication year). In addition, for each candidate, co-occurring abbreviations and a research field in which it frequently appears in the MEDLINE data are displayed. This function helps users learn about the context in which an abbreviation appears. To deal with synonymous long forms, we use a dictionary called GENA that contains domain-specific terms such as gene, protein or disease names along with their synonymic information. Conceptually identical domain-specific terms are regarded as one term, and then conceptually identical abbreviation-long form pairs are grouped taking into account their appearance in MEDLINE. To keep up with new abbreviations that are continuously introduced, Allie has an automatic update system. In addition, the database of abbreviations and their long forms with their corresponding PubMed IDs is constructed and updated weekly. Database URL: The Allie service is available at http://allie.dbcls.jp/. PMID:21498548

  7. Protein structure determination by exhaustive search of Protein Data Bank derived databases.

    Science.gov (United States)

    Stokes-Rees, Ian; Sliz, Piotr

    2010-12-14

    Parallel sequence and structure alignment tools have become ubiquitous and invaluable at all levels in the study of biological systems. We demonstrate the application and utility of this same parallel search paradigm to the process of protein structure determination, benefitting from the large and growing corpus of known structures. Such searches were previously computationally intractable. Through the method of Wide Search Molecular Replacement, developed here, they can be completed in a few hours with the aide of national-scale federated cyberinfrastructure. By dramatically expanding the range of models considered for structure determination, we show that small (less than 12% structural coverage) and low sequence identity (less than 20% identity) template structures can be identified through multidimensional template scoring metrics and used for structure determination. Many new macromolecular complexes can benefit significantly from such a technique due to the lack of known homologous protein folds or sequences. We demonstrate the effectiveness of the method by determining the structure of a full-length p97 homologue from Trichoplusia ni. Example cases with the MHC/T-cell receptor complex and the EmoB protein provide systematic estimates of minimum sequence identity, structure coverage, and structural similarity required for this method to succeed. We describe how this structure-search approach and other novel computationally intensive workflows are made tractable through integration with the US national computational cyberinfrastructure, allowing, for example, rapid processing of the entire Structural Classification of Proteins protein fragment database.

  8. Databases

    Digital Repository Service at National Institute of Oceanography (India)

    Kunte, P.D.

    Information on bibliographic as well as numeric/textual databases relevant to coastal geomorphology has been included in a tabular form. Databases cover a broad spectrum of related subjects like coastal environment and population aspects, coastline...

  9. Current Comparative Table (CCT) automates customized searches of dynamic biological databases.

    Science.gov (United States)

    Landsteiner, Benjamin R; Olson, Michael R; Rutherford, Robert

    2005-07-01

    The Current Comparative Table (CCT) software program enables working biologists to automate customized bioinformatics searches, typically of remote sequence or HMM (hidden Markov model) databases. CCT currently supports BLAST, hmmpfam and other programs useful for gene and ortholog identification. The software is web based, has a BioPerl core and can be used remotely via a browser or locally on Mac OS X or Linux machines. CCT is particularly useful to scientists who study large sets of molecules in today's evolving information landscape because it color-codes all result files by age and highlights even tiny changes in sequence or annotation. By empowering non-bioinformaticians to automate custom searches and examine current results in context at a glance, CCT allows a remote database submission in the evening to influence the next morning's bench experiment. A demonstration of CCT is available at http://orb.public.stolaf.edu/CCTdemo and the open source software is freely available from http://sourceforge.net/projects/orb-cct.

  10. Analysis of Users' Searches of CD-ROM Databases in the National and University Library in Zagreb.

    Science.gov (United States)

    Jokic, Maja

    1997-01-01

    Investigates the search behavior of CD-ROM database users in Zagreb (Croatia) libraries: one group needed a minimum of technical assistance, and the other was completely independent. Highlights include the use of questionnaires and transaction log analysis and the need for end-user education. The questionnaire and definitions of search process…

  11. Fine-grained Database Field Search Using Attribute-Based Encryption for E-Healthcare Clouds.

    Science.gov (United States)

    Guo, Cheng; Zhuang, Ruhan; Jie, Yingmo; Ren, Yizhi; Wu, Ting; Choo, Kim-Kwang Raymond

    2016-11-01

    An effectively designed e-healthcare system can significantly enhance the quality of access and experience of healthcare users, including facilitating medical and healthcare providers in ensuring a smooth delivery of services. Ensuring the security of patients' electronic health records (EHRs) in the e-healthcare system is an active research area. EHRs may be outsourced to a third-party, such as a community healthcare cloud service provider for storage due to cost-saving measures. Generally, encrypting the EHRs when they are stored in the system (i.e. data-at-rest) or prior to outsourcing the data is used to ensure data confidentiality. Searchable encryption (SE) scheme is a promising technique that can ensure the protection of private information without compromising on performance. In this paper, we propose a novel framework for controlling access to EHRs stored in semi-trusted cloud servers (e.g. a private cloud or a community cloud). To achieve fine-grained access control for EHRs, we leverage the ciphertext-policy attribute-based encryption (CP-ABE) technique to encrypt tables published by hospitals, including patients' EHRs, and the table is stored in the database with the primary key being the patient's unique identity. Our framework can enable different users with different privileges to search on different database fields. Differ from previous attempts to secure outsourcing of data, we emphasize the control of the searches of the fields within the database. We demonstrate the utility of the scheme by evaluating the scheme using datasets from the University of California, Irvine.

  12. Real-Time Ligand Binding Pocket Database Search Using Local Surface Descriptors

    Science.gov (United States)

    Chikhi, Rayan; Sael, Lee; Kihara, Daisuke

    2010-01-01

    Due to the increasing number of structures of unknown function accumulated by ongoing structural genomics projects, there is an urgent need for computational methods for characterizing protein tertiary structures. As functions of many of these proteins are not easily predicted by conventional sequence database searches, a legitimate strategy is to utilize structure information in function characterization. Of a particular interest is prediction of ligand binding to a protein, as ligand molecule recognition is a major part of molecular function of proteins. Predicting whether a ligand molecule binds a protein is a complex problem due to the physical nature of protein-ligand interactions and the flexibility of both binding sites and ligand molecules. However, geometric and physicochemical complementarity is observed between the ligand and its binding site in many cases. Therefore, ligand molecules which bind to a local surface site in a protein can be predicted by finding similar local pockets of known binding ligands in the structure database. Here, we present two representations of ligand binding pockets and utilize them for ligand binding prediction by pocket shape comparison. These representations are based on mapping of surface properties of binding pockets, which are compactly described either by the two dimensional pseudo-Zernike moments or the 3D Zernike descriptors. These compact representations allow a fast real-time pocket searching against a database. Thorough benchmark study employing two different datasets show that our representations are competitive with the other existing methods. Limitations and potentials of the shape-based methods as well as possible improvements are discussed. PMID:20455259

  13. cDNA - ASTRA | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available ontents List of cDNA in locus Data file File name: astra_cdna.zip File URL: ftp://ftp.biosciencedbc.jp/archive/astra/LATEST/astra_cdn...a.zip File size: 3.3 MB Simple search URL http://togodb.biosciencedbc.jp/togodb/view/astra_cdna...n, Department of Molecular Genetics, National Institute of Agrobiological Sciences (Kikuchi et al., 2003; ftp://cdna

  14. DNA Database of the Nicaraguan Population: Allele Frecuencies of Importance in Forensic Genetics

    Directory of Open Access Journals (Sweden)

    Raquel Vargas-Díaz

    2010-12-01

    Full Text Available Scientific-technical development in the field of natural science, specifically the discovery of human DNA polymorphism, has allowed us to identify people by their genetic fingerprint, i.e. their DNA, unique to every individual on earth.Its use in criminal investigations and forensic medicine has brought about the creation of DNA databases for discrete groups, populations and entire nations.In Nicaragua, the Molecular Biology Center of the Universidad Centroamericana has been a pioneer in this area of research, providing support for criminal investigations and resolving innumerable cases of paternity disputes. In this report we present the achievements of ten years of research, highlighting thetechnical aspects and, in particular, the application of the AmpFlSTR Identifiler system, as well as future prospects for scientific investigation in this area.

  15. Seismic Search Engine: A distributed database for mining large scale seismic data

    Science.gov (United States)

    Liu, Y.; Vaidya, S.; Kuzma, H. A.

    2009-12-01

    The International Monitoring System (IMS) of the CTBTO collects terabytes worth of seismic measurements from many receiver stations situated around the earth with the goal of detecting underground nuclear testing events and distinguishing them from other benign, but more common events such as earthquakes and mine blasts. The International Data Center (IDC) processes and analyzes these measurements, as they are collected by the IMS, to summarize event detections in daily bulletins. Thereafter, the data measurements are archived into a large format database. Our proposed Seismic Search Engine (SSE) will facilitate a framework for data exploration of the seismic database as well as the development of seismic data mining algorithms. Analogous to GenBank, the annotated genetic sequence database maintained by NIH, through SSE, we intend to provide public access to seismic data and a set of processing and analysis tools, along with community-generated annotations and statistical models to help interpret the data. SSE will implement queries as user-defined functions composed from standard tools and models. Each query is compiled and executed over the database internally before reporting results back to the user. Since queries are expressed with standard tools and models, users can easily reproduce published results within this framework for peer-review and making metric comparisons. As an illustration, an example query is “what are the best receiver stations in East Asia for detecting events in the Middle East?” Evaluating this query involves listing all receiver stations in East Asia, characterizing known seismic events in that region, and constructing a profile for each receiver station to determine how effective its measurements are at predicting each event. The results of this query can be used to help prioritize how data is collected, identify defective instruments, and guide future sensor placements.

  16. Anatomy and evolution of database search engines-a central component of mass spectrometry based proteomic workflows.

    Science.gov (United States)

    Verheggen, Kenneth; Raeder, Helge; Berven, Frode S; Martens, Lennart; Barsnes, Harald; Vaudel, Marc

    2017-09-13

    Sequence database search engines are bioinformatics algorithms that identify peptides from tandem mass spectra using a reference protein sequence database. Two decades of development, notably driven by advances in mass spectrometry, have provided scientists with more than 30 published search engines, each with its own properties. In this review, we present the common paradigm behind the different implementations, and its limitations for modern mass spectrometry datasets. We also detail how the search engines attempt to alleviate these limitations, and provide an overview of the different software frameworks available to the researcher. Finally, we highlight alternative approaches for the identification of proteomic mass spectrometry datasets, either as a replacement for, or as a complement to, sequence database search engines. © 2017 Wiley Periodicals, Inc.

  17. The Relationship between Searches Performed in Online Databases and the Number of Full-Text Articles Accessed: Measuring the Interaction between Database and E-Journal Collections

    Science.gov (United States)

    Lamothe, Alain R.

    2011-01-01

    The purpose of this paper is to report the results of a quantitative analysis exploring the interaction and relationship between the online database and electronic journal collections at the J. N. Desmarais Library of Laurentian University. A very strong relationship exists between the number of searches and the size of the online database…

  18. Spectral sum rules and search for periodicities in DNA sequences

    International Nuclear Information System (INIS)

    Chechetkin, V.R.

    2011-01-01

    Periodic patterns play the important regulatory and structural roles in genomic DNA sequences. Commonly, the underlying periodicities should be understood in a broad statistical sense, since the corresponding periodic patterns have been strongly distorted by the random point mutations and insertions/deletions during molecular evolution. The latent periodicities in DNA sequences can be efficiently displayed by Fourier transform. The criteria of significance for observed periodicities are obtained via the comparison versus the counterpart characteristics of the reference random sequences. We show that the restrictions imposed on the significance criteria by the rigorous spectral sum rules can be rationally described with De Finetti distribution. This distribution provides the convenient intermediate asymptotic form between Rayleigh distribution and exact combinatoric theory. - Highlights: → We study the significance criteria for latent periodicities in DNA sequences. → The constraints imposed by sum rules can be described with De Finetti distribution. → It is intermediate between Rayleigh distribution and exact combinatoric theory. → Theory is applicable to the study of correlations between different periodicities. → The approach can be generalized to the arbitrary discrete Fourier transform.

  19. Databases

    Directory of Open Access Journals (Sweden)

    Nick Ryan

    2004-01-01

    Full Text Available Databases are deeply embedded in archaeology, underpinning and supporting many aspects of the subject. However, as well as providing a means for storing, retrieving and modifying data, databases themselves must be a result of a detailed analysis and design process. This article looks at this process, and shows how the characteristics of data models affect the process of database design and implementation. The impact of the Internet on the development of databases is examined, and the article concludes with a discussion of a range of issues associated with the recording and management of archaeological data.

  20. Internet Databases of the Properties, Enzymatic Reactions, and Metabolism of Small Molecules—Search Options and Applications in Food Science

    Directory of Open Access Journals (Sweden)

    Piotr Minkiewicz

    2016-12-01

    Full Text Available Internet databases of small molecules, their enzymatic reactions, and metabolism have emerged as useful tools in food science. Database searching is also introduced as part of chemistry or enzymology courses for food technology students. Such resources support the search for information about single compounds and facilitate the introduction of secondary analyses of large datasets. Information can be retrieved from databases by searching for the compound name or structure, annotating with the help of chemical codes or drawn using molecule editing software. Data mining options may be enhanced by navigating through a network of links and cross-links between databases. Exemplary databases reviewed in this article belong to two classes: tools concerning small molecules (including general and specialized databases annotating food components and tools annotating enzymes and metabolism. Some problems associated with database application are also discussed. Data summarized in computer databases may be used for calculation of daily intake of bioactive compounds, prediction of metabolism of food components, and their biological activity as well as for prediction of interactions between food component and drugs.

  1. The Open Spectral Database: an open platform for sharing and searching spectral data.

    Science.gov (United States)

    Chalk, Stuart J

    2016-01-01

    A number of websites make available spectral data for download (typically as JCAMP-DX text files) and one (ChemSpider) that also allows users to contribute spectral files. As a result, searching and retrieving such spectral data can be time consuming, and difficult to reuse if the data is compressed in the JCAMP-DX file. What is needed is a single resource that allows submission of JCAMP-DX files, export of the raw data in multiple formats, searching based on multiple chemical identifiers, and is open in terms of license and access. To address these issues a new online resource called the Open Spectral Database (OSDB) http://osdb.info/ has been developed and is now available. Built using open source tools, using open code (hosted on GitHub), providing open data, and open to community input about design and functionality, the OSDB is available for anyone to submit spectral data, making it searchable and available to the scientific community. This paper details the concept and coding, internal architecture, export formats, Representational State Transfer (REST) Application Programming Interface and options for submission of data. The OSDB website went live in November 2015. Concurrently, the GitHub repository was made available at https://github.com/stuchalk/OSDB/, and is open for collaborators to join the project, submit issues, and contribute code. The combination of a scripting environment (PHPStorm), a PHP Framework (CakePHP), a relational database (MySQL) and a code repository (GitHub) provides all the capabilities to easily develop REST based websites for ingestion, curation and exposure of open chemical data to the community at all levels. It is hoped this software stack (or equivalent ones in other scripting languages) will be leveraged to make more chemical data available for both humans and computers.

  2. Searching the protein structure database for ligand-binding site similarities using CPASS v.2

    Directory of Open Access Journals (Sweden)

    Caprez Adam

    2011-01-01

    Full Text Available Abstract Background A recent analysis of protein sequences deposited in the NCBI RefSeq database indicates that ~8.5 million protein sequences are encoded in prokaryotic and eukaryotic genomes, where ~30% are explicitly annotated as "hypothetical" or "uncharacterized" protein. Our Comparison of Protein Active-Site Structures (CPASS v.2 database and software compares the sequence and structural characteristics of experimentally determined ligand binding sites to infer a functional relationship in the absence of global sequence or structure similarity. CPASS is an important component of our Functional Annotation Screening Technology by NMR (FAST-NMR protocol and has been successfully applied to aid the annotation of a number of proteins of unknown function. Findings We report a major upgrade to our CPASS software and database that significantly improves its broad utility. CPASS v.2 is designed with a layered architecture to increase flexibility and portability that also enables job distribution over the Open Science Grid (OSG to increase speed. Similarly, the CPASS interface was enhanced to provide more user flexibility in submitting a CPASS query. CPASS v.2 now allows for both automatic and manual definition of ligand-binding sites and permits pair-wise, one versus all, one versus list, or list versus list comparisons. Solvent accessible surface area, ligand root-mean square difference, and Cβ distances have been incorporated into the CPASS similarity function to improve the quality of the results. The CPASS database has also been updated. Conclusions CPASS v.2 is more than an order of magnitude faster than the original implementation, and allows for multiple simultaneous job submissions. Similarly, the CPASS database of ligand-defined binding sites has increased in size by ~ 38%, dramatically increasing the likelihood of a positive search result. The modification to the CPASS similarity function is effective in reducing CPASS similarity scores

  3. The UK National DNA Database: Implementation of the Protection of Freedoms Act 2012.

    Science.gov (United States)

    Amankwaa, Aaron Opoku; McCartney, Carole

    2018-03-01

    In 2008, the European Court of Human Rights, in S and Marper v the United Kingdom, ruled that a retention regime that permits the indefinite retention of DNA records of both convicted and non-convicted ("innocent") individuals is disproportionate. The court noted that there was inadequate evidence to justify the retention of DNA records of the innocent. Since the Marper ruling, the laws governing the taking, use, and retention of forensic DNA in England and Wales have changed with the enactment of the Protection of Freedoms Act 2012 (PoFA). This Act, put briefly, permits the indefinite retention of DNA profiles of most convicted individuals and temporal retention for some first-time convicted minors and innocent individuals on the National DNA Database (NDNAD). The PoFA regime was implemented in October 2013. This paper examines ten post-implementation reports of the NDNAD Strategy Board (3), the NDNAD Ethics Group (3) and the Office of the Biometrics Commissioner (OBC) (4). Overall, the reports highlight a considerable improvement in the performance of the database, with a current match rate of 63.3%. Further, the new regime has strengthened the genetic privacy protection of UK citizens. The OBC reports detail implementation challenges ranging from technical, legal and procedural issues to sufficient understanding of the requirements of PoFA by police forces. Risks highlighted in these reports include the deletion of some "retainable" profiles, which could potentially lead to future crimes going undetected. A further risk is the illegal retention of some profiles from innocent individuals, which may lead to privacy issues and legal challenges. In conclusion, the PoFA regime appears to be working well, however, critical research is still needed to evaluate its overall efficacy compared to other retention regimes. Copyright © 2018 Elsevier B.V. All rights reserved.

  4. Protein backbone chemical shifts predicted from searching a database for torsion angle and sequence homology

    International Nuclear Information System (INIS)

    Shen Yang; Bax, Ad

    2007-01-01

    Chemical shifts of nuclei in or attached to a protein backbone are exquisitely sensitive to their local environment. A computer program, SPARTA, is described that uses this correlation with local structure to predict protein backbone chemical shifts, given an input three-dimensional structure, by searching a newly generated database for triplets of adjacent residues that provide the best match in φ/ψ/χ 1 torsion angles and sequence similarity to the query triplet of interest. The database contains 15 N, 1 H N , 1 H α , 13 C α , 13 C β and 13 C' chemical shifts for 200 proteins for which a high resolution X-ray (≤2.4 A) structure is available. The relative importance of the weighting factors for the φ/ψ/χ 1 angles and sequence similarity was optimized empirically. The weighted, average secondary shifts of the central residues in the 20 best-matching triplets, after inclusion of nearest neighbor, ring current, and hydrogen bonding effects, are used to predict chemical shifts for the protein of known structure. Validation shows good agreement between the SPARTA-predicted and experimental shifts, with standard deviations of 2.52, 0.51, 0.27, 0.98, 1.07 and 1.08 ppm for 15 N, 1 H N , 1 H α , 13 C α , 13 C β and 13 C', respectively, including outliers

  5. Matrix-product-state simulation of an extended Brueschweiler bulk-ensemble database search

    International Nuclear Information System (INIS)

    SaiToh, Akira; Kitagawa, Masahiro

    2006-01-01

    Brueschweiler's database search in a spin Liouville space can be efficiently simulated on a conventional computer without error as long as the simulation cost of the internal circuit of an oracle function is polynomial, unlike the fact that in true NMR experiments, it suffers from an exponential decrease in the variation of a signal intensity. With the simulation method using the matrix-product-state proposed by Vidal [G. Vidal, Phys. Rev. Lett. 91, 147902 (2003)], we perform such a simulation. We also show the extensions of the algorithm without utilizing the J-coupling or DD-coupling splitting of frequency peaks in observation: searching can be completed with a single query in polynomial postoracle circuit complexities in an extension; multiple solutions of an oracle can be found in another extension whose query complexity is linear in the key length and in the number of solutions (this extension is to find all of marked keys). These extended algorithms are also simulated with the same simulation method

  6. Decision making in family medicine: randomized trial of the effects of the InfoClinique and Trip database search engines.

    Science.gov (United States)

    Labrecque, Michel; Ratté, Stéphane; Frémont, Pierre; Cauchon, Michel; Ouellet, Jérôme; Hogg, William; McGowan, Jessie; Gagnon, Marie-Pierre; Njoya, Merlin; Légaré, France

    2013-10-01

    To compare the ability of users of 2 medical search engines, InfoClinique and the Trip database, to provide correct answers to clinical questions and to explore the perceived effects of the tools on the clinical decision-making process. Randomized trial. Three family medicine units of the family medicine program of the Faculty of Medicine at Laval University in Quebec city, Que. Fifteen second-year family medicine residents. Residents generated 30 structured questions about therapy or preventive treatment (2 questions per resident) based on clinical encounters. Using an Internet platform designed for the trial, each resident answered 20 of these questions (their own 2, plus 18 of the questions formulated by other residents, selected randomly) before and after searching for information with 1 of the 2 search engines. For each question, 5 residents were randomly assigned to begin their search with InfoClinique and 5 with the Trip database. The ability of residents to provide correct answers to clinical questions using the search engines, as determined by third-party evaluation. After answering each question, participants completed a questionnaire to assess their perception of the engine's effect on the decision-making process in clinical practice. Of 300 possible pairs of answers (1 answer before and 1 after the initial search), 254 (85%) were produced by 14 residents. Of these, 132 (52%) and 122 (48%) pairs of answers concerned questions that had been assigned an initial search with InfoClinique and the Trip database, respectively. Both engines produced an important and similar absolute increase in the proportion of correct answers after searching (26% to 62% for InfoClinique, for an increase of 36%; 24% to 63% for the Trip database, for an increase of 39%; P = .68). For all 30 clinical questions, at least 1 resident produced the correct answer after searching with either search engine. The mean (SD) time of the initial search for each question was 23.5 (7

  7. Real sequence effects on the search dynamics of transcription factors on DNA

    DEFF Research Database (Denmark)

    Bauer, Maximilian; Rasmussen, Emil S.; Lomholt, Michael A.

    2015-01-01

    Recent experiments show that transcription factors (TFs) indeed use the facilitated diffusion mechanism to locate their target sequences on DNA in living bacteria cells: TFs alternate between sliding motion along DNA and relocation events through the cytoplasm. From simulations and theoretical...... analysis we study the TF-sliding motion for a large section of the DNA-sequence of a common E. coli strain, based on the two-state TF-model with a fast-sliding search state and a recognition state enabling target detection. For the probability to detect the target before dissociating from DNA the TF...... on the underlying nucleotide sequence is varied. A moderate dependence maximises the capability to distinguish between the main operator and similar sequences. Moreover, these auxiliary operators serve as starting points for DNA looping with the main operator, yielding a spectrum of target detection times spanning...

  8. Update History of This Database - RMOS | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us RMOS Update History of This Database Date Update contents 2015/10/27 RMOS English archive si...12 RMOS (http://cdna01.dna.affrc.go.jp/RMOS/) is opened. About This Database Database Description Download License Update Hi...story of This Database Site Policy | Contact Us Update History of This Database - RMOS | LSDB Archive ...

  9. Information retrieval from the INIS database. Is the new online search system poorer than the old one?

    International Nuclear Information System (INIS)

    Adamek, Petr

    2011-01-01

    A brief overview of the search options for the INIS database is presented, categorized into offline and online systems, and their assets and drawbacks are described. In the Online section, the old system on the BASIS platform and the new system on the Google Search Appliance platform are compared. The capabilities of the new system seem to be more limited than those of the old system. (author)

  10. Searching for evidence of selection in avian DNA barcodes.

    Science.gov (United States)

    Kerr, Kevin C R

    2011-11-01

    The barcode of life project has assembled a tremendous number of mitochondrial cytochrome c oxidase I (COI) sequences. Although these sequences were gathered to develop a DNA-based system for species identification, it has been suggested that further biological inferences may also be derived from this wealth of data. Recurrent selective sweeps have been invoked as an evolutionary mechanism to explain limited intraspecific COI diversity, particularly in birds, but this hypothesis has not been formally tested. In this study, I collated COI sequences from previous barcoding studies on birds and tested them for evidence of selection. Using this expanded data set, I re-examined the relationships between intraspecific diversity and interspecific divergence and sampling effort, respectively. I employed the McDonald-Kreitman test to test for neutrality in sequence evolution between closely related pairs of species. Because amino acid sequences were generally constrained between closely related pairs, I also included broader intra-order comparisons to quantify patterns of protein variation in avian COI sequences. Lastly, using 22 published whole mitochondrial genomes, I compared the evolutionary rate of COI against the other 12 protein-coding mitochondrial genes to assess intragenomic variability. I found no conclusive evidence of selective sweeps. Most evidence pointed to an overall trend of strong purifying selection and functional constraint. The COI protein did vary across the class Aves, but to a very limited extent. COI was the least variable gene in the mitochondrial genome, suggesting that other genes might be more informative for probing factors constraining mitochondrial variation within species. © 2011 Blackwell Publishing Ltd.

  11. A framework for intelligent data acquisition and real-time database searching for shotgun proteomics.

    Science.gov (United States)

    Graumann, Johannes; Scheltema, Richard A; Zhang, Yong; Cox, Jürgen; Mann, Matthias

    2012-03-01

    In the analysis of complex peptide mixtures by MS-based proteomics, many more peptides elute at any given time than can be identified and quantified by the mass spectrometer. This makes it desirable to optimally allocate peptide sequencing and narrow mass range quantification events. In computer science, intelligent agents are frequently used to make autonomous decisions in complex environments. Here we develop and describe a framework for intelligent data acquisition and real-time database searching and showcase selected examples. The intelligent agent is implemented in the MaxQuant computational proteomics environment, termed MaxQuant Real-Time. It analyzes data as it is acquired on the mass spectrometer, constructs isotope patterns and SILAC pair information as well as controls MS and tandem MS events based on real-time and prior MS data or external knowledge. Re-implementing a top10 method in the intelligent agent yields similar performance to the data dependent methods running on the mass spectrometer itself. We demonstrate the capabilities of MaxQuant Real-Time by creating a real-time search engine capable of identifying peptides "on-the-fly" within 30 ms, well within the time constraints of a shotgun fragmentation "topN" method. The agent can focus sequencing events onto peptides of specific interest, such as those originating from a specific gene ontology (GO) term, or peptides that are likely modified versions of already identified peptides. Finally, we demonstrate enhanced quantification of SILAC pairs whose ratios were poorly defined in survey spectra. MaxQuant Real-Time is flexible and can be applied to a large number of scenarios that would benefit from intelligent, directed data acquisition. Our framework should be especially useful for new instrument types, such as the quadrupole-Orbitrap, that are currently becoming available.

  12. Complementary Value of Databases for Discovery of Scholarly Literature: A User Survey of Online Searching for Publications in Art History

    Science.gov (United States)

    Nemeth, Erik

    2010-01-01

    Discovery of academic literature through Web search engines challenges the traditional role of specialized research databases. Creation of literature outside academic presses and peer-reviewed publications expands the content for scholarly research within a particular field. The resulting body of literature raises the question of whether scholars…

  13. Lnc2Meth: a manually curated database of regulatory relationships between long non-coding RNAs and DNA methylation associated with human disease.

    Science.gov (United States)

    Zhi, Hui; Li, Xin; Wang, Peng; Gao, Yue; Gao, Baoqing; Zhou, Dianshuang; Zhang, Yan; Guo, Maoni; Yue, Ming; Shen, Weitao; Ning, Shangwei; Jin, Lianhong; Li, Xia

    2018-01-04

    Lnc2Meth (http://www.bio-bigdata.com/Lnc2Meth/), an interactive resource to identify regulatory relationships between human long non-coding RNAs (lncRNAs) and DNA methylation, is not only a manually curated collection and annotation of experimentally supported lncRNAs-DNA methylation associations but also a platform that effectively integrates tools for calculating and identifying the differentially methylated lncRNAs and protein-coding genes (PCGs) in diverse human diseases. The resource provides: (i) advanced search possibilities, e.g. retrieval of the database by searching the lncRNA symbol of interest, DNA methylation patterns, regulatory mechanisms and disease types; (ii) abundant computationally calculated DNA methylation array profiles for the lncRNAs and PCGs; (iii) the prognostic values for each hit transcript calculated from the patients clinical data; (iv) a genome browser to display the DNA methylation landscape of the lncRNA transcripts for a specific type of disease; (v) tools to re-annotate probes to lncRNA loci and identify the differential methylation patterns for lncRNAs and PCGs with user-supplied external datasets; (vi) an R package (LncDM) to complete the differentially methylated lncRNAs identification and visualization with local computers. Lnc2Meth provides a timely and valuable resource that can be applied to significantly expand our understanding of the regulatory relationships between lncRNAs and DNA methylation in various human diseases. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  14. Testing search strategies for systematic reviews in the Medline literature database through PubMed.

    Science.gov (United States)

    Volpato, Enilze S N; Betini, Marluci; El Dib, Regina

    2014-04-01

    A high-quality electronic search is essential in ensuring accuracy and completeness in retrieved records for the conducting of a systematic review. We analysed the available sample of search strategies to identify the best method for searching in Medline through PubMed, considering the use or not of parenthesis, double quotation marks, truncation and use of a simple search or search history. In our cross-sectional study of search strategies, we selected and analysed the available searches performed during evidence-based medicine classes and in systematic reviews conducted in the Botucatu Medical School, UNESP, Brazil. We analysed 120 search strategies. With regard to the use of phrase searches with parenthesis, there was no difference between the results with and without parenthesis and simple searches or search history tools in 100% of the sample analysed (P = 1.0). The number of results retrieved by the searches analysed was smaller using double quotations marks and using truncation compared with the standard strategy (P = 0.04 and P = 0.08, respectively). There is no need to use phrase-searching parenthesis to retrieve studies; however, we recommend the use of double quotation marks when an investigator attempts to retrieve articles in which a term appears to be exactly the same as what was proposed in the search form. Furthermore, we do not recommend the use of truncation in search strategies in the Medline via PubMed. Although the results of simple searches or search history tools were the same, we recommend using the latter.

  15. Preparing College Students To Search Full-Text Databases: Is Instruction Necessary?

    Science.gov (United States)

    Riley, Cheryl; Wales, Barbara

    Full-text databases allow Central Missouri State University's clients to access some of the serials that libraries have had to cancel due to escalating subscription costs; EbscoHost, the subject of this study, is one such database. The database is available free to all Missouri residents. A survey was designed consisting of 21 questions intended…

  16. Characterization of new Schistosoma mansoni microsatellite loci in sequences obtained from public DNA databases and microsatellite enriched genomic libraries

    Directory of Open Access Journals (Sweden)

    Rodrigues NB

    2002-01-01

    Full Text Available In the last decade microsatellites have become one of the most useful genetic markers used in a large number of organisms due to their abundance and high level of polymorphism. Microsatellites have been used for individual identification, paternity tests, forensic studies and population genetics. Data on microsatellite abundance comes preferentially from microsatellite enriched libraries and DNA sequence databases. We have conducted a search in GenBank of more than 16,000 Schistosoma mansoni ESTs and 42,000 BAC sequences. In addition, we obtained 300 sequences from CA and AT microsatellite enriched genomic libraries. The sequences were searched for simple repeats using the RepeatMasker software. Of 16,022 ESTs, we detected 481 (3% sequences that contained 622 microsatellites (434 perfect, 164 imperfect and 24 compounds. Of the 481 ESTs, 194 were grouped in 63 clusters containing 2 to 15 ESTs per cluster. Polymorphisms were observed in 16 clusters. The 287 remaining ESTs were orphan sequences. Of the 42,017 BAC end sequences, 1,598 (3.8% contained microsatellites (2,335 perfect, 287 imperfect and 79 compounds. The 1,598 BAC end sequences 80 were grouped into 17 clusters containing 3 to 17 BAC end sequences per cluster. Microsatellites were present in 67 out of 300 sequences from microsatellite enriched libraries (55 perfect, 38 imperfect and 15 compounds. From all of the observed loci 55 were selected for having the longest perfect repeats and flanking regions that allowed the design of primers for PCR amplification. Additionally we describe two new polymorphic microsatellite loci.

  17. Oracle Database 10g: a platform for BLAST search and Regular Expression pattern matching in life sciences.

    Science.gov (United States)

    Stephens, Susie M; Chen, Jake Y; Davidson, Marcel G; Thomas, Shiby; Trute, Barry M

    2005-01-01

    As database management systems expand their array of analytical functionality, they become powerful research engines for biomedical data analysis and drug discovery. Databases can hold most of the data types commonly required in life sciences and consequently can be used as flexible platforms for the implementation of knowledgebases. Performing data analysis in the database simplifies data management by minimizing the movement of data from disks to memory, allowing pre-filtering and post-processing of datasets, and enabling data to remain in a secure, highly available environment. This article describes the Oracle Database 10g implementation of BLAST and Regular Expression Searches and provides case studies of their usage in bioinformatics. http://www.oracle.com/technology/software/index.html.

  18. Spot table - RPD | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...d_spot.zip File URL: ftp://ftp.biosciencedbc.jp/archive/rpd/LATEST/rpd_spot.zip F... cDNA. (multiple entries) About This Database Database Description Download License Update History of This Database Site Policy | Contact Us Spot table - RPD | LSDB Archive ...

  19. The Moroccan Genetic Disease Database (MGDD): a database for DNA variations related to inherited disorders and disease susceptibility.

    Science.gov (United States)

    Charoute, Hicham; Nahili, Halima; Abidi, Omar; Gabi, Khalid; Rouba, Hassan; Fakiri, Malika; Barakat, Abdelhamid

    2014-03-01

    National and ethnic mutation databases provide comprehensive information about genetic variations reported in a population or an ethnic group. In this paper, we present the Moroccan Genetic Disease Database (MGDD), a catalogue of genetic data related to diseases identified in the Moroccan population. We used the PubMed, Web of Science and Google Scholar databases to identify available articles published until April 2013. The Database is designed and implemented on a three-tier model using Mysql relational database and the PHP programming language. To date, the database contains 425 mutations and 208 polymorphisms found in 301 genes and 259 diseases. Most Mendelian diseases in the Moroccan population follow autosomal recessive mode of inheritance (74.17%) and affect endocrine, nutritional and metabolic physiology. The MGDD database provides reference information for researchers, clinicians and health professionals through a user-friendly Web interface. Its content should be useful to improve researches in human molecular genetics, disease diagnoses and design of association studies. MGDD can be publicly accessed at http://mgdd.pasteur.ma.

  20. NIMS structural materials databases and cross search engine - MatNavi

    Energy Technology Data Exchange (ETDEWEB)

    Yamazaki, M.; Xu, Y.; Murata, M.; Tanaka, H.; Kamihira, K.; Kimura, K. [National Institute for Materials Science, Tokyo (Japan)

    2007-06-15

    Materials Database Station (MDBS) of National Institute for Materials Science (NIMS) owns the world's largest Internet materials database for academic and industry purpose, which is composed of twelve databases: five concerning structural materials, five concerning basic physical properties, one for superconducting materials and one for polymers. All of theses databases are opened to Internet access at the website of http://mits.nims.go.jp/en. Online tools for predicting properties of polymers and composite materials are also available. The NIMS structural materials databases are composed of structural materials data sheet online version (creep, fatigue, corrosion and space use materials strength), microstructure for crept material database, Pressure vessel materials database and CCT diagram for welding. (orig.)

  1. A two-locus DNA sequence database for typing plant and human pathogens within the Fusarium oxysporum species complex

    DEFF Research Database (Denmark)

    O'Donnell, Kerry; Gueidan, C; Sink, S

    2009-01-01

    We constructed a two-locus database, comprising partial translation elongation factor (EF-1alpha) gene sequences and nearly full-length sequences of the nuclear ribosomal intergenic spacer region (IGS rDNA) for 850 isolates spanning the phylogenetic breadth of the Fusarium oxysporum species compl...... of the IGS rDNA sequences may be non-orthologous. We also evaluated enniatin, fumonisin and moniliformin mycotoxin production in vitro within a phylogenetic framework....

  2. Methods and pitfalls in searching drug safety databases utilising the Medical Dictionary for Regulatory Activities (MedDRA).

    Science.gov (United States)

    Brown, Elliot G

    2003-01-01

    The Medical Dictionary for Regulatory Activities (MedDRA) is a unified standard terminology for recording and reporting adverse drug event data. Its introduction is widely seen as a significant improvement on the previous situation, where a multitude of terminologies of widely varying scope and quality were in use. However, there are some complexities that may cause difficulties, and these will form the focus for this paper. Two methods of searching MedDRA-coded databases are described: searching based on term selection from all of MedDRA and searching based on terms in the safety database. There are several potential traps for the unwary in safety searches. There may be multiple locations of relevant terms within a system organ class (SOC) and lack of recognition of appropriate group terms; the user may think that group terms are more inclusive than is the case. MedDRA may distribute terms relevant to one medical condition across several primary SOCs. If the database supports the MedDRA model, it is possible to perform multiaxial searching: while this may help find terms that might have been missed, it is still necessary to consider the entire contents of the SOCs to find all relevant terms and there are many instances of incomplete secondary linkages. It is important to adjust for multiaxiality if data are presented using primary and secondary locations. Other sources for errors in searching are non-intuitive placement and the selection of terms as preferred terms (PTs) that may not be widely recognised. Some MedDRA rules could also result in errors in data retrieval if the individual is unaware of these: in particular, the lack of multiaxial linkages for the Investigations SOC, Social circumstances SOC and Surgical and medical procedures SOC and the requirement that a PT may only be present under one High Level Term (HLT) and one High Level Group Term (HLGT) within any single SOC. Special Search Categories (collections of PTs assembled from various SOCs by

  3. Identifying complications of interventional procedures from UK routine healthcare databases: a systematic search for methods using clinical codes.

    Science.gov (United States)

    Keltie, Kim; Cole, Helen; Arber, Mick; Patrick, Hannah; Powell, John; Campbell, Bruce; Sims, Andrew

    2014-11-28

    Several authors have developed and applied methods to routine data sets to identify the nature and rate of complications following interventional procedures. But, to date, there has been no systematic search for such methods. The objective of this article was to find, classify and appraise published methods, based on analysis of clinical codes, which used routine healthcare databases in a United Kingdom setting to identify complications resulting from interventional procedures. A literature search strategy was developed to identify published studies that referred, in the title or abstract, to the name or acronym of a known routine healthcare database and to complications from procedures or devices. The following data sources were searched in February and March 2013: Cochrane Methods Register, Conference Proceedings Citation Index - Science, Econlit, EMBASE, Health Management Information Consortium, Health Technology Assessment database, MathSciNet, MEDLINE, MEDLINE in-process, OAIster, OpenGrey, Science Citation Index Expanded and ScienceDirect. Of the eligible papers, those which reported methods using clinical coding were classified and summarised in tabular form using the following headings: routine healthcare database; medical speciality; method for identifying complications; length of follow-up; method of recording comorbidity. The benefits and limitations of each approach were assessed. From 3688 papers identified from the literature search, 44 reported the use of clinical codes to identify complications, from which four distinct methods were identified: 1) searching the index admission for specified clinical codes, 2) searching a sequence of admissions for specified clinical codes, 3) searching for specified clinical codes for complications from procedures and devices within the International Classification of Diseases 10th revision (ICD-10) coding scheme which is the methodology recommended by NHS Classification Service, and 4) conducting manual clinical

  4. Review and Comparison of the Search Effectiveness and User Interface of Three Major Online Chemical Databases

    Science.gov (United States)

    Bharti, Neelam; Leonard, Michelle; Singh, Shailendra

    2016-01-01

    Online chemical databases are the largest source of chemical information and, therefore, the main resource for retrieving results from published journals, books, patents, conference abstracts, and other relevant sources. Various commercial, as well as free, chemical databases are available. SciFinder, Reaxys, and Web of Science are three major…

  5. Policy implications for familial searching

    OpenAIRE

    Kim, Joyce; Mammo, Danny; Siegel, Marni B; Katsanis, Sara H

    2011-01-01

    Abstract In the United States, several states have made policy decisions regarding whether and how to use familial searching of the Combined DNA Index System (CODIS) database in criminal investigations. Familial searching pushes DNA typing beyond merely identifying individuals to detecting genetic relatedness, an application previously reserved for missing persons identifications and custody battles. The intentional search of CODIS for partial matches to an item of evidence offers law enforce...

  6. DMPD: The actions of bacterial DNA on murine macrophages. [Dynamic Macrophage Pathway CSML Database

    Lifescience Database Archive (English)

    Full Text Available 10534106 The actions of bacterial DNA on murine macrophages. Sester DP, Stacey KJ, ... Show The actions of bacterial DNA on murine macrophages. PubmedID 10534106 Title The actions of bacterial DNA on murine macrophage

  7. Update History of This Database - RED | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us RED Update History of This Database Date Update contents 2015/12/21 Rice Expression Database English archi...s Database Database Description Download License Update History of This Database Site Policy | Contact Us Update History of This Database - RED | LSDB Archive ... ...ve site is opened. 2000/10/1 Rice Expression Database ( http://red.dna.affrc.go.jp/RED/ ) is opened. About Thi

  8. Update History of This Database - RPD | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us RPD Update History of This Database Date Update contents 2016/02/02 Rice Proteome Database English archi...s Database Database Description Download License Update History of This Database Site Policy | Contact Us Update History of This Database - RPD | LSDB Archive ... ...ve site is opened. 2003/01/07 Rice Proteome Database ( http://gene64.dna.affrc.go.jp/RPD/ ) is opened. About Thi

  9. Fast quantum search algorithm for databases of arbitrary size and its implementation in a cavity QED system

    International Nuclear Information System (INIS)

    Li, H.Y.; Wu, C.W.; Liu, W.T.; Chen, P.X.; Li, C.Z.

    2011-01-01

    We propose a method for implementing the Grover search algorithm directly in a database containing any number of items based on multi-level systems. Compared with the searching procedure in the database with qubits encoding, our modified algorithm needs fewer iteration steps to find the marked item and uses the carriers of the information more economically. Furthermore, we illustrate how to realize our idea in cavity QED using Zeeman's level structure of atoms. And the numerical simulation under the influence of the cavity and atom decays shows that the scheme could be achieved efficiently within current state-of-the-art technology. -- Highlights: ► A modified Grover algorithm is proposed for searching in an arbitrary dimensional Hilbert space. ► Our modified algorithm requires fewer iteration steps to find the marked item. ► The proposed method uses the carriers of the information more economically. ► A scheme for a six-item Grover search in cavity QED is proposed. ► Numerical simulation under decays shows that the scheme can be achieved with enough fidelity.

  10. Fluorescence- and capillary electrophoresis (CE)-based SSR DNA fingerprinting and a molecular identity database for the Louisiana sugarcane industry

    Science.gov (United States)

    A database of Louisiana sugarcane molecular identity has been constructed and is being updated annually using FAM or HEX or NED fluorescence- and capillary electrophoresis (CE)-based microsatellite (SSR) fingerprinting information. The fingerprints are PCR-amplified from leaf DNA samples of current ...

  11. Supervised learning of tools for content-based search of image databases

    Science.gov (United States)

    Delanoy, Richard L.

    1996-03-01

    A computer environment, called the Toolkit for Image Mining (TIM), is being developed with the goal of enabling users with diverse interests and varied computer skills to create search tools for content-based image retrieval and other pattern matching tasks. Search tools are generated using a simple paradigm of supervised learning that is based on the user pointing at mistakes of classification made by the current search tool. As mistakes are identified, a learning algorithm uses the identified mistakes to build up a model of the user's intentions, construct a new search tool, apply the search tool to a test image, display the match results as feedback to the user, and accept new inputs from the user. Search tools are constructed in the form of functional templates, which are generalized matched filters capable of knowledge- based image processing. The ability of this system to learn the user's intentions from experience contrasts with other existing approaches to content-based image retrieval that base searches on the characteristics of a single input example or on a predefined and semantically- constrained textual query. Currently, TIM is capable of learning spectral and textural patterns, but should be adaptable to the learning of shapes, as well. Possible applications of TIM include not only content-based image retrieval, but also quantitative image analysis, the generation of metadata for annotating images, data prioritization or data reduction in bandwidth-limited situations, and the construction of components for larger, more complex computer vision algorithms.

  12. Mascot search results - CREATE portal | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available search(/contents-en/) != -1 || url.search(/index-e.html/) != -1 ) { document.getElementById(lang).innerHTML=.../) != -1 ) { url = url.replace(-e.html,.html); document.getElementById(lang).innerHTML=[ Japanese |...en/,/jp/); document.getElementById(lang).innerHTML=[ Japanese | English ]; } else if ( url.search(//contents...//) != -1 ) { url = url.replace(/contents/,/contents-en/); document.getElementById(lang).innerHTML=[ Japanes...e(/contents-en/,/contents/); document.getElementById(lang).innerHTML=[ Japanese | English ]; } else if( url.

  13. Google Scholar Out-Performs Many Subscription Databases when Keyword Searching. A Review of: Walters, W. H. (2009. Google Scholar search performance: Comparative recall and precision. portal: Libraries and the Academy, 9(1, 5-24.

    Directory of Open Access Journals (Sweden)

    Giovanna Badia

    2010-09-01

    Full Text Available Objective – To compare the search performance (i.e., recall and precision of Google Scholar with that of 11 other bibliographic databases when using a keyword search to find references on later-life migration. Design – Comparative database evaluation. Setting – Not stated in the article. It appears from the author’s affiliation that this research took place in an academic institution of higher learning. Subjects – Twelve databases were compared: Google Scholar, Academic Search Elite, AgeLine, ArticleFirst, EconLit, Geobase, Medline, PAIS International, Popline, Social Sciences Abstracts, Social Sciences Citation Index, and SocIndex. Methods – The relevant literature on later-life migration was pre-identified as a set of 155 journal articles published from 1990 to 2000. The author selected these articles from database searches, citation tracking, journal scans, and consultations with social sciences colleagues. Each database was evaluated with regards to its performance in finding references to these 155 papers.Elderly and migration were the keywords used to conduct the searches in each of the 12 databases, since these were the words that were the most frequently used in the titles of the 155 relevant articles. The search was performed in the most basic search interface of each database that allowed limiting results by the needed publication dates (1990-2000. Search results were sorted by relevance when possible (for 9 out of the 12 databases, and by date when the relevance sorting option was not available. Recall and precision statistics were then calculated from the search results. Recall is the number of relevant results obtained in the database for a search topic, divided by all the potential results which can be obtained on that topic (in this case, 155 references. Precision is the number of relevant results obtained in the database for a search topic, divided by the total number of results that were obtained in the database on

  14. Verification of Single-Peptide Protein Identifications by the Application of Complementary Database Search Algorithms

    National Research Council Canada - National Science Library

    Rohrbough, James G; Breci, Linda; Merchant, Nirav; Miller, Susan; Haynes, Paul A

    2005-01-01

    .... One such technique, known as the Multi-Dimensional Protein Identification Technique, or MudPIT, involves the use of computer search algorithms that automate the process of identifying proteins...

  15. IMG/VR: a database of cultured and uncultured DNA Viruses and retroviruses.

    Science.gov (United States)

    Paez-Espino, David; Chen, I-Min A; Palaniappan, Krishna; Ratner, Anna; Chu, Ken; Szeto, Ernest; Pillay, Manoj; Huang, Jinghua; Markowitz, Victor M; Nielsen, Torben; Huntemann, Marcel; K Reddy, T B; Pavlopoulos, Georgios A; Sullivan, Matthew B; Campbell, Barbara J; Chen, Feng; McMahon, Katherine; Hallam, Steve J; Denef, Vincent; Cavicchioli, Ricardo; Caffrey, Sean M; Streit, Wolfgang R; Webster, John; Handley, Kim M; Salekdeh, Ghasem H; Tsesmetzis, Nicolas; Setubal, Joao C; Pope, Phillip B; Liu, Wen-Tso; Rivers, Adam R; Ivanova, Natalia N; Kyrpides, Nikos C

    2017-01-04

    Viruses represent the most abundant life forms on the planet. Recent experimental and computational improvements have led to a dramatic increase in the number of viral genome sequences identified primarily from metagenomic samples. As a result of the expanding catalog of metagenomic viral sequences, there exists a need for a comprehensive computational platform integrating all these sequences with associated metadata and analytical tools. Here we present IMG/VR (https://img.jgi.doe.gov/vr/), the largest publicly available database of 3908 isolate reference DNA viruses with 264 413 computationally identified viral contigs from >6000 ecologically diverse metagenomic samples. Approximately half of the viral contigs are grouped into genetically distinct quasi-species clusters. Microbial hosts are predicted for 20 000 viral sequences, revealing nine microbial phyla previously unreported to be infected by viruses. Viral sequences can be queried using a variety of associated metadata, including habitat type and geographic location of the samples, or taxonomic classification according to hallmark viral genes. IMG/VR has a user-friendly interface that allows users to interrogate all integrated data and interact by comparing with external sequences, thus serving as an essential resource in the viral genomics community. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  16. SimShiftDB; local conformational restraints derived from chemical shift similarity searches on a large synthetic database

    International Nuclear Information System (INIS)

    Ginzinger, Simon W.; Coles, Murray

    2009-01-01

    We present SimShiftDB, a new program to extract conformational data from protein chemical shifts using structural alignments. The alignments are obtained in searches of a large database containing 13,000 structures and corresponding back-calculated chemical shifts. SimShiftDB makes use of chemical shift data to provide accurate results even in the case of low sequence similarity, and with even coverage of the conformational search space. We compare SimShiftDB to HHSearch, a state-of-the-art sequence-based search tool, and to TALOS, the current standard tool for the task. We show that for a significant fraction of the predicted similarities, SimShiftDB outperforms the other two methods. Particularly, the high coverage afforded by the larger database often allows predictions to be made for residues not involved in canonical secondary structure, where TALOS predictions are both less frequent and more error prone. Thus SimShiftDB can be seen as a complement to currently available methods

  17. SimShiftDB; local conformational restraints derived from chemical shift similarity searches on a large synthetic database

    Energy Technology Data Exchange (ETDEWEB)

    Ginzinger, Simon W. [Center of Applied Molecular Engineering, University of Salzburg, Department of Molecular Biology, Division of Bioinformatics (Austria)], E-mail: simon@came.sbg.ac.at; Coles, Murray [Max-Planck-Institute for Developmental Biology, Department of Protein Evolution (Germany)], E-mail: Murray.Coles@tuebingen.mpg.de

    2009-03-15

    We present SimShiftDB, a new program to extract conformational data from protein chemical shifts using structural alignments. The alignments are obtained in searches of a large database containing 13,000 structures and corresponding back-calculated chemical shifts. SimShiftDB makes use of chemical shift data to provide accurate results even in the case of low sequence similarity, and with even coverage of the conformational search space. We compare SimShiftDB to HHSearch, a state-of-the-art sequence-based search tool, and to TALOS, the current standard tool for the task. We show that for a significant fraction of the predicted similarities, SimShiftDB outperforms the other two methods. Particularly, the high coverage afforded by the larger database often allows predictions to be made for residues not involved in canonical secondary structure, where TALOS predictions are both less frequent and more error prone. Thus SimShiftDB can be seen as a complement to currently available methods.

  18. cDNA library information - Dicty_cDB | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Dicty_cDB cDNA library information Data detail Data name cDNA library information DOI 10.189...s Data item Description cDNA library name Names of cDNA libraries (AF, AH, CF, CH, FC, FC-IC, FCL, SF, SH, S...(C) 5) sexually fusion-competent KAX3 cells (Gamete phase) (F) cDNA library construction method How to construct cDNA library...dir) 2) Full-length cDNA libraries (oligocapped method)(fl) 3) Gamete-specific subtraction library (sub) cDNA library... construction protocol Link to the webpage describing the protocol for generating cDNA library Size

  19. Content Based Retrieval Database Management System with Support for Similarity Searching and Query Refinement

    Science.gov (United States)

    2002-01-01

    to the OODBMS approach. The ORDBMS approach produced such research prototypes as Postgres [155], and Starburst [67] and commercial products such as...Kemnitz. The POSTGRES Next-Generation Database Management System. Communications of the ACM, 34(10):78–92, 1991. [156] Michael Stonebreaker and Dorothy

  20. Information Retrieval Strategies of Millennial Undergraduate Students in Web and Library Database Searches

    Science.gov (United States)

    Porter, Brandi

    2009-01-01

    Millennial students make up a large portion of undergraduate students attending colleges and universities, and they have a variety of online resources available to them to complete academically related information searches, primarily Web based and library-based online information retrieval systems. The content, ease of use, and required search…

  1. Federated Search Tools in Fusion Centers: Bridging Databases in the Information Sharing Environment

    Science.gov (United States)

    2012-09-01

    Suspicious Activity Reporting Initiative ODNI Office of the Director of National Intelligence OSINT Open Source Intelligence PERF Police Executive...Fusion centers are encouraged to explore all available information sources to enhance the intelligence analysis process. It follows then that fusion...WSIC also utilizes ACCURINT, a web-based, subscription service. ACCURINT searches open source information and is able to collect and collate

  2. Combining history of medicine and library instruction: an innovative approach to teaching database searching to medical students.

    Science.gov (United States)

    Timm, Donna F; Jones, Dee; Woodson, Deidra; Cyrus, John W

    2012-01-01

    Library faculty members at the Health Sciences Library at the LSU Health Shreveport campus offer a database searching class for third-year medical students during their surgery rotation. For a number of years, students completed "ten-minute clinical challenges," but the instructors decided to replace the clinical challenges with innovative exercises using The Edwin Smith Surgical Papyrus to emphasize concepts learned. The Surgical Papyrus is an online resource that is part of the National Library of Medicine's "Turning the Pages" digital initiative. In addition, vintage surgical instruments and historic books are displayed in the classroom to enhance the learning experience.

  3. DMPD: Intracellular DNA sensors in immunity. [Dynamic Macrophage Pathway CSML Database

    Lifescience Database Archive (English)

    Full Text Available 18573338 Intracellular DNA sensors in immunity. Takeshita F, Ishii KJ. Curr Opin Im...munol. 2008 Aug;20(4):383-8. Epub 2008 Jun 23. (.png) (.svg) (.html) (.csml) Show Intracellular DNA sensors ...in immunity. PubmedID 18573338 Title Intracellular DNA sensors in immunity. Authors Takeshita F, Ishii KJ. P

  4. DMPD: All is not Toll: new pathways in DNA recognition. [Dynamic Macrophage Pathway CSML Database

    Lifescience Database Archive (English)

    Full Text Available 16446382 All is not Toll: new pathways in DNA recognition. Wagner H, Bauer S. J Exp... Med. 2006 Feb 20;203(2):265-8. Epub 2006 Jan 30. (.png) (.svg) (.html) (.csml) Show All is not Toll: new pathways in DNA recognition.... PubmedID 16446382 Title All is not Toll: new pathways in DNA recognition. Authors

  5. DOT Online Database

    Science.gov (United States)

    Page Home Table of Contents Contents Search Database Search Login Login Databases Advisory Circulars accessed by clicking below: Full-Text WebSearch Databases Database Records Date Advisory Circulars 2092 5 data collection and distribution policies. Document Database Website provided by MicroSearch

  6. Crescendo: A Protein Sequence Database Search Engine for Tandem Mass Spectra.

    Science.gov (United States)

    Wang, Jianqi; Zhang, Yajie; Yu, Yonghao

    2015-07-01

    A search engine that discovers more peptides reliably is essential to the progress of the computational proteomics. We propose two new scoring functions (L- and P-scores), which aim to capture similar characteristics of a peptide-spectrum match (PSM) as Sequest and Comet do. Crescendo, introduced here, is a software program that implements these two scores for peptide identification. We applied Crescendo to test datasets and compared its performance with widely used search engines, including Mascot, Sequest, and Comet. The results indicate that Crescendo identifies a similar or larger number of peptides at various predefined false discovery rates (FDR). Importantly, it also provides a better separation between the true and decoy PSMs, warranting the future development of a companion post-processing filtering algorithm.

  7. The Magnetics Information Consortium (MagIC) Online Database: Uploading, Searching and Visualizing Paleomagnetic and Rock Magnetic Data

    Science.gov (United States)

    Minnett, R.; Koppers, A.; Tauxe, L.; Constable, C.; Pisarevsky, S. A.; Jackson, M.; Solheid, P.; Banerjee, S.; Johnson, C.

    2006-12-01

    The Magnetics Information Consortium (MagIC) is commissioned to implement and maintain an online portal to a relational database populated by both rock and paleomagnetic data. The goal of MagIC is to archive all measurements and the derived properties for studies of paleomagnetic directions (inclination, declination) and intensities, and for rock magnetic experiments (hysteresis, remanence, susceptibility, anisotropy). MagIC is hosted under EarthRef.org at http://earthref.org/MAGIC/ and has two search nodes, one for paleomagnetism and one for rock magnetism. Both nodes provide query building based on location, reference, methods applied, material type and geological age, as well as a visual map interface to browse and select locations. The query result set is displayed in a digestible tabular format allowing the user to descend through hierarchical levels such as from locations to sites, samples, specimens, and measurements. At each stage, the result set can be saved and, if supported by the data, can be visualized by plotting global location maps, equal area plots, or typical Zijderveld, hysteresis, and various magnetization and remanence diagrams. User contributions to the MagIC database are critical to achieving a useful research tool. We have developed a standard data and metadata template (Version 2.1) that can be used to format and upload all data at the time of publication in Earth Science journals. Software tools are provided to facilitate population of these templates within Microsoft Excel. These tools allow for the import/export of text files and provide advanced functionality to manage and edit the data, and to perform various internal checks to maintain data integrity and prepare for uploading. The MagIC Contribution Wizard at http://earthref.org/MAGIC/upload.htm executes the upload and takes only a few minutes to process several thousand data records. The standardized MagIC template files are stored in the digital archives of EarthRef.org where they

  8. Heat pumps: Industrial applications. (Latest citations from the NTIS bibliographic database). Published Search

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1997-04-01

    The bibliography contains citations concerning design, development, and applications of heat pumps for industrial processes. Included are thermal energy exchanges based on air-to-air, ground-coupled, air-to-water, and water-to-water systems. Specific applications include industrial process heat, drying, district heating, and waste processing plants. Other Published Searches in this series cover heat pump technology and economics, and heat pumps for residential and commercial applications. (Contains 50-250 citations and includes a subject term index and title list.) (Copyright NERAC, Inc. 1995)

  9. Heat pumps: Industrial applications. (Latest citations from the NTIS bibliographic database). Published Search

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1996-01-01

    The bibliography contains citations concerning design, development, and applications of heat pumps for industrial processes. Included are thermal energy exchanges based on air-to-air, ground-coupled, air-to-water, and water-to-water systems. Specific applications include industrial process heat, drying, district heating, and waste processing plants. Other Published Searches in this series cover heat pump technology and economics, and heat pumps for residential and commercial applications. (Contains 50-250 citations and includes a subject term index and title list.) (Copyright NERAC, Inc. 1995)

  10. Using a Native XML Database for Encoded Archival Description Search and Retrieval

    Directory of Open Access Journals (Sweden)

    Alan Cornish

    2017-09-01

    Full Text Available This article is an attempt to develop Geographic Information Systems (GIS technology into an analytical tool for examining the relationships between the height of the bookshelves and the behavior of library readers in utilizing books within a library. The tool would contain a database to store book-use information and some GIS maps to represent bookshelves. Upon analyzing the data stored in the database, different frequencies of book use across bookshelf layers are displayed on the maps. The tool would provide a wonderful means of visualization through which analysts can quickly realize the spatial distribution of books used in a library. This article reveals that readers tend to pull books out of the bookshelf layers that are easily reachable by human eyes and hands, and thus opens some issues for librarians to reconsider the management of library collections.

  11. Heart research advances using database search engines, Human Protein Atlas and the Sydney Heart Bank.

    Science.gov (United States)

    Li, Amy; Estigoy, Colleen; Raftery, Mark; Cameron, Darryl; Odeberg, Jacob; Pontén, Fredrik; Lal, Sean; Dos Remedios, Cristobal G

    2013-10-01

    This Methodological Review is intended as a guide for research students who may have just discovered a human "novel" cardiac protein, but it may also help hard-pressed reviewers of journal submissions on a "novel" protein reported in an animal model of human heart failure. Whether you are an expert or not, you may know little or nothing about this particular protein of interest. In this review we provide a strategic guide on how to proceed. We ask: How do you discover what has been published (even in an abstract or research report) about this protein? Everyone knows how to undertake literature searches using PubMed and Medline but these are usually encyclopaedic, often producing long lists of papers, most of which are either irrelevant or only vaguely relevant to your query. Relatively few will be aware of more advanced search engines such as Google Scholar and even fewer will know about Quertle. Next, we provide a strategy for discovering if your "novel" protein is expressed in the normal, healthy human heart, and if it is, we show you how to investigate its subcellular location. This can usually be achieved by visiting the website "Human Protein Atlas" without doing a single experiment. Finally, we provide a pathway to discovering if your protein of interest changes its expression level with heart failure/disease or with ageing. Crown Copyright © 2013. Published by Elsevier B.V. All rights reserved.

  12. Searching the databases: a quick look at Amazon and two other online catalogues.

    Science.gov (United States)

    Potts, Hilary

    2003-01-01

    The Amazon Online Catalogue was compared with the Library of Congress Catalogue and the British Library Catalogue, both also available online, by searching on both neutral (Gay, Lesbian, Homosexual) and pejorative (Perversion, Sex Crime) subject terms, and also by searches using Boolean logic in an attempt to identify Lesbian Fiction items and religion-based anti-gay material. Amazon was much more likely to be the first port of call for non-academic enquiries. Although excluding much material necessary for academic research, it carried more information about the individual books and less historical homophobic baggage in its terminology than the great national catalogues. Its back catalogue of second-hand books outnumbered those in print. Current attitudes may partially be gauged by the relative numbers of titles published under each heading--e.g., there may be an inverse relationship between concern about child sex abuse and homophobia, more noticeable in U.S. because of the activities of the religious right.

  13. On-line biomedical databases-the best source for quick search of the scientific information in the biomedicine.

    Science.gov (United States)

    Masic, Izet; Milinovic, Katarina

    2012-06-01

    Most of medical journals now has it's electronic version, available over public networks. Although there are parallel printed and electronic versions, and one other form need not to be simultaneously published. Electronic version of a journal can be published a few weeks before the printed form and must not has identical content. Electronic form of a journals may have an extension that does not contain a printed form, such as animation, 3D display, etc., or may have available fulltext, mostly in PDF or XML format, or just the contents or a summary. Access to a full text is usually not free and can be achieved only if the institution (library or host) enters into an agreement on access. Many medical journals, however, provide free access for some articles, or after a certain time (after 6 months or a year) to complete content. The search for such journals provide the network archive as High Wire Press, Free Medical Journals.com. It is necessary to allocate PubMed and PubMed Central, the first public digital archives unlimited collect journals of available medical literature, which operates in the system of the National Library of Medicine in Bethesda (USA). There are so called on- line medical journals published only in electronic form. It could be searched over on-line databases. In this paper authors shortly described about 30 data bases and short instructions how to make access and search the published papers in indexed medical journals.

  14. DMPD: Cytosolic DNA recognition for triggering innate immune responses. [Dynamic Macrophage Pathway CSML Database

    Lifescience Database Archive (English)

    Full Text Available 18280611 Cytosolic DNA recognition for triggering innate immune responses. Takaoka ...A, Taniguchi T. Adv Drug Deliv Rev. 2008 Apr 29;60(7):847-57. Epub 2007 Dec 31. (.png) (.svg) (.html) (.csml) Show Cytosol...ic DNA recognition for triggering innate immune responses. PubmedID 18280611 Title Cytosolic D

  15. DMPD: Innate immune recognition of, and regulation by, DNA. [Dynamic Macrophage Pathway CSML Database

    Lifescience Database Archive (English)

    Full Text Available 16979939 Innate immune recognition of, and regulation by, DNA. Ishii KJ, Akira S. T...rends Immunol. 2006 Nov;27(11):525-32. Epub 2006 Sep 18. (.png) (.svg) (.html) (.csml) Show Innate immune recognition... of, and regulation by, DNA. PubmedID 16979939 Title Innate immune recognition of, and regulation b

  16. Update History of This Database - RMG | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us RMG Update History of This Database Date Update contents 2016/08/22 The contact address is c...dna.affrc.go.jp/ ) is opened. About This Database Database Description Download License Update Hi...story of This Database Site Policy | Contact Us Update History of This Database - RMG | LSDB Archive ... ... URL of the portal site is changed. 2013/08/07 RMG archive site is opened. 2002/09/25 RMG ( http://rmg.rice.

  17. High serum folate is associated with reduced biochemical recurrence after radical prostatectomy: Results from the SEARCH Database

    Directory of Open Access Journals (Sweden)

    Daniel M. Moreira

    2013-06-01

    Full Text Available Introduction To analyze the association between serum levels of folate and risk of biochemical recurrence after radical prostatectomy among men from the Shared Equal Access Regional Cancer Hospital (SEARCH database. Materials and Methods Retrospective analysis of 135 subjects from the SEARCH database treated between 1991-2009 with available preoperative serum folate levels. Patients' characteristics at the time of the surgery were analyzed with ranksum and linear regression. Uni- and multivariable analyses of folate levels (log-transformed and time to biochemical recurrence were performed with Cox proportional hazards. Results The median preoperative folate level was 11.6ng/mL (reference = 1.5-20.0ng/mL. Folate levels were significantly lower among African-American men than Caucasians (P = 0.003. In univariable analysis, higher folate levels were associated with more recent year of surgery (P < 0.001 and lower preoperative PSA (P = 0.003. In univariable analysis, there was a trend towards lower risk of biochemical recurrence among men with high folate levels (HR = 0.61, 95%CI = 0.37-1.03, P = 0.064. After adjustments for patients characteristics' and pre- and post-operative clinical and pathological findings, higher serum levels of folate were independently associated with lower risk for biochemical recurrence (HR = 0.42, 95%CI = 0.20-0.89, P = 0.023. Conclusion In a cohort of men undergoing radical prostatectomy at several VAs across the country, higher serum folate levels were associated with lower PSA and lower risk for biochemical failure. While the source of the folate in the serum in this study is unknown (i.e. diet vs. supplement, these findings, if confirmed, suggest a potential role of folic acid supplementation or increased consumption of folate rich foods to reduce the risk of recurrence.

  18. An Efficient Approach to Mining Maximal Contiguous Frequent Patterns from Large DNA Sequence Databases

    Directory of Open Access Journals (Sweden)

    Md. Rezaul Karim

    2012-03-01

    Full Text Available Mining interesting patterns from DNA sequences is one of the most challenging tasks in bioinformatics and computational biology. Maximal contiguous frequent patterns are preferable for expressing the function and structure of DNA sequences and hence can capture the common data characteristics among related sequences. Biologists are interested in finding frequent orderly arrangements of motifs that are responsible for similar expression of a group of genes. In order to reduce mining time and complexity, however, most existing sequence mining algorithms either focus on finding short DNA sequences or require explicit specification of sequence lengths in advance. The challenge is to find longer sequences without specifying sequence lengths in advance. In this paper, we propose an efficient approach to mining maximal contiguous frequent patterns from large DNA sequence datasets. The experimental results show that our proposed approach is memory-efficient and mines maximal contiguous frequent patterns within a reasonable time.

  19. Download - Budding yeast cDNA sequencing project | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available search(/contents-en/) != -1 || url.search(/index-e.html/) != -1 ) { document.getElementById(lang).innerHTML=.../) != -1 ) { url = url.replace(-e.html,.html); document.getElementById(lang).innerHTML=[ Japanese |...en/,/jp/); document.getElementById(lang).innerHTML=[ Japanese | English ]; } else if ( url.search(//contents...//) != -1 ) { url = url.replace(/contents/,/contents-en/); document.getElementById(lang).innerHTML=[ Japanes...e(/contents-en/,/contents/); document.getElementById(lang).innerHTML=[ Japanese | English ]; } else if( url.

  20. An algorithm of discovering signatures from DNA databases on a computer cluster.

    Science.gov (United States)

    Lee, Hsiao Ping; Sheu, Tzu-Fang

    2014-10-05

    Signatures are short sequences that are unique and not similar to any other sequence in a database that can be used as the basis to identify different species. Even though several signature discovery algorithms have been proposed in the past, these algorithms require the entirety of databases to be loaded in the memory, thus restricting the amount of data that they can process. It makes those algorithms unable to process databases with large amounts of data. Also, those algorithms use sequential models and have slower discovery speeds, meaning that the efficiency can be improved. In this research, we are debuting the utilization of a divide-and-conquer strategy in signature discovery and have proposed a parallel signature discovery algorithm on a computer cluster. The algorithm applies the divide-and-conquer strategy to solve the problem posed to the existing algorithms where they are unable to process large databases and uses a parallel computing mechanism to effectively improve the efficiency of signature discovery. Even when run with just the memory of regular personal computers, the algorithm can still process large databases such as the human whole-genome EST database which were previously unable to be processed by the existing algorithms. The algorithm proposed in this research is not limited by the amount of usable memory and can rapidly find signatures in large databases, making it useful in applications such as Next Generation Sequencing and other large database analysis and processing. The implementation of the proposed algorithm is available at http://www.cs.pu.edu.tw/~fang/DDCSDPrograms/DDCSD.htm.

  1. Search for novel remedies to augment radiation resistance of inhabitants of Fukushima and Chernobyl disasters: identifying DNA repair protein XRCC4 inhibitors.

    Science.gov (United States)

    Sun, Mao-Feng; Chen, Hsin-Yi; Tsai, Fuu-Jen; Lui, Shu-Hui; Chen, Chih-Yi; Chen, Calvin Yu-Chian

    2011-10-01

    Two nuclear plant disasters occurring within a span of 25 years threaten health and genome integrity both in Fukushima and Chernobyl. Search for remedies capable of enhancing DNA repair efficiency and radiation resistance in humans appears to be a urgent problem for now. XRCC4 is an important enhancer in promoting repair pathway triggered by DNA double-strand break (DSB). In the context of radiation therapy, active XRCC4 could reduce DSB-mediated apoptotic effect on cancer cells. Hence, developing XRCC4 inhibitors could possibly enhance radiotherapy outcomes. In this study, we screened traditional Chinese medicine (TCM) database, TCM Database@Taiwan, and have identified three potent inhibitor agents against XRCC4. Through molecular dynamics simulation, we have determined that the protein-ligand interactions were focused at Lys188 on chain A and Lys187 on chain B. Intriguingly, the hydrogen bonds for all three ligands fluctuated frequently but were held at close approximation. The pi-cation interactions and ionic interactions mediated by o-hydroxyphenyl and carboxyl functional groups respectively have been demonstrated to play critical roles in stabilizing binding conformations. Based on these results, we reported the identification of potential radiotherapy enhancers from TCM. We further characterized the key binding elements for inhibiting the XRCC4 activities.

  2. Uploading, Searching and Visualizing of Paleomagnetic and Rock Magnetic Data in the Online MagIC Database

    Science.gov (United States)

    Minnett, R.; Koppers, A.; Tauxe, L.; Constable, C.; Donadini, F.

    2007-12-01

    The Magnetics Information Consortium (MagIC) is commissioned to implement and maintain an online portal to a relational database populated by both rock and paleomagnetic data. The goal of MagIC is to archive all available measurements and derived properties from paleomagnetic studies of directions and intensities, and for rock magnetic experiments (hysteresis, remanence, susceptibility, anisotropy). MagIC is hosted under EarthRef.org at http://earthref.org/MAGIC/ and will soon implement two search nodes, one for paleomagnetism and one for rock magnetism. Currently the PMAG node is operational. Both nodes provide query building based on location, reference, methods applied, material type and geological age, as well as a visual map interface to browse and select locations. Users can also browse the database by data type or by data compilation to view all contributions associated with well known earlier collections like PINT, GMPDB or PSVRL. The query result set is displayed in a digestible tabular format allowing the user to descend from locations to sites, samples, specimens and measurements. At each stage, the result set can be saved and, where appropriate, can be visualized by plotting global location maps, equal area, XY, age, and depth plots, or typical Zijderveld, hysteresis, magnetization and remanence diagrams. User contributions to the MagIC database are critical to achieving a useful research tool. We have developed a standard data and metadata template (version 2.3) that can be used to format and upload all data at the time of publication in Earth Science journals. Software tools are provided to facilitate population of these templates within Microsoft Excel. These tools allow for the import/export of text files and provide advanced functionality to manage and edit the data, and to perform various internal checks to maintain data integrity and prepare for uploading. The MagIC Contribution Wizard at http://earthref.org/MAGIC/upload.htm executes the upload

  3. Students are Confident Using Federated Search Tools as much as Single Databases. A Review of: Armstrong, A. (2009. Student perceptions of federated searching vs. single database searching. Reference Services Review, 37(3, 291-303. doi:10.1108/00907320910982785

    Directory of Open Access Journals (Sweden)

    Deena Yanofsky

    2011-09-01

    Full Text Available Objective – To measure students’ perceptions of the ease-of-use and efficacy of a federated search tool versus a single multidisciplinary database.Design – An evaluation worksheet, employing a combination of quantitative and qualitative questions.Setting – A required, first-year English composition course taught at the University of Illinois at Chicago (UIC.Subjects – Thirty-one undergraduate students completed and submitted the worksheet.Methods – Students attended two library instruction sessions. The first session introduced participants to basic Boolean searching (using AND only, selecting appropriate keywords and searching for books in the library catalogue. In the second library session, students were handed an evaluation worksheet and, with no introduction to the process of searching article databases, were asked to find relevant articles on a research topic of their own choosing using both a federated search tool and a single multidisciplinary database. The evaluation worksheet was divided into four sections: step-by-step instructions for accessing the single multidisciplinary database and the federated search tool; space to record search strings in both resources; space to record the titles of up to five relevant articles; and a series of quantitative and qualitative questions regarding ease-of-use, relevancy of results, overall preference (if any between the two resources, likeliness of future use and other preferred research tools. Half of the participants received a worksheet with instructions to search the federated search tool before the single database; the order was reversed for the other half of the students. The evaluation worksheet was designed to be completed in one hour.Participant responses to qualitative questions were analyzed, codified and grouped into thematic categories. If a student mentioned more than one factor in responding to a question, their response was recorded in multiple categories.Main Results

  4. Accelerating Smith-Waterman Alignment for Protein Database Search Using Frequency Distance Filtration Scheme Based on CPU-GPU Collaborative System.

    Science.gov (United States)

    Liu, Yu; Hong, Yang; Lin, Chun-Yuan; Hung, Che-Lun

    2015-01-01

    The Smith-Waterman (SW) algorithm has been widely utilized for searching biological sequence databases in bioinformatics. Recently, several works have adopted the graphic card with Graphic Processing Units (GPUs) and their associated CUDA model to enhance the performance of SW computations. However, these works mainly focused on the protein database search by using the intertask parallelization technique, and only using the GPU capability to do the SW computations one by one. Hence, in this paper, we will propose an efficient SW alignment method, called CUDA-SWfr, for the protein database search by using the intratask parallelization technique based on a CPU-GPU collaborative system. Before doing the SW computations on GPU, a procedure is applied on CPU by using the frequency distance filtration scheme (FDFS) to eliminate the unnecessary alignments. The experimental results indicate that CUDA-SWfr runs 9.6 times and 96 times faster than the CPU-based SW method without and with FDFS, respectively.

  5. DNA barcoding of odonates from the Upper Plata basin: Database creation and genetic diversity estimation.

    Directory of Open Access Journals (Sweden)

    Ricardo Koroiva

    Full Text Available We present a DNA barcoding study of Neotropical odonates from the Upper Plata basin, Brazil. A total of 38 species were collected in a transition region of "Cerrado" and Atlantic Forest, both regarded as biological hotspots, and 130 cytochrome c oxidase subunit I (COI barcodes were generated for the collected specimens. The distinct gap between intraspecific (0-2% and interspecific variation (15% and above in COI, and resulting separation of Barcode Index Numbers (BIN, allowed for successful identification of specimens in 94% of cases. The 6% fail rate was due to a shared BIN between two separate nominal species. DNA barcoding, based on COI, thus seems to be a reliable and efficient tool for identifying Neotropical odonate specimens down to the species level. These results underscore the utility of DNA barcoding to aid specimen identification in diverse biological hotspots, areas that require urgent action regarding taxonomic surveys and biodiversity conservation.

  6. Native Health Research Database

    Science.gov (United States)

    ... Indian Health Board) Welcome to the Native Health Database. Please enter your search terms. Basic Search Advanced ... To learn more about searching the Native Health Database, click here. Tutorial Video The NHD has made ...

  7. Leading-edge forensic DNA analyses and the necessity of including crime scene investigators, police officers and technicians in a DNA elimination database.

    Science.gov (United States)

    Lapointe, Martine; Rogic, Anita; Bourgoin, Sarah; Jolicoeur, Christine; Séguin, Diane

    2015-11-01

    In recent years, sophisticated technology has significantly increased the sensitivity and analytical power of genetic analyses so that very little starting material may now produce viable genetic profiles. This sensitivity however, has also increased the risk of detecting unknown genetic profiles assumed to be that of the perpetrator, yet originate from extraneous sources such as from crime scene workers. These contaminants may mislead investigations, keeping criminal cases active and unresolved for long spans of time. Voluntary submission of DNA samples from crime scene workers is fairly low, therefore we have created a promotional method for our staff elimination database that has resulted in a significant increase in voluntary samples since 2011. Our database enforces privacy safeguards and allows for optional anonymity to all staff members. We also offer information sessions at various police precincts to advise crime scene workers of the importance and success of our staff elimination database. This study, a pioneer in its field, has obtained 327 voluntary submissions from crime scene workers to date, of which 46 individual profiles (14%) have been matched to 58 criminal cases. By implementing our methods and respect for individual privacy, forensic laboratories everywhere may see similar growth and success in explaining unidentified genetic profiles in stagnate criminal cases. Crown Copyright © 2015. Published by Elsevier Ireland Ltd. All rights reserved.

  8. HMMerThread: detecting remote, functional conserved domains in entire genomes by combining relaxed sequence-database searches with fold recognition.

    Directory of Open Access Journals (Sweden)

    Charles Richard Bradshaw

    Full Text Available Conserved domains in proteins are one of the major sources of functional information for experimental design and genome-level annotation. Though search tools for conserved domain databases such as Hidden Markov Models (HMMs are sensitive in detecting conserved domains in proteins when they share sufficient sequence similarity, they tend to miss more divergent family members, as they lack a reliable statistical framework for the detection of low sequence similarity. We have developed a greatly improved HMMerThread algorithm that can detect remotely conserved domains in highly divergent sequences. HMMerThread combines relaxed conserved domain searches with fold recognition to eliminate false positive, sequence-based identifications. With an accuracy of 90%, our software is able to automatically predict highly divergent members of conserved domain families with an associated 3-dimensional structure. We give additional confidence to our predictions by validation across species. We have run HMMerThread searches on eight proteomes including human and present a rich resource of remotely conserved domains, which adds significantly to the functional annotation of entire proteomes. We find ∼4500 cross-species validated, remotely conserved domain predictions in the human proteome alone. As an example, we find a DNA-binding domain in the C-terminal part of the A-kinase anchor protein 10 (AKAP10, a PKA adaptor that has been implicated in cardiac arrhythmias and premature cardiac death, which upon stress likely translocates from mitochondria to the nucleus/nucleolus. Based on our prediction, we propose that with this HLH-domain, AKAP10 is involved in the transcriptional control of stress response. Further remotely conserved domains we discuss are examples from areas such as sporulation, chromosome segregation and signalling during immune response. The HMMerThread algorithm is able to automatically detect the presence of remotely conserved domains in

  9. Policy implications for familial searching.

    Science.gov (United States)

    Kim, Joyce; Mammo, Danny; Siegel, Marni B; Katsanis, Sara H

    2011-11-01

    In the United States, several states have made policy decisions regarding whether and how to use familial searching of the Combined DNA Index System (CODIS) database in criminal investigations. Familial searching pushes DNA typing beyond merely identifying individuals to detecting genetic relatedness, an application previously reserved for missing persons identifications and custody battles. The intentional search of CODIS for partial matches to an item of evidence offers law enforcement agencies a powerful tool for developing investigative leads, apprehending criminals, revitalizing cold cases and exonerating wrongfully convicted individuals. As familial searching involves a range of logistical, social, ethical and legal considerations, states are now grappling with policy options for implementing familial searching to balance crime fighting with its potential impact on society. When developing policies for familial searching, legislators should take into account the impact of familial searching on select populations and the need to minimize personal intrusion on relatives of individuals in the DNA database. This review describes the approaches used to narrow a suspect pool from a partial match search of CODIS and summarizes the economic, ethical, logistical and political challenges of implementing familial searching. We examine particular US state policies and the policy options adopted to address these issues. The aim of this review is to provide objective background information on the controversial approach of familial searching to inform policy decisions in this area. Herein we highlight key policy options and recommendations regarding effective utilization of familial searching that minimize harm to and afford maximum protection of US citizens.

  10. ORF Table - KAIKOcDNA | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available tion of data contents List of open reading frames of the representative ESTs. Data file File name: kaiko_cdna..._orf.zip File URL: ftp://ftp.biosciencedbc.jp/archive/kaiko-cdna/LATEST/kaiko_cdna_orf.zip File size: 11 MB... Simple search URL http://togodb.biosciencedbc.jp/togodb/view/kaiko_cdna_orf#en D

  11. Automatic sorting of toxicological information into the IUCLID (International Uniform Chemical Information Database) endpoint-categories making use of the semantic search engine Go3R.

    Science.gov (United States)

    Sauer, Ursula G; Wächter, Thomas; Hareng, Lars; Wareing, Britta; Langsch, Angelika; Zschunke, Matthias; Alvers, Michael R; Landsiedel, Robert

    2014-06-01

    The knowledge-based search engine Go3R, www.Go3R.org, has been developed to assist scientists from industry and regulatory authorities in collecting comprehensive toxicological information with a special focus on identifying available alternatives to animal testing. The semantic search paradigm of Go3R makes use of expert knowledge on 3Rs methods and regulatory toxicology, laid down in the ontology, a network of concepts, terms, and synonyms, to recognize the contents of documents. Search results are automatically sorted into a dynamic table of contents presented alongside the list of documents retrieved. This table of contents allows the user to quickly filter the set of documents by topics of interest. Documents containing hazard information are automatically assigned to a user interface following the endpoint-specific IUCLID5 categorization scheme required, e.g. for REACH registration dossiers. For this purpose, complex endpoint-specific search queries were compiled and integrated into the search engine (based upon a gold standard of 310 references that had been assigned manually to the different endpoint categories). Go3R sorts 87% of the references concordantly into the respective IUCLID5 categories. Currently, Go3R searches in the 22 million documents available in the PubMed and TOXNET databases. However, it can be customized to search in other databases including in-house databanks. Copyright © 2013 Elsevier Ltd. All rights reserved.

  12. Antibiotic distribution channels in Thailand: results of key-informant interviews, reviews of drug regulations and database searches.

    Science.gov (United States)

    Sommanustweechai, Angkana; Chanvatik, Sunicha; Sermsinsiri, Varavoot; Sivilaikul, Somsajee; Patcharanarumol, Walaiporn; Yeung, Shunmay; Tangcharoensathien, Viroj

    2018-02-01

    To analyse how antibiotics are imported, manufactured, distributed and regulated in Thailand. We gathered information, on antibiotic distribution in Thailand, in in-depth interviews - with 43 key informants from farms, health facilities, pharmaceutical and animal feed industries, private pharmacies and regulators- and in database and literature searches. In 2016-2017, licensed antibiotic distribution in Thailand involves over 700 importers and about 24 000 distributors - e.g. retail pharmacies and wholesalers. Thailand imports antibiotics and active pharmaceutical ingredients. There is no system for monitoring the distribution of active ingredients, some of which are used directly on farms, without being processed. Most antibiotics can be bought from pharmacies, for home or farm use, without a prescription. Although the 1987 Drug Act classified most antibiotics as "dangerous drugs", it only classified a few of them as prescription-only medicines and placed no restrictions on the quantities of antibiotics that could be sold to any individual. Pharmacists working in pharmacies are covered by some of the Act's regulations, but the quality of their dispensing and prescribing appears to be largely reliant on their competences. In Thailand, most antibiotics are easily and widely available from retail pharmacies, without a prescription. If the inappropriate use of active pharmaceutical ingredients and antibiotics is to be reduced, we need to reclassify and restrict access to certain antibiotics and to develop systems to audit the dispensing of antibiotics in the retail sector and track the movements of active ingredients.

  13. First postoperative PSA is associated with outcomes in patients with node positive prostate cancer: Results from the SEARCH database.

    Science.gov (United States)

    McDonald, Michelle L; Howard, Lauren E; Aronson, William J; Terris, Martha K; Cooperberg, Matthew R; Amling, Christopher L; Freedland, Stephen J; Kane, Christopher J

    2018-05-01

    To analyze factors associated with metastases, prostate cancer-specific mortality, and all-cause mortality in pN1 patients. We analyzed 3,642 radical prostatectomy patients within the Shared Equal Access Regional Cancer Hospital (SEARCH) database. Pathologic Gleason grade, number of lymph nodes (LN) removed, and first postoperative prostate-specific antigen (PSA) (PSA. Of 3,642 patients, 124 (3.4%) had pN1. There were 71 (60%) patients with 1 positive LN, 32 (27%) with 2 positive LNs, and 15 (13%) with ≥3. Among men with pN1, first postoperative PSA wasPSA ≥0.2 ng/ml (P = 0.005) were associated with metastases. First postoperative PSA ≥0.2ng/ml was associated with metastasis on multivariable analysis (P = 0.046). Log-rank analysis revealed a more favorable metastases-free survival in patients with a first postoperative PSAPSAPSA ≥0.2ng/ml were more likely to develop metastases. First postoperative PSA may be useful in identifying pN1 patients who harbor distant disease and aid in secondary treatment decisions. Copyright © 2018 Elsevier Inc. All rights reserved.

  14. Preference vs. Authority: A Comparison of Student Searching in a Subject-Specific Indexing and Abstracting Database and a Customized Discovery Layer

    Science.gov (United States)

    Dahlen, Sarah P. C.; Hanson, Kathlene

    2017-01-01

    Discovery layers provide a simplified interface for searching library resources. Libraries with limited finances make decisions about retaining indexing and abstracting databases when similar information is available in discovery layers. These decisions should be informed by student success at finding quality information as well as satisfaction…

  15. Structural and mutational analysis of Escherichia coli AlkB provides insight into substrate specificity and DNA damage searching.

    Directory of Open Access Journals (Sweden)

    Paul J Holland

    Full Text Available BACKGROUND: In Escherichia coli, cytotoxic DNA methyl lesions on the N1 position of purines and N3 position of pyrimidines are primarily repaired by the 2-oxoglutarate (2-OG iron(II dependent dioxygenase, AlkB. AlkB repairs 1-methyladenine (1-meA and 3-methylcytosine (3-meC lesions, but it also repairs 1-methylguanine (1-meG and 3-methylthymine (3-meT at a much less efficient rate. How the AlkB enzyme is able to locate and identify methylated bases in ssDNA has remained an open question. METHODOLOGY/PRINCIPAL FINDINGS: We determined the crystal structures of the E. coli AlkB protein holoenzyme and the AlkB-ssDNA complex containing a 1-meG lesion. We coupled this to site-directed mutagenesis of amino acids in and around the active site, and tested the effects of these mutations on the ability of the protein to bind both damaged and undamaged DNA, as well as catalyze repair of a methylated substrate. CONCLUSIONS/SIGNIFICANCE: A comparison of our substrate-bound AlkB-ssDNA complex with our unliganded holoenzyme reveals conformational changes of residues within the active site that are important for binding damaged bases. Site-directed mutagenesis of these residues reveals novel insight into their roles in DNA damage recognition and repair. Our data support a model that the AlkB protein utilizes at least two distinct conformations in searching and binding methylated bases within DNA: a "searching" mode and "repair" mode. Moreover, we are able to functionally separate these modes through mutagenesis of residues that affect one or the other binding state. Finally, our mutagenesis experiments show that amino acid D135 of AlkB participates in both substrate specificity and catalysis.

  16. Amplification volume reduction on DNA database samples using FTA™ Classic Cards.

    Science.gov (United States)

    Wong, Hang Yee; Lim, Eng Seng Simon; Tan-Siew, Wai Fun

    2012-03-01

    The DNA forensic community always strives towards improvements in aspects such as sensitivity, robustness, and efficacy balanced with cost efficiency. Therefore our laboratory decided to study the feasibility of PCR amplification volume reduction using DNA entrapped in FTA™ Classic Card and to bring cost savings to the laboratory. There were a few concerns the laboratory needed to address. First, the kinetics of the amplification reaction could be significantly altered. Second, an increase in sensitivity might affect interpretation due to increased stochastic effects even though they were pristine samples. Third, statics might cause FTA punches to jump out of its allocated well into another thus causing sample-to-sample contamination. Fourth, the size of the punches might be too small for visual inspection. Last, there would be a limit to the extent of volume reduction due to evaporation and the possible need of re-injection of samples for capillary electrophoresis. The laboratory had successfully optimized a reduced amplification volume of 10 μL for FTA samples. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.

  17. Evidential significance of automotive paint trace evidence using a pattern recognition based infrared library search engine for the Paint Data Query Forensic Database.

    Science.gov (United States)

    Lavine, Barry K; White, Collin G; Allen, Matthew D; Fasasi, Ayuba; Weakley, Andrew

    2016-10-01

    A prototype library search engine has been further developed to search the infrared spectral libraries of the paint data query database to identify the line and model of a vehicle from the clear coat, surfacer-primer, and e-coat layers of an intact paint chip. For this study, search prefilters were developed from 1181 automotive paint systems spanning 3 manufacturers: General Motors, Chrysler, and Ford. The best match between each unknown and the spectra in the hit list generated by the search prefilters was identified using a cross-correlation library search algorithm that performed both a forward and backward search. In the forward search, spectra were divided into intervals and further subdivided into windows (which corresponds to the time lag for the comparison) within those intervals. The top five hits identified in each search window were compiled; a histogram was computed that summarized the frequency of occurrence for each library sample, with the IR spectra most similar to the unknown flagged. The backward search computed the frequency and occurrence of each line and model without regard to the identity of the individual spectra. Only those lines and models with a frequency of occurrence greater than or equal to 20% were included in the final hit list. If there was agreement between the forward and backward search results, the specific line and model common to both hit lists was always the correct assignment. Samples assigned to the same line and model by both searches are always well represented in the library and correlate well on an individual basis to specific library samples. For these samples, one can have confidence in the accuracy of the match. This was not the case for the results obtained using commercial library search algorithms, as the hit quality index scores for the top twenty hits were always greater than 99%. Copyright © 2016 Elsevier B.V. All rights reserved.

  18. Discovery of possible gene relationships through the application of self-organizing maps to DNA microarray databases.

    Science.gov (United States)

    Chavez-Alvarez, Rocio; Chavoya, Arturo; Mendez-Vazquez, Andres

    2014-01-01

    DNA microarrays and cell cycle synchronization experiments have made possible the study of the mechanisms of cell cycle regulation of Saccharomyces cerevisiae by simultaneously monitoring the expression levels of thousands of genes at specific time points. On the other hand, pattern recognition techniques can contribute to the analysis of such massive measurements, providing a model of gene expression level evolution through the cell cycle process. In this paper, we propose the use of one of such techniques--an unsupervised artificial neural network called a Self-Organizing Map (SOM)-which has been successfully applied to processes involving very noisy signals, classifying and organizing them, and assisting in the discovery of behavior patterns without requiring prior knowledge about the process under analysis. As a test bed for the use of SOMs in finding possible relationships among genes and their possible contribution in some biological processes, we selected 282 S. cerevisiae genes that have been shown through biological experiments to have an activity during the cell cycle. The expression level of these genes was analyzed in five of the most cited time series DNA microarray databases used in the study of the cell cycle of this organism. With the use of SOM, it was possible to find clusters of genes with similar behavior in the five databases along two cell cycles. This result suggested that some of these genes might be biologically related or might have a regulatory relationship, as was corroborated by comparing some of the clusters obtained with SOMs against a previously reported regulatory network that was generated using biological knowledge, such as protein-protein interactions, gene expression levels, metabolism dynamics, promoter binding, and modification, regulation and transport of proteins. The methodology described in this paper could be applied to the study of gene relationships of other biological processes in different organisms.

  19. Database searching and accounting of multiplexed precursor and product ion spectra from the data independent analysis of simple and complex peptide mixtures.

    Science.gov (United States)

    Li, Guo-Zhong; Vissers, Johannes P C; Silva, Jeffrey C; Golick, Dan; Gorenstein, Marc V; Geromanos, Scott J

    2009-03-01

    A novel database search algorithm is presented for the qualitative identification of proteins over a wide dynamic range, both in simple and complex biological samples. The algorithm has been designed for the analysis of data originating from data independent acquisitions, whereby multiple precursor ions are fragmented simultaneously. Measurements used by the algorithm include retention time, ion intensities, charge state, and accurate masses on both precursor and product ions from LC-MS data. The search algorithm uses an iterative process whereby each iteration incrementally increases the selectivity, specificity, and sensitivity of the overall strategy. Increased specificity is obtained by utilizing a subset database search approach, whereby for each subsequent stage of the search, only those peptides from securely identified proteins are queried. Tentative peptide and protein identifications are ranked and scored by their relative correlation to a number of models of known and empirically derived physicochemical attributes of proteins and peptides. In addition, the algorithm utilizes decoy database techniques for automatically determining the false positive identification rates. The search algorithm has been tested by comparing the search results from a four-protein mixture, the same four-protein mixture spiked into a complex biological background, and a variety of other "system" type protein digest mixtures. The method was validated independently by data dependent methods, while concurrently relying on replication and selectivity. Comparisons were also performed with other commercially and publicly available peptide fragmentation search algorithms. The presented results demonstrate the ability to correctly identify peptides and proteins from data independent acquisition strategies with high sensitivity and specificity. They also illustrate a more comprehensive analysis of the samples studied; providing approximately 20% more protein identifications, compared to

  20. Millennial Students’ Online Search Strategies are Associated With Their Mental Models of Search. A Review of: Holman, L. (2011. Millennial students’ mental models of search: Implications for academic librarians and database developers. Journal of Academic Librarianship, 37(1, 19-27. doi:10.1016/j.acalib.2010.10.003

    Directory of Open Access Journals (Sweden)

    Leslie Bussert

    2011-09-01

    Full Text Available Objective – To examine first-year college students’ information seeking behaviours and determine whether their mental models of the search process influence their ability to effectively search for and find scholarly materials.Design – Mixed methods including contextual inquiry, concept mapping, observation, and interviews.Setting – University of Baltimore, a public institution in Maryland, United States of America, offering undergraduate, graduate, and professional degrees.Subjects – A total of 21 first-year undergraduate students, ages 16 to 19 years, undertaking research assignments for which they chose to use online resources.Methods – First-year students were recruited in the fall of 2008 and met with the researcher in a university usability lab for about one hour over a three week period. The researcher observed and videotaped the students as they conducted research in their chosen search engines or article databases. The searches were captured using software, and students were encouraged to think aloud about their research process, search strategies, and anticipated search results. Observation sessions concluded with a 10-question interview incorporating a review of the keywords the student used, the student’s reflection on the success of his or her searches, and possible alternate keywords. The interview also offered prompts to help the researcher learn about students’ conceptualizations of search tools’ utilization of keywords to generate results. The researcher then asked the students to provide a visual diagram of the relationship between their search terms and the items retrieved in the search tool.Data were analyzed by identifying the 21 different search tools used by the students and categorizing all 210 searches and student diagrams for further analysis. A scheme similar to Guinee, Eagleton, and Hall’s (2003 characterized the student searches into four categories: simple single-term searches, topic plus focus

  1. BLAST and FASTA similarity searching for multiple sequence alignment.

    Science.gov (United States)

    Pearson, William R

    2014-01-01

    BLAST, FASTA, and other similarity searching programs seek to identify homologous proteins and DNA sequences based on excess sequence similarity. If two sequences share much more similarity than expected by chance, the simplest explanation for the excess similarity is common ancestry-homology. The most effective similarity searches compare protein sequences, rather than DNA sequences, for sequences that encode proteins, and use expectation values, rather than percent identity, to infer homology. The BLAST and FASTA packages of sequence comparison programs provide programs for comparing protein and DNA sequences to protein databases (the most sensitive searches). Protein and translated-DNA comparisons to protein databases routinely allow evolutionary look back times from 1 to 2 billion years; DNA:DNA searches are 5-10-fold less sensitive. BLAST and FASTA can be run on popular web sites, but can also be downloaded and installed on local computers. With local installation, target databases can be customized for the sequence data being characterized. With today's very large protein databases, search sensitivity can also be improved by searching smaller comprehensive databases, for example, a complete protein set from an evolutionarily neighboring model organism. By default, BLAST and FASTA use scoring strategies target for distant evolutionary relationships; for comparisons involving short domains or queries, or searches that seek relatively close homologs (e.g. mouse-human), shallower scoring matrices will be more effective. Both BLAST and FASTA provide very accurate statistical estimates, which can be used to reliably identify protein sequences that diverged more than 2 billion years ago.

  2. RiceFOX: a database of Arabidopsis mutant lines overexpressing rice full-length cDNA that contains a wide range of trait information to facilitate analysis of gene function.

    Science.gov (United States)

    Sakurai, Tetsuya; Kondou, Youichi; Akiyama, Kenji; Kurotani, Atsushi; Higuchi, Mieko; Ichikawa, Takanari; Kuroda, Hirofumi; Kusano, Miyako; Mori, Masaki; Saitou, Tsutomu; Sakakibara, Hitoshi; Sugano, Shoji; Suzuki, Makoto; Takahashi, Hideki; Takahashi, Shinya; Takatsuji, Hiroshi; Yokotani, Naoki; Yoshizumi, Takeshi; Saito, Kazuki; Shinozaki, Kazuo; Oda, Kenji; Hirochika, Hirohiko; Matsui, Minami

    2011-02-01

    Identification of gene function is important not only for basic research but also for applied science, especially with regard to improvements in crop production. For rapid and efficient elucidation of useful traits, we developed a system named FOX hunting (Full-length cDNA Over-eXpressor gene hunting) using full-length cDNAs (fl-cDNAs). A heterologous expression approach provides a solution for the high-throughput characterization of gene functions in agricultural plant species. Since fl-cDNAs contain all the information of functional mRNAs and proteins, we introduced rice fl-cDNAs into Arabidopsis plants for systematic gain-of-function mutation. We generated >30,000 independent Arabidopsis transgenic lines expressing rice fl-cDNAs (rice FOX Arabidopsis mutant lines). These rice FOX Arabidopsis lines were screened systematically for various criteria such as morphology, photosynthesis, UV resistance, element composition, plant hormone profile, metabolite profile/fingerprinting, bacterial resistance, and heat and salt tolerance. The information obtained from these screenings was compiled into a database named 'RiceFOX'. This database contains around 18,000 records of rice FOX Arabidopsis lines and allows users to search against all the observed results, ranging from morphological to invisible traits. The number of searchable items is approximately 100; moreover, the rice FOX Arabidopsis lines can be searched by rice and Arabidopsis gene/protein identifiers, sequence similarity to the introduced rice fl-cDNA and traits. The RiceFOX database is available at http://ricefox.psc.riken.jp/.

  3. RNA FRABASE 2.0: an advanced web-accessible database with the capacity to search the three-dimensional fragments within RNA structures

    Directory of Open Access Journals (Sweden)

    Wasik Szymon

    2010-05-01

    Full Text Available Abstract Background Recent discoveries concerning novel functions of RNA, such as RNA interference, have contributed towards the growing importance of the field. In this respect, a deeper knowledge of complex three-dimensional RNA structures is essential to understand their new biological functions. A number of bioinformatic tools have been proposed to explore two major structural databases (PDB, NDB in order to analyze various aspects of RNA tertiary structures. One of these tools is RNA FRABASE 1.0, the first web-accessible database with an engine for automatic search of 3D fragments within PDB-derived RNA structures. This search is based upon the user-defined RNA secondary structure pattern. In this paper, we present and discuss RNA FRABASE 2.0. This second version of the system represents a major extension of this tool in terms of providing new data and a wide spectrum of novel functionalities. An intuitionally operated web server platform enables very fast user-tailored search of three-dimensional RNA fragments, their multi-parameter conformational analysis and visualization. Description RNA FRABASE 2.0 has stored information on 1565 PDB-deposited RNA structures, including all NMR models. The RNA FRABASE 2.0 search engine algorithms operate on the database of the RNA sequences and the new library of RNA secondary structures, coded in the dot-bracket format extended to hold multi-stranded structures and to cover residues whose coordinates are missing in the PDB files. The library of RNA secondary structures (and their graphics is made available. A high level of efficiency of the 3D search has been achieved by introducing novel tools to formulate advanced searching patterns and to screen highly populated tertiary structure elements. RNA FRABASE 2.0 also stores data and conformational parameters in order to provide "on the spot" structural filters to explore the three-dimensional RNA structures. An instant visualization of the 3D RNA

  4. Comparing the Precision of Information Retrieval of MeSH-Controlled Vocabulary Search Method and a Visual Method in the Medline Medical Database.

    Science.gov (United States)

    Hariri, Nadjla; Ravandi, Somayyeh Nadi

    2014-01-01

    Medline is one of the most important databases in the biomedical field. One of the most important hosts for Medline is Elton B. Stephens CO. (EBSCO), which has presented different search methods that can be used based on the needs of the users. Visual search and MeSH-controlled search methods are among the most common methods. The goal of this research was to compare the precision of the retrieved sources in the EBSCO Medline base using MeSH-controlled and visual search methods. This research was a semi-empirical study. By holding training workshops, 70 students of higher education in different educational departments of Kashan University of Medical Sciences were taught MeSH-Controlled and visual search methods in 2012. Then, the precision of 300 searches made by these students was calculated based on Best Precision, Useful Precision, and Objective Precision formulas and analyzed in SPSS software using the independent sample T Test, and three precisions obtained with the three precision formulas were studied for the two search methods. The mean precision of the visual method was greater than that of the MeSH-Controlled search for all three types of precision, i.e. Best Precision, Useful Precision, and Objective Precision, and their mean precisions were significantly different (P searches. Fifty-three percent of the participants in the research also mentioned that the use of the combination of the two methods produced better results. For users, it is more appropriate to use a natural, language-based method, such as the visual method, in the EBSCO Medline host than to use the controlled method, which requires users to use special keywords. The potential reason for their preference was that the visual method allowed them more freedom of action.

  5. Accelerating Smith-Waterman Alignment for Protein Database Search Using Frequency Distance Filtration Scheme Based on CPU-GPU Collaborative System

    Directory of Open Access Journals (Sweden)

    Yu Liu

    2015-01-01

    Full Text Available The Smith-Waterman (SW algorithm has been widely utilized for searching biological sequence databases in bioinformatics. Recently, several works have adopted the graphic card with Graphic Processing Units (GPUs and their associated CUDA model to enhance the performance of SW computations. However, these works mainly focused on the protein database search by using the intertask parallelization technique, and only using the GPU capability to do the SW computations one by one. Hence, in this paper, we will propose an efficient SW alignment method, called CUDA-SWfr, for the protein database search by using the intratask parallelization technique based on a CPU-GPU collaborative system. Before doing the SW computations on GPU, a procedure is applied on CPU by using the frequency distance filtration scheme (FDFS to eliminate the unnecessary alignments. The experimental results indicate that CUDA-SWfr runs 9.6 times and 96 times faster than the CPU-based SW method without and with FDFS, respectively.

  6. Search for virus-related genetic information in human tumors DNA by the transfection method

    Energy Technology Data Exchange (ETDEWEB)

    Knyazev, P G; Perevozchikov, A P; Korobitsyn, L P; Zhudina, A I; Kuznetsov, O K; Savost' yanov, G A; Dyad' kova, A M; Sejts, I F [Nauchno-Issledovatel' skij Inst. Onkologii, Leningrad (USSR)

    1979-01-01

    DNA preparations from the blood cells of a patient suffering from acute myeloid leukemia, from tissues of polymorphous cell rhabdomyosarcoma, synovial sarcoma and neurinoma were studied. The production of virus-like RNA-containing particles (according to radioisotope analysis and the content of reverse transcriptase in these particles) was observed in cultures of embryonic human cells treated with DNA from the cells of leukemia patient.

  7. A comparison of three design tree based search algorithms for the detection of engineering parts constructed with CATIA V5 in large databases

    Directory of Open Access Journals (Sweden)

    Robin Roj

    2014-07-01

    Full Text Available This paper presents three different search engines for the detection of CAD-parts in large databases. The analysis of the contained information is performed by the export of the data that is stored in the structure trees of the CAD-models. A preparation program generates one XML-file for every model, which in addition to including the data of the structure tree, also owns certain physical properties of each part. The first search engine is specializes in the discovery of standard parts, like screws or washers. The second program uses certain user input as search parameters, and therefore has the ability to perform personalized queries. The third one compares one given reference part with all parts in the database, and locates files that are identical, or similar to, the reference part. All approaches run automatically, and have the analysis of the structure tree in common. Files constructed with CATIA V5, and search engines written with Python have been used for the implementation. The paper also includes a short comparison of the advantages and disadvantages of each program, as well as a performance test.

  8. CUDASW++2.0: enhanced Smith-Waterman protein database search on CUDA-enabled GPUs based on SIMT and virtualized SIMD abstractions

    Directory of Open Access Journals (Sweden)

    Schmidt Bertil

    2010-04-01

    Full Text Available Abstract Background Due to its high sensitivity, the Smith-Waterman algorithm is widely used for biological database searches. Unfortunately, the quadratic time complexity of this algorithm makes it highly time-consuming. The exponential growth of biological databases further deteriorates the situation. To accelerate this algorithm, many efforts have been made to develop techniques in high performance architectures, especially the recently emerging many-core architectures and their associated programming models. Findings This paper describes the latest release of the CUDASW++ software, CUDASW++ 2.0, which makes new contributions to Smith-Waterman protein database searches using compute unified device architecture (CUDA. A parallel Smith-Waterman algorithm is proposed to further optimize the performance of CUDASW++ 1.0 based on the single instruction, multiple thread (SIMT abstraction. For the first time, we have investigated a partitioned vectorized Smith-Waterman algorithm using CUDA based on the virtualized single instruction, multiple data (SIMD abstraction. The optimized SIMT and the partitioned vectorized algorithms were benchmarked, and remarkably, have similar performance characteristics. CUDASW++ 2.0 achieves performance improvement over CUDASW++ 1.0 as much as 1.74 (1.72 times using the optimized SIMT algorithm and up to 1.77 (1.66 times using the partitioned vectorized algorithm, with a performance of up to 17 (30 billion cells update per second (GCUPS on a single-GPU GeForce GTX 280 (dual-GPU GeForce GTX 295 graphics card. Conclusions CUDASW++ 2.0 is publicly available open-source software, written in CUDA and C++ programming languages. It obtains significant performance improvement over CUDASW++ 1.0 using either the optimized SIMT algorithm or the partitioned vectorized algorithm for Smith-Waterman protein database searches by fully exploiting the compute capability of commonly used CUDA-enabled low-cost GPUs.

  9. CAZymes Analysis Toolkit (CAT): web service for searching and analyzing carbohydrate-active enzymes in a newly sequenced organism using CAZy database.

    Science.gov (United States)

    Park, Byung H; Karpinets, Tatiana V; Syed, Mustafa H; Leuze, Michael R; Uberbacher, Edward C

    2010-12-01

    The Carbohydrate-Active Enzyme (CAZy) database provides a rich set of manually annotated enzymes that degrade, modify, or create glycosidic bonds. Despite rich and invaluable information stored in the database, software tools utilizing this information for annotation of newly sequenced genomes by CAZy families are limited. We have employed two annotation approaches to fill the gap between manually curated high-quality protein sequences collected in the CAZy database and the growing number of other protein sequences produced by genome or metagenome sequencing projects. The first approach is based on a similarity search against the entire nonredundant sequences of the CAZy database. The second approach performs annotation using links or correspondences between the CAZy families and protein family domains. The links were discovered using the association rule learning algorithm applied to sequences from the CAZy database. The approaches complement each other and in combination achieved high specificity and sensitivity when cross-evaluated with the manually curated genomes of Clostridium thermocellum ATCC 27405 and Saccharophagus degradans 2-40. The capability of the proposed framework to predict the function of unknown protein domains and of hypothetical proteins in the genome of Neurospora crassa is demonstrated. The framework is implemented as a Web service, the CAZymes Analysis Toolkit, and is available at http://cricket.ornl.gov/cgi-bin/cat.cgi.

  10. Patscanui: an intuitive web interface for searching patterns in DNA and protein data

    DEFF Research Database (Denmark)

    Blin, Kai; Wohlleben, Wolfgang; Weber, Tilmann

    2018-01-01

    Patterns in biological sequences frequently signify interesting features in the underlying molecule. Many tools exist to search for well-known patterns. Less support is available for exploratory analysis, where no well-defined patterns are known yet. PatScanUI (https://patscan.secondarymetabolite......Patterns in biological sequences frequently signify interesting features in the underlying molecule. Many tools exist to search for well-known patterns. Less support is available for exploratory analysis, where no well-defined patterns are known yet. PatScanUI (https......://patscan.secondarymetabolites.org/) provides a highly interactive web interface to the powerful generic pattern search tool PatScan. The complex PatScan-patterns are created in a drag-and-drop aware interface allowing researchers to do rapid prototyping of the often complicated patterns useful to identifying features of interest....

  11. Kazusa Marker DataBase: a database for genomics, genetics, and molecular breeding in plants

    Science.gov (United States)

    Shirasawa, Kenta; Isobe, Sachiko; Tabata, Satoshi; Hirakawa, Hideki

    2014-01-01

    In order to provide useful genomic information for agronomical plants, we have established a database, the Kazusa Marker DataBase (http://marker.kazusa.or.jp). This database includes information on DNA markers, e.g., SSR and SNP markers, genetic linkage maps, and physical maps, that were developed at the Kazusa DNA Research Institute. Keyword searches for the markers, sequence data used for marker development, and experimental conditions are also available through this database. Currently, 10 plant species have been targeted: tomato (Solanum lycopersicum), pepper (Capsicum annuum), strawberry (Fragaria × ananassa), radish (Raphanus sativus), Lotus japonicus, soybean (Glycine max), peanut (Arachis hypogaea), red clover (Trifolium pratense), white clover (Trifolium repens), and eucalyptus (Eucalyptus camaldulensis). In addition, the number of plant species registered in this database will be increased as our research progresses. The Kazusa Marker DataBase will be a useful tool for both basic and applied sciences, such as genomics, genetics, and molecular breeding in crops. PMID:25320561

  12. Searching the Literatura Latino Americana e do Caribe em Ciências da Saúde (LILACS) database improves systematic reviews.

    Science.gov (United States)

    Clark, Otavio Augusto Camara; Castro, Aldemar Araujo

    2002-02-01

    An unbiased systematic review (SR) should analyse as many articles as possible in order to provide the best evidence available. However, many SR use only databases with high English-language content as sources for articles. Literatura Latino Americana e do Caribe em Ciências da Saúde (LILACS) indexes 670 journals from the Latin American and Caribbean health literature but is seldom used in these SR. Our objective is to evaluate if LILACS should be used as a routine source of articles for SR. First we identified SR published in 1997 in five medical journals with a high impact factor. Then we searched LILACS for articles that could match the inclusion criteria of these SR. We also checked if the authors had already identified these articles located in LILACS. In all, 64 SR were identified. Two had already searched LILACS and were excluded. In 39 of 62 (63%) SR a LILACS search identified articles that matched the inclusion criteria. In 5 (8%) our search was inconclusive and in 18 (29%) no articles were found in LILACS. Therefore, in 71% (44/72) of cases, a LILACS search could have been useful to the authors. This proportion remains the same if we consider only the 37 SR that performed a meta-analysis. In only one case had the article identified in LILACS already been located elsewhere by the authors' strategy. LILACS is an under-explored and unique source of articles whose use can improve the quality of systematic reviews. This database should be used as a routine source to identify studies for systematic reviews.

  13. The Danish STR sequence database: duplicate typing of 363 Danes with the ForenSeq™ DNA Signature Prep Kit.

    Science.gov (United States)

    Hussing, C; Bytyci, R; Huber, C; Morling, N; Børsting, C

    2018-05-24

    Some STR loci have internal sequence variations, which are not revealed by the standard STR typing methods used in forensic genetics (PCR and fragment length analysis by capillary electrophoresis (CE)). Typing of STRs with next-generation sequencing (NGS) uncovers the sequence variation in the repeat region and in the flanking regions. In this study, 363 Danish individuals were typed for 56 STRs (26 autosomal STRs, 24 Y-STRs, and 6 X-STRs) using the ForenSeq™ DNA Signature Prep Kit to establish a Danish STR sequence database. Increased allelic diversity was observed in 34 STRs by the PCR-NGS assay. The largest increases were found in DYS389II and D12S391, where the numbers of sequenced alleles were around four times larger than the numbers of alleles determined by repeat length alone. Thirteen SNPs and one InDel were identified in the flanking regions of 12 STRs. Furthermore, 36 single positions and five longer stretches in the STR flanking regions were found to have dubious genotyping quality. The combined match probability of the 26 autosomal STRs was 10,000 times larger using the PCR-NGS assay than by using PCR-CE. The typical paternity indices for trios and duos were 500 and 100 times larger, respectively, than those obtained with PCR-CE. The assay also amplified 94 SNPs selected for human identification. Eleven of these loci were not in Hardy-Weinberg equilibrium in the Danish population, most likely because the minimum threshold for allele calling (30 reads) in the ForenSeq™ Universal Analysis Software was too low and frequent allele dropouts were not detected.

  14. Searching for the Optimal Sampling Solution: Variation in Invertebrate Communities, Sample Condition and DNA Quality.

    Directory of Open Access Journals (Sweden)

    Martin M Gossner

    Full Text Available There is a great demand for standardising biodiversity assessments in order to allow optimal comparison across research groups. For invertebrates, pitfall or flight-interception traps are commonly used, but sampling solution differs widely between studies, which could influence the communities collected and affect sample processing (morphological or genetic. We assessed arthropod communities with flight-interception traps using three commonly used sampling solutions across two forest types and two vertical strata. We first considered the effect of sampling solution and its interaction with forest type, vertical stratum, and position of sampling jar at the trap on sample condition and community composition. We found that samples collected in copper sulphate were more mouldy and fragmented relative to other solutions which might impair morphological identification, but condition depended on forest type, trap type and the position of the jar. Community composition, based on order-level identification, did not differ across sampling solutions and only varied with forest type and vertical stratum. Species richness and species-level community composition, however, differed greatly among sampling solutions. Renner solution was highly attractant for beetles and repellent for true bugs. Secondly, we tested whether sampling solution affects subsequent molecular analyses and found that DNA barcoding success was species-specific. Samples from copper sulphate produced the fewest successful DNA sequences for genetic identification, and since DNA yield or quality was not particularly reduced in these samples additional interactions between the solution and DNA must also be occurring. Our results show that the choice of sampling solution should be an important consideration in biodiversity studies. Due to the potential bias towards or against certain species by Ethanol-containing sampling solution we suggest ethylene glycol as a suitable sampling solution when

  15. Integrating plant and animal biology for the search of novel DNA damage biomarkers

    Czech Academy of Sciences Publication Activity Database

    Nikitaki, Z.; Holá, Marcela; Donà, M.; Pavlopoulou, A.; Michalopoulos, I.; Angelis, Karel; Georgakilas, A. G.; Macovei, I.; Balestrazzi, A.

    2018-01-01

    Roč. 775, JAN-MAR (2018), s. 21-38 ISSN 1383-5742 R&D Projects: GA ČR GA16-01137S Institutional support: RVO:61389030 Keywords : DNA damage response * Ionizing radiation * Radiation exposure monitoring * Radiotolerance * Ultraviolet radiation Subject RIV: EB - Genetics ; Molecular Biology OBOR OECD: Genetics and heredity (medical genetics to be 3) Impact factor: 5.500, year: 2016

  16. Recommendations of the DNA Commission of the International Society for Forensic Genetics (ISFG) on quality control of autosomal Short Tandem Repeat allele frequency databasing (STRidER)

    DEFF Research Database (Denmark)

    Bodner, Martin; Bastisch, Ingo; Butler, John M.

    2016-01-01

    for mitochondrial mtDNA, and YHRD for Y-chromosomal loci) that centralized quality control and data curation is essential to minimize error. The concepts employed for quality control involve software-aided likelihood-of-genotype, phylogenetic, and population genetic checks that allow the researchers to compare...... on the previously established ENFSI DNA WG STRbASE and applies standard concepts established for haploid and autosomal markers as well as novel tools to reduce error and increase the quality of autosomal STR data. The platform constitutes a significant improvement and innovation for the scientific community....... There is currently no agreed procedure of performing quality control of STR allele frequency databases, and the reliability and accuracy of the data are largely based on the responsibility of the individual contributing research groups. It has been demonstrated with databases of haploid markers (EMPOP...

  17. [Method of traditional Chinese medicine formula design based on 3D-database pharmacophore search and patent retrieval].

    Science.gov (United States)

    He, Yu-su; Sun, Zhi-yi; Zhang, Yan-ling

    2014-11-01

    By using the pharmacophore model of mineralocorticoid receptor antagonists as a starting point, the experiment stud- ies the method of traditional Chinese medicine formula design for anti-hypertensive. Pharmacophore models were generated by 3D-QSAR pharmacophore (Hypogen) program of the DS3.5, based on the training set composed of 33 mineralocorticoid receptor antagonists. The best pharmacophore model consisted of two Hydrogen-bond acceptors, three Hydrophobic and four excluded volumes. Its correlation coefficient of training set and test set, N, and CAI value were 0.9534, 0.6748, 2.878, and 1.119. According to the database screening, 1700 active compounds from 86 source plant were obtained. Because of lacking of available anti-hypertensive medi cation strategy in traditional theory, this article takes advantage of patent retrieval in world traditional medicine patent database, in order to design drug formula. Finally, two formulae was obtained for antihypertensive.

  18. Duplex Interrogation by a Direct DNA Repair Protein in Search of Base Damage

    Science.gov (United States)

    Yi, Chengqi; Chen, Baoen; Qi, Bo; Zhang, Wen; Jia, Guifang; Zhang, Liang; Li, Charles J.; Dinner, Aaron R.; Yang, Cai-Guang; He, Chuan

    2012-01-01

    ALKBH2 is a direct DNA repair dioxygenase guarding mammalian genome against N1-methyladenine, N3-methylcytosine, and 1,N6-ethenoadenine damage. A prerequisite for repair is to identify these lesions in the genome. Here we present crystal structures of ALKBH2 bound to different duplex DNAs. Together with computational and biochemical analyses, our results suggest that DNA interrogation by ALKBH2 displays two novel features: i) ALKBH2 probes base-pair stability and detects base pairs with reduced stability; ii) ALKBH2 does not have nor need a “damage-checking site”, which is critical for preventing spurious base-cleavage for several glycosylases. The demethylation mechanism of ALKBH2 insures that only cognate lesions are oxidized and reversed to normal bases, and that a flipped, non-substrate base remains intact in the active site. Overall, the combination of duplex interrogation and oxidation chemistry allows ALKBH2 to detect and process diverse lesions efficiently and correctly. PMID:22659876

  19. International patent applications for non-injectable naloxone for opioid overdose reversal: Exploratory search and retrieve analysis of the PatentScope database.

    Science.gov (United States)

    McDonald, Rebecca; Danielsson Glende, Øyvind; Dale, Ola; Strang, John

    2018-02-01

    Non-injectable naloxone formulations are being developed for opioid overdose reversal, but only limited data have been published in the peer-reviewed domain. Through examination of a hitherto-unsearched database, we expand public knowledge of non-injectable formulations, tracing their development and novelty, with the aim to describe and compare their pharmacokinetic properties. (i) The PatentScope database of the World Intellectual Property Organization was searched for relevant English-language patent applications; (ii) Pharmacokinetic data were extracted, collated and analysed; (iii) PubMed was searched using Boolean search query '(nasal OR intranasal OR nose OR buccal OR sublingual) AND naloxone AND pharmacokinetics'. Five hundred and twenty-two PatentScope and 56 PubMed records were identified: three published international patent applications and five peer-reviewed papers were eligible. Pharmacokinetic data were available for intranasal, sublingual, and reference routes. Highly concentrated formulations (10-40 mg mL -1 ) had been developed and tested. Sublingual bioavailability was very low (1%; relative to intravenous). Non-concentrated intranasal spray (1 mg mL -1 ; 1 mL per nostril) had low bioavailability (11%). Concentrated intranasal formulations (≥10 mg mL -1 ) had bioavailability of 21-42% (relative to intravenous) and 26-57% (relative to intramuscular), with peak concentrations (dose-adjusted C max  = 0.8-1.7 ng mL -1 ) reached in 19-30 min (t max ). Exploratory analysis identified intranasal bioavailability as associated positively with dose and negatively with volume. We find consistent direction of development of intranasal sprays to high-concentration, low-volume formulations with bioavailability in the 20-60% range. These have potential to deliver a therapeutic dose in 0.1 mL volume. [McDonald R, Danielsson Glende Ø, Dale O, Strang J. International patent applications for non-injectable naloxone for opioid overdose reversal

  20. Where the bugs are: analyzing distributions of bacterial phyla by descriptor keyword search in the nucleotide database.

    Science.gov (United States)

    Squartini, Andrea

    2011-07-26

    The associations between bacteria and environment underlie their preferential interactions with given physical or chemical conditions. Microbial ecology aims at extracting conserved patterns of occurrence of bacterial taxa in relation to defined habitats and contexts. In the present report the NCBI nucleotide sequence database is used as dataset to extract information relative to the distribution of each of the 24 phyla of the bacteria superkingdom and of the Archaea. Over two and a half million records are filtered in their cross-association with each of 48 sets of keywords, defined to cover natural or artificial habitats, interactions with plant, animal or human hosts, and physical-chemical conditions. The results are processed showing: (a) how the different descriptors enrich or deplete the proportions at which the phyla occur in the total database; (b) in which order of abundance do the different keywords score for each phylum (preferred habitats or conditions), and to which extent are phyla clustered to few descriptors (specific) or spread across many (cosmopolitan); (c) which keywords individuate the communities ranking highest for diversity and evenness. A number of cues emerge from the results, contributing to sharpen the picture on the functional systematic diversity of prokaryotes. Suggestions are given for a future automated service dedicated to refining and updating such kind of analyses via public bioinformatic engines.

  1. Search for 5'-leader regulatory RNA structures based on gene annotation aided by the RiboGap database.

    Science.gov (United States)

    Naghdi, Mohammad Reza; Smail, Katia; Wang, Joy X; Wade, Fallou; Breaker, Ronald R; Perreault, Jonathan

    2017-03-15

    The discovery of noncoding RNAs (ncRNAs) and their importance for gene regulation led us to develop bioinformatics tools to pursue the discovery of novel ncRNAs. Finding ncRNAs de novo is challenging, first due to the difficulty of retrieving large numbers of sequences for given gene activities, and second due to exponential demands on calculation needed for comparative genomics on a large scale. Recently, several tools for the prediction of conserved RNA secondary structure were developed, but many of them are not designed to uncover new ncRNAs, or are too slow for conducting analyses on a large scale. Here we present various approaches using the database RiboGap as a primary tool for finding known ncRNAs and for uncovering simple sequence motifs with regulatory roles. This database also can be used to easily extract intergenic sequences of eubacteria and archaea to find conserved RNA structures upstream of given genes. We also show how to extend analysis further to choose the best candidate ncRNAs for experimental validation. Copyright © 2017 Elsevier Inc. All rights reserved.

  2. DMPD: TLR9 as a key receptor for the recognition of DNA. [Dynamic Macrophage Pathway CSML Database

    Lifescience Database Archive (English)

    Full Text Available 18262306 TLR9 as a key receptor for the recognition of DNA. Kumagai Y, Takeuchi O, ...TLR9 as a key receptor for the recognition of DNA. PubmedID 18262306 Title TLR9 as a key receptor for the recognition

  3. 5'-end sequences of budding yeast full-length cDNA clones - Budding yeast cDNA sequencing project | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available search(/contents-en/) != -1 || url.search(/index-e.html/) != -1 ) { document.getElementById(lang).innerHTML=.../) != -1 ) { url = url.replace(-e.html,.html); document.getElementById(lang).innerHTML=[ Japanese |...en/,/jp/); document.getElementById(lang).innerHTML=[ Japanese | English ]; } else if ( url.search(//contents...//) != -1 ) { url = url.replace(/contents/,/contents-en/); document.getElementById(lang).innerHTML=[ Japanes...e(/contents-en/,/contents/); document.getElementById(lang).innerHTML=[ Japanese | English ]; } else if( url.

  4. Statistical Measures Alone Cannot Determine Which Database (BNI, CINAHL, MEDLINE, or EMBASE Is the Most Useful for Searching Undergraduate Nursing Topics. A Review of: Stokes, P., Foster, A., & Urquhart, C. (2009. Beyond relevance and recall: Testing new user-centred measures of database performance. Health Information and Libraries Journal, 26(3, 220-231.

    Directory of Open Access Journals (Sweden)

    Giovanna Badia

    2011-03-01

    Full Text Available Objective – The research project sought to determine which of four databases was the most useful for searching undergraduate nursing topics. Design – Comparative database evaluation. Setting – Nursing and midwifery students at Homerton School of Health Studies (now part of Anglia Ruskin University, Cambridge, United Kingdom, in 2005-2006. Subjects – The subjects were four databases: British Nursing Index (BNI, CINAHL, MEDLINE, and EMBASE.Methods – This was a comparative study using title searches to compare BNI (BritishNursing Index, CINAHL, MEDLINE and EMBASE.According to the authors, this is the first study to compare BNI with other databases. BNI is a database produced by British libraries that indexes the nursing and midwifery literature. It covers over 240 British journals, and includes references to articles from health sciences journals that are relevant to nurses and midwives (British Nursing Index, n.d..The researchers performed keyword searches in the title field of the four databases for the dissertation topics of nine nursing and midwifery students enrolled in undergraduate dissertation modules. The list of titles of journals articles on their topics were given to the students and they were asked to judge the relevancy of the citations. The title searches were evaluated in each of the databases using the following criteria: • precision (the number of relevant results obtained in the database for a search topic, divided by the total number of results obtained in the database search;• recall (the number of relevant results obtained in the database for a search topic, divided by the total number of relevant results obtained on that topic from all four database searches;• novelty (the number of relevant results that were unique in the database search, which was calculated as a percentage of the total number of relevant results found in the database;• originality (the number of unique relevant results obtained in the

  5. Robots for hazardous duties: Military, space, and nuclear facility applications. (Latest citations from the NTIS bibliographic database). Published Search

    International Nuclear Information System (INIS)

    1993-09-01

    The bibliography contains citations concerning the design and application of robots used in place of humans where the environment could be hazardous. Military applications include autonomous land vehicles, robotic howitzers, and battlefield support operations. Space operations include docking, maintenance, mission support, and intra-vehicular and extra-vehicular activities. Nuclear applications include operations within the containment vessel, radioactive waste operations, fueling operations, and plant security. Many of the articles reference control techniques and the use of expert systems in robotic operations. Applications involving industrial manufacturing, walking robots, and robot welding are cited in other published searches in this series. (Contains a minimum of 183 citations and includes a subject term index and title list.)

  6. Undergraduates Prefer Federated Searching to Searching Databases Individually. A Review of: Belliston, C. Jeffrey, Jared L. Howland, & Brian C. Roberts. “Undergraduate Use of Federated Searching: A Survey of Preferences and Perceptions of Value-Added Functionality.” College & Research Libraries 68.6 (Nov. 2007: 472-86.

    Directory of Open Access Journals (Sweden)

    Genevieve Gore

    2008-09-01

    Full Text Available Objective – To determine whether use offederated searching by undergraduates saves time, meets their information needs, is preferred over searching databases individually, and provides results of higher quality. Design – Crossover study.Setting – Three American universities, all members of the Consortium of Church Libraries & Archives (CCLA: BYU (Brigham Young University, a large research university; BYUH (Brigham Young University – Hawaii, a small baccalaureate college; and BYUI (Brigham Young University – Idaho, a large baccalaureate collegeSubjects – Ninety-five participants recruited via e-mail invitations sent to a random sample of currently enrolled undergraduates at BYU, BYUH, and BYUI.Methods – Participants were given written directions to complete a literature search for journal articles on two biology-related topics using two search methods: 1. federated searching with WebFeat® (implemented in the same way for this study at the three universities and 2. a hyperlinked list of databases to search individually. Both methods used the same set of seven databases. Each topic was assigned in random order to one of the two search methods, also assigned in random order, for a total of two searches per participant. The time to complete the searches was recorded. Students compiled their list of citations, which were later normalized and graded. To analyze the quality of the citations, one quantitative rubric was created by librarians and one qualitative rubric was approved by a faculty member at BYU. The librarian-created rubric included the journal impact factor (from ISI’s Journal Citation Reports®, the proportion of citations from peer-reviewed journals (determined from Ulrichsweb.com™ to total citations, and the timeliness of the articles. The faculty-approved rubric included three criteria: relevance to the topic, quality of the individual citations (good quality: primary research results, peer-reviewed sources, and

  7. Database Description - RMOS | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available base Description General information of database Database name RMOS Alternative nam...arch Unit Shoshi Kikuchi E-mail : Database classification Plant databases - Rice Microarray Data and other Gene Expression Database...s Organism Taxonomy Name: Oryza sativa Taxonomy ID: 4530 Database description The Ric...19&lang=en Whole data download - Referenced database Rice Expression Database (RED) Rice full-length cDNA Database... (KOME) Rice Genome Integrated Map Database (INE) Rice Mutant Panel Database (Tos17) Rice Genome Annotation Database

  8. TRF1 and TRF2 use different mechanisms to find telomeric DNA but share a novel mechanism to search for protein partners at telomeres.

    Science.gov (United States)

    Lin, Jiangguo; Countryman, Preston; Buncher, Noah; Kaur, Parminder; E, Longjiang; Zhang, Yiyun; Gibson, Greg; You, Changjiang; Watkins, Simon C; Piehler, Jacob; Opresko, Patricia L; Kad, Neil M; Wang, Hong

    2014-02-01

    Human telomeres are maintained by the shelterin protein complex in which TRF1 and TRF2 bind directly to duplex telomeric DNA. How these proteins find telomeric sequences among a genome of billions of base pairs and how they find protein partners to form the shelterin complex remains uncertain. Using single-molecule fluorescence imaging of quantum dot-labeled TRF1 and TRF2, we study how these proteins locate TTAGGG repeats on DNA tightropes. By virtue of its basic domain TRF2 performs an extensive 1D search on nontelomeric DNA, whereas TRF1's 1D search is limited. Unlike the stable and static associations observed for other proteins at specific binding sites, TRF proteins possess reduced binding stability marked by transient binding (∼ 9-17 s) and slow 1D diffusion on specific telomeric regions. These slow diffusion constants yield activation energy barriers to sliding ∼ 2.8-3.6 κ(B)T greater than those for nontelomeric DNA. We propose that the TRF proteins use 1D sliding to find protein partners and assemble the shelterin complex, which in turn stabilizes the interaction with specific telomeric DNA. This 'tag-team proofreading' represents a more general mechanism to ensure a specific set of proteins interact with each other on long repetitive specific DNA sequences without requiring external energy sources.

  9. Database Description - ASTRA | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available abase Description General information of database Database name ASTRA Alternative n...tics Journal Search: Contact address Database classification Nucleotide Sequence Databases - Gene structure,...3702 Taxonomy Name: Oryza sativa Taxonomy ID: 4530 Database description The database represents classified p...(10):1211-6. External Links: Original website information Database maintenance site National Institute of Ad... for user registration Not available About This Database Database Description Dow

  10. REDIdb: the RNA editing database.

    Science.gov (United States)

    Picardi, Ernesto; Regina, Teresa Maria Rosaria; Brennicke, Axel; Quagliariello, Carla

    2007-01-01

    The RNA Editing Database (REDIdb) is an interactive, web-based database created and designed with the aim to allocate RNA editing events such as substitutions, insertions and deletions occurring in a wide range of organisms. The database contains both fully and partially sequenced DNA molecules for which editing information is available either by experimental inspection (in vitro) or by computational detection (in silico). Each record of REDIdb is organized in a specific flat-file containing a description of the main characteristics of the entry, a feature table with the editing events and related details and a sequence zone with both the genomic sequence and the corresponding edited transcript. REDIdb is a relational database in which the browsing and identification of editing sites has been simplified by means of two facilities to either graphically display genomic or cDNA sequences or to show the corresponding alignment. In both cases, all editing sites are highlighted in colour and their relative positions are detailed by mousing over. New editing positions can be directly submitted to REDIdb after a user-specific registration to obtain authorized secure access. This first version of REDIdb database stores 9964 editing events and can be freely queried at http://biologia.unical.it/py_script/search.html.

  11. Download - Trypanosomes Database | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Trypanosomes Database Download First of all, please read the license of this database. Data ...1.4 KB) Simple search and download Downlaod via FTP FTP server is sometimes jammed. If it is, access [here]. About This Database Data...base Description Download License Update History of This Database Site Policy | Contact Us Download - Trypanosomes Database | LSDB Archive ...

  12. Database Description - TMFunction | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available sidue (or mutant) in a protein. The experimental data are collected from the literature both by searching th...the sequence database, UniProt, structural database, PDB, and literature database

  13. License - Trypanosomes Database | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data List Contact us Trypanoso... Attribution-Share Alike 2.1 Japan . If you use data from this database, please be sure attribute this database as follows: Trypanoso...nse Update History of This Database Site Policy | Contact Us License - Trypanosomes Database | LSDB Archive ...

  14. PFR²: a curated database of planktonic foraminifera 18S ribosomal DNA as a resource for studies of plankton ecology, biogeography and evolution.

    Science.gov (United States)

    Morard, Raphaël; Darling, Kate F; Mahé, Frédéric; Audic, Stéphane; Ujiié, Yurika; Weiner, Agnes K M; André, Aurore; Seears, Heidi A; Wade, Christopher M; Quillévéré, Frédéric; Douady, Christophe J; Escarguel, Gilles; de Garidel-Thoron, Thibault; Siccha, Michael; Kucera, Michal; de Vargas, Colomban

    2015-11-01

    Planktonic foraminifera (Rhizaria) are ubiquitous marine pelagic protists producing calcareous shells with conspicuous morphology. They play an important role in the marine carbon cycle, and their exceptional fossil record serves as the basis for biochronostratigraphy and past climate reconstructions. A major worldwide sampling effort over the last two decades has resulted in the establishment of multiple large collections of cryopreserved individual planktonic foraminifera samples. Thousands of 18S rDNA partial sequences have been generated, representing all major known morphological taxa across their worldwide oceanic range. This comprehensive data coverage provides an opportunity to assess patterns of molecular ecology and evolution in a holistic way for an entire group of planktonic protists. We combined all available published and unpublished genetic data to build PFR(2), the Planktonic foraminifera Ribosomal Reference database. The first version of the database includes 3322 reference 18S rDNA sequences belonging to 32 of the 47 known morphospecies of extant planktonic foraminifera, collected from 460 oceanic stations. All sequences have been rigorously taxonomically curated using a six-rank annotation system fully resolved to the morphological species level and linked to a series of metadata. The PFR(2) website, available at http://pfr2.sb-roscoff.fr, allows downloading the entire database or specific sections, as well as the identification of new planktonic foraminiferal sequences. Its novel, fully documented curation process integrates advances in morphological and molecular taxonomy. It allows for an increase in its taxonomic resolution and assures that integrity is maintained by including a complete contingency tracking of annotations and assuring that the annotations remain internally consistent. © 2015 John Wiley & Sons Ltd.

  15. Pathological and Biochemical Outcomes among African-American and Caucasian Men with Low Risk Prostate Cancer in the SEARCH Database: Implications for Active Surveillance Candidacy.

    Science.gov (United States)

    Leapman, Michael S; Freedland, Stephen J; Aronson, William J; Kane, Christopher J; Terris, Martha K; Walker, Kelly; Amling, Christopher L; Carroll, Peter R; Cooperberg, Matthew R

    2016-11-01

    Racial disparities in the incidence and risk profile of prostate cancer at diagnosis among African-American men are well reported. However, it remains unclear whether African-American race is independently associated with adverse outcomes in men with clinical low risk disease. We retrospectively analyzed the records of 895 men in the SEARCH (Shared Equal Access Regional Cancer Hospital) database in whom clinical low risk prostate cancer was treated with radical prostatectomy. Associations of African-American and Caucasian race with pathological biochemical recurrence outcomes were examined using chi-square, logistic regression, log rank and Cox proportional hazards analyses. We identified 355 African-American and 540 Caucasian men with low risk tumors in the SEARCH cohort who were followed a median of 6.3 years. Following adjustment for relevant covariates African-American race was not significantly associated with pathological upgrading (OR 1.33, p = 0.12), major upgrading (OR 0.58, p = 0.10), up-staging (OR 1.09, p = 0.73) or positive surgical margins (OR 1.04, p = 0.81). Five-year recurrence-free survival rates were 73.4% in African-American men and 78.4% in Caucasian men (log rank p = 0.18). In a Cox proportional hazards analysis model African-American race was not significantly associated with biochemical recurrence (HR 1.11, p = 0.52). In a cohort of patients at clinical low risk who were treated with prostatectomy in an equal access health system with a high representation of African-American men we observed no significant differences in the rates of pathological upgrading, up-staging or biochemical recurrence. These data support continued use of active surveillance in African-American men. Upgrading and up-staging remain concerning possibilities for all men regardless of race. Copyright © 2016 American Urological Association Education and Research, Inc. Published by Elsevier Inc. All rights reserved.

  16. Personalized Search

    CERN Document Server

    AUTHOR|(SzGeCERN)749939

    2015-01-01

    As the volume of electronically available information grows, relevant items become harder to find. This work presents an approach to personalizing search results in scientific publication databases. This work focuses on re-ranking search results from existing search engines like Solr or ElasticSearch. This work also includes the development of Obelix, a new recommendation system used to re-rank search results. The project was proposed and performed at CERN, using the scientific publications available on the CERN Document Server (CDS). This work experiments with re-ranking using offline and online evaluation of users and documents in CDS. The experiments conclude that the personalized search result outperform both latest first and word similarity in terms of click position in the search result for global search in CDS.

  17. A database of linear codes over F_13 with minimum distance bounds and new quasi-twisted codes from a heuristic search algorithm

    Directory of Open Access Journals (Sweden)

    Eric Z. Chen

    2015-01-01

    Full Text Available Error control codes have been widely used in data communications and storage systems. One central problem in coding theory is to optimize the parameters of a linear code and construct codes with best possible parameters. There are tables of best-known linear codes over finite fields of sizes up to 9. Recently, there has been a growing interest in codes over $\\mathbb{F}_{13}$ and other fields of size greater than 9. The main purpose of this work is to present a database of best-known linear codes over the field $\\mathbb{F}_{13}$ together with upper bounds on the minimum distances. To find good linear codes to establish lower bounds on minimum distances, an iterative heuristic computer search algorithm is employed to construct quasi-twisted (QT codes over the field $\\mathbb{F}_{13}$ with high minimum distances. A large number of new linear codes have been found, improving previously best-known results. Tables of $[pm, m]$ QT codes over $\\mathbb{F}_{13}$ with best-known minimum distances as well as a table of lower and upper bounds on the minimum distances for linear codes of length up to 150 and dimension up to 6 are presented.

  18. Linkage of cDNA expression profiles of mesencephalic dopaminergic neurons to a genome-wide in situ hybridization database

    Directory of Open Access Journals (Sweden)

    Simon Horst H

    2009-01-01

    Full Text Available Abstract Midbrain dopaminergic neurons are involved in control of emotion, motivation and motor behavior. The loss of one of the subpopulations, substantia nigra pars compacta, is the pathological hallmark of one of the most prominent neurological disorders, Parkinson's disease. Several groups have looked at the molecular identity of midbrain dopaminergic neurons and have suggested the gene expression profile of these neurons. Here, after determining the efficiency of each screen, we provide a linked database of the genes, expressed in this neuronal population, by combining and comparing the results of six previous studies and verification of expression of each gene in dopaminergic neurons, using the collection of in situ hybridization in the Allen Brain Atlas.

  19. Database Description - SKIP Stemcell Database | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us SKIP Stemcell Database Database Description General information of database Database name SKIP Stemcell Database...rsity Journal Search: Contact address http://www.skip.med.keio.ac.jp/en/contact/ Database classification Human Genes and Diseases Dat...abase classification Stemcell Article Organism Taxonomy Name: Homo sapiens Taxonomy ID: 9606 Database...ks: Original website information Database maintenance site Center for Medical Genetics, School of medicine, ...lable Web services Not available URL of Web services - Need for user registration Not available About This Database Database

  20. The future of forensic DNA analysis

    Science.gov (United States)

    Butler, John M.

    2015-01-01

    The author's thoughts and opinions on where the field of forensic DNA testing is headed for the next decade are provided in the context of where the field has come over the past 30 years. Similar to the Olympic motto of ‘faster, higher, stronger’, forensic DNA protocols can be expected to become more rapid and sensitive and provide stronger investigative potential. New short tandem repeat (STR) loci have expanded the core set of genetic markers used for human identification in Europe and the USA. Rapid DNA testing is on the verge of enabling new applications. Next-generation sequencing has the potential to provide greater depth of coverage for information on STR alleles. Familial DNA searching has expanded capabilities of DNA databases in parts of the world where it is allowed. Challenges and opportunities that will impact the future of forensic DNA are explored including the need for education and training to improve interpretation of complex DNA profiles. PMID:26101278

  1. Accessing and using chemical databases

    DEFF Research Database (Denmark)

    Nikolov, Nikolai Georgiev; Pavlov, Todor; Niemelä, Jay Russell

    2013-01-01

    Computer-based representation of chemicals makes it possible to organize data in chemical databases-collections of chemical structures and associated properties. Databases are widely used wherever efficient processing of chemical information is needed, including search, storage, retrieval......, and dissemination. Structure and functionality of chemical databases are considered. The typical kinds of information found in a chemical database are considered-identification, structural, and associated data. Functionality of chemical databases is presented, with examples of search and access types. More details...... are included about the OASIS database and platform and the Danish (Q)SAR Database online. Various types of chemical database resources are discussed, together with a list of examples....

  2. Proteomic analysis of Pinus radiata needles: 2-DE map and protein identification by LC/MS/MS and substitution-tolerant database searching.

    Science.gov (United States)

    Valledor, Luis; Castillejo, Maria A; Lenz, Christof; Rodríguez, Roberto; Cañal, Maria J; Jorrín, Jesús

    2008-07-01

    Pinus radiata is one of the most economically important forest tree species, with a worldwide production of around 370 million m (3) of wood per year. Current selection of elite trees to be used in conservation and breeding programes requires the physiological and molecular characterization of available populations. To identify key proteins related to tree growth, productivity and responses to environmental factors, a proteomic approach is being utilized. In this paper, we present the first report of the 2-DE protein reference map of physiologically mature P. radiata needles, as a basis for subsequent differential expression proteomic studies related to growth, development, biomass production and responses to stresses. After TCA/acetone protein extraction of needle tissue, 549 +/- 21 well-resolved spots were detected in Coommassie-stained gels within the 5-8 pH and 10-100 kDa M(r) ranges. The analytical and biological variance determined for 450 spots were of 31 and 42%, respectively. After LC/MS/MS analysis of in-gel tryptic digested spots, proteins were identified by using the novel Paragon algorithm that tolerates amino acid substitution in the first-pass search. It allowed the confident identification of 115 out of the 150 protein spots subjected to MS, quite unusual high percentage for a poor sequence database, as is the case of P. radiata. Proteins were classified into 12 or 18 groups based on their corresponding cell component or biological process/pathway categories, respectively. Carbohydrate metabolism and photosynthetic enzymes predominate in the 2-DE protein profile of P. radiata needles.

  3. Race and time from diagnosis to radical prostatectomy: does equal access mean equal timely access to the operating room?--Results from the SEARCH database.

    Science.gov (United States)

    Bañez, Lionel L; Terris, Martha K; Aronson, William J; Presti, Joseph C; Kane, Christopher J; Amling, Christopher L; Freedland, Stephen J

    2009-04-01

    African American men with prostate cancer are at higher risk for cancer-specific death than Caucasian men. We determine whether significant delays in management contribute to this disparity. We hypothesize that in an equal-access health care system, time interval from diagnosis to treatment would not differ by race. We identified 1,532 African American and Caucasian men who underwent radical prostatectomy (RP) from 1988 to 2007 at one of four Veterans Affairs Medical Centers that comprise the Shared Equal-Access Regional Cancer Hospital (SEARCH) database with known biopsy date. We compared time from biopsy to RP between racial groups using linear regression adjusting for demographic and clinical variables. We analyzed risk of potential clinically relevant delays by determining odds of delays >90 and >180 days. Median time interval from diagnosis to RP was 76 and 68 days for African Americans and Caucasian men, respectively (P = 0.004). After controlling for demographic and clinical variables, race was not associated with the time interval between diagnosis and RP (P = 0.09). Furthermore, race was not associated with increased risk of delays >90 (P = 0.45) or >180 days (P = 0.31). In a cohort of men undergoing RP in an equal-access setting, there was no significant difference between racial groups with regard to time interval from diagnosis to RP. Thus, equal-access includes equal timely access to the operating room. Given our previous finding of poorer outcomes among African Americans, treatment delays do not seem to explain these observations. Our findings need to be confirmed in patients electing other treatment modalities and in other practice settings.

  4. Pharmacovigilance database search discloses ClC-K channels as a novel target of the AT1 receptor blockers valsartan and olmesartan.

    Science.gov (United States)

    Imbrici, Paola; Tricarico, Domenico; Mangiatordi, Giuseppe Felice; Nicolotti, Orazio; Lograno, Marcello Diego; Conte, Diana; Liantonio, Antonella

    2017-07-01

    Human ClC-K chloride channels are highly attractive targets for drug discovery as they have a variety of important physiological functions and are associated with genetic disorders. These channels are crucial in the kidney as they control chloride reabsorption and water diuresis. In addition, loss-of-function mutations of CLCNKB and BSND genes cause Bartter's syndrome (BS), whereas CLCNKA and CLCNKB gain-of-function polymorphisms predispose to a rare form of salt sensitive hypertension. Both disorders lack a personalized therapy that is in most cases only symptomatic. The aim of this study was to identify novel ClC-K ligands from drugs already on the market, by exploiting the pharmacological side activity of drug molecules available from the FDA Adverse Effects Reporting System database. We searched for drugs having a Bartter-like syndrome as a reported side effect, with the assumption that BS could be causatively related to the block of ClC-K channels. The ability of the selected BS-causing drugs to bind and block ClC-K channels was then validated through an integrated experimental and computational approach based on patch clamp electrophysiology in HEK293 cells and molecular docking simulations. Valsartan and olmesartan were able to block ClC-Ka channels and the molecular requirements for effective inhibition of these channels have been identified. These results suggest additional mechanisms of action for these sartans further to their primary AT 1 receptor antagonism and propose these compounds as leads for designing new potent ClC-K ligands. © 2017 The British Pharmacological Society.

  5. Delayed radical prostatectomy for intermediate-risk prostate cancer is associated with biochemical recurrence: possible implications for active surveillance from the SEARCH database.

    Science.gov (United States)

    Abern, Michael R; Aronson, William J; Terris, Martha K; Kane, Christopher J; Presti, Joseph C; Amling, Christopher L; Freedland, Stephen J

    2013-03-01

    Active surveillance (AS) is increasingly accepted as appropriate management for low-risk prostate cancer (PC) patients. It is unknown whether delaying radical prostatectomy (RP) is associated with increased risk of biochemical recurrence (BCR) for men with intermediate-risk PC. We performed a retrospective analysis of 1,561 low and intermediate-risk men from the Shared Equal Access Regional Cancer Hospital (SEARCH) database treated with RP between 1988 and 2011. Patients were stratified by interval between diagnosis and RP (≤ 3, 3-6, 6-9, or >9 months) and by risk using the D'Amico classification. Cox proportional hazard models were used to analyze BCR. Logistic regression was used to analyze positive surgical margins (PSM), extracapsular extension (ECE), and pathologic upgrading. Overall, 813 (52%) men were low-risk, and 748 (48%) intermediate-risk. Median follow-up among men without recurrence was 52.9 months, during which 437 men (38.9%) recurred. For low-risk men, RP delays were unrelated to BCR, ECE, PSM, or upgrading (all P > 0.05). For intermediate-risk men, however, delays >9 months were significantly related to BCR (HR: 2.10, P = 0.01) and PSM (OR: 4.08, P 9 months were associated with BCR in subsets of intermediate-risk men with biopsy Gleason score ≤ 3 + 4 (HR: 2.51, P 9 months predicted greater BCR and PSM risk. If confirmed in future studies, this suggests delayed RP for intermediate-risk PC may compromise outcomes. Copyright © 2012 Wiley Periodicals, Inc.

  6. Scopus database: a review.

    Science.gov (United States)

    Burnham, Judy F

    2006-03-08

    The Scopus database provides access to STM journal articles and the references included in those articles, allowing the searcher to search both forward and backward in time. The database can be used for collection development as well as for research. This review provides information on the key points of the database and compares it to Web of Science. Neither database is inclusive, but complements each other. If a library can only afford one, choice must be based in institutional needs.

  7. Laser microbeams for DNA damage induction, optical tweezers for the search on blood pressure relaxing drugs: contributions to ageing research

    Science.gov (United States)

    Grigaravicius, P.; Monajembashi, S.; Hoffmann, M.; Altenberg, B.; Greulich, K. O.

    2009-08-01

    One essential cause of human ageing is the accumulation of DNA damages during lifetime. Experimental studies require quantitative induction of damages and techniques to visualize the subsequent DNA repair. A new technique, the "immuno fluorescent comet assay", is used to directly visualize DNA damages in the microscope. Using DNA repair proteins fluorescently labeled with green fluorescent protein, it could be shown that the repair of the most dangerous DNA double strand breaks starts with the inaccurate "non homologous end joining" pathway and only after 1 - 1 ½ minutes may switch to the more accurate "homologous recombination repair". One might suggest investigating whether centenarians use "homologous recombination repair" differently from those ageing at earlier years and speculate whether it is possible, for example by nutrition, to shift DNA repair to a better use of the error free pathway and thus promote healthy ageing. As a complementary technique optical tweezers, and particularly its variant "erythrocyte mediated force application", is used to simulate the effects of blood pressure on HUVEC cells representing the inner lining of human blood vessels. Stimulating one cell induces in the whole neighbourhood waves of calcium and nitric oxide, known to relax blood vessels. NIFEDIPINE and AMLODIPINE, both used as drugs in the therapy of high blood pressure, primarily a disease of the elderly, prolong the availability of nitric oxide. This partially explains their mode of action. In contrast, VERAPAMILE, also a blood pressure reducing drug, does not show this effect, indicating that obviously an alternative mechanism must be responsible for vessel relaxation.

  8. Expectations and experiences of gamete donors and donor-conceived adults searching for genetic relatives using DNA linking through a voluntary register.

    Science.gov (United States)

    van den Akker, O B A; Crawshaw, M A; Blyth, E D; Frith, L J

    2015-01-01

    What are the experiences of donor-conceived adults and donors who are searching for a genetic link through the use of a DNA-based voluntary register service? Donor-conceived adults and donors held positive beliefs about their search and although some concerns in relation to finding a genetically linked relative were reported, these were not a barrier to searching. Research with donor-conceived people has consistently identified their interest in learning about-and in some cases making contact with-their donor and other genetic relatives. However, donor-conceived individuals or donors rarely have the opportunity to act on these desires. A questionnaire was administered for online completion using Bristol Online Surveys. The survey was live for 3 months and responses were collected anonymously. The survey was completed by 65 donor-conceived adults, 21 sperm donors and 5 oocyte donors who had registered with a DNA-based voluntary contact register in the UK. The questionnaire included socio-demographic questions, questions specifically developed for the purposes of this study and the standardized Aspects of Identity Questionnaire (AIQ). Motivations for searching for genetic relatives were varied, with the most common reasons being curiosity and passing on information. Overall, participants who were already linked and those awaiting a link were positive about being linked and valued access to a DNA-based register. Collective identity (reflecting self-defining feelings of continuity and uniqueness), as assessed by the AIQ, was significantly lower for donor-conceived adults when compared with the donor groups (P 0.05) for donor-conceived adults. Participants were members of a UK DNA-based registry which is unique. It was therefore not possible to determine how representative participants were of those who did not register for the service, those in other countries or of those who do not seek information exchange or contact. This is the first survey exploring the

  9. Is DNA a worm-like chain in Couette flow? In search of persistence length, a critical review.

    Science.gov (United States)

    Rittman, Martyn; Gilroy, Emma; Koohya, Hashem; Rodger, Alison; Richards, Adair

    2009-01-01

    Persistence length is the foremost measure of DNA flexibility. Its origins lie in polymer theory which was adapted for DNA following the determination of BDNA structure in 1953. There is no single definition of persistence length used, and the links between published definitions are based on assumptions which may, or may not be, clearly stated. DNA flexibility is affected by local ionic strength, solvent environment, bound ligands and intrinsic sequence-dependent flexibility. This article is a review of persistence length providing a mathematical treatment of the relationships between four definitions of persistence length, including: correlation, Kuhn length, bending, and curvature. Persistence length has been measured using various microscopy, force extension and solution methods such as linear dichroism and transient electric birefringence. For each experimental method a model of DNA is required to interpret the data. The importance of understanding the underlying models, along with the assumptions required by each definition to determine a value of persistence length, is highlighted for linear dichroism data, where it transpires that no model is currently available for long DNA or medium to high shear rate experiments.

  10. Astronomical databases of Nikolaev Observatory

    Science.gov (United States)

    Protsyuk, Y.; Mazhaev, A.

    2008-07-01

    Several astronomical databases were created at Nikolaev Observatory during the last years. The databases are built by using MySQL search engine and PHP scripts. They are available on NAO web-site http://www.mao.nikolaev.ua.

  11. Additional approaches to DNA typing of skeletal remains: the search for "missing" persons killed during the last dictatorship in Argentina.

    Science.gov (United States)

    Corach, D; Sala, A; Penacino, G; Iannucci, N; Bernardi, P; Doretti, M; Fondebrider, L; Ginarte, A; Inchaurregui, A; Somigliana, C; Turner, S; Hagelberg, E

    1997-08-01

    DNA typing techniques are among the most advanced tools for human identification and can contribute to the identification of poorly preserved skeletal remains. Ten thousand people are thought to have been killed during the last dictatorship in Argentina (1976-1983) and there are few official records on the identity of the victims or the location of burials. A mass grave containing 340 skeletons was excavated using archeological methods. A small number of individuals was identified by traditional forensic methods and one family group by mitochondrial DNA (mtDNA) analysis. Due to the lack of antemortem physical information on many of the victims, the application of molecular methods is imperative to speed up the identification process. We have tested two molecular screening methods, Y chromosome-specific short tandem repeats (DYS19, DYS385, DYS389 I, DYS389 II, DYS390, DYS391, DYS392, DYS393) and amplification of autosomal microsatellites using nested primers. These methods can complement solely matrilineal mtDNA sequence data in the identification of "missing" persons.

  12. Constructing Effective Search Strategies for Electronic Searching.

    Science.gov (United States)

    Flanagan, Lynn; Parente, Sharon Campbell

    Electronic databases have grown tremendously in both number and popularity since their development during the 1960s. Access to electronic databases in academic libraries was originally offered primarily through mediated search services by trained librarians; however, the advent of CD-ROM and end-user interfaces for online databases has shifted the…

  13. Database Description - Trypanosomes Database | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Trypanosomes Database Database Description General information of database Database name Trypanosomes Database...stitute of Genetics Research Organization of Information and Systems Yata 1111, Mishima, Shizuoka 411-8540, JAPAN E mail: Database...y Name: Trypanosoma Taxonomy ID: 5690 Taxonomy Name: Homo sapiens Taxonomy ID: 9606 Database description The... Article title: Author name(s): Journal: External Links: Original website information Database maintenance s...DB (Protein Data Bank) KEGG PATHWAY Database DrugPort Entry list Available Query search Available Web servic

  14. Atomic Spectra Database (ASD)

    Science.gov (United States)

    SRD 78 NIST Atomic Spectra Database (ASD) (Web, free access)   This database provides access and search capability for NIST critically evaluated data on atomic energy levels, wavelengths, and transition probabilities that are reasonably up-to-date. The NIST Atomic Spectroscopy Data Center has carried out these critical compilations.

  15. Database in Artificial Intelligence.

    Science.gov (United States)

    Wilkinson, Julia

    1986-01-01

    Describes a specialist bibliographic database of literature in the field of artificial intelligence created by the Turing Institute (Glasgow, Scotland) using the BRS/Search information retrieval software. The subscription method for end-users--i.e., annual fee entitles user to unlimited access to database, document provision, and printed awareness…

  16. Online Patent Searching: The Realities.

    Science.gov (United States)

    Kaback, Stuart M.

    1983-01-01

    Considers patent subject searching capabilities of major online databases, noting patent claims, "deep-indexed" files, test searches, retrieval of related references, multi-database searching, improvements needed in indexing of chemical structures, full text searching, improvements needed in handling numerical data, and augmenting a…

  17. Search for infective mammalian type-C virus-related genes in the DNA of human sarcomas and leukemias.

    Science.gov (United States)

    Nicolson, M O; Gilden, R V; Charman, H; Rice, N; Heberling, R; McAllister, R M

    1978-06-15

    DNA was extracted from two human sarcoma cell lines, TE-32 and TE-418, and the leukemic cells from five children with acute myelocytic leukemia, three children with acute lymphocytic leukemia and four adults with acute myelocytic leukemia. The DNAs, assayed for infectivity by transfection techniques, induced no measurable virus by methods which would detect known mammalian C-type antigens or RNA-directed DNA polymerase in TE-32, D-17 dog cells and other indicator cells, nor did they recombine with or rescue endogenous human or exogenous murine or baboon type-C virus. Model systems used as controls were human sarcoma cells, TE-32 and HT-1080, and human lymphoma cells TE-543, experimentally infected with KiMuLV, GaLV or baboon type-C virus, all of which released infectious virus and whose DNAs were infectious for TE-32 and D-17 dog cells. Other model systems included two baboon placentas and one embryonic cell strain spontaneously releasing infectious endogenous baboon virus and yielding DNAs infectious for D-17 dog cells but not for TE-32 cells. Four other baboon embryonic tissues and two embryonic cell strains, releasing either low levels of virus or no virus, did not yield infectious DNA.

  18. Literature database aid

    International Nuclear Information System (INIS)

    Wanderer, J.A.

    1991-01-01

    The booklet is to help with the acquisition of original literature either after a conventional literature search or in particular after a database search. It bridges the gap between abbreviated (short) and original (long) titel. This, together with information on the holdings of technical/scientific libraries, facilitates document delivery. 1500 short titles are listed alphabetically. (orig.) [de

  19. PRODORIC2: the bacterial gene regulation database in 2018

    Science.gov (United States)

    Dudek, Christian-Alexander; Hartlich, Juliane; Brötje, David; Jahn, Dieter

    2018-01-01

    Abstract Bacteria adapt to changes in their environment via differential gene expression mediated by DNA binding transcriptional regulators. The PRODORIC2 database hosts one of the largest collections of DNA binding sites for prokaryotic transcription factors. It is the result of the thoroughly redesigned PRODORIC database. PRODORIC2 is more intuitive and user-friendly. Besides significant technical improvements, the new update offers more than 1000 new transcription factor binding sites and 110 new position weight matrices for genome-wide pattern searches with the Virtual Footprint tool. Moreover, binding sites deduced from high-throughput experiments were included. Data for 6 new bacterial species including bacteria of the Rhodobacteraceae family were added. Finally, a comprehensive collection of sigma- and transcription factor data for the nosocomial pathogen Clostridium difficile is now part of the database. PRODORIC2 is publicly available at http://www.prodoric2.de. PMID:29136200

  20. Searching for cellular partners of hantaviral nonstructural protein NSs: Y2H screening of mouse cDNA library and analysis of cellular interactome.

    Science.gov (United States)

    Rönnberg, Tuomas; Jääskeläinen, Kirsi; Blot, Guillaume; Parviainen, Ville; Vaheri, Antti; Renkonen, Risto; Bouloy, Michele; Plyusnin, Alexander

    2012-01-01

    Hantaviruses (Bunyaviridae) are negative-strand RNA viruses with a tripartite genome. The small (S) segment encodes the nucleocapsid protein and, in some hantaviruses, also the nonstructural protein (NSs). The aim of this study was to find potential cellular partners for the hantaviral NSs protein. Toward this aim, yeast two-hybrid (Y2H) screening of mouse cDNA library was performed followed by a search for potential NSs protein counterparts via analyzing a cellular interactome. The resulting interaction network was shown to form logical, clustered structures. Furthermore, several potential binding partners for the NSs protein, for instance ACBD3, were identified and, to prove the principle, interaction between NSs and ACBD3 proteins was demonstrated biochemically.

  1. License - SSBD | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...thout notice. About This Database Database Description Download License Update History of This Database Site Policy | Contact Us License - SSBD | LSDB Archive ...

  2. Download - PSCDB | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...cess [here]. About This Database Database Description Download License Update History of This Database Site Policy | Contact Us Download - PSCDB | LSDB Archive ...

  3. License - ASTRA | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ... notice. About This Database Database Description Download License Update History of This Database Site Policy | Contact Us License - ASTRA | LSDB Archive ...

  4. License - JSNP | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...This Database Database Description Download License Update History of This Database Site Policy | Contact Us License - JSNP | LSDB Archive ...

  5. License - KOME | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...out This Database Database Description Download License Update History of This Database Site Policy | Contact Us License - KOME | LSDB Archive ...

  6. Download - ASTRA | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...is Database Database Description Download License Update History of This Database Site Policy | Contact Us Download - ASTRA | LSDB Archive ...

  7. License - RGP gmap | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...nged without notice. About This Database Database Description Download License Update History of This Database Site Policy | Contact Us License - RGP gmap | LSDB Archive ...

  8. License - SAHG | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...ut notice. About This Database Database Description Download License Update History of This Database Site Policy | Contact Us License - SAHG | LSDB Archive ...

  9. Download - RED | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...t This Database Database Description Download License Update History of This Database Site Policy | Contact Us Download - RED | LSDB Archive ...

  10. Download - GRIPDB | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...t This Database Database Description Download License Update History of This Database Site Policy | Contact Us Download - GRIPDB | LSDB Archive ...

  11. License - RPSD | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...thout notice. About This Database Database Description Download License Update History of This Database Site Policy | Contact Us License - RPSD | LSDB Archive ...

  12. License - RMOS | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...out This Database Database Description Download License Update History of This Database Site Policy | Contact Us License - RMOS | LSDB Archive ...

  13. Our love-hate relationship with DNA barcodes, the Y2K problem, and the search for next generation barcodes

    Directory of Open Access Journals (Sweden)

    Jeffrey M. Marcus

    2018-01-01

    Full Text Available DNA barcodes are very useful for species identification especially when identification by traditional morphological characters is difficult. However, the short mitochondrial and chloroplast barcodes currently in use often fail to distinguish between closely related species, are prone to lateral transfer, and provide inadequate phylogenetic resolution, particularly at deeper nodes. The deficiencies of short barcode identifiers are similar to the deficiencies of the short year identifiers that caused the Y2K problem in computer science. The resolution of the Y2K problem was to increase the size of the year identifiers. The performance of conventional mitochondrial COI barcodes for phylogenetics was compared with the performance of complete mitochondrial genomes and nuclear ribosomal RNA repeats obtained by genome skimming for a set of caddisfly taxa (Insect Order Trichoptera. The analysis focused on Trichoptera Family Hydropsychidae, the net-spinning caddisflies, which demonstrates many of the frustrating limitations of current barcodes. To conduct phylogenetic comparisons, complete mitochondrial genomes (15 kb each and nuclear ribosomal repeats (9 kb each from six caddisfly species were sequenced, assembled, and are reported for the first time. These sequences were analyzed in comparison with eight previously published trichopteran mitochondrial genomes and two triochopteran rRNA repeats, plus outgroup sequences from sister clade Lepidoptera (butterflies and moths. COI trees were not well-resolved, had low bootstrap support, and differed in topology from prior phylogenetic analyses of the Trichoptera. Phylogenetic trees based on mitochondrial genomes or rRNA repeats were well-resolved with high bootstrap support and were largely congruent with each other. Because they are easily sequenced by genome skimming, provide robust phylogenetic resolution at various phylogenetic depths, can better distinguish between closely related species, and (in the

  14. The DExH/D protein family database.

    Science.gov (United States)

    Jankowsky, E; Jankowsky, A

    2000-01-01

    DExH/D proteins are essential for all aspects of cellular RNA metabolism and processing, in the replication of many viruses and in DNA replication. DExH/D proteins are subject to current biological, biochemical and biophysical research which provides a continuous wealth of data. The DExH/D protein family database compiles this information and makes it available over the WWW (http://www.columbia.edu/ ej67/dbhome.htm ). The database can be fully searched by text based queries, facilitating fast access to specific information about this important class of enzymes.

  15. Custom Search Engines: Tools & Tips

    Science.gov (United States)

    Notess, Greg R.

    2008-01-01

    Few have the resources to build a Google or Yahoo! from scratch. Yet anyone can build a search engine based on a subset of the large search engines' databases. Use Google Custom Search Engine or Yahoo! Search Builder or any of the other similar programs to create a vertical search engine targeting sites of interest to users. The basic steps to…

  16. Construtores da bio(insegurança na base de dados de perfis de ADN Constructors of bio(insecurity in the DNA profiles database

    Directory of Open Access Journals (Sweden)

    Helena Machado

    2011-02-01

    Full Text Available O presente texto analisa discursos de peritos e de políticos produzidos acerca da criação de uma base de dados de perfis de ADN em Portugal para identificação civil e investigação criminal, com o intuito de explorar alguns patamares de construção da biossegurança. Constata-se que três tipos principais de argumentação são utilizados: a ciência como suporte de uma justiça simultaneamente mais eficaz e mais credível; a necessidade de acompanhar o percurso de países mais desenvolvidos em matéria de investigação criminal e de cooperação transfronteiriça; o contributo para o bem comum. Trata-se de um projeto técnico-genético e biopolítico crescentemente global e imbricado em imaginários coletivos assentes no medo do crime e do criminoso, que assenta mais em promessas de utilidade imaginada e de eficácia na identificação de criminosos do que na invocação dos riscos e das incertezas.This paper analyses the discourses produced by experts and politicians about the creation of a DNA database for civil identification and criminal investigation purposes in Portugal, aiming to explore some stances in the construction of biosecurity. Three main types of arguments are found: science as the support of a simultaneously more effective and reliable justice; the need to follow the course of more developed countries in terms of criminal investigation and trans-border cooperation; the contribution towards the common good. This concerns an increasingly global technical-genetic and biopolitical project, merged within collective imageries built upon on fears of crime and criminals, which is more grounded on promises of imagined utility and efficacy in the identification of offenders than on the invocation of the risks and uncertainties.

  17. A search for pre-main sequence stars in the high-latitude molecular clouds. II - A survey of the Einstein database

    Science.gov (United States)

    Caillault, Jean-Pierre; Magnani, Loris

    1990-01-01

    The preliminary results are reported of a survey of every EINSTEIN image which overlaps any high-latitude molecular cloud in a search for X-ray emitting pre-main sequence stars. This survey, together with complementary KPNO and IRAS data, will allow the determination of how prevalent low mass star formation is in these clouds in general and, particularly, in the translucent molecular clouds.

  18. DMPD: Signal transduction pathways mediated by the interaction of CpG DNA withToll-like receptor 9. [Dynamic Macrophage Pathway CSML Database

    Lifescience Database Archive (English)

    Full Text Available 14751759 Signal transduction pathways mediated by the interaction of CpG DNA withTo...;16(1):17-22. (.png) (.svg) (.html) (.csml) Show Signal transduction pathways mediated by the interaction of... CpG DNA withToll-like receptor 9. PubmedID 14751759 Title Signal transduction pathways media

  19. ALFRED: An Allele Frequency Database for Microevolutionary Studies

    Directory of Open Access Journals (Sweden)

    Kenneth K Kidd

    2005-01-01

    Full Text Available Many kinds of microevolutionary studies require data on multiple polymorphisms in multiple populations. Increasingly, and especially for human populations, multiple research groups collect relevant data and those data are dispersed widely in the literature. ALFRED has been designed to hold data from many sources and make them available over the web. Data are assembled from multiple sources, curated, and entered into the database. Multiple links to other resources are also established by the curators. A variety of search options are available and additional geographic based interfaces are being developed. The database can serve the human anthropologic genetic community by identifying what loci are already typed on many populations thereby helping to focus efforts on a common set of markers. The database can also serve as a model for databases handling similar DNA polymorphism data for other species.

  20. DENdb: database of integrated human enhancers

    KAUST Repository

    Ashoor, Haitham

    2015-09-05

    Enhancers are cis-acting DNA regulatory regions that play a key role in distal control of transcriptional activities. Identification of enhancers, coupled with a comprehensive functional analysis of their properties, could improve our understanding of complex gene transcription mechanisms and gene regulation processes in general. We developed DENdb, a centralized on-line repository of predicted enhancers derived from multiple human cell-lines. DENdb integrates enhancers predicted by five different methods generating an enriched catalogue of putative enhancers for each of the analysed cell-lines. DENdb provides information about the overlap of enhancers with DNase I hypersensitive regions, ChIP-seq regions of a number of transcription factors and transcription factor binding motifs, means to explore enhancer interactions with DNA using several chromatin interaction assays and enhancer neighbouring genes. DENdb is designed as a relational database that facilitates fast and efficient searching, browsing and visualization of information.

  1. DENdb: database of integrated human enhancers

    KAUST Repository

    Ashoor, Haitham; Kleftogiannis, Dimitrios A.; Radovanovic, Aleksandar; Bajic, Vladimir B.

    2015-01-01

    Enhancers are cis-acting DNA regulatory regions that play a key role in distal control of transcriptional activities. Identification of enhancers, coupled with a comprehensive functional analysis of their properties, could improve our understanding of complex gene transcription mechanisms and gene regulation processes in general. We developed DENdb, a centralized on-line repository of predicted enhancers derived from multiple human cell-lines. DENdb integrates enhancers predicted by five different methods generating an enriched catalogue of putative enhancers for each of the analysed cell-lines. DENdb provides information about the overlap of enhancers with DNase I hypersensitive regions, ChIP-seq regions of a number of transcription factors and transcription factor binding motifs, means to explore enhancer interactions with DNA using several chromatin interaction assays and enhancer neighbouring genes. DENdb is designed as a relational database that facilitates fast and efficient searching, browsing and visualization of information.

  2. Alignment of high-throughput sequencing data inside in-memory databases.

    Science.gov (United States)

    Firnkorn, Daniel; Knaup-Gregori, Petra; Lorenzo Bermejo, Justo; Ganzinger, Matthias

    2014-01-01

    In times of high-throughput DNA sequencing techniques, performance-capable analysis of DNA sequences is of high importance. Computer supported DNA analysis is still an intensive time-consuming task. In this paper we explore the potential of a new In-Memory database technology by using SAP's High Performance Analytic Appliance (HANA). We focus on read alignment as one of the first steps in DNA sequence analysis. In particular, we examined the widely used Burrows-Wheeler Aligner (BWA) and implemented stored procedures in both, HANA and the free database system MySQL, to compare execution time and memory management. To ensure that the results are comparable, MySQL has been running in memory as well, utilizing its integrated memory engine for database table creation. We implemented stored procedures, containing exact and inexact searching of DNA reads within the reference genome GRCh37. Due to technical restrictions in SAP HANA concerning recursion, the inexact matching problem could not be implemented on this platform. Hence, performance analysis between HANA and MySQL was made by comparing the execution time of the exact search procedures. Here, HANA was approximately 27 times faster than MySQL which means, that there is a high potential within the new In-Memory concepts, leading to further developments of DNA analysis procedures in the future.

  3. JICST Factual DatabaseJICST Chemical Substance Safety Regulation Database

    Science.gov (United States)

    Abe, Atsushi; Sohma, Tohru

    JICST Chemical Substance Safety Regulation Database is based on the Database of Safety Laws for Chemical Compounds constructed by Japan Chemical Industry Ecology-Toxicology & Information Center (JETOC) sponsored by the Sience and Technology Agency in 1987. JICST has modified JETOC database system, added data and started the online service through JOlS-F (JICST Online Information Service-Factual database) in January 1990. JICST database comprises eighty-three laws and fourteen hundred compounds. The authors outline the database, data items, files and search commands. An example of online session is presented.

  4. Specialist Bibliographic Databases.

    Science.gov (United States)

    Gasparyan, Armen Yuri; Yessirkepov, Marlen; Voronov, Alexander A; Trukhachev, Vladimir I; Kostyukova, Elena I; Gerasimov, Alexey N; Kitas, George D

    2016-05-01

    Specialist bibliographic databases offer essential online tools for researchers and authors who work on specific subjects and perform comprehensive and systematic syntheses of evidence. This article presents examples of the established specialist databases, which may be of interest to those engaged in multidisciplinary science communication. Access to most specialist databases is through subscription schemes and membership in professional associations. Several aggregators of information and database vendors, such as EBSCOhost and ProQuest, facilitate advanced searches supported by specialist keyword thesauri. Searches of items through specialist databases are complementary to those through multidisciplinary research platforms, such as PubMed, Web of Science, and Google Scholar. Familiarizing with the functional characteristics of biomedical and nonbiomedical bibliographic search tools is mandatory for researchers, authors, editors, and publishers. The database users are offered updates of the indexed journal lists, abstracts, author profiles, and links to other metadata. Editors and publishers may find particularly useful source selection criteria and apply for coverage of their peer-reviewed journals and grey literature sources. These criteria are aimed at accepting relevant sources with established editorial policies and quality controls.

  5. Specialist Bibliographic Databases

    Science.gov (United States)

    2016-01-01

    Specialist bibliographic databases offer essential online tools for researchers and authors who work on specific subjects and perform comprehensive and systematic syntheses of evidence. This article presents examples of the established specialist databases, which may be of interest to those engaged in multidisciplinary science communication. Access to most specialist databases is through subscription schemes and membership in professional associations. Several aggregators of information and database vendors, such as EBSCOhost and ProQuest, facilitate advanced searches supported by specialist keyword thesauri. Searches of items through specialist databases are complementary to those through multidisciplinary research platforms, such as PubMed, Web of Science, and Google Scholar. Familiarizing with the functional characteristics of biomedical and nonbiomedical bibliographic search tools is mandatory for researchers, authors, editors, and publishers. The database users are offered updates of the indexed journal lists, abstracts, author profiles, and links to other metadata. Editors and publishers may find particularly useful source selection criteria and apply for coverage of their peer-reviewed journals and grey literature sources. These criteria are aimed at accepting relevant sources with established editorial policies and quality controls. PMID:27134485

  6. A search for pre-main-sequence stars in high-latitude molecular clouds. 3: A survey of the Einstein database

    Science.gov (United States)

    Caillault, Jean-Pierre; Magnani, Loris; Fryer, Chris

    1995-01-01

    In order to discern whether the high-latitude molecular clouds are regions of ongoing star formation, we have used X-ray emission as a tracer of youthful stars. The entire Einstein database yields 18 images which overlap 10 of the clouds mapped partially or completely in the CO (1-0) transition, providing a total of approximately 6 deg squared of overlap. Five previously unidentified X-ray sources were detected: one has an optical counterpart which is a pre-main-sequence (PMS) star, and two have normal main-sequence stellar counterparts, while the other two are probably extragalactic sources. The PMS star is located in a high Galactic latitude Lynds dark cloud, so this result is not too suprising. The translucent clouds, though, have yet to reveal any evidence of star formation.

  7. JICST Factual Database(2)

    Science.gov (United States)

    Araki, Keisuke

    The computer programme, which builds atom-bond connection tables from nomenclatures, is developed. Chemical substances with their nomenclature and varieties of trivial names or experimental code numbers are inputted. The chemical structures of the database are stereospecifically stored and are able to be searched and displayed according to stereochemistry. Source data are from laws and regulations of Japan, RTECS of US and so on. The database plays a central role within the integrated fact database service of JICST and makes interrelational retrieval possible.

  8. Optimization of partial search

    International Nuclear Information System (INIS)

    Korepin, Vladimir E

    2005-01-01

    A quantum Grover search algorithm can find a target item in a database faster than any classical algorithm. One can trade accuracy for speed and find a part of the database (a block) containing the target item even faster; this is partial search. A partial search algorithm was recently suggested by Grover and Radhakrishnan. Here we optimize it. Efficiency of the search algorithm is measured by the number of queries to the oracle. The author suggests a new version of the Grover-Radhakrishnan algorithm which uses a minimal number of such queries. The algorithm can run on the same hardware that is used for the usual Grover algorithm. (letter to the editor)

  9. International Society of Human and Animal Mycology (ISHAM)-ITS reference DNA barcoding database--the quality controlled standard tool for routine identification of human and animal pathogenic fungi.

    Science.gov (United States)

    Irinyi, Laszlo; Serena, Carolina; Garcia-Hermoso, Dea; Arabatzis, Michael; Desnos-Ollivier, Marie; Vu, Duong; Cardinali, Gianluigi; Arthur, Ian; Normand, Anne-Cécile; Giraldo, Alejandra; da Cunha, Keith Cassia; Sandoval-Denis, Marcelo; Hendrickx, Marijke; Nishikaku, Angela Satie; de Azevedo Melo, Analy Salles; Merseguel, Karina Bellinghausen; Khan, Aziza; Parente Rocha, Juliana Alves; Sampaio, Paula; da Silva Briones, Marcelo Ribeiro; e Ferreira, Renata Carmona; de Medeiros Muniz, Mauro; Castañón-Olivares, Laura Rosio; Estrada-Barcenas, Daniel; Cassagne, Carole; Mary, Charles; Duan, Shu Yao; Kong, Fanrong; Sun, Annie Ying; Zeng, Xianyu; Zhao, Zuotao; Gantois, Nausicaa; Botterel, Françoise; Robbertse, Barbara; Schoch, Conrad; Gams, Walter; Ellis, David; Halliday, Catriona; Chen, Sharon; Sorrell, Tania C; Piarroux, Renaud; Colombo, Arnaldo L; Pais, Célia; de Hoog, Sybren; Zancopé-Oliveira, Rosely Maria; Taylor, Maria Lucia; Toriello, Conchita; de Almeida Soares, Célia Maria; Delhaes, Laurence; Stubbe, Dirk; Dromer, Françoise; Ranque, Stéphane; Guarro, Josep; Cano-Lira, Jose F; Robert, Vincent; Velegraki, Aristea; Meyer, Wieland

    2015-05-01

    Human and animal fungal pathogens are a growing threat worldwide leading to emerging infections and creating new risks for established ones. There is a growing need for a rapid and accurate identification of pathogens to enable early diagnosis and targeted antifungal therapy. Morphological and biochemical identification methods are time-consuming and require trained experts. Alternatively, molecular methods, such as DNA barcoding, a powerful and easy tool for rapid monophasic identification, offer a practical approach for species identification and less demanding in terms of taxonomical expertise. However, its wide-spread use is still limited by a lack of quality-controlled reference databases and the evolving recognition and definition of new fungal species/complexes. An international consortium of medical mycology laboratories was formed aiming to establish a quality controlled ITS database under the umbrella of the ISHAM working group on "DNA barcoding of human and animal pathogenic fungi." A new database, containing 2800 ITS sequences representing 421 fungal species, providing the medical community with a freely accessible tool at http://www.isham.org/ and http://its.mycologylab.org/ to rapidly and reliably identify most agents of mycoses, was established. The generated sequences included in the new database were used to evaluate the variation and overall utility of the ITS region for the identification of pathogenic fungi at intra-and interspecies level. The average intraspecies variation ranged from 0 to 2.25%. This highlighted selected pathogenic fungal species, such as the dermatophytes and emerging yeast, for which additional molecular methods/genetic markers are required for their reliable identification from clinical and veterinary specimens. © The Author 2015. Published by Oxford University Press on behalf of The International Society for Human and Animal Mycology. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  10. Detection and identification of drugs and toxicants in human body fluids by liquid chromatography-tandem mass spectrometry under data-dependent acquisition control and automated database search.

    Science.gov (United States)

    Oberacher, Herbert; Schubert, Birthe; Libiseller, Kathrin; Schweissgut, Anna

    2013-04-03

    Systematic toxicological analysis (STA) is aimed at detecting and identifying all substances of toxicological relevance (i.e. drugs, drugs of abuse, poisons and/or their metabolites) in biological material. Particularly, gas chromatography-mass spectrometry (GC/MS) represents a competent and commonly applied screening and confirmation tool. Herein, we present an untargeted liquid chromatography-tandem mass spectrometry (LC/MS/MS) assay aimed to complement existing GC/MS screening for the detection and identification of drugs in blood, plasma and urine samples. Solid-phase extraction was accomplished on mixed-mode cartridges. LC was based on gradient elution in a miniaturized C18 column. High resolution electrospray ionization-MS/MS in positive ion mode with data-dependent acquisition control was used to generate tandem mass spectral information that enabled compound identification via automated library search in the "Wiley Registry of Tandem Mass Spectral Data, MSforID". Fitness of the developed LC/MS/MS method for application in STA in terms of selectivity, detection capability and reliability of identification (sensitivity/specificity) was demonstrated with blank samples, certified reference materials, proficiency test samples, and authentic casework samples. Copyright © 2013 Elsevier B.V. All rights reserved.

  11. Relational databases

    CERN Document Server

    Bell, D A

    1986-01-01

    Relational Databases explores the major advances in relational databases and provides a balanced analysis of the state of the art in relational databases. Topics covered include capture and analysis of data placement requirements; distributed relational database systems; data dependency manipulation in database schemata; and relational database support for computer graphics and computer aided design. This book is divided into three sections and begins with an overview of the theory and practice of distributed systems, using the example of INGRES from Relational Technology as illustration. The

  12. Update History of This Database - KOME | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us KOME Update History of This Database Date Update contents 2014/10/22 The URL of the whole da...site is opened. 2003/07/18 KOME ( http://cdna01.dna.affrc.go.jp/cDNA/ ) is opened. About This Database Dat...abase Description Download License Update History of This Database Site Policy | Contact Us Update History of This Database - KOME | LSDB Archive ...

  13. Design of a Bioactive Small Molecule that Targets the Myotonic Dystrophy Type 1 RNA Via an RNA Motif-Ligand Database & Chemical Similarity Searching

    Science.gov (United States)

    Parkesh, Raman; Childs-Disney, Jessica L.; Nakamori, Masayuki; Kumar, Amit; Wang, Eric; Wang, Thomas; Hoskins, Jason; Tran, Tuan; Housman, David; Thornton, Charles A.; Disney, Matthew D.

    2012-01-01

    Myotonic dystrophy type 1 (DM1) is a triplet repeating disorder caused by expanded CTG repeats in the 3′ untranslated region of the dystrophia myotonica protein kinase (DMPK) gene. The transcribed repeats fold into an RNA hairpin with multiple copies of a 5′CUG/3′GUC motif that binds the RNA splicing regulator muscleblind-like 1 protein (MBNL1). Sequestration of MBNL1 by expanded r(CUG) repeats causes splicing defects in a subset of pre-mRNAs including the insulin receptor, the muscle-specific chloride ion channel, Sarco(endo)plasmic reticulum Ca2+ ATPase 1 (Serca1/Atp2a1), and cardiac troponin T (cTNT). Based on these observations, the development of small molecule ligands that target specifically expanded DM1 repeats could serve as therapeutics. In the present study, computational screening was employed to improve the efficacy of pentamidine and Hoechst 33258 ligands that have been shown previously to target the DM1 triplet repeat. A series of inhibitors of the RNA-protein complex with low micromolar IC50’s, which are >20-fold more potent than the query compounds, were identified. Importantly, a bis-benzimidazole identified from the Hoechst query improves DM1-associated pre-mRNA splicing defects in cell and mouse models of DM1 (when dosed with 1 mM and 100 mg/kg, respectively). Since Hoechst 33258 was identified as a DM1 binder through analysis of an RNA motif-ligand database, these studies suggest that lead ligands targeting RNA with improved biological activity can be identified by using a synergistic approach that combines analysis of known RNA-ligand interactions with virtual screening. PMID:22300544

  14. Identification of specific markers for amphetamine synthesised from the pre-precursor APAAN following the Leuckart route and retrospective search for APAAN markers in profiling databases from Germany and the Netherlands.

    Science.gov (United States)

    Hauser, Frank M; Rößler, Thorsten; Hulshof, Janneke W; Weigel, Diana; Zimmermann, Ralf; Pütz, Michael

    2018-04-01

    α-Phenylacetoacetonitrile (APAAN) is one of the most important pre-precursors for amphetamine production in recent years. This assumption is based on seizure data but there is little analytical data available showing how much amphetamine really originated from APAAN. In this study, several syntheses of amphetamine following the Leuckart route were performed starting from different organic compounds including APAAN. The organic phases were analysed using gas chromatography-mass spectrometry (GC-MS) to search for signals caused by possible APAAN markers. Three compounds were discovered, isolated, and based on the performed syntheses it was found that they are highly specific for the use of APAAN. Using mass spectra, high resolution MS and nuclear magnetic resonance (NMR) data the compounds were characterised and identified as 2-phenyl-2-butenenitrile, 3-amino-2-phenyl-2-butenenitrile, and 4-amino-6-methyl-5-phenylpyrimidine. To investigate their significance, they were searched in data from seized amphetamine samples to determine to what extent they were present in illicitly produced amphetamine. Data of more than 580 cases from amphetamine profiling databases in Germany and the Netherlands were used for this purpose. These databases allowed analysis of the yearly occurrence of the markers going back to 2009. The markers revealed a trend that was in agreement with seizure reports and reflected an increasing use of APAAN from 2010 on. This paper presents experimental proof that APAAN is indeed the most important pre-precursor of amphetamine in recent years. It also illustrates how important it is to look for new ways to identify current trends in drug production since such trends can change within a few years. Copyright © 2017 John Wiley & Sons, Ltd.

  15. Standardization of Keyword Search Mode

    Science.gov (United States)

    Su, Di

    2010-01-01

    In spite of its popularity, keyword search mode has not been standardized. Though information professionals are quick to adapt to various presentations of keyword search mode, novice end-users may find keyword search confusing. This article compares keyword search mode in some major reference databases and calls for standardization. (Contains 3…

  16. Biofuel Database

    Science.gov (United States)

    Biofuel Database (Web, free access)   This database brings together structural, biological, and thermodynamic data for enzymes that are either in current use or are being considered for use in the production of biofuels.

  17. Community Database

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — This excel spreadsheet is the result of merging at the port level of several of the in-house fisheries databases in combination with other demographic databases such...

  18. Database Administrator

    Science.gov (United States)

    Moore, Pam

    2010-01-01

    The Internet and electronic commerce (e-commerce) generate lots of data. Data must be stored, organized, and managed. Database administrators, or DBAs, work with database software to find ways to do this. They identify user needs, set up computer databases, and test systems. They ensure that systems perform as they should and add people to the…

  19. Update History of This Database - SSBD | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us SSBD Update History of This Database Date Update contents 2016/07/25 SSBD English archive si...tion Download License Update History of This Database Site Policy | Contact Us Update History of This Database - SSBD | LSDB Archive ... ...te is opened. 2013/09/03 SSBD ( http://ssbd.qbic.riken.jp/ ) is opened. About This Database Database Descrip

  20. Update History of This Database - SAHG | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us SAHG Update History of This Database Date Update contents 2016/05/09 SAHG English archive si...te is opened. 2009/10 SAHG ( http://bird.cbrc.jp/sahg ) is opened. About This Database Database Description ...Download License Update History of This Database Site Policy | Contact Us Update History of This Database - SAHG | LSDB Archive ...

  1. Update History of This Database - DMPD | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us DMPD Update History of This Database Date Update contents 2010/03/29 DMPD English archive si....jp/macrophage/ ) is released. About This Database Database Description Download License Update History of Thi...s Database Site Policy | Contact Us Update History of This Database - DMPD | LSDB Archive ...

  2. Ocean Drilling Program: Janus Web Database

    Science.gov (United States)

    JANUS Database Send questions/comments about the online database Request data not available online Janus database Search the ODP/TAMU web site ODP's main web site Janus Data Model Data Migration Overview in Janus Data Types and Examples Leg 199, sunrise. Janus Web Database ODP and IODP data are stored in

  3. Online Petroleum Industry Bibliographic Databases: A Review.

    Science.gov (United States)

    Anderson, Margaret B.

    This paper discusses the present status of the bibliographic database industry, reviews the development of online databases of interest to the petroleum industry, and considers future developments in online searching and their effect on libraries and information centers. Three groups of databases are described: (1) databases developed by the…

  4. Database Description - tRNADB-CE | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data List Contact us tRNAD...B-CE Database Description General information of database Database name tRNADB-CE Alter...CC BY-SA Detail Background and funding Name: MEXT Integrated Database Project Reference(s) Article title: tRNAD... 2009 Jan;37(Database issue):D163-8. External Links: Article title: tRNADB-CE 2011: tRNA gene database curat...n Download License Update History of This Database Site Policy | Contact Us Database Description - tRNADB-CE | LSDB Archive ...

  5. SS-Wrapper: a package of wrapper applications for similarity searches on Linux clusters

    Directory of Open Access Journals (Sweden)

    Lefkowitz Elliot J

    2004-10-01

    Full Text Available Abstract Background Large-scale sequence comparison is a powerful tool for biological inference in modern molecular biology. Comparing new sequences to those in annotated databases is a useful source of functional and structural information about these sequences. Using software such as the basic local alignment search tool (BLAST or HMMPFAM to identify statistically significant matches between newly sequenced segments of genetic material and those in databases is an important task for most molecular biologists. Searching algorithms are intrinsically slow and data-intensive, especially in light of the rapid growth of biological sequence databases due to the emergence of high throughput DNA sequencing techniques. Thus, traditional bioinformatics tools are impractical on PCs and even on dedicated UNIX servers. To take advantage of larger databases and more reliable methods, high performance computation becomes necessary. Results We describe the implementation of SS-Wrapper (Similarity Search Wrapper, a package of wrapper applications that can parallelize similarity search applications on a Linux cluster. Our wrapper utilizes a query segmentation-search (QS-search approach to parallelize sequence database search applications. It takes into consideration load balancing between each node on the cluster to maximize resource usage. QS-search is designed to wrap many different search tools, such as BLAST and HMMPFAM using the same interface. This implementation does not alter the original program, so newly obtained programs and program updates should be accommodated easily. Benchmark experiments using QS-search to optimize BLAST and HMMPFAM showed that QS-search accelerated the performance of these programs almost linearly in proportion to the number of CPUs used. We have also implemented a wrapper that utilizes a database segmentation approach (DS-BLAST that provides a complementary solution for BLAST searches when the database is too large to fit into

  6. SS-Wrapper: a package of wrapper applications for similarity searches on Linux clusters.

    Science.gov (United States)

    Wang, Chunlin; Lefkowitz, Elliot J

    2004-10-28

    Large-scale sequence comparison is a powerful tool for biological inference in modern molecular biology. Comparing new sequences to those in annotated databases is a useful source of functional and structural information about these sequences. Using software such as the basic local alignment search tool (BLAST) or HMMPFAM to identify statistically significant matches between newly sequenced segments of genetic material and those in databases is an important task for most molecular biologists. Searching algorithms are intrinsically slow and data-intensive, especially in light of the rapid growth of biological sequence databases due to the emergence of high throughput DNA sequencing techniques. Thus, traditional bioinformatics tools are impractical on PCs and even on dedicated UNIX servers. To take advantage of larger databases and more reliable methods, high performance computation becomes necessary. We describe the implementation of SS-Wrapper (Similarity Search Wrapper), a package of wrapper applications that can parallelize similarity search applications on a Linux cluster. Our wrapper utilizes a query segmentation-search (QS-search) approach to parallelize sequence database search applications. It takes into consideration load balancing between each node on the cluster to maximize resource usage. QS-search is designed to wrap many different search tools, such as BLAST and HMMPFAM using the same interface. This implementation does not alter the original program, so newly obtained programs and program updates should be accommodated easily. Benchmark experiments using QS-search to optimize BLAST and HMMPFAM showed that QS-search accelerated the performance of these programs almost linearly in proportion to the number of CPUs used. We have also implemented a wrapper that utilizes a database segmentation approach (DS-BLAST) that provides a complementary solution for BLAST searches when the database is too large to fit into the memory of a single node. Used together

  7. Download - SAHG | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...Database Description Download License Update History of This Database Site Policy | Contact Us Download - SAHG | LSDB Archive ...

  8. License - GRIPDB | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...e Database Description Download License Update History of This Database Site Policy | Contact Us License - GRIPDB | LSDB Archive ...

  9. License - GETDB | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...se Database Description Download License Update History of This Database Site Policy | Contact Us License - GETDB | LSDB Archive ...

  10. Download - Metabolonote | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ... Database Description Download License Update History of This Database Site Policy | Contact Us Download - Metabolonote | LSDB Archive ...

  11. COMPARISON OF POPULAR BIOINFORMATICS DATABASES

    OpenAIRE

    Abdulganiyu Abdu Yusuf; Zahraddeen Sufyanu; Kabir Yusuf Mamman; Abubakar Umar Suleiman

    2016-01-01

    Bioinformatics is the application of computational tools to capture and interpret biological data. It has wide applications in drug development, crop improvement, agricultural biotechnology and forensic DNA analysis. There are various databases available to researchers in bioinformatics. These databases are customized for a specific need and are ranged in size, scope, and purpose. The main drawbacks of bioinformatics databases include redundant information, constant change, data spread over m...

  12. DMPD: Activation of lymphokine genes in T cells: role of cis-acting DNA elements thatrespond to T cell activation signals. [Dynamic Macrophage Pathway CSML Database

    Lifescience Database Archive (English)

    Full Text Available thatrespond to T cell activation signals. Arai N, Naito Y, Watanabe M, Masuda ES, Yamaguchi-Iwai Y, Tsuboi A, Heike T,Matsud... in T cells: role of cis-acting DNA elements thatrespond to T cell activation signals. Authors Arai N, Naito Y, Watanabe M, Masud...a ES, Yamaguchi-Iwai Y, Tsuboi A, Heike T,Matsuda I, Yokota

  13. Evaluation of Federated Searching Options for the School Library

    Science.gov (United States)

    Abercrombie, Sarah E.

    2008-01-01

    Three hosted federated search tools, Follett One Search, Gale PowerSearch Plus, and WebFeat Express, were configured and implemented in a school library. Databases from five vendors and the OPAC were systematically searched. Federated search results were compared with each other and to the results of the same searches in the database's native…

  14. The CAPEC Database

    DEFF Research Database (Denmark)

    Nielsen, Thomas Lund; Abildskov, Jens; Harper, Peter Mathias

    2001-01-01

    in the compound. This classification makes the CAPEC database a very useful tool, for example, in the development of new property models, since properties of chemically similar compounds are easily obtained. A program with efficient search and retrieval functions of properties has been developed.......The Computer-Aided Process Engineering Center (CAPEC) database of measured data was established with the aim to promote greater data exchange in the chemical engineering community. The target properties are pure component properties, mixture properties, and special drug solubility data....... The database divides pure component properties into primary, secondary, and functional properties. Mixture properties are categorized in terms of the number of components in the mixture and the number of phases present. The compounds in the database have been classified on the basis of the functional groups...

  15. SoyDB: a knowledge database of soybean transcription factors

    Directory of Open Access Journals (Sweden)

    Valliyodan Babu

    2010-01-01

    Full Text Available Abstract Background Transcription factors play the crucial rule of regulating gene expression and influence almost all biological processes. Systematically identifying and annotating transcription factors can greatly aid further understanding their functions and mechanisms. In this article, we present SoyDB, a user friendly database containing comprehensive knowledge of soybean transcription factors. Description The soybean genome was recently sequenced by the Department of Energy-Joint Genome Institute (DOE-JGI and is publicly available. Mining of this sequence identified 5,671 soybean genes as putative transcription factors. These genes were comprehensively annotated as an aid to the soybean research community. We developed SoyDB - a knowledge database for all the transcription factors in the soybean genome. The database contains protein sequences, predicted tertiary structures, putative DNA binding sites, domains, homologous templates in the Protein Data Bank (PDB, protein family classifications, multiple sequence alignments, consensus protein sequence motifs, web logo of each family, and web links to the soybean transcription factor database PlantTFDB, known EST sequences, and other general protein databases including Swiss-Prot, Gene Ontology, KEGG, EMBL, TAIR, InterPro, SMART, PROSITE, NCBI, and Pfam. The database can be accessed via an interactive and convenient web server, which supports full-text search, PSI-BLAST sequence search, database browsing by protein family, and automatic classification of a new protein sequence into one of 64 annotated transcription factor families by hidden Markov models. Conclusions A comprehensive soybean transcription factor database was constructed and made publicly accessible at http://casp.rnet.missouri.edu/soydb/.

  16. Beyond MEDLINE for literature searches.

    Science.gov (United States)

    Conn, Vicki S; Isaramalai, Sang-arun; Rath, Sabyasachi; Jantarakupt, Peeranuch; Wadhawan, Rohini; Dash, Yashodhara

    2003-01-01

    To describe strategies for a comprehensive literature search. MEDLINE searches result in limited numbers of studies that are often biased toward statistically significant findings. Diversified search strategies are needed. Empirical evidence about the recall and precision of diverse search strategies is presented. Challenges and strengths of each search strategy are identified. Search strategies vary in recall and precision. Often sensitivity and specificity are inversely related. Valuable search strategies include examination of multiple diverse computerized databases, ancestry searches, citation index searches, examination of research registries, journal hand searching, contact with the "invisible college," examination of abstracts, Internet searches, and contact with sources of synthesized information. Extending searches beyond MEDLINE enables researchers to conduct more systematic comprehensive searches.

  17. 5'-end sequences of budding yeast full-length cDNA clones and quality scores - Budding yeast cDNA sequencing project | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available east_seq_qual.zip File URL: ftp://ftp.biosciencedbc.jp/archive/yeast_cdna/LATEST/...yeast_seq_qual.zip File size: 59.9MB Simple search URL http://togodb.biosciencedbc.jp/togodb/view/budding_yeast_cdna

  18. Main data - RMG | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...ftp://ftp.biosciencedbc.jp/archive/rmg/LATEST/rmg_main.zip File size: 1 KB Simple search URL http://togodb.b... This Database Database Description Download License Update History of This Database Site Policy | Contact Us Main data - RMG | LSDB Archive ...

  19. Alignment - SAHG | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...e URL: ftp://ftp.biosciencedbc.jp/archive/sahg/LATEST/sahg_alignment.zip File size: 12.0 MB Simple search UR...s Database Database Description Download License Update History of This Database Site Policy | Contact Us Alignment - SAHG | LSDB Archive ...

  20. Locus - ASTRA | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...URL: ftp://ftp.biosciencedbc.jp/archive/astra/LATEST/astra_locus.zip File size: 887 KB Simple search URL htt...icing type (ex. cassette) About This Database Database Description Download License Update History of This Database Site Policy | Contact Us Locus - ASTRA | LSDB Archive ...

  1. A comprehensive DNA barcode database for Central European beetles with a focus on Germany: adding more than 3500 identified species to BOLD.

    Science.gov (United States)

    Hendrich, Lars; Morinière, Jérôme; Haszprunar, Gerhard; Hebert, Paul D N; Hausmann, Axel; Köhler, Frank; Balke, Michael

    2015-07-01

    Beetles are the most diverse group of animals and are crucial for ecosystem functioning. In many countries, they are well established for environmental impact assessment, but even in the well-studied Central European fauna, species identification can be very difficult. A comprehensive and taxonomically well-curated DNA barcode library could remedy this deficit and could also link hundreds of years of traditional knowledge with next generation sequencing technology. However, such a beetle library is missing to date. This study provides the globally largest DNA barcode reference library for Coleoptera for 15 948 individuals belonging to 3514 well-identified species (53% of the German fauna) with representatives from 97 of 103 families (94%). This study is the first comprehensive regional test of the efficiency of DNA barcoding for beetles with a focus on Germany. Sequences ≥500 bp were recovered from 63% of the specimens analysed (15 948 of 25 294) with short sequences from another 997 specimens. Whereas most specimens (92.2%) could be unambiguously assigned to a single known species by sequence diversity at CO1, 1089 specimens (6.8%) were assigned to more than one Barcode Index Number (BIN), creating 395 BINs which need further study to ascertain if they represent cryptic species, mitochondrial introgression, or simply regional variation in widespread species. We found 409 specimens (2.6%) that shared a BIN assignment with another species, most involving a pair of closely allied species as 43 BINs were involved. Most of these taxa were separated by barcodes although sequence divergences were low. Only 155 specimens (0.97%) show identical or overlapping clusters. © 2014 John Wiley & Sons Ltd.

  2. Specialized microbial databases for inductive exploration of microbial genome sequences

    Directory of Open Access Journals (Sweden)

    Cabau Cédric

    2005-02-01

    Full Text Available Abstract Background The enormous amount of genome sequence data asks for user-oriented databases to manage sequences and annotations. Queries must include search tools permitting function identification through exploration of related objects. Methods The GenoList package for collecting and mining microbial genome databases has been rewritten using MySQL as the database management system. Functions that were not available in MySQL, such as nested subquery, have been implemented. Results Inductive reasoning in the study of genomes starts from "islands of knowledge", centered around genes with some known background. With this concept of "neighborhood" in mind, a modified version of the GenoList structure has been used for organizing sequence data from prokaryotic genomes of particular interest in China. GenoChore http://bioinfo.hku.hk/genochore.html, a set of 17 specialized end-user-oriented microbial databases (including one instance of Microsporidia, Encephalitozoon cuniculi, a member of Eukarya has been made publicly available. These databases allow the user to browse genome sequence and annotation data using standard queries. In addition they provide a weekly update of searches against the world-wide protein sequences data libraries, allowing one to monitor annotation updates on genes of interest. Finally, they allow users to search for patterns in DNA or protein sequences, taking into account a clustering of genes into formal operons, as well as providing extra facilities to query sequences using predefined sequence patterns. Conclusion This growing set of specialized microbial databases organize data created by the first Chinese bacterial genome programs (ThermaList, Thermoanaerobacter tencongensis, LeptoList, with two different genomes of Leptospira interrogans and SepiList, Staphylococcus epidermidis associated to related organisms for comparison.

  3. License - FANTOM5 | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data List Contact us FANTOM....0 International . If you use data from this database, please be sure attribute this database as follows: FANTOM...se Database Description Download License Update History of This Database Site Policy | Contact Us License - FANTOM5 | LSDB Archive ...

  4. Federal databases

    International Nuclear Information System (INIS)

    Welch, M.J.; Welles, B.W.

    1988-01-01

    Accident statistics on all modes of transportation are available as risk assessment analytical tools through several federal agencies. This paper reports on the examination of the accident databases by personal contact with the federal staff responsible for administration of the database programs. This activity, sponsored by the Department of Energy through Sandia National Laboratories, is an overview of the national accident data on highway, rail, air, and marine shipping. For each mode, the definition or reporting requirements of an accident are determined and the method of entering the accident data into the database is established. Availability of the database to others, ease of access, costs, and who to contact were prime questions to each of the database program managers. Additionally, how the agency uses the accident data was of major interest

  5. License - Q-TARO | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...thout notice. About This Database Database Description Download License Update History of This Database Site Policy | Contact Us License - Q-TARO | LSDB Archive ...

  6. Download - GenLibi | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...access [here]. About This Database Database Description Download License Update History of This Database Site Policy | Contact Us Download - GenLibi | LSDB Archive ...

  7. Sampling the potential energy surface of a DNA duplex damaged by a food carcinogen: Force field parameterization by ab initio quantum calculations and conformational searching using molecular mechanics computations

    Science.gov (United States)

    Wu, Xiangyang

    1999-07-01

    The heterocyclic amine 2-amino-3-methylimidazo (4, 5-f) quinoline (IQ) is one of a number of carcinogens found in barbecued meat and fish. It induces tumors in mammals and is probably involved in human carcinogenesis, because of great exposure to such food carcinogens. IQ is biochemically activated to a derivative which reacts with DNA to form a covalent adduct. This adduct may deform the DNA and consequently cause a mutation. which may initiate carcinogenesis. To understand this cancer initiating event, it is necessary to obtain atomic resolution structures of the damaged DNA. No such structures are available experimentally due to synthesis difficulties. Therefore, we employ extensive molecular mechanics and dynamics calculations for this purpose. The major IQ-DNA adduct in the specific DNA sequence d(5'G1G2C G3CCA3') - d(5'TGGCGCC3') with IQ modified at G3 is studied. The d(5'G1G2C G3CC3') sequence has recently been shown to be a hot-spot for mutations when IQ modification is at G3. Although this sequence is prone to -2 deletions via a ``slippage mechanism'' even when unmodified, a key question is why IQ increases the mutation frequency of the unmodified DNA by about 104 fold. Is there a structural feature imposed by IQ that is responsible? The molecular mechanics and dynamics program AMBER for nucleic acids with the latest force field was chosen for this work. This force field has been demonstrated to reproduce well the B-DNA structure. However, some parameters, the partial charges, bond lengths and angles, dihedral parameters of the modified residue, are not available in the AMBER database. We parameterized the force field using high level ab initio quantum calculations. We created 800 starting conformations which uniformly sampled in combination at 18° intervals three torsion angles that govern the IQ-DNA orientations, and energy minimized them. The most important structures are abnormal; the IQ damaged guanine is rotated out of its standard B-DNA

  8. UNITE: a database providing web-based methods for the molecular identification of ectomycorrhizal fungi

    DEFF Research Database (Denmark)

    Köljalg, U.; Larsson, K.H.; Abarenkov, K.

    2005-01-01

    Identification of ectomycorrhizal (ECM) fungi is often achieved through comparisons of ribosomal DNA internal transcribed spacer (ITS) sequences with accessioned sequences deposited in public databases. A major problem encountered is that annotation of the sequences in these databases is not always....... At present UNITE contains 758 ITS sequences from 455 species and 67 genera of ECM fungi. •  UNITE can be searched by taxon name, via sequence similarity using blastn, and via phylogenetic sequence identification using galaxie. Following implementation, galaxie performs a phylogenetic analysis of the query...... sequence after alignment either to pre-existing generic alignments, or to matches retrieved from a blast search on the UNITE data. It should be noted that the current version of UNITE is dedicated to the reliable identification of ECM fungi. •  The UNITE database is accessible through the URL http://unite.zbi.ee...

  9. GEMINI: a computationally-efficient search engine for large gene expression datasets.

    Science.gov (United States)

    DeFreitas, Timothy; Saddiki, Hachem; Flaherty, Patrick

    2016-02-24

    Low-cost DNA sequencing allows organizations to accumulate massive amounts of genomic data and use that data to answer a diverse range of research questions. Presently, users must search for relevant genomic data using a keyword, accession number of meta-data tag. However, in this search paradigm the form of the query - a text-based string - is mismatched with the form of the target - a genomic profile. To improve access to massive genomic data resources, we have developed a fast search engine, GEMINI, that uses a genomic profile as a query to search for similar genomic profiles. GEMINI implements a nearest-neighbor search algorithm using a vantage-point tree to store a database of n profiles and in certain circumstances achieves an [Formula: see text] expected query time in the limit. We tested GEMINI on breast and ovarian cancer gene expression data from The Cancer Genome Atlas project and show that it achieves a query time that scales as the logarithm of the number of records in practice on genomic data. In a database with 10(5) samples, GEMINI identifies the nearest neighbor in 0.05 sec compared to a brute force search time of 0.6 sec. GEMINI is a fast search engine that uses a query genomic profile to search for similar profiles in a very large genomic database. It enables users to identify similar profiles independent of sample label, data origin or other meta-data information.

  10. footprintDB: a database of transcription factors with annotated cis elements and binding interfaces.

    Science.gov (United States)

    Sebastian, Alvaro; Contreras-Moreira, Bruno

    2014-01-15

    Traditional and high-throughput techniques for determining transcription factor (TF) binding specificities are generating large volumes of data of uneven quality, which are scattered across individual databases. FootprintDB integrates some of the most comprehensive freely available libraries of curated DNA binding sites and systematically annotates the binding interfaces of the corresponding TFs. The first release contains 2422 unique TF sequences, 10 112 DNA binding sites and 3662 DNA motifs. A survey of the included data sources, organisms and TF families was performed together with proprietary database TRANSFAC, finding that footprintDB has a similar coverage of multicellular organisms, while also containing bacterial regulatory data. A search engine has been designed that drives the prediction of DNA motifs for input TFs, or conversely of TF sequences that might recognize input regulatory sequences, by comparison with database entries. Such predictions can also be extended to a single proteome chosen by the user, and results are ranked in terms of interface similarity. Benchmark experiments with bacterial, plant and human data were performed to measure the predictive power of footprintDB searches, which were able to correctly recover 10, 55 and 90% of the tested sequences, respectively. Correctly predicted TFs had a higher interface similarity than the average, confirming its diagnostic value. Web site implemented in PHP,Perl, MySQL and Apache. Freely available from http://floresta.eead.csic.es/footprintdb.

  11. Database Replication

    CERN Document Server

    Kemme, Bettina

    2010-01-01

    Database replication is widely used for fault-tolerance, scalability and performance. The failure of one database replica does not stop the system from working as available replicas can take over the tasks of the failed replica. Scalability can be achieved by distributing the load across all replicas, and adding new replicas should the load increase. Finally, database replication can provide fast local access, even if clients are geographically distributed clients, if data copies are located close to clients. Despite its advantages, replication is not a straightforward technique to apply, and

  12. Working with Data: Discovering Knowledge through Mining and Analysis; Systematic Knowledge Management and Knowledge Discovery; Text Mining; Methodological Approach in Discovering User Search Patterns through Web Log Analysis; Knowledge Discovery in Databases Using Formal Concept Analysis; Knowledge Discovery with a Little Perspective.

    Science.gov (United States)

    Qin, Jian; Jurisica, Igor; Liddy, Elizabeth D.; Jansen, Bernard J; Spink, Amanda; Priss, Uta; Norton, Melanie J.

    2000-01-01

    These six articles discuss knowledge discovery in databases (KDD). Topics include data mining; knowledge management systems; applications of knowledge discovery; text and Web mining; text mining and information retrieval; user search patterns through Web log analysis; concept analysis; data collection; and data structure inconsistency. (LRW)

  13. Refactoring databases evolutionary database design

    CERN Document Server

    Ambler, Scott W

    2006-01-01

    Refactoring has proven its value in a wide range of development projects–helping software professionals improve system designs, maintainability, extensibility, and performance. Now, for the first time, leading agile methodologist Scott Ambler and renowned consultant Pramodkumar Sadalage introduce powerful refactoring techniques specifically designed for database systems. Ambler and Sadalage demonstrate how small changes to table structures, data, stored procedures, and triggers can significantly enhance virtually any database design–without changing semantics. You’ll learn how to evolve database schemas in step with source code–and become far more effective in projects relying on iterative, agile methodologies. This comprehensive guide and reference helps you overcome the practical obstacles to refactoring real-world databases by covering every fundamental concept underlying database refactoring. Using start-to-finish examples, the authors walk you through refactoring simple standalone databas...

  14. Search Engines for Tomorrow's Scholars

    Science.gov (United States)

    Fagan, Jody Condit

    2011-01-01

    Today's scholars face an outstanding array of choices when choosing search tools: Google Scholar, discipline-specific abstracts and index databases, library discovery tools, and more recently, Microsoft's re-launch of their academic search tool, now dubbed Microsoft Academic Search. What are these tools' strengths for the emerging needs of…

  15. MEGGASENSE - The Metagenome/Genome Annotated Sequence Natural Language Search Engine: A Platform for 
the Construction of Sequence Data Warehouses.

    Science.gov (United States)

    Gacesa, Ranko; Zucko, Jurica; Petursdottir, Solveig K; Gudmundsdottir, Elisabet Eik; Fridjonsson, Olafur H; Diminic, Janko; Long, Paul F; Cullum, John; Hranueli, Daslav; Hreggvidsson, Gudmundur O; Starcevic, Antonio

    2017-06-01

    The MEGGASENSE platform constructs relational databases of DNA or protein sequences. The default functional analysis uses 14 106 hidden Markov model (HMM) profiles based on sequences in the KEGG database. The Solr search engine allows sophisticated queries and a BLAST search function is also incorporated. These standard capabilities were used to generate the SCATT database from the predicted proteome of Streptomyces cattleya . The implementation of a specialised metagenome database (AMYLOMICS) for bioprospecting of carbohydrate-modifying enzymes is described. In addition to standard assembly of reads, a novel 'functional' assembly was developed, in which screening of reads with the HMM profiles occurs before the assembly. The AMYLOMICS database incorporates additional HMM profiles for carbohydrate-modifying enzymes and it is illustrated how the combination of HMM and BLAST analyses helps identify interesting genes. A variety of different proteome and metagenome databases have been generated by MEGGASENSE.

  16. Update History of This Database - KEGG MEDICUS | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available glish archive site is opened. 2010/10/01 KEGG MEDICUS ( http://www.kegg.jp/kegg/medicus/ ) is opened. About ...[ Credits ] English ]; } else if ( url.search(//en//) != -1 ) { url = url.replace(/...switchLanguage; BLAST Search Image Search Home About Archive Update History Data List Contact us KEGG MEDI...CUS Update History of This Database Date Update contents 2014/05/09 KEGG MEDICUS En...This Database Database Description Download License Update History of This Database Site Policy | Contact Us Update History of This Database - KEGG MEDICUS | LSDB Archive ...

  17. RDD Databases

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — This database was established to oversee documents issued in support of fishery research activities including experimental fishing permits (EFP), letters of...

  18. Snowstorm Database

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — The Snowstorm Database is a collection of over 500 snowstorms dating back to 1900 and updated operationally. Only storms having large areas of heavy snowfall (10-20...

  19. Dealer Database

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — The dealer reporting databases contain the primary data reported by federally permitted seafood dealers in the northeast. Electronic reporting was implemented May 1,...

  20. Tibetan Magmatism Database

    Science.gov (United States)

    Chapman, James B.; Kapp, Paul

    2017-11-01

    A database containing previously published geochronologic, geochemical, and isotopic data on Mesozoic to Quaternary igneous rocks in the Himalayan-Tibetan orogenic system are presented. The database is intended to serve as a repository for new and existing igneous rock data and is publicly accessible through a web-based platform that includes an interactive map and data table interface with search, filtering, and download options. To illustrate the utility of the database, the age, location, and ɛHft composition of magmatism from the central Gangdese batholith in the southern Lhasa terrane are compared. The data identify three high-flux events, which peak at 93, 50, and 15 Ma. They are characterized by inboard arc migration and a temporal and spatial shift to more evolved isotopic compositions.

  1. Citation Searching: Search Smarter & Find More

    Science.gov (United States)

    Hammond, Chelsea C.; Brown, Stephanie Willen

    2008-01-01

    The staff at University of Connecticut are participating in Elsevier's Student Ambassador Program (SAmP) in which graduate students train their peers on "citation searching" research using Scopus and Web of Science, two tremendous citation databases. They are in the fourth semester of these training programs, and they are wildly successful: They…

  2. National database

    DEFF Research Database (Denmark)

    Kristensen, Helen Grundtvig; Stjernø, Henrik

    1995-01-01

    Artikel om national database for sygeplejeforskning oprettet på Dansk Institut for Sundheds- og Sygeplejeforskning. Det er målet med databasen at samle viden om forsknings- og udviklingsaktiviteter inden for sygeplejen.......Artikel om national database for sygeplejeforskning oprettet på Dansk Institut for Sundheds- og Sygeplejeforskning. Det er målet med databasen at samle viden om forsknings- og udviklingsaktiviteter inden for sygeplejen....

  3. RADARS, a bioinformatics solution that automates proteome mass spectral analysis, optimises protein identification, and archives data in a relational database.

    Science.gov (United States)

    Field, Helen I; Fenyö, David; Beavis, Ronald C

    2002-01-01

    RADARS, a rapid, automated, data archiving and retrieval software system for high-throughput proteomic mass spectral data processing and storage, is described. The majority of mass spectrometer data files are compatible with RADARS, for consistent processing. The system automatically takes unprocessed data files, identifies proteins via in silico database searching, then stores the processed data and search results in a relational database suitable for customized reporting. The system is robust, used in 24/7 operation, accessible to multiple users of an intranet through a web browser, may be monitored by Virtual Private Network, and is secure. RADARS is scalable for use on one or many computers, and is suited to multiple processor systems. It can incorporate any local database in FASTA format, and can search protein and DNA databases online. A key feature is a suite of visualisation tools (many available gratis), allowing facile manipulation of spectra, by hand annotation, reanalysis, and access to all procedures. We also described the use of Sonar MS/MS, a novel, rapid search engine requiring 40 MB RAM per process for searches against a genomic or EST database translated in all six reading frames. RADARS reduces the cost of analysis by its efficient algorithms: Sonar MS/MS can identifiy proteins without accurate knowledge of the parent ion mass and without protein tags. Statistical scoring methods provide close-to-expert accuracy and brings robust data analysis to the non-expert user.

  4. The Weaknesses of Full-Text Searching

    Science.gov (United States)

    Beall, Jeffrey

    2008-01-01

    This paper provides a theoretical critique of the deficiencies of full-text searching in academic library databases. Because full-text searching relies on matching words in a search query with words in online resources, it is an inefficient method of finding information in a database. This matching fails to retrieve synonyms, and it also retrieves…

  5. Possible use of fuzzy logic in database

    Directory of Open Access Journals (Sweden)

    Vaclav Bezdek

    2011-04-01

    Full Text Available The article deals with fuzzy logic and its possible use in database systems. At first fuzzy thinking style is shown on a simple example. Next the advantages of the fuzzy approach to database searching are considered on the database of used cars in the Czech Republic.

  6. Search Help

    Science.gov (United States)

    Guidance and search help resource listing examples of common queries that can be used in the Google Search Appliance search request, including examples of special characters, or query term seperators that Google Search Appliance recognizes.

  7. Experiment Databases

    Science.gov (United States)

    Vanschoren, Joaquin; Blockeel, Hendrik

    Next to running machine learning algorithms based on inductive queries, much can be learned by immediately querying the combined results of many prior studies. Indeed, all around the globe, thousands of machine learning experiments are being executed on a daily basis, generating a constant stream of empirical information on machine learning techniques. While the information contained in these experiments might have many uses beyond their original intent, results are typically described very concisely in papers and discarded afterwards. If we properly store and organize these results in central databases, they can be immediately reused for further analysis, thus boosting future research. In this chapter, we propose the use of experiment databases: databases designed to collect all the necessary details of these experiments, and to intelligently organize them in online repositories to enable fast and thorough analysis of a myriad of collected results. They constitute an additional, queriable source of empirical meta-data based on principled descriptions of algorithm executions, without reimplementing the algorithms in an inductive database. As such, they engender a very dynamic, collaborative approach to experimentation, in which experiments can be freely shared, linked together, and immediately reused by researchers all over the world. They can be set up for personal use, to share results within a lab or to create open, community-wide repositories. Here, we provide a high-level overview of their design, and use an existing experiment database to answer various interesting research questions about machine learning algorithms and to verify a number of recent studies.

  8. Update History of This Database - PLACE | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us PLACE Update History of This Database Date Update contents 2016/08/22 The contact address is...s Database Database Description Download License Update History of Thi...s Database Site Policy | Contact Us Update History of This Database - PLACE | LSDB Archive ... ... changed. 2014/10/20 The URLs of the database maintenance site and the portal site are changed. 2014/07/17 PLACE English archi

  9. GRIP Database original data - GRIPDB | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data List Contact us GRI...PDB GRIP Database original data Data detail Data name GRIP Database original data DOI 10....18908/lsdba.nbdc01665-006 Description of data contents GRIP Database original data It consists of data table...s and sequences. Data file File name: gripdb_original_data.zip File URL: ftp://ftp.biosciencedbc.jp/archive/gripdb/LATEST/gri...e Database Description Download License Update History of This Database Site Policy | Contact Us GRIP Database original data - GRIPDB | LSDB Archive ...

  10. Database Dump - fRNAdb | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us fRNAdb Database Dump Data detail Data name Database Dump DOI 10.18908/lsdba.nbdc00452-002 De... data (tab separeted text) Data file File name: Database_Dump File URL: ftp://ftp....biosciencedbc.jp/archive/frnadb/LATEST/Database_Dump File size: 673 MB Simple search URL - Data acquisition...s. Data analysis method - Number of data entries 4 files - About This Database Database Description Download... License Update History of This Database Site Policy | Contact Us Database Dump - fRNAdb | LSDB Archive ...

  11. DNA Microarray Technology

    Science.gov (United States)

    Skip to main content DNA Microarray Technology Enter Search Term(s): Español Research Funding An Overview Bioinformatics Current Grants Education and Training Funding Extramural Research News Features Funding Divisions Funding ...

  12. SinEx DB: a database for single exon coding sequences in mammalian genomes.

    Science.gov (United States)

    Jorquera, Roddy; Ortiz, Rodrigo; Ossandon, F; Cárdenas, Juan Pablo; Sepúlveda, Rene; González, Carolina; Holmes, David S

    2016-01-01

    Eukaryotic genes are typically interrupted by intragenic, noncoding sequences termed introns. However, some genes lack introns in their coding sequence (CDS) and are generally known as 'single exon genes' (SEGs). In this work, a SEG is defined as a nuclear, protein-coding gene that lacks introns in its CDS. Whereas, many public databases of Eukaryotic multi-exon genes are available, there are only two specialized databases for SEGs. The present work addresses the need for a more extensive and diverse database by creating SinEx DB, a publicly available, searchable database of predicted SEGs from 10 completely sequenced mammalian genomes including human. SinEx DB houses the DNA and protein sequence information of these SEGs and includes their functional predictions (KOG) and the relative distribution of these functions within species. The information is stored in a relational database built with My SQL Server 5.1.33 and the complete dataset of SEG sequences and their functional predictions are available for downloading. SinEx DB can be interrogated by: (i) a browsable phylogenetic schema, (ii) carrying out BLAST searches to the in-house SinEx DB of SEGs and (iii) via an advanced search mode in which the database can be searched by key words and any combination of searches by species and predicted functions. SinEx DB provides a rich source of information for advancing our understanding of the evolution and function of SEGs.Database URL: www.sinex.cl. © The Author(s) 2016. Published by Oxford University Press.

  13. Generation and analysis of a large-scale expressed sequence Tag database from a full-length enriched cDNA library of developing leaves of Gossypium hirsutum L.

    Directory of Open Access Journals (Sweden)

    Min Lin

    Full Text Available BACKGROUND: Cotton (Gossypium hirsutum L. is one of the world's most economically-important crops. However, its entire genome has not been sequenced, and limited resources are available in GenBank for understanding the molecular mechanisms underlying leaf development and senescence. METHODOLOGY/PRINCIPAL FINDINGS: In this study, 9,874 high-quality ESTs were generated from a normalized, full-length cDNA library derived from pooled RNA isolated from throughout leaf development during the plant blooming stage. After clustering and assembly of these ESTs, 5,191 unique sequences, representative 1,652 contigs and 3,539 singletons, were obtained. The average unique sequence length was 682 bp. Annotation of these unique sequences revealed that 84.4% showed significant homology to sequences in the NCBI non-redundant protein database, and 57.3% had significant hits to known proteins in the Swiss-Prot database. Comparative analysis indicated that our library added 2,400 ESTs and 991 unique sequences to those known for cotton. The unigenes were functionally characterized by gene ontology annotation. We identified 1,339 and 200 unigenes as potential leaf senescence-related genes and transcription factors, respectively. Moreover, nine genes related to leaf senescence and eleven MYB transcription factors were randomly selected for quantitative real-time PCR (qRT-PCR, which revealed that these genes were regulated differentially during senescence. The qRT-PCR for three GhYLSs revealed that these genes express express preferentially in senescent leaves. CONCLUSIONS/SIGNIFICANCE: These EST resources will provide valuable sequence information for gene expression profiling analyses and functional genomics studies to elucidate their roles, as well as for studying the mechanisms of leaf development and senescence in cotton and discovering candidate genes related to important agronomic traits of cotton. These data will also facilitate future whole-genome sequence

  14. INIS: Manual for online retrieval from the INIS Database on the Internet

    International Nuclear Information System (INIS)

    2000-01-01

    This manual demonstrates the different Search Forms available to retrieve relevant records using the INIS Database online retrieval system. Information on how to search, how to store, refine and retrieve searches, and how to update a literature search is given

  15. INIS: Manual for online retrieval from the INIS Database on the Internet

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2000-10-01

    This manual demonstrates the different Search Forms available to retrieve relevant records using the INIS Database online retrieval system. Information on how to search, how to store, refine and retrieve searches, and how to update a literature search is given.

  16. The Impact of Online Bibliographic Databases on Teaching and Research in Political Science.

    Science.gov (United States)

    Reichel, Mary

    The availability of online bibliographic databases greatly facilitates literature searching in political science. The advantages to searching databases online include combination of concepts, comprehensiveness, multiple database searching, free-text searching, currency, current awareness services, document delivery service, and convenience.…

  17. Subject search study. Final report

    International Nuclear Information System (INIS)

    Todeschini, C.

    1995-01-01

    The study gathered information on how users search the database of the International Nuclear Information System (INIS), using indicators such as Subject categories, Controlled terms, Subject headings, Free-text words, combinations of the above. Users participated from the Australian, French, Russian and Spanish INIS Centres, that have different national languages. Participants, both intermediaries and end users, replied to a questionnaire and executed search queries. The INIS Secretariat at the IAEA also participated. A protocol of all search strategies used in actual searches in the database was kept. The thought process for Russian and Spanish users is predominantly non-English and also the actual initial search formulation is predominantly non-English among Russian and Spanish users while it tends to be more in English among French users. A total of 1002 searches were executed by the five INIS centres including the IAEA. The search protocols indicate the following search behaviour: 1) free text words represent about 40% of search points on an average query; 2) descriptors used as search keys have the widest range as percentage of search points, from a low of 25% to a high of 48%; 3) search keys consisting of free text that coincides with a descriptor account for about 15% of search points; 4) Subject Categories are not used in many searches; 5) free text words are present as search points in about 80% of all searches; 6) controlled terms (descriptors) are used very extensively and appear in about 90% of all searches; 7) Subject Headings were used in only a few percent of searches. From the results of the study one can conclude that there is a greater reluctance on the part of non-native English speakers in initiating their searches by using free text word searches. Also: Subject Categories are little used in searching the database; both free text terms and controlled terms are the predominant types of search keys used, whereby the controlled terms are used more

  18. Flat Files - JSNP | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ... Data file File name: jsnp_flat_files File URL: ftp://ftp.biosciencedbc.jp/archiv...his Database Database Description Download License Update History of This Database Site Policy | Contact Us Flat Files - JSNP | LSDB Archive ...

  19. Reference - PLACE | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...ailable. Data file File name: place_reference.zip File URL: ftp://ftp.biosciencedbc.jp/archive/place/LATEST/...ber About This Database Database Description Download License Update History of This Database Site Policy | Contact Us Reference - PLACE | LSDB Archive ...

  20. License - AT Atlas | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ... might be changed without notice. About This Database Database Description Download License Update History of This Database Site Policy | Contact Us License - AT Atlas | LSDB Archive ...

  1. Protein - AT Atlas | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ..._protein.zip File URL: ftp://ftp.biosciencedbc.jp/archive/at_atlas/LATEST/at_atla...About This Database Database Description Download License Update History of This Database Site Policy | Contact Us Protein - AT Atlas | LSDB Archive ...

  2. Mapping data - KOME | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...tional Rice Genome Sequencing Project (IRGSP) Data file File name: kome_mapping_data.zip File URL: ftp://ftp.biosciencedbc.jp/archiv...(Transcriptional Unit) About This Database Database Description Download License Update History of This Database Site Policy | Contact Us Mapping data - KOME | LSDB Archive ...

  3. Download - RPD | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data .... If it is, access [here]. About This Database Database Description Download License Update History of This Database Site Policy | Contact Us Download - RPD | LSDB Archive ...

  4. License - RED | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...ts might be changed without notice. About This Database Database Description Download License Update History of This Database Site Policy | Contact Us License - RED | LSDB Archive ...

  5. License - TP Atlas | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ... might be changed without notice. About This Database Database Description Download License Update History of This Database Site Policy | Contact Us License - TP Atlas | LSDB Archive ...

  6. Exon - ASTRA | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...ontents Exons in variants Data file File name: astra_exon.zip File URL: ftp://ftp.biosciencedbc.jp/archive/a... About This Database Database Description Download License Update History of This Database Site Policy | Contact Us Exon - ASTRA | LSDB Archive ...

  7. Download - JSNP | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data .... If it is, access [here]. About This Database Database Description Download License Update History of This Database Site Policy | Contact Us Download - JSNP | LSDB Archive ...

  8. ORF information - KOME | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ... File URL: ftp://ftp.biosciencedbc.jp/archive/kome/LATEST/kome_orf_infomation.zip File size: 526 KB Simple s...ut This Database Database Description Download License Update History of This Database Site Policy | Contact Us ORF information - KOME | LSDB Archive ...

  9. Download - Plabrain DB | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data List Contact us Plabrain...s Database Database Description Download License Update History of This Database Site Policy | Contact Us Download - Plabrain DB | LSDB Archive ...

  10. ElasticSearch server

    CERN Document Server

    Rogozinski, Marek

    2014-01-01

    This book is a detailed, practical, hands-on guide packed with real-life scenarios and examples which will show you how to implement an ElasticSearch search engine on your own websites.If you are a web developer or a user who wants to learn more about ElasticSearch, then this is the book for you. You do not need to know anything about ElastiSeach, Java, or Apache Lucene in order to use this book, though basic knowledge about databases and queries is required.

  11. EST data - RED | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...st.zip File URL: ftp://ftp.biosciencedbc.jp/archive/red/LATEST/red_est.zip File size: 629 KB Simple search U...ase Database Description Download License Update History of This Database Site Policy | Contact Us EST data - RED | LSDB Archive ...

  12. Signaling pathways in a Citrus EST database

    Directory of Open Access Journals (Sweden)

    Angela Mehta

    2007-01-01

    Full Text Available Citrus spp. are economically important crops, which in Brazil are grown mainly in the State of São Paulo. Citrus cultures are attacked by several pathogens, causing severe yield losses. In order to better understand this culture, the Millenium Project (IAC Cordeirópolis was launched in order to sequence Citrus ESTs (expressed sequence tags from different tissues, including leaf, bark, fruit, root and flower. Plants were submitted to biotic and abiotic stresses and investigated under different development stages (adult vs. juvenile. Several cDNA libraries were constructed and the sequences obtained formed the Citrus ESTs database with almost 200,000 sequences. Searches were performed in the Citrus database to investigate the presence of different signaling pathway components. Several of the genes involved in the signaling of sugar, calcium, cytokinin, plant hormones, inositol phosphate, MAPKinase and COP9 were found in the citrus genome and are discussed in this paper. The results obtained may indicate that similar mechanisms described in other plants, such as Arabidopsis, occur in citrus. Further experimental studies must be conducted in order to understand the different signaling pathways present.

  13. Geologic Field Database

    Directory of Open Access Journals (Sweden)

    Katarina Hribernik

    2002-12-01

    Full Text Available The purpose of the paper is to present the field data relational database, which was compiled from data, gathered during thirty years of fieldwork on the Basic Geologic Map of Slovenia in scale1:100.000. The database was created using MS Access software. The MS Access environment ensures its stability and effective operation despite changing, searching, and updating the data. It also enables faster and easier user-friendly access to the field data. Last but not least, in the long-term, with the data transferred into the GISenvironment, it will provide the basis for the sound geologic information system that will satisfy a broad spectrum of geologists’ needs.

  14. Database on wind characteristics

    Energy Technology Data Exchange (ETDEWEB)

    Hansen, K.S. [The Technical Univ. of Denmark (Denmark); Courtney, M.S. [Risoe National Lab., (Denmark)

    1999-08-01

    The organisations that participated in the project consists of five research organisations: MIUU (Sweden), ECN (The Netherlands), CRES (Greece), DTU (Denmark), Risoe (Denmark) and one wind turbine manufacturer: Vestas Wind System A/S (Denmark). The overall goal was to build a database consisting of a large number of wind speed time series and create tools for efficiently searching through the data to select interesting data. The project resulted in a database located at DTU, Denmark with online access through the Internet. The database contains more than 50.000 hours of measured wind speed measurements. A wide range of wind climates and terrain types are represented with significant amounts of time series. Data have been chosen selectively with a deliberate over-representation of high wind and complex terrain cases. This makes the database ideal for wind turbine design needs but completely unsuitable for resource studies. Diversity has also been an important aim and this is realised with data from a large range of terrain types; everything from offshore to mountain, from Norway to Greece. (EHS)

  15. Intermittent search strategies

    Science.gov (United States)

    Bénichou, O.; Loverdo, C.; Moreau, M.; Voituriez, R.

    2011-01-01

    This review examines intermittent target search strategies, which combine phases of slow motion, allowing the searcher to detect the target, and phases of fast motion during which targets cannot be detected. It is first shown that intermittent search strategies are actually widely observed at various scales. At the macroscopic scale, this is, for example, the case of animals looking for food; at the microscopic scale, intermittent transport patterns are involved in a reaction pathway of DNA-binding proteins as well as in intracellular transport. Second, generic stochastic models are introduced, which show that intermittent strategies are efficient strategies that enable the minimization of search time. This suggests that the intrinsic efficiency of intermittent search strategies could justify their frequent observation in nature. Last, beyond these modeling aspects, it is proposed that intermittent strategies could also be used in a broader context to design and accelerate search processes.

  16. Nuclear database management systems

    International Nuclear Information System (INIS)

    Stone, C.; Sutton, R.

    1996-01-01

    The authors are developing software tools for accessing and visualizing nuclear data. MacNuclide was the first software application produced by their group. This application incorporates novel database management and visualization tools into an intuitive interface. The nuclide chart is used to access properties and to display results of searches. Selecting a nuclide in the chart displays a level scheme with tables of basic, radioactive decay, and other properties. All level schemes are interactive, allowing the user to modify the display, move between nuclides, and display entire daughter decay chains

  17. The Porcelain Crab Transcriptome and PCAD, the Porcelain Crab Microarray and Sequence Database

    Energy Technology Data Exchange (ETDEWEB)

    Tagmount, Abderrahmane; Wang, Mei; Lindquist, Erika; Tanaka, Yoshihiro; Teranishi, Kristen S.; Sunagawa, Shinichi; Wong, Mike; Stillman, Jonathon H.

    2010-01-27

    Background: With the emergence of a completed genome sequence of the freshwater crustacean Daphnia pulex, construction of genomic-scale sequence databases for additional crustacean sequences are important for comparative genomics and annotation. Porcelain crabs, genus Petrolisthes, have been powerful crustacean models for environmental and evolutionary physiology with respect to thermal adaptation and understanding responses of marine organisms to climate change. Here, we present a large-scale EST sequencing and cDNA microarray database project for the porcelain crab Petrolisthes cinctipes. Methodology/Principal Findings: A set of ~;;30K unique sequences (UniSeqs) representing ~;;19K clusters were generated from ~;;98K high quality ESTs from a set of tissue specific non-normalized and mixed-tissue normalized cDNA libraries from the porcelain crab Petrolisthes cinctipes. Homology for each UniSeq was assessed using BLAST, InterProScan, GO and KEGG database searches. Approximately 66percent of the UniSeqs had homology in at least one of the databases. All EST and UniSeq sequences along with annotation results and coordinated cDNA microarray datasets have been made publicly accessible at the Porcelain Crab Array Database (PCAD), a feature-enriched version of the Stanford and Longhorn Array Databases.Conclusions/Significance: The EST project presented here represents the third largest sequencing effort for any crustacean, and the largest effort for any crab species. Our assembly and clustering results suggest that our porcelain crab EST data set is equally diverse to the much larger EST set generated in the Daphnia pulex genome sequencing project, and thus will be an important resource to the Daphnia research community. Our homology results support the pancrustacea hypothesis and suggest that Malacostraca may be ancestral to Branchiopoda and Hexapoda. Our results also suggest that our cDNA microarrays cover as much of the transcriptome as can reasonably be captured in

  18. Update History of This Database - DGBY | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us DGBY Update History of This Database Date Update contents 2014/10/20 The URL of the portal s...aro.affrc.go.jp/yakudachi/yeast/index.html ) is opened. About This Database Database Description Download License Update Hi...story of This Database Site Policy | Contact Us Update History of This Database - DGBY | LSDB Archive ... ... Expression of attribution in License is updated. 2012/03/08 DGBY English archive site is opened. 2006/10/02

  19. Update History of This Database - Q-TARO | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us Q-TARO Update History of This Database Date Update contents 2014/10/20 The URL of the portal...ption Download License Update History of This Database Site Policy | Contact Us Update History of This Database - Q-TARO | LSDB Archive ... ... site is changed. 2013/12/17 The URL of the portal site is changed. 2013/12/13 Q-TARO English archive site i...s opened. 2009/11/15 Q-TARO ( http://qtaro.abr.affrc.go.jp/ ) is opened. About This Database Database Descri

  20. Update History of This Database - TogoTV | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us TogoTV Update History of This Database Date Update contents 2017/05/12 TogoTV English archiv...ription Download License Update History of This Database Site Policy | Contact Us Update History of This Database - TogoTV | LSDB Archive ... ...e site is opened. 2007/07/20 TogoTV ( http://togotv.dbcls.jp/ ) is opened. About This Database Database Desc

  1. Update History of This Database - ConfC | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us ConfC Update History of This Database Date Update contents 2016/09/20 ConfC English archive ...tion Download License Update History of This Database Site Policy | Contact Us Update History of This Database - ConfC | LSDB Archive ... ...site is opened. 2005/05/01 ConfC (http://mbs.cbrc.jp/ConfC/) is opened. About This Database Database Descrip

  2. Update History of This Database - TP Atlas | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us TP Atlas Update History of This Database Date Update contents 2013/12/16 The email address i...s ( http://www.tanpaku.org/tpatlas/ ) is opened. About This Database Database Description Download License Update History of Thi...s Database Site Policy | Contact Us Update History of This Database - TP Atlas | LSDB Archive ... ...n the contact information is corrected. 2013/11/19 TP Atlas English archive site is opened. 2008/4/1 TP Atla

  3. Update History of This Database - fRNAdb | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us fRNAdb Update History of This Database Date Update contents 2016/03/29 fRNAdb English archiv...on Download License Update History of This Database Site Policy | Contact Us Update History of This Database - fRNAdb | LSDB Archive ... ...e site is opened. 2006/12 fRNAdb ( http://www.ncrna.org/ ) is opened. About This Database Database Descripti

  4. Update History of This Database - AcEST | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us AcEST Update History of This Database Date Update contents 2013/01/10 Errors found on AcEST ...s Database Database Description Download License Update History of This Data...base Site Policy | Contact Us Update History of This Database - AcEST | LSDB Archive ... ...Conting data have been correceted. For details, please refer to the following page. Data correction 2010/03/29 AcEST English archi

  5. DPTEdb, an integrative database of transposable elements in dioecious plants.

    Science.gov (United States)

    Li, Shu-Fen; Zhang, Guo-Jun; Zhang, Xue-Jin; Yuan, Jin-Hong; Deng, Chuan-Liang; Gu, Lian-Feng; Gao, Wu-Jun

    2016-01-01

    Dioecious plants usually harbor 'young' sex chromosomes, providing an opportunity to study the early stages of sex chromosome evolution. Transposable elements (TEs) are mobile DNA elements frequently found in plants and are suggested to play important roles in plant sex chromosome evolution. The genomes of several dioecious plants have been sequenced, offering an opportunity to annotate and mine the TE data. However, comprehensive and unified annotation of TEs in these dioecious plants is still lacking. In this study, we constructed a dioecious plant transposable element database (DPTEdb). DPTEdb is a specific, comprehensive and unified relational database and web interface. We used a combination of de novo, structure-based and homology-based approaches to identify TEs from the genome assemblies of previously published data, as well as our own. The database currently integrates eight dioecious plant species and a total of 31 340 TEs along with classification information. DPTEdb provides user-friendly web interfaces to browse, search and download the TE sequences in the database. Users can also use tools, including BLAST, GetORF, HMMER, Cut sequence and JBrowse, to analyze TE data. Given the role of TEs in plant sex chromosome evolution, the database will contribute to the investigation of TEs in structural, functional and evolutionary dynamics of the genome of dioecious plants. In addition, the database will supplement the research of sex diversification and sex chromosome evolution of dioecious plants.Database URL: http://genedenovoweb.ticp.net:81/DPTEdb/index.php. © The Author(s) 2016. Published by Oxford University Press.

  6. Classical databases and knowledge organization

    DEFF Research Database (Denmark)

    Hjørland, Birger

    2015-01-01

    This paper considers classical bibliographic databases based on the Boolean retrieval model (such as MEDLINE and PsycInfo). This model is challenged by modern search engines and information retrieval (IR) researchers, who often consider Boolean retrieval a less efficient approach. The paper...

  7. Ancient DNA

    DEFF Research Database (Denmark)

    Willerslev, Eske; Cooper, Alan

    2004-01-01

    ancient DNA, palaeontology, palaeoecology, archaeology, population genetics, DNA damage and repair......ancient DNA, palaeontology, palaeoecology, archaeology, population genetics, DNA damage and repair...

  8. Analysis of newly established EST databases reveals similarities between heart regeneration in newt and fish

    Directory of Open Access Journals (Sweden)

    Weis Patrick

    2010-01-01

    Full Text Available Abstract Background The newt Notophthalmus viridescens possesses the remarkable ability to respond to cardiac damage by formation of new myocardial tissue. Surprisingly little is known about changes in gene activities that occur during the course of regeneration. To begin to decipher the molecular processes, that underlie restoration of functional cardiac tissue, we generated an EST database from regenerating newt hearts and compared the transcriptional profile of selected candidates with genes deregulated during zebrafish heart regeneration. Results A cDNA library of 100,000 cDNA clones was generated from newt hearts 14 days after ventricular injury. Sequencing of 11520 cDNA clones resulted in 2894 assembled contigs. BLAST searches revealed 1695 sequences with potential homology to sequences from the NCBI database. BLAST searches to TrEMBL and Swiss-Prot databases assigned 1116 proteins to Gene Ontology terms. We also identified a relatively large set of 174 ORFs, which are likely to be unique for urodele amphibians. Expression analysis of newt-zebrafish homologues confirmed the deregulation of selected genes during heart regeneration. Sequences, BLAST results and GO annotations were visualized in a relational web based database followed by grouping of identified proteins into clusters of GO Terms. Comparison of data from regenerating zebrafish hearts identified biological processes, which were uniformly overrepresented during cardiac regeneration in newt and zebrafish. Conclusion We concluded that heart regeneration in newts and zebrafish led to the activation of similar sets of genes, which suggests that heart regeneration in both species might follow similar principles. The design of the newly established newt EST database allows identification of molecular pathways important for heart regeneration.

  9. HOLLYWOOD: a comparative relational database of alternative splicing.

    Science.gov (United States)

    Holste, Dirk; Huo, George; Tung, Vivian; Burge, Christopher B

    2006-01-01

    RNA splicing is an essential step in gene expression, and is often variable, giving rise to multiple alternatively spliced mRNA and protein isoforms from a single gene locus. The design of effective databases to support experimental and computational investigations of alternative splicing (AS) is a significant challenge. In an effort to integrate accurate exon and splice site annotation with current knowledge about splicing regulatory elements and predicted AS events, and to link information about the splicing of orthologous genes in different species, we have developed the Hollywood system. This database was built upon genomic annotation of splicing patterns of known genes derived from spliced alignment of complementary DNAs (cDNAs) and expressed sequence tags, and links features such as splice site sequence and strength, exonic splicing enhancers and silencers, conserved and non-conserved patterns of splicing, and cDNA library information for inferred alternative exons. Hollywood was implemented as a relational database and currently contains comprehensive information for human and mouse. It is accompanied by a web query tool that allows searches for sets of exons with specific splicing characteristics or splicing regulatory element composition, or gives a graphical or sequence-level summary of splicing patterns for a specific gene. A streamlined graphical representation of gene splicing patterns is provided, and these patterns can alternatively be layered onto existing information in the UCSC Genome Browser. The database is accessible at http://hollywood.mit.edu.

  10. DNA barcoding in the media: does coverage of cool science reflect its social context?

    Science.gov (United States)

    Geary, Janis; Camicioli, Emma; Bubela, Tania

    2016-09-01

    Paul Hebert and colleagues first described DNA barcoding in 2003, which led to international efforts to promote and coordinate its use. Since its inception, DNA barcoding has generated considerable media coverage. We analysed whether this coverage reflected both the scientific and social mandates of international barcoding organizations. We searched newspaper databases to identify 900 English-language articles from 2003 to 2013. Coverage of the science of DNA barcoding was highly positive but lacked context for key topics. Coverage omissions pose challenges for public understanding of the science and applications of DNA barcoding; these included coverage of governance structures and issues related to the sharing of genetic resources across national borders. Our analysis provided insight into how barcoding communication efforts have translated into media coverage; more targeted communication efforts may focus media attention on previously omitted, but important topics. Our analysis is timely as the DNA barcoding community works to establish the International Society for the Barcode of Life.

  11. DNA and Law Enforcement in the European Union: Tools and Human Rights Protection

    Directory of Open Access Journals (Sweden)

    Helena Soleto Muñoz

    2014-01-01

    Full Text Available Since its first successful use in criminal investigations in the 1980s, DNA has become a widely used and valuable tool to identify offenders and to acquit innocent persons. For a more beneficial use of the DNA-related data possessed, the Council of the European Union adopted Council Decisions 2008/615 and 2008/616 establishing a mechanism for a direct automated search in national EU Member States’ DNA databases. The article reveals the complications associated with the regulation on the use of DNA for criminal investigations as it is regulated by both EU and national legislation which results in a great deal of variations. It also analyses possible violations of and limitations to human rights when collecting DNA samples, as well as their analysis, use and storage.

  12. Grantees Guide to Research Databases at IDRC

    International Development Research Centre (IDRC) Digital Library (Canada)

    . 7. 7. Creating search alerts. 9. 8. IDRC Digital Library (IDL). 11. 9. Key contacts. 12. Commercial databases conditions of use. These resources are governed by license agreements which restrict use to IDRC employees and grantees taking ...

  13. Heuristic Search Theory and Applications

    CERN Document Server

    Edelkamp, Stefan

    2011-01-01

    Search has been vital to artificial intelligence from the very beginning as a core technique in problem solving. The authors present a thorough overview of heuristic search with a balance of discussion between theoretical analysis and efficient implementation and application to real-world problems. Current developments in search such as pattern databases and search with efficient use of external memory and parallel processing units on main boards and graphics cards are detailed. Heuristic search as a problem solving tool is demonstrated in applications for puzzle solving, game playing, constra

  14. Random searching

    International Nuclear Information System (INIS)

    Shlesinger, Michael F

    2009-01-01

    There are a wide variety of searching problems from molecules seeking receptor sites to predators seeking prey. The optimal search strategy can depend on constraints on time, energy, supplies or other variables. We discuss a number of cases and especially remark on the usefulness of Levy walk search patterns when the targets of the search are scarce.

  15. Online Searching at 9600 Baud.

    Science.gov (United States)

    Scott, Ralph Lee; Scott, Nancy Sue Schell

    1991-01-01

    Discusses online searching with the new 9600 baud rate and describes test searches on the BRS, Data-Star, LEXIS, and Dow Jones databases through three packet-switching networks: US Sprint, MEADNET, and TYMNET. Hardware and software are described, data communications equipment standards are discussed, and the impact of higher baud rates on pricing…

  16. Search Patterns

    CERN Document Server

    Morville, Peter

    2010-01-01

    What people are saying about Search Patterns "Search Patterns is a delight to read -- very thoughtful and thought provoking. It's the most comprehensive survey of designing effective search experiences I've seen." --Irene Au, Director of User Experience, Google "I love this book! Thanks to Peter and Jeffery, I now know that search (yes, boring old yucky who cares search) is one of the coolest ways around of looking at the world." --Dan Roam, author, The Back of the Napkin (Portfolio Hardcover) "Search Patterns is a playful guide to the practical concerns of search interface design. It cont

  17. In search of the genetic footprints of Sumerians: a survey of Y-chromosome and mtDNA variation in the Marsh Arabs of Iraq

    Directory of Open Access Journals (Sweden)

    Olivieri Anna

    2011-10-01

    Full Text Available Abstract Background For millennia, the southern part of the Mesopotamia has been a wetland region generated by the Tigris and Euphrates rivers before flowing into the Gulf. This area has been occupied by human communities since ancient times and the present-day inhabitants, the Marsh Arabs, are considered the population with the strongest link to ancient Sumerians. Popular tradition, however, considers the Marsh Arabs as a foreign group, of unknown origin, which arrived in the marshlands when the rearing of water buffalo was introduced to the region. Results To shed some light on the paternal and maternal origin of this population, Y chromosome and mitochondrial DNA (mtDNA variation was surveyed in 143 Marsh Arabs and in a large sample of Iraqi controls. Analyses of the haplogroups and sub-haplogroups observed in the Marsh Arabs revealed a prevalent autochthonous Middle Eastern component for both male and female gene pools, with weak South-West Asian and African contributions, more evident in mtDNA. A higher male than female homogeneity is characteristic of the Marsh Arab gene pool, likely due to a strong male genetic drift determined by socio-cultural factors (patrilocality, polygamy, unequal male and female migration rates. Conclusions Evidence of genetic stratification ascribable to the Sumerian development was provided by the Y-chromosome data where the J1-Page08 branch reveals a local expansion, almost contemporary with the Sumerian City State period that characterized Southern Mesopotamia. On the other hand, a more ancient background shared with Northern Mesopotamia is revealed by the less represented Y-chromosome lineage J1-M267*. Overall our results indicate that the introduction of water buffalo breeding and rice farming, most likely from the Indian sub-continent, only marginally affected the gene pool of autochthonous people of the region. Furthermore, a prevalent Middle Eastern ancestry of the modern population of the marshes of

  18. Stackfile Database

    Science.gov (United States)

    deVarvalho, Robert; Desai, Shailen D.; Haines, Bruce J.; Kruizinga, Gerhard L.; Gilmer, Christopher

    2013-01-01

    This software provides storage retrieval and analysis functionality for managing satellite altimetry data. It improves the efficiency and analysis capabilities of existing database software with improved flexibility and documentation. It offers flexibility in the type of data that can be stored. There is efficient retrieval either across the spatial domain or the time domain. Built-in analysis tools are provided for frequently performed altimetry tasks. This software package is used for storing and manipulating satellite measurement data. It was developed with a focus on handling the requirements of repeat-track altimetry missions such as Topex and Jason. It was, however, designed to work with a wide variety of satellite measurement data [e.g., Gravity Recovery And Climate Experiment -- GRACE). The software consists of several command-line tools for importing, retrieving, and analyzing satellite measurement data.

  19. Download - DGBY | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...base Description Download License Update History of This Database Site Policy | Contact Us Download - DGBY | LSDB Archive ...

  20. License - PSCDB | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...wnload License Update History of This Database Site Policy | Contact Us License - PSCDB | LSDB Archive ...

  1. Download - RPSD | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...Download License Update History of This Database Site Policy | Contact Us Download - RPSD | LSDB Archive ...

  2. Download - SSBD | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...ion Download License Update History of This Database Site Policy | Contact Us Download - SSBD | LSDB Archive ...

  3. License - DMPD | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...ownload License Update History of This Database Site Policy | Contact Us License - DMPD | LSDB Archive ...

  4. Literature searches on Ayurveda: An update.

    Science.gov (United States)

    Aggithaya, Madhur G; Narahari, Saravu R

    2015-01-01

    The journals that publish on Ayurveda are increasingly indexed by popular medical databases in recent years. However, many Eastern journals are not indexed biomedical journal databases such as PubMed. Literature searches for Ayurveda continue to be challenging due to the nonavailability of active, unbiased dedicated databases for Ayurvedic literature. In 2010, authors identified 46 databases that can be used for systematic search of Ayurvedic papers and theses. This update reviewed our previous recommendation and identified current and relevant databases. To update on Ayurveda literature search and strategy to retrieve maximum publications. Author used psoriasis as an example to search previously listed databases and identify new. The population, intervention, control, and outcome table included keywords related to psoriasis and Ayurvedic terminologies for skin diseases. Current citation update status, search results, and search options of previous databases were assessed. Eight search strategies were developed. Hundred and five journals, both biomedical and Ayurveda, which publish on Ayurveda, were identified. Variability in databases was explored to identify bias in journal citation. Five among 46 databases are now relevant - AYUSH research portal, Annotated Bibliography of Indian Medicine, Digital Helpline for Ayurveda Research Articles (DHARA), PubMed, and Directory of Open Access Journals. Search options in these databases are not uniform, and only PubMed allows complex search strategy. "The Researches in Ayurveda" and "Ayurvedic Research Database" (ARD) are important grey resources for hand searching. About 44/105 (41.5%) journals publishing Ayurvedic studies are not indexed in any database. Only 11/105 (10.4%) exclusive Ayurveda journals are indexed in PubMed. AYUSH research portal and DHARA are two major portals after 2010. It is mandatory to search PubMed and four other databases because all five carry citations from different groups of journals. The hand

  5. Searching for a stock structure in Sardina pilchardus from the Adriatic and Ionian seas using a microsatellite DNA-based approach

    Directory of Open Access Journals (Sweden)

    Paolo Ruggeri

    2013-10-01

    Full Text Available In the present study the genetic variability of European sardine from Adriatic and Ionian seas was investigated in order to detect the occurrence of genetic structure within and between these basins. In several samples the analysis of genetic variability at eight microsatellite loci showed a number of homozygote individuals higher than expected at Hardy-Weinberg equilibrium. The inter-population differentiation level estimated by AMOVA, qST and rRST and Bayesian descriptors detected no signs of population differentiation between the samples analysed. These results are consistent with previous studies based on allozymes and several mitochondrial DNA markers and add further evidence contradicting the early identification, based on morphological and reproductive data, of two sub-populations in the Adriatic Sea.

  6. License - Plabrain DB | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data List Contact us Plabrain... Alike 2.1 Japan . If you use data from this database, please be sure attribute this database as follows: Plabrain...of this database (http://dbarchive.lifesciencedb.jp/english/en/plabrain-db/desc.html) in the article or pape...se Description Download License Update History of This Database Site Policy | Contact Us License - Plabrain DB | LSDB Archive ...

  7. Download - eSOL | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...Database Description Download License Update History of This Database Site Policy | Contact Us Download - eSOL | LSDB Archive ...

  8. SpolSimilaritySearch - A web tool to compare and search similarities between spoligotypes of Mycobacterium tuberculosis complex.

    Science.gov (United States)

    Couvin, David; Zozio, Thierry; Rastogi, Nalin

    2017-07-01

    Spoligotyping is one of the most commonly used polymerase chain reaction (PCR)-based methods for identification and study of genetic diversity of Mycobacterium tuberculosis complex (MTBC). Despite its known limitations if used alone, the methodology is particularly useful when used in combination with other methods such as mycobacterial interspersed repetitive units - variable number of tandem DNA repeats (MIRU-VNTRs). At a worldwide scale, spoligotyping has allowed identification of information on 103,856 MTBC isolates (corresponding to 98049 clustered strains plus 5807 unique isolates from 169 countries of patient origin) contained within the SITVIT2 proprietary database of the Institut Pasteur de la Guadeloupe. The SpolSimilaritySearch web-tool described herein (available at: http://www.pasteur-guadeloupe.fr:8081/SpolSimilaritySearch) incorporates a similarity search algorithm allowing users to get a complete overview of similar spoligotype patterns (with information on presence or absence of 43 spacers) in the aforementioned worldwide database. This tool allows one to analyze spread and evolutionary patterns of MTBC by comparing similar spoligotype patterns, to distinguish between widespread, specific and/or confined patterns, as well as to pinpoint patterns with large deleted blocks, which play an intriguing role in the genetic epidemiology of M. tuberculosis. Finally, the SpolSimilaritySearch tool also provides with the country distribution patterns for each queried spoligotype. Copyright © 2017 Elsevier Ltd. All rights reserved.

  9. Usability of some databases for information services in Czechoslovak nuclear programme

    International Nuclear Information System (INIS)

    Kakos, A.

    1988-01-01

    The contents were compared of the databases Chemical Abstracts Search, World Patent Index, Excerpta Medica, Inspec and Compendex with INIS, with regard to possible completing of INIS searches with searches in these other databases. On the basis of the results of test searches made in all said databases on selected topics falling under the INIS scope, concrete cases were determined when INIS searches should be completed with data in some of the other databases. The contents analysis method is described with regard to the concrete search topics and areas are given of the overlapping of the databases with INIS. Numerical results are given. (J.B.). 2 tabs

  10. Statistical assignment of DNA sequences using Bayesian phylogenetics

    DEFF Research Database (Denmark)

    Terkelsen, Kasper Munch; Boomsma, Wouter Krogh; Huelsenbeck, John P.

    2008-01-01

    We provide a new automated statistical method for DNA barcoding based on a Bayesian phylogenetic analysis. The method is based on automated database sequence retrieval, alignment, and phylogenetic analysis using a custom-built program for Bayesian phylogenetic analysis. We show on real data...... that the method outperforms Blast searches as a measure of confidence and can help eliminate 80% of all false assignment based on best Blast hit. However, the most important advance of the method is that it provides statistically meaningful measures of confidence. We apply the method to a re......-analysis of previously published ancient DNA data and show that, with high statistical confidence, most of the published sequences are in fact of Neanderthal origin. However, there are several cases of chimeric sequences that are comprised of a combination of both Neanderthal and modern human DNA....

  11. Developing an Inhouse Database from Online Sources.

    Science.gov (United States)

    Smith-Cohen, Deborah

    1993-01-01

    Describes the development of an in-house bibliographic database by the U.S. Army Corp of Engineers Cold Regions Research and Engineering Laboratory on arctic wetlands research. Topics discussed include planning; identifying relevant search terms and commercial online databases; downloading citations; criteria for software selection; management…

  12. Database Resources of the BIG Data Center in 2018.

    Science.gov (United States)

    2018-01-04

    The BIG Data Center at Beijing Institute of Genomics (BIG) of the Chinese Academy of Sciences provides freely open access to a suite of database resources in support of worldwide research activities in both academia and industry. With the vast amounts of omics data generated at ever-greater scales and rates, the BIG Data Center is continually expanding, updating and enriching its core database resources through big-data integration and value-added curation, including BioCode (a repository archiving bioinformatics tool codes), BioProject (a biological project library), BioSample (a biological sample library), Genome Sequence Archive (GSA, a data repository for archiving raw sequence reads), Genome Warehouse (GWH, a centralized resource housing genome-scale data), Genome Variation Map (GVM, a public repository of genome variations), Gene Expression Nebulas (GEN, a database of gene expression profiles based on RNA-Seq data), Methylation Bank (MethBank, an integrated databank of DNA methylomes), and Science Wikis (a series of biological knowledge wikis for community annotations). In addition, three featured web services are provided, viz., BIG Search (search as a service; a scalable inter-domain text search engine), BIG SSO (single sign-on as a service; a user access control system to gain access to multiple independent systems with a single ID and password) and Gsub (submission as a service; a unified submission service for all relevant resources). All of these resources are publicly accessible through the home page of the BIG Data Center at http://bigd.big.ac.cn. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  13. Extending Database Integration Technology

    National Research Council Canada - National Science Library

    Buneman, Peter

    1999-01-01

    Formal approaches to the semantics of databases and database languages can have immediate and practical consequences in extending database integration technologies to include a vastly greater range...

  14. The GLIMS Glacier Database

    Science.gov (United States)

    Raup, B. H.; Khalsa, S. S.; Armstrong, R.

    2007-12-01

    The Global Land Ice Measurements from Space (GLIMS) project has built a geospatial and temporal database of glacier data, composed of glacier outlines and various scalar attributes. These data are being derived primarily from satellite imagery, such as from ASTER and Landsat. Each "snapshot" of a glacier is from a specific time, and the database is designed to store multiple snapshots representative of different times. We have implemented two web-based interfaces to the database; one enables exploration of the data via interactive maps (web map server), while the other allows searches based on text-field constraints. The web map server is an Open Geospatial Consortium (OGC) compliant Web Map Server (WMS) and Web Feature Server (WFS). This means that other web sites can display glacier layers from our site over the Internet, or retrieve glacier features in vector format. All components of the system are implemented using Open Source software: Linux, PostgreSQL, PostGIS (geospatial extensions to the database), MapServer (WMS and WFS), and several supporting components such as Proj.4 (a geographic projection library) and PHP. These tools are robust and provide a flexible and powerful framework for web mapping applications. As a service to the GLIMS community, the database contains metadata on all ASTER imagery acquired over glacierized terrain. Reduced-resolution of the images (browse imagery) can be viewed either as a layer in the MapServer application, or overlaid on the virtual globe within Google Earth. The interactive map application allows the user to constrain by time what data appear on the map. For example, ASTER or glacier outlines from 2002 only, or from Autumn in any year, can be displayed. The system allows users to download their selected glacier data in a choice of formats. The results of a query based on spatial selection (using a mouse) or text-field constraints can be downloaded in any of these formats: ESRI shapefiles, KML (Google Earth), Map

  15. Update History of This Database - GenLibi | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us GenLibi Update History of This Database Date Update contents 2014/03/25 GenLibi English archi...base Description Download License Update History of This Database Site Policy | Contact Us Update History of This Database - GenLibi | LSDB Archive ... ...ve site is opened. 2007/03/01 GenLibi ( http://gene.biosciencedbc.jp/ ) is opened. About This Database Data

  16. Update History of This Database - dbQSNP | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us dbQSNP Update History of This Database Date Update contents 2017/02/16 dbQSNP English archiv...e Description Download License Update History of This Database Site Policy | Contact Us Update History of This Database - dbQSNP | LSDB Archive ... ...e site is opened. 2002/10/23 dbQSNP (http://qsnp.gen.kyushu-u.ac.jp/) is opened. About This Database Databas

  17. TOMATOMICS: A Web Database for Integrated Omics Information in Tomato

    KAUST Repository

    Kudo, Toru; Kobayashi, Masaaki; Terashima, Shin; Katayama, Minami; Ozaki, Soichi; Kanno, Maasa; Saito, Misa; Yokoyama, Koji; Ohyanagi, Hajime; Aoki, Koh; Kubo, Yasutaka; Yano, Kentaro

    2016-01-01

    Solanum lycopersicum (tomato) is an important agronomic crop and a major model fruit-producing plant. To facilitate basic and applied research, comprehensive experimental resources and omics information on tomato are available following their development. Mutant lines and cDNA clones from a dwarf cultivar, Micro-Tom, are two of these genetic resources. Large-scale sequencing data for ESTs and full-length cDNAs from Micro-Tom continue to be gathered. In conjunction with information on the reference genome sequence of another cultivar, Heinz 1706, the Micro-Tom experimental resources have facilitated comprehensive functional analyses. To enhance the efficiency of acquiring omics information for tomato biology, we have integrated the information on the Micro-Tom experimental resources and the Heinz 1706 genome sequence. We have also inferred gene structure by comparison of sequences between the genome of Heinz 1706 and the transcriptome, which are comprised of Micro-Tom full-length cDNAs and Heinz 1706 RNA-seq data stored in the KaFTom and Sequence Read Archive databases. In order to provide large-scale omics information with streamlined connectivity we have developed and maintain a web database TOMATOMICS (http://bioinf.mind.meiji.ac.jp/tomatomics/). In TOMATOMICS, access to the information on the cDNA clone resources, full-length mRNA sequences, gene structures, expression profiles and functional annotations of genes is available through search functions and the genome browser, which has an intuitive graphical interface.

  18. TOMATOMICS: A Web Database for Integrated Omics Information in Tomato

    KAUST Repository

    Kudo, Toru

    2016-11-29

    Solanum lycopersicum (tomato) is an important agronomic crop and a major model fruit-producing plant. To facilitate basic and applied research, comprehensive experimental resources and omics information on tomato are available following their development. Mutant lines and cDNA clones from a dwarf cultivar, Micro-Tom, are two of these genetic resources. Large-scale sequencing data for ESTs and full-length cDNAs from Micro-Tom continue to be gathered. In conjunction with information on the reference genome sequence of another cultivar, Heinz 1706, the Micro-Tom experimental resources have facilitated comprehensive functional analyses. To enhance the efficiency of acquiring omics information for tomato biology, we have integrated the information on the Micro-Tom experimental resources and the Heinz 1706 genome sequence. We have also inferred gene structure by comparison of sequences between the genome of Heinz 1706 and the transcriptome, which are comprised of Micro-Tom full-length cDNAs and Heinz 1706 RNA-seq data stored in the KaFTom and Sequence Read Archive databases. In order to provide large-scale omics information with streamlined connectivity we have developed and maintain a web database TOMATOMICS (http://bioinf.mind.meiji.ac.jp/tomatomics/). In TOMATOMICS, access to the information on the cDNA clone resources, full-length mRNA sequences, gene structures, expression profiles and functional annotations of genes is available through search functions and the genome browser, which has an intuitive graphical interface.

  19. The National DNA Data Bank of Canada: a Quebecer perspective.

    Science.gov (United States)

    Milot, Emmanuel; Lecomte, Marie M J; Germain, Hugo; Crispino, Frank

    2013-11-20

    The Canadian National DNA Database was created in 1998 and first used in the mid-2000. Under management by the RCMP, the National DNA Data Bank of Canada offers each year satisfactory reported statistics for its use and efficiency. Built on two indexes (convicted offenders and crime scene indexes), the database not only provides increasing matches to offenders or linked traces to the various police forces of the nation, but offers a memory repository for cold cases. Despite these achievements, the data bank is now facing new challenges that will inevitably defy the way the database is currently used. These arise from the increasing power of detection of DNA traces, the diversity of demands from police investigators and the growth of the bank itself. Examples of new requirements from the database now include familial searches, low-copy-number analyses and the correct interpretation of mixed samples. This paper aims to develop on the original way set in Québec to address some of these challenges. Nevertheless, analytic and technological advances will inevitably lead to the introduction of new technologies in forensic laboratories, such as single cell sequencing, phenotyping, and proteomics. Furthermore, it will not only request a new holistic/global approach of the forensic molecular biology sciences (through academia and a more investigative role in the laboratory), but also new legal developments. Far from being exhaustive, this paper highlights some of the current use of the database, its potential for the future, and opportunity to expand as a result of recent technological developments in molecular biology, including, but not limited to DNA identification.

  20. The National DNA Data Bank of Canada: A Quebecer perspective

    Directory of Open Access Journals (Sweden)

    Emmanuel eMilot

    2013-11-01

    Full Text Available The Canadian National DNA Database was created in 1998 and first used in the mid-2000. Under management by the RCMP, the National DNA Data Bank of Canada offers each year satisfactory reported statistics for its use and efficiency. Built on two indexes (convicted offenders and crime scene indexes, the database not only provides increasing matches to offenders or linked traces to the various police forces of the nation, but offers a memory repository for cold cases. Despite these achievements, the data bank is now facing new challenges that will inevitably defy the way the database is currently used. These arise from the increasing power of detection of DNA traces, the diversity of demands from police investigators and the growth of the bank itself. Examples of new requirements from the database now include familial searches, low-copy-number analyses and the correct interpretation of mixed samples. This paper aims to develop on the original way set in Québec to address some of these challenges. Nevertheless, analytic and technological advances will inevitably lead to the introduction of new technologies in forensic laboratories, such as single cell sequencing, phenotyping, and proteomics. Furthermore, it will not only request a new holistic/global approach of the forensic molecular biology sciences (through academia and a more investigative role in the laboratory, but also new legal developments. Far from being exhaustive, this paper highlights some of the current use of the database, its potential for the future, and opportunity to expand as a result of recent technological developments in molecular biology, including, but not limited to DNA identification.

  1. Contributions to Logical Database Design

    Directory of Open Access Journals (Sweden)

    Vitalie COTELEA

    2012-01-01

    Full Text Available This paper treats the problems arising at the stage of logical database design. It comprises a synthesis of the most common inference models of functional dependencies, deals with the problems of building covers for sets of functional dependencies, makes a synthesizes of normal forms, presents trends regarding normalization algorithms and provides a temporal complexity of those. In addition, it presents a summary of the most known keys’ search algorithms, deals with issues of analysis and testing of relational schemes. It also summarizes and compares the different features of recognition of acyclic database schemas.

  2. HAEdb: a novel interactive, locus-specific mutation database for the C1 inhibitor gene.

    Science.gov (United States)

    Kalmár, Lajos; Hegedüs, Tamás; Farkas, Henriette; Nagy, Melinda; Tordai, Attila

    2005-01-01

    Hereditary angioneurotic edema (HAE) is an autosomal dominant disorder characterized by episodic local subcutaneous and submucosal edema and is caused by the deficiency of the activated C1 esterase inhibitor protein (C1-INH or C1INH; approved gene symbol SERPING1). Published C1-INH mutations are represented in large universal databases (e.g., OMIM, HGMD), but these databases update their data rather infrequently, they are not interactive, and they do not allow searches according to different criteria. The HAEdb, a C1-INH gene mutation database (http://hae.biomembrane.hu) was created to contribute to the following expectations: 1) help the comprehensive collection of information on genetic alterations of the C1-INH gene; 2) create a database in which data can be searched and compared according to several flexible criteria; and 3) provide additional help in new mutation identification. The website uses MySQL, an open-source, multithreaded, relational database management system. The user-friendly graphical interface was written in the PHP web programming language. The website consists of two main parts, the freely browsable search function, and the password-protected data deposition function. Mutations of the C1-INH gene are divided in two parts: gross mutations involving DNA fragments >1 kb, and micro mutations encompassing all non-gross mutations. Several attributes (e.g., affected exon, molecular consequence, family history) are collected for each mutation in a standardized form. This database may facilitate future comprehensive analyses of C1-INH mutations and also provide regular help for molecular diagnostic testing of HAE patients in different centers.

  3. Multilingual access to full text databases

    International Nuclear Information System (INIS)

    Fluhr, C.; Radwan, K.

    1990-05-01

    Many full text databases are available in only one language, or more, they may contain documents in different languages. Even if the user is able to understand the language of the documents in the database, it could be easier for him to express his need in his own language. For the case of databases containing documents in different languages, it is more simple to formulate the query in one language only and to retrieve documents in different languages. This paper present the developments and the first experiments of multilingual search, applied to french-english pair, for text data in nuclear field, based on the system SPIRIT. After reminding the general problems of full text databases search by queries formulated in natural language, we present the methods used to reformulate the queries and show how they can be expanded for multilingual search. The first results on data in nuclear field are presented (AFCEN norms and INIS abstracts). 4 refs

  4. Multilingual Federated Searching Across Heterogeneous Collections.

    Science.gov (United States)

    Powell, James; Fox, Edward A.

    1998-01-01

    Describes a scalable system for searching heterogeneous multilingual collections on the World Wide Web. Details Searchable Database Markup Language (SearchDB-ML) for describing the characteristics of a search engine and its interface, and a protocol for requesting word translations between languages. (Author)

  5. Database Description - fRNAdb | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available Affiliation: National Institute of Advanced Industrial Science and Technology (AIST) Journal Search: Creato...D89-92 External Links: Original website information Database maintenance site National Institute of Industrial Science and Technology

  6. The MAJORANA Parts Tracking Database

    Energy Technology Data Exchange (ETDEWEB)

    Abgrall, N. [Nuclear Science Division, Lawrence Berkeley National Laboratory, Berkeley, CA (United States); Aguayo, E. [Pacific Northwest National Laboratory, Richland, WA (United States); Avignone, F.T. [Department of Physics and Astronomy, University of South Carolina, Columbia, SC (United States); Oak Ridge National Laboratory, Oak Ridge, TN (United States); Barabash, A.S. [Institute for Theoretical and Experimental Physics, Moscow (Russian Federation); Bertrand, F.E. [Oak Ridge National Laboratory, Oak Ridge, TN (United States); Brudanin, V. [Joint Institute for Nuclear Research, Dubna (Russian Federation); Busch, M. [Department of Physics, Duke University, Durham, NC (United States); Triangle Universities Nuclear Laboratory, Durham, NC (United States); Byram, D. [Department of Physics, University of South Dakota, Vermillion, SD (United States); Caldwell, A.S. [South Dakota School of Mines and Technology, Rapid City, SD (United States); Chan, Y-D. [Nuclear Science Division, Lawrence Berkeley National Laboratory, Berkeley, CA (United States); Christofferson, C.D. [South Dakota School of Mines and Technology, Rapid City, SD (United States); Combs, D.C. [Department of Physics, North Carolina State University, Raleigh, NC (United States); Triangle Universities Nuclear Laboratory, Durham, NC (United States); Cuesta, C.; Detwiler, J.A.; Doe, P.J. [Center for Experimental Nuclear Physics and Astrophysics, and Department of Physics, University of Washington, Seattle, WA (United States); Efremenko, Yu. [Department of Physics and Astronomy, University of Tennessee, Knoxville, TN (United States); Egorov, V. [Joint Institute for Nuclear Research, Dubna (Russian Federation); Ejiri, H. [Research Center for Nuclear Physics and Department of Physics, Osaka University, Ibaraki, Osaka (Japan); Elliott, S.R. [Los Alamos National Laboratory, Los Alamos, NM (United States); and others

    2015-04-11

    The MAJORANA DEMONSTRATOR is an ultra-low background physics experiment searching for the neutrinoless double beta decay of {sup 76}Ge. The MAJORANA Parts Tracking Database is used to record the history of components used in the construction of the DEMONSTRATOR. The tracking implementation takes a novel approach based on the schema-free database technology CouchDB. Transportation, storage, and processes undergone by parts such as machining or cleaning are linked to part records. Tracking parts provide a great logistics benefit and an important quality assurance reference during construction. In addition, the location history of parts provides an estimate of their exposure to cosmic radiation. A web application for data entry and a radiation exposure calculator have been developed as tools for achieving the extreme radio-purity required for this rare decay search.

  7. The MAJORANA Parts Tracking Database

    Science.gov (United States)

    Abgrall, N.; Aguayo, E.; Avignone, F. T.; Barabash, A. S.; Bertrand, F. E.; Brudanin, V.; Busch, M.; Byram, D.; Caldwell, A. S.; Chan, Y.-D.; Christofferson, C. D.; Combs, D. C.; Cuesta, C.; Detwiler, J. A.; Doe, P. J.; Efremenko, Yu.; Egorov, V.; Ejiri, H.; Elliott, S. R.; Esterline, J.; Fast, J. E.; Finnerty, P.; Fraenkle, F. M.; Galindo-Uribarri, A.; Giovanetti, G. K.; Goett, J.; Green, M. P.; Gruszko, J.; Guiseppe, V. E.; Gusev, K.; Hallin, A. L.; Hazama, R.; Hegai, A.; Henning, R.; Hoppe, E. W.; Howard, S.; Howe, M. A.; Keeter, K. J.; Kidd, M. F.; Kochetov, O.; Konovalov, S. I.; Kouzes, R. T.; LaFerriere, B. D.; Leon, J. Diaz; Leviner, L. E.; Loach, J. C.; MacMullin, J.; Martin, R. D.; Meijer, S. J.; Mertens, S.; Miller, M. L.; Mizouni, L.; Nomachi, M.; Orrell, J. L.; O`Shaughnessy, C.; Overman, N. R.; Petersburg, R.; Phillips, D. G.; Poon, A. W. P.; Pushkin, K.; Radford, D. C.; Rager, J.; Rielage, K.; Robertson, R. G. H.; Romero-Romero, E.; Ronquest, M. C.; Shanks, B.; Shima, T.; Shirchenko, M.; Snavely, K. J.; Snyder, N.; Soin, A.; Suriano, A. M.; Tedeschi, D.; Thompson, J.; Timkin, V.; Tornow, W.; Trimble, J. E.; Varner, R. L.; Vasilyev, S.; Vetter, K.; Vorren, K.; White, B. R.; Wilkerson, J. F.; Wiseman, C.; Xu, W.; Yakushev, E.; Young, A. R.; Yu, C.-H.; Yumatov, V.; Zhitnikov, I.

    2015-04-01

    The MAJORANA DEMONSTRATOR is an ultra-low background physics experiment searching for the neutrinoless double beta decay of 76Ge. The MAJORANA Parts Tracking Database is used to record the history of components used in the construction of the DEMONSTRATOR. The tracking implementation takes a novel approach based on the schema-free database technology CouchDB. Transportation, storage, and processes undergone by parts such as machining or cleaning are linked to part records. Tracking parts provide a great logistics benefit and an important quality assurance reference during construction. In addition, the location history of parts provides an estimate of their exposure to cosmic radiation. A web application for data entry and a radiation exposure calculator have been developed as tools for achieving the extreme radio-purity required for this rare decay search.

  8. Northeast India Helminth Parasite Information Database (NEIHPID: Knowledge Base for Helminth Parasites.

    Directory of Open Access Journals (Sweden)

    Devendra Kumar Biswal

    Full Text Available Most metazoan parasites that invade vertebrate hosts belong to three phyla: Platyhelminthes, Nematoda and Acanthocephala. Many of the parasitic members of these phyla are collectively known as helminths and are causative agents of many debilitating, deforming and lethal diseases of humans and animals. The North-East India Helminth Parasite Information Database (NEIHPID project aimed to document and characterise the spectrum of helminth parasites in the north-eastern region of India, providing host, geographical distribution, diagnostic characters and image data. The morphology-based taxonomic data are supplemented with information on DNA sequences of nuclear, ribosomal and mitochondrial gene marker regions that aid in parasite identification. In addition, the database contains raw next generation sequencing (NGS data for 3 foodborne trematode parasites, with more to follow. The database will also provide study material for students interested in parasite biology. Users can search the database at various taxonomic levels (phylum, class, order, superfamily, family, genus, and species, or by host, habitat and geographical location. Specimen collection locations are noted as co-ordinates in a MySQL database and can be viewed on Google maps, using Google Maps JavaScript API v3. The NEIHPID database has been made freely available at http://nepiac.nehu.ac.in/index.php.

  9. Northeast India Helminth Parasite Information Database (NEIHPID): Knowledge Base for Helminth Parasites.

    Science.gov (United States)

    Biswal, Devendra Kumar; Debnath, Manish; Kharumnuid, Graciously; Thongnibah, Welfrank; Tandon, Veena

    2016-01-01

    Most metazoan parasites that invade vertebrate hosts belong to three phyla: Platyhelminthes, Nematoda and Acanthocephala. Many of the parasitic members of these phyla are collectively known as helminths and are causative agents of many debilitating, deforming and lethal diseases of humans and animals. The North-East India Helminth Parasite Information Database (NEIHPID) project aimed to document and characterise the spectrum of helminth parasites in the north-eastern region of India, providing host, geographical distribution, diagnostic characters and image data. The morphology-based taxonomic data are supplemented with information on DNA sequences of nuclear, ribosomal and mitochondrial gene marker regions that aid in parasite identification. In addition, the database contains raw next generation sequencing (NGS) data for 3 foodborne trematode parasites, with more to follow. The database will also provide study material for students interested in parasite biology. Users can search the database at various taxonomic levels (phylum, class, order, superfamily, family, genus, and species), or by host, habitat and geographical location. Specimen collection locations are noted as co-ordinates in a MySQL database and can be viewed on Google maps, using Google Maps JavaScript API v3. The NEIHPID database has been made freely available at http://nepiac.nehu.ac.in/index.php.

  10. Northeast India Helminth Parasite Information Database (NEIHPID): Knowledge Base for Helminth Parasites

    Science.gov (United States)

    Debnath, Manish; Kharumnuid, Graciously; Thongnibah, Welfrank; Tandon, Veena

    2016-01-01

    Most metazoan parasites that invade vertebrate hosts belong to three phyla: Platyhelminthes, Nematoda and Acanthocephala. Many of the parasitic members of these phyla are collectively known as helminths and are causative agents of many debilitating, deforming and lethal diseases of humans and animals. The North-East India Helminth Parasite Information Database (NEIHPID) project aimed to document and characterise the spectrum of helminth parasites in the north-eastern region of India, providing host, geographical distribution, diagnostic characters and image data. The morphology-based taxonomic data are supplemented with information on DNA sequences of nuclear, ribosomal and mitochondrial gene marker regions that aid in parasite identification. In addition, the database contains raw next generation sequencing (NGS) data for 3 foodborne trematode parasites, with more to follow. The database will also provide study material for students interested in parasite biology. Users can search the database at various taxonomic levels (phylum, class, order, superfamily, family, genus, and species), or by host, habitat and geographical location. Specimen collection locations are noted as co-ordinates in a MySQL database and can be viewed on Google maps, using Google Maps JavaScript API v3. The NEIHPID database has been made freely available at http://nepiac.nehu.ac.in/index.php PMID:27285615

  11. Geminivirus data warehouse: a database enriched with machine learning approaches.

    Science.gov (United States)

    Silva, Jose Cleydson F; Carvalho, Thales F M; Basso, Marcos F; Deguchi, Michihito; Pereira, Welison A; Sobrinho, Roberto R; Vidigal, Pedro M P; Brustolini, Otávio J B; Silva, Fabyano F; Dal-Bianco, Maximiller; Fontes, Renildes L F; Santos, Anésia A; Zerbini, Francisco Murilo; Cerqueira, Fabio R; Fontes, Elizabeth P B

    2017-05-05

    The Geminiviridae family encompasses a group of single-stranded DNA viruses with twinned and quasi-isometric virions, which infect a wide range of dicotyledonous and monocotyledonous plants and are responsible for significant economic losses worldwide. Geminiviruses are divided into nine genera, according to their insect vector, host range, genome organization, and phylogeny reconstruction. Using rolling-circle amplification approaches along with high-throughput sequencing technologies, thousands of full-length geminivirus and satellite genome sequences were amplified and have become available in public databases. As a consequence, many important challenges have emerged, namely, how to classify, store, and analyze massive datasets as well as how to extract information or new knowledge. Data mining approaches, mainly supported by machine learning (ML) techniques, are a natural means for high-throughput data analysis in the context of genomics, transcriptomics, proteomics, and metabolomics. Here, we describe the development of a data warehouse enriched with ML approaches, designated geminivirus.org. We implemented search modules, bioinformatics tools, and ML methods to retrieve high precision information, demarcate species, and create classifiers for genera and open reading frames (ORFs) of geminivirus genomes. The use of data mining techniques such as ETL (Extract, Transform, Load) to feed our database, as well as algorithms based on machine learning for knowledge extraction, allowed us to obtain a database with quality data and suitable tools for bioinformatics analysis. The Geminivirus Data Warehouse (geminivirus.org) offers a simple and user-friendly environment for information retrieval and knowledge discovery related to geminiviruses.

  12. A tuberculosis biomarker database: the key to novel TB diagnostics

    Directory of Open Access Journals (Sweden)

    Seda Yerlikaya

    2017-03-01

    Full Text Available New diagnostic innovations for tuberculosis (TB, including point-of-care solutions, are critical to reach the goals of the End TB Strategy. However, despite decades of research, numerous reports on new biomarker candidates, and significant investment, no well-performing, simple and rapid TB diagnostic test is yet available on the market, and the search for accurate, non-DNA biomarkers remains a priority. To help overcome this ‘biomarker pipeline problem’, FIND and partners are working on the development of a well-curated and user-friendly TB biomarker database. The web-based database will enable the dynamic tracking of evidence surrounding biomarker candidates in relation to target product profiles (TPPs for needed TB diagnostics. It will be able to accommodate raw datasets and facilitate the verification of promising biomarker candidates and the identification of novel biomarker combinations. As such, the database will simplify data and knowledge sharing, empower collaboration, help in the coordination of efforts and allocation of resources, streamline the verification and validation of biomarker candidates, and ultimately lead to an accelerated translation into clinically useful tools.

  13. The Hawaiian Freshwater Algal Database (HfwADB: a laboratory LIMS and online biodiversity resource

    Directory of Open Access Journals (Sweden)

    Sherwood Alison R

    2012-10-01

    Full Text Available Abstract Background Biodiversity databases serve the important role of highlighting species-level diversity from defined geographical regions. Databases that are specially designed to accommodate the types of data gathered during regional surveys are valuable in allowing full data access and display to researchers not directly involved with the project, while serving as a Laboratory Information Management System (LIMS. The Hawaiian Freshwater Algal Database, or HfwADB, was modified from the Hawaiian Algal Database to showcase non-marine algal specimens collected from the Hawaiian Archipelago by accommodating the additional level of organization required for samples including multiple species. Description The Hawaiian Freshwater Algal Database is a comprehensive and searchable database containing photographs and micrographs of samples and collection sites, geo-referenced collecting information, taxonomic data and standardized DNA sequence data. All data for individual samples are linked through unique 10-digit accession numbers (“Isolate Accession”, the first five of which correspond to the collection site (“Environmental Accession”. Users can search online for sample information by accession number, various levels of taxonomy, habitat or collection site. HfwADB is hosted at the University of Hawaii, and was made publicly accessible in October 2011. At the present time the database houses data for over 2,825 samples of non-marine algae from 1,786 collection sites from the Hawaiian Archipelago. These samples include cyanobacteria, red and green algae and diatoms, as well as lesser representation from some other algal lineages. Conclusions HfwADB is a digital repository that acts as a Laboratory Information Management System for Hawaiian non-marine algal data. Users can interact with the repository through the web to view relevant habitat data (including geo-referenced collection locations and download images of collection sites, specimen

  14. The STRING database in 2011

    DEFF Research Database (Denmark)

    Szklarczyk, Damian; Franceschini, Andrea; Kuhn, Michael

    2011-01-01

    present an update on the online database resource Search Tool for the Retrieval of Interacting Genes (STRING); it provides uniquely comprehensive coverage and ease of access to both experimental as well as predicted interaction information. Interactions in STRING are provided with a confidence score...... models, extensive data updates and strongly improved connectivity and integration with third-party resources. Version 9.0 of STRING covers more than 1100 completely sequenced organisms; the resource can be reached at http://string-db.org....

  15. Update History of This Database - tRNADB-CE | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data List Contact us tRNAD...11/08/25 License is updated. 2010/03/29 tRNADB-CE English archive site is opened. 2008/7/1 tRNADB-CE( http:/...Download License Update History of This Database Site Policy | Contact Us Update History of This Database - tRNADB-CE | LSDB Archive ...

  16. Main - AT Atlas | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ... name: at_atlas_en.zip File URL: ftp://ftp.biosciencedbc.jp/archive/at_atlas/LATE... Database Description Download License Update History of This Database Site Policy | Contact Us Main - AT Atlas | LSDB Archive ...

  17. Main - KOME | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...ntents List of datasets Data file File name: kome_main.zip File URL: ftp://ftp.biosciencedbc.jp/archive/kome...ase Database Description Download License Update History of This Database Site Policy | Contact Us Main - KOME | LSDB Archive ...

  18. PREIMS - AT Atlas | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...Targeted Proteins Research Program (TPRP). Data file File name: at_atlas_preims.zip File URL: ftp://ftp.biosciencedbc.jp/archiv...base Database Description Download License Update History of This Database Site Policy | Contact Us PREIMS - AT Atlas | LSDB Archive ...

  19. Subject Retrieval from Full-Text Databases in the Humanities

    Science.gov (United States)

    East, John W.

    2007-01-01

    This paper examines the problems involved in subject retrieval from full-text databases of secondary materials in the humanities. Ten such databases were studied and their search functionality evaluated, focusing on factors such as Boolean operators, document surrogates, limiting by subject area, proximity operators, phrase searching, wildcards,…

  20. Home | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available ple Search Original Site Database Center for Life Science Kousaku Okubo organ human The dictionary-type data...-SA Detail Taxonomy Icon Taxonomy Icon Download | Simple Search Original Site National Bioscience Database Center Kousaku Okubo...enter for Life Science Kousaku Okubo Dictionary 9 species (human, mouse, rat, zeb

  1. Simrank: Rapid and sensitive general-purpose k-mer search tool

    Energy Technology Data Exchange (ETDEWEB)

    DeSantis, T.Z.; Keller, K.; Karaoz, U.; Alekseyenko, A.V; Singh, N.N.S.; Brodie, E.L; Pei, Z.; Andersen, G.L; Larsen, N.

    2011-04-01

    Terabyte-scale collections of string-encoded data are expected from consortia efforts such as the Human Microbiome Project (http://nihroadmap.nih.gov/hmp). Intra- and inter-project data similarity searches are enabled by rapid k-mer matching strategies. Software applications for sequence database partitioning, guide tree estimation, molecular classification and alignment acceleration have benefited from embedded k-mer searches as sub-routines. However, a rapid, general-purpose, open-source, flexible, stand-alone k-mer tool has not been available. Here we present a stand-alone utility, Simrank, which allows users to rapidly identify database strings the most similar to query strings. Performance testing of Simrank and related tools against DNA, RNA, protein and human-languages found Simrank 10X to 928X faster depending on the dataset. Simrank provides molecular ecologists with a high-throughput, open source choice for comparing large sequence sets to find similarity.

  2. Database development and management

    CERN Document Server

    Chao, Lee

    2006-01-01

    Introduction to Database Systems Functions of a DatabaseDatabase Management SystemDatabase ComponentsDatabase Development ProcessConceptual Design and Data Modeling Introduction to Database Design Process Understanding Business ProcessEntity-Relationship Data Model Representing Business Process with Entity-RelationshipModelTable Structure and NormalizationIntroduction to TablesTable NormalizationTransforming Data Models to Relational Databases .DBMS Selection Transforming Data Models to Relational DatabasesEnforcing ConstraintsCreating Database for Business ProcessPhysical Design and Database

  3. Integrating Variances into an Analytical Database

    Science.gov (United States)

    Sanchez, Carlos

    2010-01-01

    For this project, I enrolled in numerous SATERN courses that taught the basics of database programming. These include: Basic Access 2007 Forms, Introduction to Database Systems, Overview of Database Design, and others. My main job was to create an analytical database that can handle many stored forms and make it easy to interpret and organize. Additionally, I helped improve an existing database and populate it with information. These databases were designed to be used with data from Safety Variances and DCR forms. The research consisted of analyzing the database and comparing the data to find out which entries were repeated the most. If an entry happened to be repeated several times in the database, that would mean that the rule or requirement targeted by that variance has been bypassed many times already and so the requirement may not really be needed, but rather should be changed to allow the variance's conditions permanently. This project did not only restrict itself to the design and development of the database system, but also worked on exporting the data from the database to a different format (e.g. Excel or Word) so it could be analyzed in a simpler fashion. Thanks to the change in format, the data was organized in a spreadsheet that made it possible to sort the data by categories or types and helped speed up searches. Once my work with the database was done, the records of variances could be arranged so that they were displayed in numerical order, or one could search for a specific document targeted by the variances and restrict the search to only include variances that modified a specific requirement. A great part that contributed to my learning was SATERN, NASA's resource for education. Thanks to the SATERN online courses I took over the summer, I was able to learn many new things about computers and databases and also go more in depth into topics I already knew about.

  4. UNITE: a database providing web-based methods for the molecular identification of ectomycorrhizal fungi.

    Science.gov (United States)

    Kõljalg, Urmas; Larsson, Karl-Henrik; Abarenkov, Kessy; Nilsson, R Henrik; Alexander, Ian J; Eberhardt, Ursula; Erland, Susanne; Høiland, Klaus; Kjøller, Rasmus; Larsson, Ellen; Pennanen, Taina; Sen, Robin; Taylor, Andy F S; Tedersoo, Leho; Vrålstad, Trude; Ursing, Björn M

    2005-06-01

    Identification of ectomycorrhizal (ECM) fungi is often achieved through comparisons of ribosomal DNA internal transcribed spacer (ITS) sequences with accessioned sequences deposited in public databases. A major problem encountered is that annotation of the sequences in these databases is not always complete or trustworthy. In order to overcome this deficiency, we report on UNITE, an open-access database. UNITE comprises well annotated fungal ITS sequences from well defined herbarium specimens that include full herbarium reference identification data, collector/source and ecological data. At present UNITE contains 758 ITS sequences from 455 species and 67 genera of ECM fungi. UNITE can be searched by taxon name, via sequence similarity using blastn, and via phylogenetic sequence identification using galaxie. Following implementation, galaxie performs a phylogenetic analysis of the query sequence after alignment either to pre-existing generic alignments, or to matches retrieved from a blast search on the UNITE data. It should be noted that the current version of UNITE is dedicated to the reliable identification of ECM fungi. The UNITE database is accessible through the URL http://unite.zbi.ee

  5. DEPOT database: Reference manual and user's guide

    International Nuclear Information System (INIS)

    Clancey, P.; Logg, C.

    1991-03-01

    DEPOT has been developed to provide tracking for the Stanford Linear Collider (SLC) control system equipment. For each piece of equipment entered into the database, complete location, service, maintenance, modification, certification, and radiation exposure histories can be maintained. To facilitate data entry accuracy, efficiency, and consistency, barcoding technology has been used extensively. DEPOT has been an important tool in improving the reliability of the microsystems controlling SLC. This document describes the components of the DEPOT database, the elements in the database records, and the use of the supporting programs for entering data, searching the database, and producing reports from the information

  6. Construction of Database for Pulsating Variable Stars

    Science.gov (United States)

    Chen, B. Q.; Yang, M.; Jiang, B. W.

    2011-07-01

    A database for the pulsating variable stars is constructed for Chinese astronomers to study the variable stars conveniently. The database includes about 230000 variable stars in the Galactic bulge, LMC and SMC observed by the MACHO (MAssive Compact Halo Objects) and OGLE (Optical Gravitational Lensing Experiment) projects at present. The software used for the construction is LAMP, i.e., Linux+Apache+MySQL+PHP. A web page is provided to search the photometric data and the light curve in the database through the right ascension and declination of the object. More data will be incorporated into the database.

  7. Search Advertising

    OpenAIRE

    Cornière (de), Alexandre

    2016-01-01

    Search engines enable advertisers to target consumers based on the query they have entered. In a framework with horizontal product differentiation, imperfect product information and in which consumers incur search costs, I study a game in which advertisers have to choose a price and a set of relevant keywords. The targeting mechanism brings about three kinds of efficiency gains, namely lower search costs, better matching, and more intense product market price-competition. A monopolistic searc...

  8. Using the TIGR gene index databases for biological discovery.

    Science.gov (United States)

    Lee, Yuandan; Quackenbush, John

    2003-11-01

    The TIGR Gene Index web pages provide access to analyses of ESTs and gene sequences for nearly 60 species, as well as a number of resources derived from these. Each species-specific database is presented using a common format with a homepage. A variety of methods exist that allow users to search each species-specific database. Methods implemented currently include nucleotide or protein sequence queries using WU-BLAST, text-based searches using various sequence identifiers, searches by gene, tissue and library name, and searches using functional classes through Gene Ontology assignments. This protocol provides guidance for using the Gene Index Databases to extract information.

  9. Database automation of accelerator operation

    International Nuclear Information System (INIS)

    Casstevens, B.J.; Ludemann, C.A.

    1983-01-01

    Database management techniques are applied to automating the setup of operating parameters of a heavy-ion accelerator used in nuclear physics experiments. Data files consist of ion-beam attributes, the interconnection assignments of the numerous power supplies and magnetic elements that steer the ions' path through the system, the data values that represent the electrical currents supplied by the power supplies, as well as the positions of motors and status of mechanical actuators. The database is relational and permits searching on ranges of any subset of the ion-beam attributes. A file selected from the database is used by the control software to replicate the ion beam conditions by adjusting the physical elements in a continuous manner

  10. Faceted Search

    CERN Document Server

    Tunkelang, Daniel

    2009-01-01

    We live in an information age that requires us, more than ever, to represent, access, and use information. Over the last several decades, we have developed a modern science and technology for information retrieval, relentlessly pursuing the vision of a "memex" that Vannevar Bush proposed in his seminal article, "As We May Think." Faceted search plays a key role in this program. Faceted search addresses weaknesses of conventional search approaches and has emerged as a foundation for interactive information retrieval. User studies demonstrate that faceted search provides more

  11. Refined repetitive sequence searches utilizing a fast hash function and cross species information retrievals

    Directory of Open Access Journals (Sweden)

    Reneker Jeff

    2005-05-01

    Full Text Available Abstract Background Searching for small tandem/disperse repetitive DNA sequences streamlines many biomedical research processes. For instance, whole genomic array analysis in yeast has revealed 22 PHO-regulated genes. The promoter regions of all but one of them contain at least one of the two core Pho4p binding sites, CACGTG and CACGTT. In humans, microsatellites play a role in a number of rare neurodegenerative diseases such as spinocerebellar ataxia type 1 (SCA1. SCA1 is a hereditary neurodegenerative disease caused by an expanded CAG repeat in the coding sequence of the gene. In bacterial pathogens, microsatellites are proposed to regulate expression of some virulence factors. For example, bacteria commonly generate intra-strain diversity through phase variation which is strongly associated with virulence determinants. A recent analysis of the complete sequences of the Helicobacter pylori strains 26695 and J99 has identified 46 putative phase-variable genes among the two genomes through their association with homopolymeric tracts and dinucleotide repeats. Life scientists are increasingly interested in studying the function of small sequences of DNA. However, current search algorithms often generate thousands of matches – most of which are irrelevant to the researcher. Results We present our hash function as well as our search algorithm to locate small sequences of DNA within multiple genomes. Our system applies information retrieval algorithms to discover knowledge of cross-species conservation of repeat sequences. We discuss our incorporation of the Gene Ontology (GO database into these algorithms. We conduct an exhaustive time analysis of our system for various repetitive sequence lengths. For instance, a search for eight bases of sequence within 3.224 GBases on 49 different chromosomes takes 1.147 seconds on average. To illustrate the relevance of the search results, we conduct a search with and without added annotation terms for the

  12. PSCID List - PSCDB | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...t.zip File URL: ftp://ftp.biosciencedbc.jp/archive/pscdb/LATEST/pscdb_pscid_list.zip File size: 24.4 KB Simp...nd-binding sites About This Database Database Description Download License Update History of This Database Site Policy | Contact Us PSCID List - PSCDB | LSDB Archive ...

  13. Mathematics for Databases

    NARCIS (Netherlands)

    ir. Sander van Laar

    2007-01-01

    A formal description of a database consists of the description of the relations (tables) of the database together with the constraints that must hold on the database. Furthermore the contents of a database can be retrieved using queries. These constraints and queries for databases can very well be

  14. Databases and their application

    NARCIS (Netherlands)

    Grimm, E.C.; Bradshaw, R.H.W; Brewer, S.; Flantua, S.; Giesecke, T.; Lézine, A.M.; Takahara, H.; Williams, J.W.,Jr; Elias, S.A.; Mock, C.J.

    2013-01-01

    During the past 20 years, several pollen database cooperatives have been established. These databases are now constituent databases of the Neotoma Paleoecology Database, a public domain, multiproxy, relational database designed for Quaternary-Pliocene fossil data and modern surface samples. The

  15. Citation searches are more sensitive than keyword searches to identify studies using specific measurement instruments.

    Science.gov (United States)

    Linder, Suzanne K; Kamath, Geetanjali R; Pratt, Gregory F; Saraykar, Smita S; Volk, Robert J

    2015-04-01

    To compare the effectiveness of two search methods in identifying studies that used the Control Preferences Scale (CPS), a health care decision-making instrument commonly used in clinical settings. We searched the literature using two methods: (1) keyword searching using variations of "Control Preferences Scale" and (2) cited reference searching using two seminal CPS publications. We searched three bibliographic databases [PubMed, Scopus, and Web of Science (WOS)] and one full-text database (Google Scholar). We report precision and sensitivity as measures of effectiveness. Keyword searches in bibliographic databases yielded high average precision (90%) but low average sensitivity (16%). PubMed was the most precise, followed closely by Scopus and WOS. The Google Scholar keyword search had low precision (54%) but provided the highest sensitivity (70%). Cited reference searches in all databases yielded moderate sensitivity (45-54%), but precision ranged from 35% to 75% with Scopus being the most precise. Cited reference searches were more sensitive than keyword searches, making it a more comprehensive strategy to identify all studies that use a particular instrument. Keyword searches provide a quick way of finding some but not all relevant articles. Goals, time, and resources should dictate the combination of which methods and databases are used. Copyright © 2015 Elsevier Inc. All rights reserved.

  16. Advanced Search

    African Journals Online (AJOL)

    Search tips: Search terms are case-insensitive; Common words are ignored; By default only articles containing all terms in the query are returned (i.e., AND is implied); Combine multiple words with OR to find articles containing either term; e.g., education OR research; Use parentheses to create more complex queries; e.g., ...

  17. Content-Based Information Retrieval from Forensic Databases

    NARCIS (Netherlands)

    Geradts, Z.J.M.H.

    2002-01-01

    In forensic science, the number of image databases is growing rapidly. For this reason, it is necessary to have a proper procedure for searching in these images databases based on content. The use of image databases results in more solved crimes; furthermore, statistical information can be obtained

  18. Fast and secure retrieval of DNA sequences

    NARCIS (Netherlands)

    2014-01-01

    Sequence models are retrieved from a sequences index. The sequence models model DNA or RNA sequences stored in a database, and each comprises a finite memory tree source model and parameters for the finite memory tree source model. One or more DNA or RNA sequences stored in the database are

  19. Dietary Supplement Ingredient Database

    Science.gov (United States)

    ... and US Department of Agriculture Dietary Supplement Ingredient Database Toggle navigation Menu Home About DSID Mission Current ... values can be saved to build a small database or add to an existing database for national, ...

  20. Energy Consumption Database

    Science.gov (United States)

    Consumption Database The California Energy Commission has created this on-line database for informal reporting ) classifications. The database also provides easy downloading of energy consumption data into Microsoft Excel (XLSX