databases reveals similarities: Topics by WorldWideScience.org

Sample records for databases reveals similarities

Analysis of newly established EST databases reveals similarities between heart regeneration in newt and fish

Directory of Open Access Journals (Sweden)

Weis Patrick

2010-01-01

Full Text Available Abstract Background The newt Notophthalmus viridescens possesses the remarkable ability to respond to cardiac damage by formation of new myocardial tissue. Surprisingly little is known about changes in gene activities that occur during the course of regeneration. To begin to decipher the molecular processes, that underlie restoration of functional cardiac tissue, we generated an EST database from regenerating newt hearts and compared the transcriptional profile of selected candidates with genes deregulated during zebrafish heart regeneration. Results A cDNA library of 100,000 cDNA clones was generated from newt hearts 14 days after ventricular injury. Sequencing of 11520 cDNA clones resulted in 2894 assembled contigs. BLAST searches revealed 1695 sequences with potential homology to sequences from the NCBI database. BLAST searches to TrEMBL and Swiss-Prot databases assigned 1116 proteins to Gene Ontology terms. We also identified a relatively large set of 174 ORFs, which are likely to be unique for urodele amphibians. Expression analysis of newt-zebrafish homologues confirmed the deregulation of selected genes during heart regeneration. Sequences, BLAST results and GO annotations were visualized in a relational web based database followed by grouping of identified proteins into clusters of GO Terms. Comparison of data from regenerating zebrafish hearts identified biological processes, which were uniformly overrepresented during cardiac regeneration in newt and zebrafish. Conclusion We concluded that heart regeneration in newts and zebrafish led to the activation of similar sets of genes, which suggests that heart regeneration in both species might follow similar principles. The design of the newly established newt EST database allows identification of molecular pathways important for heart regeneration.
Similarity joins in relational database systems

CERN Document Server

Augsten, Nikolaus

2013-01-01

State-of-the-art database systems manage and process a variety of complex objects, including strings and trees. For such objects equality comparisons are often not meaningful and must be replaced by similarity comparisons. This book describes the concepts and techniques to incorporate similarity into database systems. We start out by discussing the properties of strings and trees, and identify the edit distance as the de facto standard for comparing complex objects. Since the edit distance is computationally expensive, token-based distances have been introduced to speed up edit distance comput
Using SQL Databases for Sequence Similarity Searching and Analysis.

Science.gov (United States)

Pearson, William R; Mackey, Aaron J

2017-09-13

Relational databases can integrate diverse types of information and manage large sets of similarity search results, greatly simplifying genome-scale analyses. By focusing on taxonomic subsets of sequences, relational databases can reduce the size and redundancy of sequence libraries and improve the statistical significance of homologs. In addition, by loading similarity search results into a relational database, it becomes possible to explore and summarize the relationships between all of the proteins in an organism and those in other biological kingdoms. This unit describes how to use relational databases to improve the efficiency of sequence similarity searching and demonstrates various large-scale genomic analyses of homology-related data. It also describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. The unit also introduces search_demo, a database that stores sequence similarity search results. The search_demo database is then used to explore the evolutionary relationships between E. coli proteins and proteins in other organisms in a large-scale comparative genomic analysis. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.
Density-based retrieval from high-similarity image databases

DEFF Research Database (Denmark)

Hansen, Michael Edberg; Carstensen, Jens Michael

2004-01-01

Many image classification problems can fruitfully be thought of as image retrieval in a "high similarity image database" (HSID) characterized by being tuned towards a specific application and having a high degree of visual similarity between entries that should be distinguished. We introduce a me...
Using relational databases for improved sequence similarity searching and large-scale genomic analyses.

Science.gov (United States)

Mackey, Aaron J; Pearson, William R

2004-10-01

Relational databases are designed to integrate diverse types of information and manage large sets of search results, greatly simplifying genome-scale analyses. Relational databases are essential for management and analysis of large-scale sequence analyses, and can also be used to improve the statistical significance of similarity searches by focusing on subsets of sequence libraries most likely to contain homologs. This unit describes using relational databases to improve the efficiency of sequence similarity searching and to demonstrate various large-scale genomic analyses of homology-related data. This unit describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. These include basic use of the database to generate a novel sequence library subset, how to extend and use seqdb_demo for the storage of sequence similarity search results and making use of various kinds of stored search results to address aspects of comparative genomic analysis.
Efficient Similarity Search Using the Earth Mover's Distance for Large Multimedia Databases

DEFF Research Database (Denmark)

Assent, Ira; Wichterich, Marc; Meisen, Tobias

2008-01-01

Multimedia similarity search in large databases requires efficient query processing. The Earth mover's distance, introduced in computer vision, is successfully used as a similarity model in a number of small-scale applications. Its computational complexity hindered its adoption in large multimedia...... databases. We enable directly indexing the Earth mover's distance in structures such as the R-tree and the VA-file by providing the accurate 'MinDist' function to any bounding rectangle in the index. We exploit the computational structure of the new MinDist to derive a new lower bound for the EMD Min...
Color-Based Image Retrieval from High-Similarity Image Databases

DEFF Research Database (Denmark)

Hansen, Michael Adsetts Edberg; Carstensen, Jens Michael

2003-01-01

Many image classification problems can fruitfully be thought of as image retrieval in a "high similarity image database" (HSID) characterized by being tuned towards a specific application and having a high degree of visual similarity between entries that should be distinguished. We introduce...... a method for HSID retrieval using a similarity measure based on a linear combination of Jeffreys-Matusita (JM) distances between distributions of color (and color derivatives) estimated from a set of automatically extracted image regions. The weight coefficients are estimated based on optimal retrieval...... performance. Experimental results on the difficult task of visually identifying clones of fungal colonies grown in a petri dish and categorization of pelts show a high retrieval accuracy of the method when combined with standardized sample preparation and image acquisition....
Fuzzy Relational Databases: Representational Issues and Reduction Using Similarity Measures.

Science.gov (United States)

Prade, Henri; Testemale, Claudette

1987-01-01

Compares and expands upon two approaches to dealing with fuzzy relational databases. The proposed similarity measure is based on a fuzzy Hausdorff distance and estimates the mismatch between two possibility distributions using a reduction process. The consequences of the reduction process on query evaluation are studied. (Author/EM)
Data mining technique for fast retrieval of similar waveforms in Fusion massive databases

International Nuclear Information System (INIS)

Vega, J.; Pereira, A.; Portas, A.; Dormido-Canto, S.; Farias, G.; Dormido, R.; Sanchez, J.; Duro, N.; Santos, M.; Sanchez, E.; Pajares, G.

2008-01-01

Fusion measurement systems generate similar waveforms for reproducible behavior. A major difficulty related to data analysis is the identification, in a rapid and automated way, of a set of discharges with comparable behaviour, i.e. discharges with 'similar' waveforms. Here we introduce a new technique for rapid searching and retrieval of 'similar' signals. The approach consists of building a classification system that avoids traversing the whole database looking for similarities. The classification system diminishes the problem dimensionality (by means of waveform feature extraction) and reduces the searching space to just the most probable 'similar' waveforms (clustering techniques). In the searching procedure, the input waveform is classified in any of the existing clusters. Then, a similarity measure is computed between the input signal and all cluster elements in order to identify the most similar waveforms. The inner product of normalized vectors is used as the similarity measure as it allows the searching process to be independent of signal gain and polarity. This development has been applied recently to TJ-II stellarator databases and has been integrated into its remote participation system
Data mining technique for fast retrieval of similar waveforms in Fusion massive databases

Energy Technology Data Exchange (ETDEWEB)

Vega, J. [Asociacion EURATOM/CIEMAT Para Fusion, Madrid (Spain)], E-mail: jesus.vega@ciemat.es; Pereira, A.; Portas, A. [Asociacion EURATOM/CIEMAT Para Fusion, Madrid (Spain); Dormido-Canto, S.; Farias, G.; Dormido, R.; Sanchez, J.; Duro, N. [Departamento de Informatica y Automatica, UNED, Madrid (Spain); Santos, M. [Departamento de Arquitectura de Computadores y Automatica, UCM, Madrid (Spain); Sanchez, E. [Asociacion EURATOM/CIEMAT Para Fusion, Madrid (Spain); Pajares, G. [Departamento de Arquitectura de Computadores y Automatica, UCM, Madrid (Spain)

2008-01-15

Fusion measurement systems generate similar waveforms for reproducible behavior. A major difficulty related to data analysis is the identification, in a rapid and automated way, of a set of discharges with comparable behaviour, i.e. discharges with 'similar' waveforms. Here we introduce a new technique for rapid searching and retrieval of 'similar' signals. The approach consists of building a classification system that avoids traversing the whole database looking for similarities. The classification system diminishes the problem dimensionality (by means of waveform feature extraction) and reduces the searching space to just the most probable 'similar' waveforms (clustering techniques). In the searching procedure, the input waveform is classified in any of the existing clusters. Then, a similarity measure is computed between the input signal and all cluster elements in order to identify the most similar waveforms. The inner product of normalized vectors is used as the similarity measure as it allows the searching process to be independent of signal gain and polarity. This development has been applied recently to TJ-II stellarator databases and has been integrated into its remote participation system.
Searching the protein structure database for ligand-binding site similarities using CPASS v.2

Directory of Open Access Journals (Sweden)

Caprez Adam

2011-01-01

Full Text Available Abstract Background A recent analysis of protein sequences deposited in the NCBI RefSeq database indicates that ~8.5 million protein sequences are encoded in prokaryotic and eukaryotic genomes, where ~30% are explicitly annotated as "hypothetical" or "uncharacterized" protein. Our Comparison of Protein Active-Site Structures (CPASS v.2 database and software compares the sequence and structural characteristics of experimentally determined ligand binding sites to infer a functional relationship in the absence of global sequence or structure similarity. CPASS is an important component of our Functional Annotation Screening Technology by NMR (FAST-NMR protocol and has been successfully applied to aid the annotation of a number of proteins of unknown function. Findings We report a major upgrade to our CPASS software and database that significantly improves its broad utility. CPASS v.2 is designed with a layered architecture to increase flexibility and portability that also enables job distribution over the Open Science Grid (OSG to increase speed. Similarly, the CPASS interface was enhanced to provide more user flexibility in submitting a CPASS query. CPASS v.2 now allows for both automatic and manual definition of ligand-binding sites and permits pair-wise, one versus all, one versus list, or list versus list comparisons. Solvent accessible surface area, ligand root-mean square difference, and Cβ distances have been incorporated into the CPASS similarity function to improve the quality of the results. The CPASS database has also been updated. Conclusions CPASS v.2 is more than an order of magnitude faster than the original implementation, and allows for multiple simultaneous job submissions. Similarly, the CPASS database of ligand-defined binding sites has increased in size by ~ 38%, dramatically increasing the likelihood of a positive search result. The modification to the CPASS similarity function is effective in reducing CPASS similarity scores
Deja vu: a database of highly similar citations in the scientific literature.

Science.gov (United States)

Errami, Mounir; Sun, Zhaohui; Long, Tara C; George, Angela C; Garner, Harold R

2009-01-01

In the scientific research community, plagiarism and covert multiple publications of the same data are considered unacceptable because they undermine the public confidence in the scientific integrity. Yet, little has been done to help authors and editors to identify highly similar citations, which sometimes may represent cases of unethical duplication. For this reason, we have made available Déjà vu, a publicly available database of highly similar Medline citations identified by the text similarity search engine eTBLAST. Following manual verification, highly similar citation pairs are classified into various categories ranging from duplicates with different authors to sanctioned duplicates. Déjà vu records also contain user-provided commentary and supporting information to substantiate each document's categorization. Déjà vu and eTBLAST are available to authors, editors, reviewers, ethicists and sociologists to study, intercept, annotate and deter questionable publication practices. These tools are part of a sustained effort to enhance the quality of Medline as 'the' biomedical corpus. The Déjà vu database is freely accessible at http://spore.swmed.edu/dejavu. The tool eTBLAST is also freely available at http://etblast.org.
Nuclear markers reveal that inter-lake cichlids' similar morphologies do not reflect similar genealogy.

Science.gov (United States)

Kassam, Daud; Seki, Shingo; Horic, Michio; Yamaoka, Kosaku

2006-08-01

The apparent inter-lake morphological similarity among East African Great Lakes' cichlid species/genera has left evolutionary biologists asking whether such similarity is due to sharing of common ancestor or mere convergent evolution. In order to answer such question, we first used Geometric Morphometrics, GM, to quantify morphological similarity and then subsequently used Amplified Fragment Length Polymorphism, AFLP, to determine if similar morphologies imply shared ancestry or convergent evolution. GM revealed that not all presumed morphological similar pairs were indeed similar, and the dendrogram generated from AFLP data indicated distinct clusters corresponding to each lake and not inter-lake morphological similar pairs. Such results imply that the morphological similarity is due to convergent evolution and not shared ancestry. The congruency of GM and AFLP generated dendrograms imply that GM is capable of picking up phylogenetic signal, and thus GM can be potential tool in phylogenetic systematics.
Content Based Retrieval Database Management System with Support for Similarity Searching and Query Refinement

National Research Council Canada - National Science Library

Ortega-Binderberger, Michael

2002-01-01

... as a critical area of research. This thesis explores how to enhance database systems with content based search over arbitrary abstract data types in a similarity based framework with query refinement...
A little similarity goes a long way: the effects of peripheral but self-revealing similarities on improving and sustaining interracial relationships.

Science.gov (United States)

West, Tessa V; Magee, Joe C; Gordon, Sarah H; Gullett, Lindy

2014-07-01

Integrating theory on close relationships and intergroup relations, we construct a manipulation of similarity that we demonstrate can improve interracial interactions across different settings. We find that manipulating perceptions of similarity on self-revealing attributes that are peripheral to the interaction improves interactions in cross-race dyads and racially diverse task groups. In a getting-acquainted context, we demonstrate that the belief that one's different-race partner is similar to oneself on self-revealing, peripheral attributes leads to less anticipatory anxiety than the belief that one's partner is similar on peripheral, nonself-revealing attributes. In another dyadic context, we explore the range of benefits that perceptions of peripheral, self-revealing similarity can bring to different-race interaction partners and find (a) less anxiety during interaction, (b) greater interest in sustained contact with one's partner, and (c) stronger accuracy in perceptions of one's partners' relationship intentions. By contrast, participants in same-race interactions were largely unaffected by these manipulations of perceived similarity. Our final experiment shows that among small task groups composed of racially diverse individuals, those whose members perceive peripheral, self-revealing similarity perform superior to those who perceive dissimilarity. Implications for using this approach to improve interracial interactions across different goal-driven contexts are discussed.
Comparative mapping reveals similar linkage of functional genes to ...

Indian Academy of Sciences (India)

genes between O. sativa and B. napus may have consistent function and control similar traits, which may be ..... acea chromosomes reveals islands of conserved organization. ... 1998 Conserved structure and function of the Arabidopsis flow-.
Combined semantic and similarity search in medical image databases

Science.gov (United States)

Seifert, Sascha; Thoma, Marisa; Stegmaier, Florian; Hammon, Matthias; Kramer, Martin; Huber, Martin; Kriegel, Hans-Peter; Cavallaro, Alexander; Comaniciu, Dorin

2011-03-01

The current diagnostic process at hospitals is mainly based on reviewing and comparing images coming from multiple time points and modalities in order to monitor disease progression over a period of time. However, for ambiguous cases the radiologist deeply relies on reference literature or second opinion. Although there is a vast amount of acquired images stored in PACS systems which could be reused for decision support, these data sets suffer from weak search capabilities. Thus, we present a search methodology which enables the physician to fulfill intelligent search scenarios on medical image databases combining ontology-based semantic and appearance-based similarity search. It enabled the elimination of 12% of the top ten hits which would arise without taking the semantic context into account.
An Efficient Similarity Digests Database Lookup - A Logarithmic Divide & Conquer Approach

Directory of Open Access Journals (Sweden)

Frank Breitinger

2014-09-01

Full Text Available Investigating seized devices within digital forensics represents a challenging task due to the increasing amount of data. Common procedures utilize automated file identification, which reduces the amount of data an investigator has to examine manually. In the past years the research field of approximate matching arises to detect similar data. However, if n denotes the number of similarity digests in a database, then the lookup for a single similarity digest is of complexity of O(n. This paper presents a concept to extend existing approximate matching algorithms, which reduces the lookup complexity from O(n to O(log(n. Our proposed approach is based on the well-known divide and conquer paradigm and builds a Bloom filter-based tree data structure in order to enable an efficient lookup of similarity digests. Further, it is demonstrated that the presented technique is highly scalable operating a trade-off between storage requirements and computational efficiency. We perform a theoretical assessment based on recently published results and reasonable magnitudes of input data, and show that the complexity reduction achieved by the proposed technique yields a 220-fold acceleration of look-up costs.
MM-MDS: a multidimensional scaling database with similarity ratings for 240 object categories from the Massive Memory picture database.

Directory of Open Access Journals (Sweden)

Michael C Hout

Full Text Available Cognitive theories in visual attention and perception, categorization, and memory often critically rely on concepts of similarity among objects, and empirically require measures of "sameness" among their stimuli. For instance, a researcher may require similarity estimates among multiple exemplars of a target category in visual search, or targets and lures in recognition memory. Quantifying similarity, however, is challenging when everyday items are the desired stimulus set, particularly when researchers require several different pictures from the same category. In this article, we document a new multidimensional scaling database with similarity ratings for 240 categories, each containing color photographs of 16-17 exemplar objects. We collected similarity ratings using the spatial arrangement method. Reports include: the multidimensional scaling solutions for each category, up to five dimensions, stress and fit measures, coordinate locations for each stimulus, and two new classifications. For each picture, we categorized the item's prototypicality, indexed by its proximity to other items in the space. We also classified pairs of images along a continuum of similarity, by assessing the overall arrangement of each MDS space. These similarity ratings will be useful to any researcher that wishes to control the similarity of experimental stimuli according to an objective quantification of "sameness."
MM-MDS: a multidimensional scaling database with similarity ratings for 240 object categories from the Massive Memory picture database.

Science.gov (United States)

Hout, Michael C; Goldinger, Stephen D; Brady, Kyle J

2014-01-01

Cognitive theories in visual attention and perception, categorization, and memory often critically rely on concepts of similarity among objects, and empirically require measures of "sameness" among their stimuli. For instance, a researcher may require similarity estimates among multiple exemplars of a target category in visual search, or targets and lures in recognition memory. Quantifying similarity, however, is challenging when everyday items are the desired stimulus set, particularly when researchers require several different pictures from the same category. In this article, we document a new multidimensional scaling database with similarity ratings for 240 categories, each containing color photographs of 16-17 exemplar objects. We collected similarity ratings using the spatial arrangement method. Reports include: the multidimensional scaling solutions for each category, up to five dimensions, stress and fit measures, coordinate locations for each stimulus, and two new classifications. For each picture, we categorized the item's prototypicality, indexed by its proximity to other items in the space. We also classified pairs of images along a continuum of similarity, by assessing the overall arrangement of each MDS space. These similarity ratings will be useful to any researcher that wishes to control the similarity of experimental stimuli according to an objective quantification of "sameness."

MetalS(3), a database-mining tool for the identification of structurally similar metal sites.

Science.gov (United States)

Valasatava, Yana; Rosato, Antonio; Cavallaro, Gabriele; Andreini, Claudia

2014-08-01

We have developed a database search tool to identify metal sites having structural similarity to a query metal site structure within the MetalPDB database of minimal functional sites (MFSs) contained in metal-binding biological macromolecules. MFSs describe the local environment around the metal(s) independently of the larger context of the macromolecular structure. Such a local environment has a determinant role in tuning the chemical reactivity of the metal, ultimately contributing to the functional properties of the whole system. The database search tool, which we called MetalS(3) (Metal Sites Similarity Search), can be accessed through a Web interface at http://metalweb.cerm.unifi.it/tools/metals3/ . MetalS(3) uses a suitably adapted version of an algorithm that we previously developed to systematically compare the structure of the query metal site with each MFS in MetalPDB. For each MFS, the best superposition is kept. All these superpositions are then ranked according to the MetalS(3) scoring function and are presented to the user in tabular form. The user can interact with the output Web page to visualize the structural alignment or the sequence alignment derived from it. Options to filter the results are available. Test calculations show that the MetalS(3) output correlates well with expectations from protein homology considerations. Furthermore, we describe some usage scenarios that highlight the usefulness of MetalS(3) to obtain mechanistic and functional hints regardless of homology.
SimShiftDB; local conformational restraints derived from chemical shift similarity searches on a large synthetic database

Energy Technology Data Exchange (ETDEWEB)

Ginzinger, Simon W. [Center of Applied Molecular Engineering, University of Salzburg, Department of Molecular Biology, Division of Bioinformatics (Austria)], E-mail: simon@came.sbg.ac.at; Coles, Murray [Max-Planck-Institute for Developmental Biology, Department of Protein Evolution (Germany)], E-mail: Murray.Coles@tuebingen.mpg.de

2009-03-15

We present SimShiftDB, a new program to extract conformational data from protein chemical shifts using structural alignments. The alignments are obtained in searches of a large database containing 13,000 structures and corresponding back-calculated chemical shifts. SimShiftDB makes use of chemical shift data to provide accurate results even in the case of low sequence similarity, and with even coverage of the conformational search space. We compare SimShiftDB to HHSearch, a state-of-the-art sequence-based search tool, and to TALOS, the current standard tool for the task. We show that for a significant fraction of the predicted similarities, SimShiftDB outperforms the other two methods. Particularly, the high coverage afforded by the larger database often allows predictions to be made for residues not involved in canonical secondary structure, where TALOS predictions are both less frequent and more error prone. Thus SimShiftDB can be seen as a complement to currently available methods.
SimShiftDB; local conformational restraints derived from chemical shift similarity searches on a large synthetic database

International Nuclear Information System (INIS)

Ginzinger, Simon W.; Coles, Murray

2009-01-01

We present SimShiftDB, a new program to extract conformational data from protein chemical shifts using structural alignments. The alignments are obtained in searches of a large database containing 13,000 structures and corresponding back-calculated chemical shifts. SimShiftDB makes use of chemical shift data to provide accurate results even in the case of low sequence similarity, and with even coverage of the conformational search space. We compare SimShiftDB to HHSearch, a state-of-the-art sequence-based search tool, and to TALOS, the current standard tool for the task. We show that for a significant fraction of the predicted similarities, SimShiftDB outperforms the other two methods. Particularly, the high coverage afforded by the larger database often allows predictions to be made for residues not involved in canonical secondary structure, where TALOS predictions are both less frequent and more error prone. Thus SimShiftDB can be seen as a complement to currently available methods
Computer-aided beam arrangement based on similar cases in radiation treatment-planning databases for stereotactic lung radiation therapy

International Nuclear Information System (INIS)

Magome, Taiki; Shioyama, Yoshiyuki; Arimura, Hidetaka

2013-01-01

The purpose of this study was to develop a computer-aided method for determination of beam arrangements based on similar cases in a radiotherapy treatment-planning database for stereotactic lung radiation therapy. Similar-case-based beam arrangements were automatically determined based on the following two steps. First, the five most similar cases were searched, based on geometrical features related to the location, size and shape of the planning target volume, lung and spinal cord. Second, five beam arrangements of an objective case were automatically determined by registering five similar cases with the objective case, with respect to lung regions, by means of a linear registration technique. For evaluation of the beam arrangements five treatment plans were manually created by applying the beam arrangements determined in the second step to the objective case. The most usable beam arrangement was selected by sorting the five treatment plans based on eight plan evaluation indices, including the D95, mean lung dose and spinal cord maximum dose. We applied the proposed method to 10 test cases, by using an RTP database of 81 cases with lung cancer, and compared the eight plan evaluation indices between the original treatment plan and the corresponding most usable similar-case-based treatment plan. As a result, the proposed method may provide usable beam arrangements, which have no statistically significant differences from the original beam arrangements (P>0.05) in terms of the eight plan evaluation indices. Therefore, the proposed method could be employed as an educational tool for less experienced treatment planners. (author)
Efficient Similarity Retrieval in Music Databases

DEFF Research Database (Denmark)

Ruxanda, Maria Magdalena; Jensen, Christian Søndergaard

2006-01-01

Audio music is increasingly becoming available in digital form, and the digital music collections of individuals continue to grow. Addressing the need for effective means of retrieving music from such collections, this paper proposes new techniques for content-based similarity search. Each music...
Concept similarity in publications precedes cross-disciplinary collaboration.

Science.gov (United States)

Post, Andrew R; Harrison, James H

2008-11-06

Innovative science frequently occurs as a result of cross-disciplinary collaboration, the importance of which is reflected by recent NIH funding initiatives that promote communication and collaboration. If shared research interests between collaborators are important for the formation of collaborations,methods for identifying these shared interests across scientific domains could potentially reveal new and useful collaboration opportunities. MEDLINE represents a comprehensive database of collaborations and research interests, as reflected by article co-authors and concept content. We analyzed six years of citations using information retrieval based methods to compute articles conceptual similarity, and found that articles by basic and clinical scientists who later collaborated had significantly higher average similarity than articles by similar scientists who did not collaborate.Refinement of these methods and characterization of found conceptual overlaps could allow automated discovery of collaboration opportunities that are currently missed.
NREL: U.S. Life Cycle Inventory Database - About the LCI Database Project

Science.gov (United States)

About the LCI Database Project The U.S. Life Cycle Inventory (LCI) Database is a publicly available database that allows users to objectively review and compare analysis results that are based on similar source of critically reviewed LCI data through its LCI Database Project. NREL's High-Performance
Gene Discovery in the Apicomplexa as Revealed by EST Sequencing and Assembly of a Comparative Gene Database

Science.gov (United States)

Li, Li; Brunk, Brian P.; Kissinger, Jessica C.; Pape, Deana; Tang, Keliang; Cole, Robert H.; Martin, John; Wylie, Todd; Dante, Mike; Fogarty, Steven J.; Howe, Daniel K.; Liberator, Paul; Diaz, Carmen; Anderson, Jennifer; White, Michael; Jerome, Maria E.; Johnson, Emily A.; Radke, Jay A.; Stoeckert, Christian J.; Waterston, Robert H.; Clifton, Sandra W.; Roos, David S.; Sibley, L. David

2003-01-01

Large-scale EST sequencing projects for several important parasites within the phylum Apicomplexa were undertaken for the purpose of gene discovery. Included were several parasites of medical importance (Plasmodium falciparum, Toxoplasma gondii) and others of veterinary importance (Eimeria tenella, Sarcocystis neurona, and Neospora caninum). A total of 55,192 ESTs, deposited into dbEST/GenBank, were included in the analyses. The resulting sequences have been clustered into nonredundant gene assemblies and deposited into a relational database that supports a variety of sequence and text searches. This database has been used to compare the gene assemblies using BLAST similarity comparisons to the public protein databases to identify putative genes. Of these new entries, ∼15%–20% represent putative homologs with a conservative cutoff of p neurona: , , , , , , , , , , , , , –, –, –, –, –. Eimeria tenella: –, –, –, –, –, –, –, –, – , –, –, –, –, –, –, –, –, –, –, –. Neospora caninum: –, –, , – , –, –.] PMID:12618375
Genetic similarity of soybean genotypes revealed by seed protein

Directory of Open Access Journals (Sweden)

Nikolić Ana

2005-01-01

Full Text Available More accurate and complete descriptions of genotypes could help determinate future breeding strategies and facilitate introgression of new genotypes in current soybean genetic pool. The objective of this study was to characterize 20 soybean genotypes from the Maize Research Institute "Zemun Polje" collection, which have good agronomic performances, high yield, lodging and drought resistance, and low shuttering by seed proteins as biochemical markers. Seed proteins were isolated and separated by PAA electrophoresis. On the basis of the presence/absence of protein fractions coefficients of similarity were calculated as Dice and Roger and Tanamoto coefficient between pairs of genotypes. The similarity matrix was submitted for hierarchical cluster analysis of un weighted pair group using arithmetic average (UPGMA method and necessary computation were performed using NTSYS-pc program. Protein seed analysis confirmed low level of genetic diversity in soybean. The highest genetic similarity was between genotypes P9272 and Kador. According to obtained results, soybean genotypes were assigned in two larger groups and coefficients of similarity showed similar results. Because of the lack of pedigree data for analyzed genotypes, correspondence with marker data could not be determined. In plant with a narrow genetic base in their gene pool, such as soybean, protein markers may not be sufficient for characterization and study of genetic diversity.
Does filler database size influence identification accuracy?

Science.gov (United States)

Bergold, Amanda N; Heaton, Paul

2018-06-01

Police departments increasingly use large photo databases to select lineup fillers using facial recognition software, but this technological shift's implications have been largely unexplored in eyewitness research. Database use, particularly if coupled with facial matching software, could enable lineup constructors to increase filler-suspect similarity and thus enhance eyewitness accuracy (Fitzgerald, Oriet, Price, & Charman, 2013). However, with a large pool of potential fillers, such technologies might theoretically produce lineup fillers too similar to the suspect (Fitzgerald, Oriet, & Price, 2015; Luus & Wells, 1991; Wells, Rydell, & Seelau, 1993). This research proposes a new factor-filler database size-as a lineup feature affecting eyewitness accuracy. In a facial recognition experiment, we select lineup fillers in a legally realistic manner using facial matching software applied to filler databases of 5,000, 25,000, and 125,000 photos, and find that larger databases are associated with a higher objective similarity rating between suspects and fillers and lower overall identification accuracy. In target present lineups, witnesses viewing lineups created from the larger databases were less likely to make correct identifications and more likely to select known innocent fillers. When the target was absent, database size was associated with a lower rate of correct rejections and a higher rate of filler identifications. Higher algorithmic similarity ratings were also associated with decreases in eyewitness identification accuracy. The results suggest that using facial matching software to select fillers from large photograph databases may reduce identification accuracy, and provides support for filler database size as a meaningful system variable. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Large revealing similarity in multihadron production in nuclear and particle collisions

International Nuclear Information System (INIS)

Mishra, Aditya Nath; Sahoo, Raghunath; Sarkisyan, Edward K.G.; Sakharov, Alexander S.; )

2016-01-01

The dependencies of charged particle pseudorapidity density and transverse energy pseudorapidity density at midrapidity as well as of charged particle total multiplicity on the collision energy and on the number of nucleon participants, or centrality, measured in nucleus-nucleus collisions are studied in the energy range spanning a few GeV to a few TeV per nucleon. The model in which the multiparticle production is driven by the dissipating effective energy of participants is considered. The model extends the earlier proposed approach, combining the constituent quark picture together with Landau relativistic hydrodynamics shown to interrelate the measurements from different types of collisions. Within this model, the dependence of the charged particle pseudorapidity density and transverse energy pseudorapidity density at midrapidity on the number of participants in heavy-ion collisions are found to be well described in terms of the effective energy defined as a centrality-dependent fraction of the collision energy. For both variables the effective energy approach reveals a similarity in the energy dependence obtained for the most central collisions and centrality data in the entire available energy range
Categorical database generalization in GIS

NARCIS (Netherlands)

Liu, Y.

2002-01-01

Key words: Categorical database, categorical database generalization, Formal data structure, constraints, transformation unit, classification hierarchy, aggregation hierarchy, semantic similarity, data model,

Pediatric burns: Kids' Inpatient Database vs the National Burn Repository.

Soleimani, Tahereh; Evans, Tyler A; Sood, Rajiv; Hartman, Brett C; Hadad, Ivan; Tholpady, Sunil S

2016-04-01

Burn injuries are one of the leading causes of morbidity and mortality in young children. The Kids' Inpatient Database (KID) and National Burn Repository (NBR) are two large national databases that can be used to evaluate outcomes and help quality improvement in burn care. Differences in the design of the KID and NBR could lead to differing results affecting resultant conclusions and quality improvement programs. This study was designed to validate the use of KID for burn epidemiologic studies, as an adjunct to the NBR. Using the KID (2003, 2006, and 2009), a total of 17,300 nonelective burn patients younger than 20 y old were identified. Data from 13,828 similar patients were collected from the NBR. Outcome variables were compared between the two databases. Comparisons revealed similar patient distribution by gender, race, and burn size. Inhalation injury was more common among the NBR patients and was associated with increased mortality. The rates of respiratory failure, wound infection, cellulitis, sepsis, and urinary tract infection were higher in the KID. Multiple regression analysis adjusting for potential confounders demonstrated similar mortality rate but significantly longer length of stay for patients in the NBR. Despite differences in the design and sampling of the KID and NBR, the overall demographic and mortality results are similar. The differences in complication rate and length of stay should be explored by further studies to clarify underlying causes. Investigations into these differences should also better inform strategies to improve burn prevention and treatment. Copyright © 2016 Elsevier Inc. All rights reserved.

Density-based similarity measures for content based search

Energy Technology Data Exchange (ETDEWEB)

Hush, Don R [Los Alamos National Laboratory; Porter, Reid B [Los Alamos National Laboratory; Ruggiero, Christy E [Los Alamos National Laboratory

2009-01-01

We consider the query by multiple example problem where the goal is to identify database samples whose content is similar to a coUection of query samples. To assess the similarity we use a relative content density which quantifies the relative concentration of the query distribution to the database distribution. If the database distribution is a mixture of the query distribution and a background distribution then it can be shown that database samples whose relative content density is greater than a particular threshold {rho} are more likely to have been generated by the query distribution than the background distribution. We describe an algorithm for predicting samples with relative content density greater than {rho} that is computationally efficient and possesses strong performance guarantees. We also show empirical results for applications in computer network monitoring and image segmentation.

BLAST and FASTA similarity searching for multiple sequence alignment.

Science.gov (United States)

Pearson, William R

2014-01-01

BLAST, FASTA, and other similarity searching programs seek to identify homologous proteins and DNA sequences based on excess sequence similarity. If two sequences share much more similarity than expected by chance, the simplest explanation for the excess similarity is common ancestry-homology. The most effective similarity searches compare protein sequences, rather than DNA sequences, for sequences that encode proteins, and use expectation values, rather than percent identity, to infer homology. The BLAST and FASTA packages of sequence comparison programs provide programs for comparing protein and DNA sequences to protein databases (the most sensitive searches). Protein and translated-DNA comparisons to protein databases routinely allow evolutionary look back times from 1 to 2 billion years; DNA:DNA searches are 5-10-fold less sensitive. BLAST and FASTA can be run on popular web sites, but can also be downloaded and installed on local computers. With local installation, target databases can be customized for the sequence data being characterized. With today's very large protein databases, search sensitivity can also be improved by searching smaller comprehensive databases, for example, a complete protein set from an evolutionarily neighboring model organism. By default, BLAST and FASTA use scoring strategies target for distant evolutionary relationships; for comparisons involving short domains or queries, or searches that seek relatively close homologs (e.g. mouse-human), shallower scoring matrices will be more effective. Both BLAST and FASTA provide very accurate statistical estimates, which can be used to reliably identify protein sequences that diverged more than 2 billion years ago.

The mitochondrial genomes of Amphiascoides atopus and Schizopera knabeni (Harpacticoida: Miraciidae) reveal similarities between the copepod orders Harpacticoida and Poecilostomatoida.

Science.gov (United States)

Easton, Erin E; Darrow, Emily M; Spears, Trisha; Thistle, David

2014-03-15

Members of subclass Copepoda are abundant, diverse, and-as a result of their variety of ecological roles in marine and freshwater environments-important, but their phylogenetic interrelationships are unclear. Recent studies of arthropods have used gene arrangements in the mitochondrial (mt) genome to infer phylogenies, but for copepods, only seven complete mt genomes have been published. These data revealed several within-order and few among-order similarities. To increase the data available for comparisons, we sequenced the complete mt genome (13,831base pairs) of Amphiascoides atopus and 10,649base pairs of the mt genome of Schizopera knabeni (both in the family Miraciidae of the order Harpacticoida). Comparison of our data to those for Tigriopus japonicus (family Harpacticidae, order Harpacticoida) revealed similarities in gene arrangement among these three species that were consistent with those found within and among families of other copepod orders. Comparison of the mt genomes of our species with those known from other copepod orders revealed the arrangement of mt genes of our Harpacticoida species to be more similar to that of Sinergasilus polycolpus (order Poecilostomatoida) than to that of T. japonicus. The similarities between S. polycolpus and our species are the first to be noted across the boundaries of copepod orders and support the possibility that mt-gene arrangement might be used to infer copepod phylogenies. We also found that our two species had extremely truncated transfer RNAs and that gene overlaps occurred much more frequently than has been reported for other copepod mt genomes. Published by Elsevier B.V.

The CAPEC Database

DEFF Research Database (Denmark)

Nielsen, Thomas Lund; Abildskov, Jens; Harper, Peter Mathias

2001-01-01

in the compound. This classification makes the CAPEC database a very useful tool, for example, in the development of new property models, since properties of chemically similar compounds are easily obtained. A program with efficient search and retrieval functions of properties has been developed.......The Computer-Aided Process Engineering Center (CAPEC) database of measured data was established with the aim to promote greater data exchange in the chemical engineering community. The target properties are pure component properties, mixture properties, and special drug solubility data....... The database divides pure component properties into primary, secondary, and functional properties. Mixture properties are categorized in terms of the number of components in the mixture and the number of phases present. The compounds in the database have been classified on the basis of the functional groups...

Fast Structural Search in Phylogenetic Databases

Directory of Open Access Journals (Sweden)

William H. Piel

2005-01-01

Full Text Available As the size of phylogenetic databases grows, the need for efficiently searching these databases arises. Thanks to previous and ongoing research, searching by attribute value and by text has become commonplace in these databases. However, searching by topological or physical structure, especially for large databases and especially for approximate matches, is still an art. We propose structural search techniques that, given a query or pattern tree P and a database of phylogenies D, find trees in D that are sufficiently close to P . The “closeness” is a measure of the topological relationships in P that are found to be the same or similar in a tree D in D. We develop a filtering technique that accelerates searches and present algorithms for rooted and unrooted trees where the trees can be weighted or unweighted. Experimental results on comparing the similarity measure with existing tree metrics and on evaluating the efficiency of the search techniques demonstrate that the proposed approach is promising

Nuclear materials thermo-physical property database and property analysis using the database

International Nuclear Information System (INIS)

Jeong, Yeong Seok

2002-02-01

It is necessary that thermo-physical properties and understand of nuclear materials for evaluation and analysis to steady and accident states of commercial and research reactor. In this study, development of nuclear materials thermo-properties database and home page. In application of this database, it is analyzed of thermal conductivity, heat capacity, enthalpy, and linear thermal expansion of fuel and cladding material and compared thermo-properties model in nuclear fuel performance evaluation codes with experimental data in database. Results of compare thermo-property model of UO 2 fuel and cladding major performance evaluation code, both are similar

The Danish Nonmelanoma Skin Cancer Dermatology Database

DEFF Research Database (Denmark)

Lamberg, Anna Lei; Sølvsten, Henrik; Lei, Ulrikke

2016-01-01

AIM OF DATABASE: The Danish Nonmelanoma Skin Cancer Dermatology Database was established in 2008. The aim of this database was to collect data on nonmelanoma skin cancer (NMSC) treatment and improve its treatment in Denmark. NMSC is the most common malignancy in the western countries and represents...... treatment. The database has revealed that overall, the quality of care of NMSC in Danish dermatological clinics is high, and the database provides the necessary data for continuous quality assurance....

Negative Effects of Learning Spreadsheet Management on Learning Database Management

Science.gov (United States)

Vágner, Anikó; Zsakó, László

2015-01-01

A lot of students learn spreadsheet management before database management. Their similarities can cause a lot of negative effects when learning database management. In this article, we consider these similarities and explain what can cause problems. First, we analyse the basic concepts such as table, database, row, cell, reference, etc. Then, we…
SpolSimilaritySearch - A web tool to compare and search similarities between spoligotypes of Mycobacterium tuberculosis complex.

Science.gov (United States)

Couvin, David; Zozio, Thierry; Rastogi, Nalin

2017-07-01

Spoligotyping is one of the most commonly used polymerase chain reaction (PCR)-based methods for identification and study of genetic diversity of Mycobacterium tuberculosis complex (MTBC). Despite its known limitations if used alone, the methodology is particularly useful when used in combination with other methods such as mycobacterial interspersed repetitive units - variable number of tandem DNA repeats (MIRU-VNTRs). At a worldwide scale, spoligotyping has allowed identification of information on 103,856 MTBC isolates (corresponding to 98049 clustered strains plus 5807 unique isolates from 169 countries of patient origin) contained within the SITVIT2 proprietary database of the Institut Pasteur de la Guadeloupe. The SpolSimilaritySearch web-tool described herein (available at: http://www.pasteur-guadeloupe.fr:8081/SpolSimilaritySearch) incorporates a similarity search algorithm allowing users to get a complete overview of similar spoligotype patterns (with information on presence or absence of 43 spacers) in the aforementioned worldwide database. This tool allows one to analyze spread and evolutionary patterns of MTBC by comparing similar spoligotype patterns, to distinguish between widespread, specific and/or confined patterns, as well as to pinpoint patterns with large deleted blocks, which play an intriguing role in the genetic epidemiology of M. tuberculosis. Finally, the SpolSimilaritySearch tool also provides with the country distribution patterns for each queried spoligotype. Copyright © 2017 Elsevier Ltd. All rights reserved.
Protein structure similarity from principle component correlation analysis

Directory of Open Access Journals (Sweden)

Chou James

2006-01-01

Full Text Available Abstract Background Owing to rapid expansion of protein structure databases in recent years, methods of structure comparison are becoming increasingly effective and important in revealing novel information on functional properties of proteins and their roles in the grand scheme of evolutionary biology. Currently, the structural similarity between two proteins is measured by the root-mean-square-deviation (RMSD in their best-superimposed atomic coordinates. RMSD is the golden rule of measuring structural similarity when the structures are nearly identical; it, however, fails to detect the higher order topological similarities in proteins evolved into different shapes. We propose new algorithms for extracting geometrical invariants of proteins that can be effectively used to identify homologous protein structures or topologies in order to quantify both close and remote structural similarities. Results We measure structural similarity between proteins by correlating the principle components of their secondary structure interaction matrix. In our approach, the Principle Component Correlation (PCC analysis, a symmetric interaction matrix for a protein structure is constructed with relationship parameters between secondary elements that can take the form of distance, orientation, or other relevant structural invariants. When using a distance-based construction in the presence or absence of encoded N to C terminal sense, there are strong correlations between the principle components of interaction matrices of structurally or topologically similar proteins. Conclusion The PCC method is extensively tested for protein structures that belong to the same topological class but are significantly different by RMSD measure. The PCC analysis can also differentiate proteins having similar shapes but different topological arrangements. Additionally, we demonstrate that when using two independently defined interaction matrices, comparison of their maximum
Marriage Matters: Spousal Similarity in Life Satisfaction

OpenAIRE

Ulrich Schimmack; Richard Lucas

2006-01-01

Examined the concurrent and cross-lagged spousal similarity in life satisfaction over a 21-year period. Analyses were based on married couples (N = 847) in the German Socio-Economic Panel (SOEP). Concurrent spousal similarity was considerably higher than one-year retest similarity, revealing spousal similarity in the variable component of life satisfac-tion. Spousal similarity systematically decreased with length of retest interval, revealing simi-larity in the changing component of life sati...
CANDID: Comparison algorithm for navigating digital image databases

Energy Technology Data Exchange (ETDEWEB)

Kelly, P.M.; Cannon, T.M.

1994-02-21

In this paper, we propose a method for calculating the similarity between two digital images. A global signature describing the texture, shape, or color content is first computed for every image stored in a database, and a normalized distance between probability density functions of feature vectors is used to match signatures. This method can be used to retrieve images from a database that are similar to an example target image. This algorithm is applied to the problem of search and retrieval for database containing pulmonary CT imagery, and experimental results are provided.
Content Based Retrieval Database Management System with Support for Similarity Searching and Query Refinement

Science.gov (United States)

2002-01-01

to the OODBMS approach. The ORDBMS approach produced such research prototypes as Postgres [155], and Starburst [67] and commercial products such as...Kemnitz. The POSTGRES Next-Generation Database Management System. Communications of the ACM, 34(10):78–92, 1991. [156] Michael Stonebreaker and Dorothy
A performance evaluation of in-memory databases

Directory of Open Access Journals (Sweden)

Abdullah Talha Kabakus

2017-10-01

Full Text Available The popularity of NoSQL databases has increased due to the need of (1 processing vast amount of data faster than the relational database management systems by taking the advantage of highly scalable architecture, (2 flexible (schema-free data structure, and, (3 low latency and high performance. Despite that memory usage is not major criteria to evaluate performance of algorithms, since these databases serve the data from memory, their memory usages are also experimented alongside the time taken to complete each operation in the paper to reveal which one uses the memory most efficiently. Currently there exists over 225 NoSQL databases that provide different features and characteristics. So it is necessary to reveal which one provides better performance for different data operations. In this paper, we experiment the widely used in-memory databases to measure their performance in terms of (1 the time taken to complete operations, and (2 how efficiently they use memory during operations. As per the results reported in this paper, there is no database that provides the best performance for all data operations. It is also proved that even though a RDMS stores its data in memory, its overall performance is worse than NoSQL databases.
The Matti Kuusi International Database of Proverbs

Directory of Open Access Journals (Sweden)

Outi Lauhakangas

2013-10-01

Full Text Available Based on Matti Kuusi’s library of proverb collections, the Matti Kuusi International Database of Proverbs is designed for cultural researchers, translators, and proverb enthusiasts. Kuusi compiled a card-index with tens of thousands of references to synonymic proverbs. The database is a tool for studying global similarities and national specificities of proverbs. This essay offers a practical introduction to the database, its classification system, and describes future plans for improvement of the database.
Protein profiling reveals inter-individual protein homogeneity of arachnoid cyst fluid and high qualitative similarity to cerebrospinal fluid

Directory of Open Access Journals (Sweden)

Berle Magnus

2011-05-01

Full Text Available Abstract Background The mechanisms behind formation and filling of intracranial arachnoid cysts (AC are poorly understood. The aim of this study was to evaluate AC fluid by proteomics to gain further knowledge about ACs. Two goals were set: 1 Comparison of AC fluid from individual patients to determine whether or not temporal AC is a homogenous condition; and 2 Evaluate the protein content of a pool of AC fluid from several patients and qualitatively compare this with published protein lists of cerebrospinal fluid (CSF and plasma. Methods AC fluid from 15 patients with temporal AC was included in this study. In the AC protein comparison experiment, AC fluid from 14 patients was digested, analyzed by LC-MS/MS using a semi-quantitative label-free approach and the data were compared by principal component analysis (PCA to gain knowledge of protein homogeneity of AC. In the AC proteome evaluation experiment, AC fluid from 11 patients was pooled, digested, and fractionated by SCX chromatography prior to analysis by LC-MS/MS. Proteins identified were compared to published databases of proteins identified from CSF and plasma. AC fluid proteins not found in these two databases were experimentally searched for in lumbar CSF taken from neurologically-normal patients, by a targeted protein identification approach called MIDAS (Multiple Reaction Monitoring (MRM initiated detection and sequence analysis. Results We did not identify systematic trends or grouping of data in the AC protein comparison experiment, implying low variability between individual proteomic profiles of AC. In the AC proteome evaluation experiment, we identified 199 proteins. When compared to previously published lists of proteins identified from CSF and plasma, 15 of the AC proteins had not been reported in either of these datasets. By a targeted protein identification approach, we identified 11 of these 15 proteins in pooled CSF from neurologically-normal patients, demonstrating that
The impact of bereaved parents' perceived grief similarity on relationship satisfaction.

Science.gov (United States)

Buyukcan-Tetik, Asuman; Finkenauer, Catrin; Schut, Henk; Stroebe, Margaret; Stroebe, Wolfgang

2017-06-01

The present research focused on bereaved parents' perceived grief similarity, and aimed to investigate the concurrent and longitudinal effects of the perceptions that the partner has less, equal, or more grief intensity than oneself on relationship satisfaction. Participants of our longitudinal study were 229 heterosexual bereaved Dutch couples who completed questionnaires 6, 13, and 20 months after the loss of their child. Average age of participants was 40.7 (SD = 9.5). Across 3 study waves, participants' perceived grief similarity and relationship satisfaction were assessed. To control for their effects, own grief level, child's gender, expectedness of loss, parent's age, parent's gender, and time were also included in the analyses. Consistent with the hypotheses, cross-sectional results revealed that bereaved parents who perceived dissimilar levels of grief (less or more grief) had lower relationship satisfaction than bereaved parents who perceived similar levels of grief. This effect remained significant controlling for the effects of possible confounding variables and actual similarity in grief between partners. We also found that perceived grief similarity at the first study wave was related to the highest level of relationship satisfaction at the second study wave. Moreover, results showed that perceived grief similarity was associated with a higher level in partner's relationship satisfaction. Results are discussed considering the comparison and similarity in grief across bereaved partners after child loss. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Calculating the knowledge-based similarity of functional groups using crystallographic data

Science.gov (United States)

Watson, Paul; Willett, Peter; Gillet, Valerie J.; Verdonk, Marcel L.

2001-09-01

A knowledge-based method for calculating the similarity of functional groups is described and validated. The method is based on experimental information derived from small molecule crystal structures. These data are used in the form of scatterplots that show the likelihood of a non-bonded interaction being formed between functional group A (the `central group') and functional group B (the `contact group' or `probe'). The scatterplots are converted into three-dimensional maps that show the propensity of the probe at different positions around the central group. Here we describe how to calculate the similarity of a pair of central groups based on these maps. The similarity method is validated using bioisosteric functional group pairs identified in the Bioster database and Relibase. The Bioster database is a critical compilation of thousands of bioisosteric molecule pairs, including drugs, enzyme inhibitors and agrochemicals. Relibase is an object-oriented database containing structural data about protein-ligand interactions. The distributions of the similarities of the bioisosteric functional group pairs are compared with similarities for all the possible pairs in IsoStar, and are found to be significantly different. Enrichment factors are also calculated showing the similarity method is statistically significantly better than random in predicting bioisosteric functional group pairs.
Predicting the performance of fingerprint similarity searching.

Science.gov (United States)

Vogt, Martin; Bajorath, Jürgen

2011-01-01

Fingerprints are bit string representations of molecular structure that typically encode structural fragments, topological features, or pharmacophore patterns. Various fingerprint designs are utilized in virtual screening and their search performance essentially depends on three parameters: the nature of the fingerprint, the active compounds serving as reference molecules, and the composition of the screening database. It is of considerable interest and practical relevance to predict the performance of fingerprint similarity searching. A quantitative assessment of the potential that a fingerprint search might successfully retrieve active compounds, if available in the screening database, would substantially help to select the type of fingerprint most suitable for a given search problem. The method presented herein utilizes concepts from information theory to relate the fingerprint feature distributions of reference compounds to screening libraries. If these feature distributions do not sufficiently differ, active database compounds that are similar to reference molecules cannot be retrieved because they disappear in the "background." By quantifying the difference in feature distribution using the Kullback-Leibler divergence and relating the divergence to compound recovery rates obtained for different benchmark classes, fingerprint search performance can be quantitatively predicted.
In silico identification of anti-cancer compounds and plants from traditional Chinese medicine database

Science.gov (United States)

Dai, Shao-Xing; Li, Wen-Xing; Han, Fei-Fei; Guo, Yi-Cheng; Zheng, Jun-Juan; Liu, Jia-Qian; Wang, Qian; Gao, Yue-Dong; Li, Gong-Hua; Huang, Jing-Fei

2016-05-01

There is a constant demand to develop new, effective, and affordable anti-cancer drugs. The traditional Chinese medicine (TCM) is a valuable and alternative resource for identifying novel anti-cancer agents. In this study, we aim to identify the anti-cancer compounds and plants from the TCM database by using cheminformatics. We first predicted 5278 anti-cancer compounds from TCM database. The top 346 compounds were highly potent active in the 60 cell lines test. Similarity analysis revealed that 75% of the 5278 compounds are highly similar to the approved anti-cancer drugs. Based on the predicted anti-cancer compounds, we identified 57 anti-cancer plants by activity enrichment. The identified plants are widely distributed in 46 genera and 28 families, which broadens the scope of the anti-cancer drug screening. Finally, we constructed a network of predicted anti-cancer plants and approved drugs based on the above results. The network highlighted the supportive role of the predicted plant in the development of anti-cancer drug and suggested different molecular anti-cancer mechanisms of the plants. Our study suggests that the predicted compounds and plants from TCM database offer an attractive starting point and a broader scope to mine for potential anti-cancer agents.
Prediction of Associations between OMIM Diseases and MicroRNAs by Random Walk on OMIM Disease Similarity Network

Directory of Open Access Journals (Sweden)

Hailin Chen

2013-01-01

Full Text Available Increasing evidence has revealed that microRNAs (miRNAs play important roles in the development and progression of human diseases. However, efforts made to uncover OMIM disease-miRNA associations are lacking and the majority of diseases in the OMIM database are not associated with any miRNA. Therefore, there is a strong incentive to develop computational methods to detect potential OMIM disease-miRNA associations. In this paper, random walk on OMIM disease similarity network is applied to predict potential OMIM disease-miRNA associations under the assumption that functionally related miRNAs are often associated with phenotypically similar diseases. Our method makes full use of global disease similarity values. We tested our method on 1226 known OMIM disease-miRNA associations in the framework of leave-one-out cross-validation and achieved an area under the ROC curve of 71.42%. Excellent performance enables us to predict a number of new potential OMIM disease-miRNA associations and the newly predicted associations are publicly released to facilitate future studies. Some predicted associations with high ranks were manually checked and were confirmed from the publicly available databases, which was a strong evidence for the practical relevance of our method.
Neighborhood Structural Similarity Mapping for the Classification of Masses in Mammograms.

Science.gov (United States)

Rabidas, Rinku; Midya, Abhishek; Chakraborty, Jayasree

2018-05-01

In this paper, two novel feature extraction methods, using neighborhood structural similarity (NSS), are proposed for the characterization of mammographic masses as benign or malignant. Since gray-level distribution of pixels is different in benign and malignant masses, more regular and homogeneous patterns are visible in benign masses compared to malignant masses; the proposed method exploits the similarity between neighboring regions of masses by designing two new features, namely, NSS-I and NSS-II, which capture global similarity at different scales. Complementary to these global features, uniform local binary patterns are computed to enhance the classification efficiency by combining with the proposed features. The performance of the features are evaluated using the images from the mini-mammographic image analysis society (mini-MIAS) and digital database for screening mammography (DDSM) databases, where a tenfold cross-validation technique is incorporated with Fisher linear discriminant analysis, after selecting the optimal set of features using stepwise logistic regression method. The best area under the receiver operating characteristic curve of 0.98 with an accuracy of is achieved with the mini-MIAS database, while the same for the DDSM database is 0.93 with accuracy .
Construction of crystal structure prototype database: methods and applications.

Science.gov (United States)

Su, Chuanxun; Lv, Jian; Li, Quan; Wang, Hui; Zhang, Lijun; Wang, Yanchao; Ma, Yanming

2017-04-26

Crystal structure prototype data have become a useful source of information for materials discovery in the fields of crystallography, chemistry, physics, and materials science. This work reports the development of a robust and efficient method for assessing the similarity of structures on the basis of their interatomic distances. Using this method, we proposed a simple and unambiguous definition of crystal structure prototype based on hierarchical clustering theory, and constructed the crystal structure prototype database (CSPD) by filtering the known crystallographic structures in a database. With similar method, a program structure prototype analysis package (SPAP) was developed to remove similar structures in CALYPSO prediction results and extract predicted low energy structures for a separate theoretical structure database. A series of statistics describing the distribution of crystal structure prototypes in the CSPD was compiled to provide an important insight for structure prediction and high-throughput calculations. Illustrative examples of the application of the proposed database are given, including the generation of initial structures for structure prediction and determination of the prototype structure in databases. These examples demonstrate the CSPD to be a generally applicable and useful tool for materials discovery.
Construction of crystal structure prototype database: methods and applications

International Nuclear Information System (INIS)

Su, Chuanxun; Lv, Jian; Wang, Hui; Wang, Yanchao; Ma, Yanming; Li, Quan; Zhang, Lijun

2017-01-01

Crystal structure prototype data have become a useful source of information for materials discovery in the fields of crystallography, chemistry, physics, and materials science. This work reports the development of a robust and efficient method for assessing the similarity of structures on the basis of their interatomic distances. Using this method, we proposed a simple and unambiguous definition of crystal structure prototype based on hierarchical clustering theory, and constructed the crystal structure prototype database (CSPD) by filtering the known crystallographic structures in a database. With similar method, a program structure prototype analysis package (SPAP) was developed to remove similar structures in CALYPSO prediction results and extract predicted low energy structures for a separate theoretical structure database. A series of statistics describing the distribution of crystal structure prototypes in the CSPD was compiled to provide an important insight for structure prediction and high-throughput calculations. Illustrative examples of the application of the proposed database are given, including the generation of initial structures for structure prediction and determination of the prototype structure in databases. These examples demonstrate the CSPD to be a generally applicable and useful tool for materials discovery. (paper)
Query-dependent banding (QDB for faster RNA similarity searches.

Directory of Open Access Journals (Sweden)

Eric P Nawrocki

2007-03-01

Full Text Available When searching sequence databases for RNAs, it is desirable to score both primary sequence and RNA secondary structure similarity. Covariance models (CMs are probabilistic models well-suited for RNA similarity search applications. However, the computational complexity of CM dynamic programming alignment algorithms has limited their practical application. Here we describe an acceleration method called query-dependent banding (QDB, which uses the probabilistic query CM to precalculate regions of the dynamic programming lattice that have negligible probability, independently of the target database. We have implemented QDB in the freely available Infernal software package. QDB reduces the average case time complexity of CM alignment from LN(2.4 to LN(1.3 for a query RNA of N residues and a target database of L residues, resulting in a 4-fold speedup for typical RNA queries. Combined with other improvements to Infernal, including informative mixture Dirichlet priors on model parameters, benchmarks also show increased sensitivity and specificity resulting from improved parameterization.
Similarity Measure of Graphs

Directory of Open Access Journals (Sweden)

Amine Labriji

2017-07-01

Full Text Available The topic of identifying the similarity of graphs was considered as highly recommended research field in the Web semantic, artificial intelligence, the shape recognition and information research. One of the fundamental problems of graph databases is finding similar graphs to a graph query. Existing approaches dealing with this problem are usually based on the nodes and arcs of the two graphs, regardless of parental semantic links. For instance, a common connection is not identified as being part of the similarity of two graphs in cases like two graphs without common concepts, the measure of similarity based on the union of two graphs, or the one based on the notion of maximum common sub-graph (SCM, or the distance of edition of graphs. This leads to an inadequate situation in the context of information research. To overcome this problem, we suggest a new measure of similarity between graphs, based on the similarity measure of Wu and Palmer. We have shown that this new measure satisfies the properties of a measure of similarities and we applied this new measure on examples. The results show that our measure provides a run time with a gain of time compared to existing approaches. In addition, we compared the relevance of the similarity values obtained, it appears that this new graphs measure is advantageous and offers a contribution to solving the problem mentioned above.
Smokers and non smokers with rheumatoid arthritis have similar clinical status: data from the multinational QUEST-RA database.

Science.gov (United States)

Naranjo, A; Toloza, S; Guimaraes da Silveira, I; Lazovskis, J; Hetland, M L; Hamoud, H; Peets, T; Mäkinen, H; Gossec, L; Herborn, G; Skopouli, F N; Rojkovich, B; Aggarwal, A; Minnock, P; Cazzato, M; Yamanaka, H; Oyoo, O; Rexhepi, S; Andersone, D; Baranauskaite, A; Hajjaj-Hassouni, N; Jacobs, J W G; Haugeberg, G; Sierakowski, S; Ionescu, R; Karateew, D; Dimic, A; Henrohn, D; Gogus, F; Badsha, H; Choy, E; Bergman, M; Sokka, T

2010-01-01

To analyse clinical severity/activity of rheumatoid arthritis (RA) according to smoking status. The QUEST-RA multinational database reviews patients for Core Data Set measures including 28 swollen and tender joint count, physician global estimate, erythrocyte sedimentation rate (ESR), HAQ-function, pain, and patient global estimate, as well as DAS28, rheumatoid factor (RF), nodules, erosions and number of DMARDs were recorded. Smoking status was assessed by self-report as 'never smoked', 'currently smoking' and 'former smokers'. Patient groups with different smoking status were compared for demographic and RA measures. Among the 7,307 patients with smoking data available, status as 'never smoked,' 'current smoker' and 'former smoker' were reported by 65%, 15% and 20%. Ever smokers were more likely to be RF-positive (OR 1.32;1.17-1.48, p<0.001). Rheumatoid nodules were more frequent in ever smokers (OR 1.41;1.24-1.59, p<0.001). The percentage of patients with erosive arthritis and extra-articular disease was similar in all smoking categories. Mean DAS28 was 4.4 (SD 1.6) in non-smokers vs. 4.0 (SD 1.6) in those who had ever smoked. However, when adjusted by age, sex, disease duration, and country gross domestic product, only ESR remained significantly different among Core Data Set measures (mean 31.7mm in non-smokers vs. 26.8mm in ever smoked category). RA patients who had ever smoked were more likely to have RF and nodules, but values for other clinical status measures were similar in all smoking categories (never smoked, current smokers and former smokers).

Protein structural similarity search by Ramachandran codes

Directory of Open Access Journals (Sweden)

Chang Chih-Hung

2007-08-01

Full Text Available Abstract Background Protein structural data has increased exponentially, such that fast and accurate tools are necessary to access structure similarity search. To improve the search speed, several methods have been designed to reduce three-dimensional protein structures to one-dimensional text strings that are then analyzed by traditional sequence alignment methods; however, the accuracy is usually sacrificed and the speed is still unable to match sequence similarity search tools. Here, we aimed to improve the linear encoding methodology and develop efficient search tools that can rapidly retrieve structural homologs from large protein databases. Results We propose a new linear encoding method, SARST (Structural similarity search Aided by Ramachandran Sequential Transformation. SARST transforms protein structures into text strings through a Ramachandran map organized by nearest-neighbor clustering and uses a regenerative approach to produce substitution matrices. Then, classical sequence similarity search methods can be applied to the structural similarity search. Its accuracy is similar to Combinatorial Extension (CE and works over 243,000 times faster, searching 34,000 proteins in 0.34 sec with a 3.2-GHz CPU. SARST provides statistically meaningful expectation values to assess the retrieved information. It has been implemented into a web service and a stand-alone Java program that is able to run on many different platforms. Conclusion As a database search method, SARST can rapidly distinguish high from low similarities and efficiently retrieve homologous structures. It demonstrates that the easily accessible linear encoding methodology has the potential to serve as a foundation for efficient protein structural similarity search tools. These search tools are supposed applicable to automated and high-throughput functional annotations or predictions for the ever increasing number of published protein structures in this post-genomic era.
A database of new zeolite-like materials.

Science.gov (United States)

Pophale, Ramdas; Cheeseman, Phillip A; Deem, Michael W

2011-07-21

We here describe a database of computationally predicted zeolite-like materials. These crystals were discovered by a Monte Carlo search for zeolite-like materials. Positions of Si atoms as well as unit cell, space group, density, and number of crystallographically unique atoms were explored in the construction of this database. The database contains over 2.6 M unique structures. Roughly 15% of these are within +30 kJ mol(-1) Si of α-quartz, the band in which most of the known zeolites lie. These structures have topological, geometrical, and diffraction characteristics that are similar to those of known zeolites. The database is the result of refinement by two interatomic potentials that both satisfy the Pauli exclusion principle. The database has been deposited in the publicly available PCOD database and in www.hypotheticalzeolites.net/database/deem/. This journal is © the Owner Societies 2011
Discovering new information in bibliographic databases

Directory of Open Access Journals (Sweden)

Emil Hudomalj

2005-01-01

Full Text Available Databases contain information that can usually not be revealed by standard query systems. For that purpose, the methods for knowledge discovery from databases can be applied, which enable the user to browse aggregated data, discover trends, produce online reports, explore possible new associations within the data etc. Such methods are successfully employed in various fields, such as banking, insurance and telecommunications, while they are seldom used in libraries. The article reviews the development of query systems for bibliographic databases, including some early attempts to apply modern knowledge discovery methods. Analytical databases are described in more detail, since they usually serve as the basis for knowledge discovery. Data mining approaches are presented, since they are a central step in the knowledge discovery process. The key role of librarians who can play a key part in developing systems for finding new information in existing bibliographic databases is stressed.
Next-generation sequencing can reveal in vitro-generated PCR crossover products: some artifactual sequences correspond to HLA alleles in the IMGT/HLA database.

Science.gov (United States)

Holcomb, C L; Rastrou, M; Williams, T C; Goodridge, D; Lazaro, A M; Tilanus, M; Erlich, H A

2014-01-01

The high-resolution human leukocyte antigen (HLA) genotyping assay that we developed using 454 sequencing and Conexio software uses generic polymerase chain reaction (PCR) primers for DRB exon 2. Occasionally, we observed low abundance DRB amplicon sequences that resulted from in vitro PCR 'crossing over' between DRB1 and DRB3/4/5. These hybrid sequences, revealed by the clonal sequencing property of the 454 system, were generally observed at a read depth of 5%-10% of the true alleles. They usually contained at least one mismatch with the IMGT/HLA database, and consequently, were easily recognizable and did not cause a problem for HLA genotyping. Sometimes, however, these artifactual sequences matched a rare allele and the automatic genotype assignment was incorrect. These observations raised two issues: (1) could PCR conditions be modified to reduce such artifacts? and (2) could some of the rare alleles listed in the IMGT/HLA database be artifacts rather than true alleles? Because PCR crossing over occurs during late cycles of PCR, we compared DRB genotypes resulting from 28 and (our standard) 35 cycles of PCR. For all 21 cell line DNAs amplified for 35 cycles, crossover products were detected. In 33% of the cases, these hybrid sequences corresponded to named alleles. With amplification for only 28 cycles, these artifactual sequences were not detectable. To investigate whether some rare alleles in the IMGT/HLA database might be due to PCR artifacts, we analyzed four samples obtained from the investigators who submitted the sequences. In three cases, the sequences were generated from true alleles. In one case, our 454 sequencing revealed an error in the previously submitted sequence. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Investigation of psychophysical similarity measures for selection of similar images in the diagnosis of clustered microcalcifications on mammograms

International Nuclear Information System (INIS)

Muramatsu, Chisako; Li Qiang; Schmidt, Robert; Shiraishi, Junji; Doi, Kunio

2008-01-01

The presentation of images with lesions of known pathology that are similar to an unknown lesion may be helpful to radiologists in the diagnosis of challenging cases for improving the diagnostic accuracy and also for reducing variation among different radiologists. The authors have been developing a computerized scheme for automatically selecting similar images with clustered microcalcifications on mammograms from a large database. For similar images to be useful, they must be similar from the point of view of the diagnosing radiologists. In order to select such images, subjective similarity ratings were obtained for a number of pairs of clustered microcalcifications by breast radiologists for establishment of a ''gold standard'' of image similarity, and the gold standard was employed for determination and evaluation of the selection of similar images. The images used in this study were obtained from the Digital Database for Screening Mammography developed by the University of South Florida. The subjective similarity ratings for 300 pairs of images with clustered microcalcifications were determined by ten breast radiologists. The authors determined a number of image features which represent the characteristics of clustered microcalcifications that radiologists would use in their diagnosis. For determination of objective similarity measures, an artificial neural network (ANN) was employed. The ANN was trained with the average subjective similarity ratings as teacher and selected image features as input data. The ANN was trained to learn the relationship between the image features and the radiologists' similarity ratings; therefore, once the training was completed, the ANN was able to determine the similarity, called a psychophysical similarity measure, which was expected to be close to radiologists' impressions, for an unknown pair of clustered microcalcifications. By use of a leave-one-out test method, the best combination of features was selected. The correlation
Investigation of psychophysical similarity measures for selection of similar images in the diagnosis of clustered microcalcifications on mammograms

Energy Technology Data Exchange (ETDEWEB)

Muramatsu, Chisako; Li Qiang; Schmidt, Robert; Shiraishi, Junji; Doi, Kunio [Department of Radiology, University of Chicago, 5841 South Maryland Avenue, Chicago, Illinois 60637 (United States) and Department of Intelligent Image Information, Gifu University, 1-1 Yanagido, Gifu (Japan); Department of Radiology, Duke Advanced Imaging Labs, Duke University, 2424 Erwin Road, Suite 302, Durham, North Carolina 27705 (United States); Department of Radiology, University of Chicago, 5841 South Maryland Avenue, Chicago, Illinois 60637 (United States)

2008-12-15

The presentation of images with lesions of known pathology that are similar to an unknown lesion may be helpful to radiologists in the diagnosis of challenging cases for improving the diagnostic accuracy and also for reducing variation among different radiologists. The authors have been developing a computerized scheme for automatically selecting similar images with clustered microcalcifications on mammograms from a large database. For similar images to be useful, they must be similar from the point of view of the diagnosing radiologists. In order to select such images, subjective similarity ratings were obtained for a number of pairs of clustered microcalcifications by breast radiologists for establishment of a ''gold standard'' of image similarity, and the gold standard was employed for determination and evaluation of the selection of similar images. The images used in this study were obtained from the Digital Database for Screening Mammography developed by the University of South Florida. The subjective similarity ratings for 300 pairs of images with clustered microcalcifications were determined by ten breast radiologists. The authors determined a number of image features which represent the characteristics of clustered microcalcifications that radiologists would use in their diagnosis. For determination of objective similarity measures, an artificial neural network (ANN) was employed. The ANN was trained with the average subjective similarity ratings as teacher and selected image features as input data. The ANN was trained to learn the relationship between the image features and the radiologists' similarity ratings; therefore, once the training was completed, the ANN was able to determine the similarity, called a psychophysical similarity measure, which was expected to be close to radiologists' impressions, for an unknown pair of clustered microcalcifications. By use of a leave-one-out test method, the best combination of features
Disruption Warning Database Development and Exploratory Machine Learning Studies on Alcator C-Mod

Science.gov (United States)

Montes, Kevin; Rea, Cristina; Granetz, Robert

2017-10-01

A database of about 1800 shots from the 2015 campaign on the Alcator C-Mod tokamak is assembled, including disruptive and non-disruptive discharges. The database consists of 40 relevant plasma parameters with data taken from 160k time slices. In order to investigate the possibility of developing a robust disruption prediction algorithm that is tokamak-independent, we focused machine learning studies on a subset of dimensionless parameters such as βp, n /nG , etc. The Random Forests machine learning algorithm provides insight on the available data set by ranking the relative importance of the input features. Its application on the C-Mod database, however, reveals that virtually no one parameter has more importance than any other, and that its classification algorithm has a low rate of successfully predicted samples, as well as poor false positive and false negative rates. Comparing the analysis of this algorithm on the C-Mod database with its application to a similar database on DIII-D, we conclude that disruption prediction may not be feasible on C-Mod. This conclusion is supported by empirical observations that most C-Mod disruptions are caused by radiative collapse due to molybdenum from the first wall, which happens on just a 1-2ms timescale. Supported by the US Dept. of Energy under DE-FC02-99ER54512 and DE-FC02-04ER54698.
Efficient data retrieval method for similar plasma waveforms in EAST

Energy Technology Data Exchange (ETDEWEB)

Liu, Ying, E-mail: liuying-ipp@szu.edu.cn [SZU-CASIPP Joint Laboratory for Applied Plasma, Shenzhen University, Shenzhen 518060 (China); Huang, Jianjun; Zhou, Huasheng; Wang, Fan [SZU-CASIPP Joint Laboratory for Applied Plasma, Shenzhen University, Shenzhen 518060 (China); Wang, Feng [Institute of Plasma Physics Chinese Academy of Sciences, Hefei 230031 (China)

2016-11-15

Highlights: • The proposed method is carried out by means of bounding envelope and angle distance. • It allows retrieving for whole similar waveforms of any time length. • In addition, the proposed method is also possible to retrieve subsequences. - Abstract: Fusion research relies highly on data analysis due to its massive-sized database. In the present work, we propose an efficient method for searching and retrieving similar plasma waveforms in Experimental Advanced Superconducting Tokamak (EAST). Based on Piecewise Linear Aggregate Approximation (PLAA) for extracting feature values, the searching process is accomplished in two steps. The first one is coarse searching to narrow down the search space, which is carried out by means of bounding envelope. The second step is fine searching to retrieval similar waveforms, which is implemented by the angle distance. The proposed method is tested in EAST databases and turns out to have good performance in retrieving similar waveforms.
Textual and chemical information processing: different domains but similar algorithms

Directory of Open Access Journals (Sweden)

Peter Willett

2000-01-01

Full Text Available This paper discusses the extent to which algorithms developed for the processing of textual databases are also applicable to the processing of chemical structure databases, and vice versa. Applications discussed include: an algorithm for distribution sorting that has been applied to the design of screening systems for rapid chemical substructure searching; the use of measures of inter-molecular structural similarity for the analysis of hypertext graphs; a genetic algorithm for calculating term weights for relevance feedback searching for determining whether a molecule is likely to exhibit biological activity; and the use of data fusion to combine the results of different chemical similarity searches.
Experience with CANDID: Comparison algorithm for navigating digital image databases

Energy Technology Data Exchange (ETDEWEB)

Kelly, P.; Cannon, M.

1994-10-01

This paper presents results from the authors experience with CANDID (Comparison Algorithm for Navigating Digital Image Databases), which was designed to facilitate image retrieval by content using a query-by-example methodology. A global signature describing the texture, shape, or color content is first computed for every image stored in a database, and a normalized similarity measure between probability density functions of feature vectors is used to match signatures. This method can be used to retrieve images from a database that are similar to a user-provided example image. Results for three test applications are included.
BSSF: a fingerprint based ultrafast binding site similarity search and function analysis server

Directory of Open Access Journals (Sweden)

Jiang Hualiang

2010-01-01

Full Text Available Abstract Background Genome sequencing and post-genomics projects such as structural genomics are extending the frontier of the study of sequence-structure-function relationship of genes and their products. Although many sequence/structure-based methods have been devised with the aim of deciphering this delicate relationship, there still remain large gaps in this fundamental problem, which continuously drives researchers to develop novel methods to extract relevant information from sequences and structures and to infer the functions of newly identified genes by genomics technology. Results Here we present an ultrafast method, named BSSF(Binding Site Similarity & Function, which enables researchers to conduct similarity searches in a comprehensive three-dimensional binding site database extracted from PDB structures. This method utilizes a fingerprint representation of the binding site and a validated statistical Z-score function scheme to judge the similarity between the query and database items, even if their similarities are only constrained in a sub-pocket. This fingerprint based similarity measurement was also validated on a known binding site dataset by comparing with geometric hashing, which is a standard 3D similarity method. The comparison clearly demonstrated the utility of this ultrafast method. After conducting the database searching, the hit list is further analyzed to provide basic statistical information about the occurrences of Gene Ontology terms and Enzyme Commission numbers, which may benefit researchers by helping them to design further experiments to study the query proteins. Conclusions This ultrafast web-based system will not only help researchers interested in drug design and structural genomics to identify similar binding sites, but also assist them by providing further analysis of hit list from database searching.
Developments in diffraction databases

International Nuclear Information System (INIS)

Jenkins, R.

1999-01-01

Full text: There are a number of databases available to the diffraction community. Two of the more important of these are the Powder Diffraction File (PDF) maintained by the International Centre for Diffraction Data (ICDD), and the Inorganic Crystal Structure Database (ICSD) maintained by Fachsinformationzentrum (FIZ, Karlsruhe). In application, the PDF has been used as an indispensable tool in phase identification and identification of unknowns. The ICSD database has extensive and explicit reference to the structures of compounds: atomic coordinates, space group and even thermal vibration parameters. A similar database, but for organic compounds, is maintained by the Cambridge Crystallographic Data Centre. These databases are often used as independent sources of information. However, little thought has been given on how to exploit the combined properties of structural database tools. A recently completed agreement between ICDD and FIZ, plus ICDD and Cambridge, provides a first step in complementary use of the PDF and the ICSD databases. The focus of this paper (as indicated below) is to examine ways of exploiting the combined properties of both databases. In 1996, there were approximately 76,000 entries in the PDF and approximately 43,000 entries in the ICSD database. The ICSD database has now been used to calculate entries in the PDF. Thus, to derive d-spacing and peak intensity data requires the synthesis of full diffraction patterns, i.e., we use the structural data in the ICSD database and then add instrumental resolution information. The combined data from PDF and ICSD can be effectively used in many ways. For example, we can calculate PDF data for an ideally random crystal distribution and also in the absence of preferred orientation. Again, we can use systematic studies of intermediate members in solid solutions series to help produce reliable quantitative phase analyses. In some cases, we can study how solid solution properties vary with composition and
Efficient blind search for similar-waveform earthquakes in years of continuous seismic data

Science.gov (United States)

Yoon, C. E.; Bergen, K.; Rong, K.; Elezabi, H.; Bailis, P.; Levis, P.; Beroza, G. C.

2017-12-01

Cross-correlating an earthquake waveform template with continuous seismic data has proven to be a sensitive, discriminating detector of small events missing from earthquake catalogs, but a key limitation of this approach is that it requires advance knowledge of the earthquake signals we wish to detect. To overcome this limitation, we can perform a blind search for events with similar waveforms, comparing waveforms from all possible times within the continuous data (Brown et al., 2008). However, the runtime for naive blind search scales quadratically with the duration of continuous data, making it impractical to process years of continuous data. The Fingerprint And Similarity Thresholding (FAST) detection method (Yoon et al., 2015) enables a comprehensive blind search for similar-waveform earthquakes in a fast, scalable manner by adapting data-mining techniques originally developed for audio and image search within massive databases. FAST converts seismic waveforms into compact "fingerprints", which are efficiently organized and searched within a database. In this way, FAST avoids the unnecessary comparison of dissimilar waveforms. To date, the longest duration of continuous data used for event detection with FAST was 3 months at a single station near Guy-Greenbrier, Arkansas, which revealed microearthquakes closely correlated with stages of hydraulic fracturing (Yoon et al., 2017). In this presentation we introduce an optimized, parallel version of the FAST software with improvements to the fingerprinting algorithm and the ability to detect events using continuous data from a network of stations (Bergen et al., 2016). We demonstrate its ability to detect low-magnitude earthquakes within several years of continuous data at locations of interest in California.
SHOP: scaffold hopping by GRID-based similarity searches

DEFF Research Database (Denmark)

Bergmann, Rikke; Linusson, Anna; Zamora, Ismael

2007-01-01

A new GRID-based method for scaffold hopping (SHOP) is presented. In a fully automatic manner, scaffolds were identified in a database based on three types of 3D-descriptors. SHOP's ability to recover scaffolds was assessed and validated by searching a database spiked with fragments of known...... scaffolds were in the 31 top-ranked scaffolds. SHOP also identified new scaffolds with substantially different chemotypes from the queries. Docking analysis indicated that the new scaffolds would have similar binding modes to those of the respective query scaffolds observed in X-ray structures...
Databases for BaBar Datastream Calibrations and Prompt Reconstruction Processes

International Nuclear Information System (INIS)

Bartelt, John E

1998-01-01

We describe the design of databases used for performing datastream calibrations in the BABAR experiment, involving data accumulated on multiple processors and possibly over several blocks of events (''ConsBlocks''). The database for tracking the history and status of the ConsBlocks, along with similar databases needed by ''Prompt Reconstruction'' are also described
COMPACT STARBURSTS IN z similar to 3-6 SUBMILLIMETER GALAXIES REVEALED BY ALMA

NARCIS (Netherlands)

Ikarashi, Soh; Ivison, R. J.; Caputi, Karina I.; Aretxaga, Itziar; Dunlop, James S.; Hatsukade, Bunyo; Hughes, David H.; Iono, Daisuke; Izumi, Takuma; Kawabe, Ryohei; Kohno, Kotaro; Lagos, Claudia D. P.; Motohara, Kentaro; Nakanishi, Kouichiro; Ohta, Kouji; Tamura, Yoichi; Umehata, Hideki; Wilson, Grant W.; Yabe, Kiyoto; Yun, Min S.

2015-01-01

We report the source size distribution, as measured by ALMA millimetric continuum imaging, of a sample of 13 AzTEC-selected submillimeter galaxies (SMGs) at z(phot) similar to 3-6. Their infrared luminosities and star formation rates (SFRs) are L-IR similar to, 2-6 x 10(12) L-circle dot and similar
Database Description - Trypanosomes Database | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us Trypanosomes Database Database Description General information of database Database name Trypanosomes Database...stitute of Genetics Research Organization of Information and Systems Yata 1111, Mishima, Shizuoka 411-8540, JAPAN E mail: Database...y Name: Trypanosoma Taxonomy ID: 5690 Taxonomy Name: Homo sapiens Taxonomy ID: 9606 Database description The... Article title: Author name(s): Journal: External Links: Original website information Database maintenance s...DB (Protein Data Bank) KEGG PATHWAY Database DrugPort Entry list Available Query search Available Web servic
A trending database for human performance events

International Nuclear Information System (INIS)

Harrison, D.

1993-01-01

An effective Operations Experience program includes a standardized methodology for the investigation of unplanned events and a tool capable of retaining investigation data for the purpose of trending analysis. A database used in conjunction with a formalized investigation procedure for the purpose of trending unplanning event data is described. The database follows the structure of INPO's Human Performance Enhancement System for investigations. The database screens duplicate on-line the HPES evaluation Forms. All information pertaining to investigations is collected, retained and entered into the database using these forms. The database will be used for trending analysis to determine if any significant patterns exist, for tracking progress over time both within AECL and against industry standards, and for evaluating the success of corrective actions. Trending information will be used to help prevent similar occurrences
Similarity analysis of spectra obtained via reflectance spectrometry in legal medicine.

Science.gov (United States)

Belenki, Liudmila; Sterzik, Vera; Bohnert, Michael

2014-02-01

In the present study, a series of reflectance spectra of postmortem lividity, pallor, and putrefaction-affected skin for 195 investigated cases in the course of cooling down the corpse has been collected. The reflectance spectrometric measurements were stored together with their respective metadata in a MySQL database. The latter has been managed via a scientific information repository. We propose similarity measures and a criterion of similarity that capture similar spectra recorded at corpse skin. We systematically clustered reflectance spectra from the database as well as their metadata, such as case number, age, sex, skin temperature, duration of cooling, and postmortem time, with respect to the given criterion of similarity. Altogether, more than 500 reflectance spectra have been pairwisely compared. The measures that have been used to compare a pair of reflectance curve samples include the Euclidean distance between curves and the Euclidean distance between derivatives of the functions represented by the reflectance curves at the same wavelengths in the spectral range of visible light between 380 and 750 nm. For each case, using the recorded reflectance curves and the similarity criterion, the postmortem time interval during which a characteristic change in the shape of reflectance spectrum takes place is estimated. The latter is carried out via a software package composed of Java, Python, and MatLab scripts that query the MySQL database. We show that in legal medicine, matching and clustering of reflectance curves obtained by means of reflectance spectrometry with respect to a given criterion of similarity can be used to estimate the postmortem interval.
Overview of intelligent data retrieval methods for waveforms and images in massive fusion databases

Energy Technology Data Exchange (ETDEWEB)

Vega, J. [JET-EFDA, Culham Science Center, OX14 3DB Abingdon (United Kingdom); Asociacion EURATOM/CIEMAT para Fusion, Avda. Complutense 22, 28040 Madrid (Spain)], E-mail: jesus.vega@ciemat.es; Murari, A. [JET-EFDA, Culham Science Center, OX14 3DB Abingdon (United Kingdom); Consorzio RFX-Associazione EURATOM ENEA per la Fusione, I-35127 Padua (Italy); Pereira, A.; Portas, A.; Ratta, G.A.; Castro, R. [JET-EFDA, Culham Science Center, OX14 3DB Abingdon (United Kingdom); Asociacion EURATOM/CIEMAT para Fusion, Avda. Complutense 22, 28040 Madrid (Spain)

2009-06-15

JET database contains more than 42 Tbytes of data (waveforms and images) and it doubles its size about every 2 years. ITER database is expected to be orders of magnitude above this quantity. Therefore, data access in such huge databases can no longer be efficiently based on shot number or temporal interval. Taking into account that diagnostics generate reproducible signal patterns (structural shapes) for similar physical behaviour, high level data access systems can be developed. In these systems, the input parameter is a pattern and the outputs are the shot numbers and the temporal locations where similar patterns appear inside the database. These pattern oriented techniques can be used for first data screening of any type of morphological aspect of waveforms and images. The article shows a new technique to look for similar images in huge databases in a fast an efficient way. Also, previous techniques to search for similar waveforms and to retrieve time-series data or images containing any kind of patterns are reviewed.

Database Description - SKIP Stemcell Database | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us SKIP Stemcell Database Database Description General information of database Database name SKIP Stemcell Database...rsity Journal Search: Contact address http://www.skip.med.keio.ac.jp/en/contact/ Database classification Human Genes and Diseases Dat...abase classification Stemcell Article Organism Taxonomy Name: Homo sapiens Taxonomy ID: 9606 Database...ks: Original website information Database maintenance site Center for Medical Genetics, School of medicine, ...lable Web services Not available URL of Web services - Need for user registration Not available About This Database Database
Database Description - Arabidopsis Phenome Database | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us Arabidopsis Phenome Database Database Description General information of database Database n... BioResource Center Hiroshi Masuya Database classification Plant databases - Arabidopsis thaliana Organism T...axonomy Name: Arabidopsis thaliana Taxonomy ID: 3702 Database description The Arabidopsis thaliana phenome i...heir effective application. We developed the new Arabidopsis Phenome Database integrating two novel database...seful materials for their experimental research. The other, the “Database of Curated Plant Phenome” focusing
Mobile object retrieval in server-based image databases

Science.gov (United States)

Manger, D.; Pagel, F.; Widak, H.

2013-05-01

The increasing number of mobile phones equipped with powerful cameras leads to huge collections of user-generated images. To utilize the information of the images on site, image retrieval systems are becoming more and more popular to search for similar objects in an own image database. As the computational performance and the memory capacity of mobile devices are constantly increasing, this search can often be performed on the device itself. This is feasible, for example, if the images are represented with global image features or if the search is done using EXIF or textual metadata. However, for larger image databases, if multiple users are meant to contribute to a growing image database or if powerful content-based image retrieval methods with local features are required, a server-based image retrieval backend is needed. In this work, we present a content-based image retrieval system with a client server architecture working with local features. On the server side, the scalability to large image databases is addressed with the popular bag-of-word model with state-of-the-art extensions. The client end of the system focuses on a lightweight user interface presenting the most similar images of the database highlighting the visual information which is common with the query image. Additionally, new images can be added to the database making it a powerful and interactive tool for mobile contentbased image retrieval.
GIS: a comprehensive source for protein structure similarities.

Science.gov (United States)

Guerler, Aysam; Knapp, Ernst-Walter

2010-07-01

A web service for analysis of protein structures that are sequentially or non-sequentially similar was generated. Recently, the non-sequential structure alignment algorithm GANGSTA+ was introduced. GANGSTA+ can detect non-sequential structural analogs for proteins stated to possess novel folds. Since GANGSTA+ ignores the polypeptide chain connectivity of secondary structure elements (i.e. alpha-helices and beta-strands), it is able to detect structural similarities also between proteins whose sequences were reshuffled during evolution. GANGSTA+ was applied in an all-against-all comparison on the ASTRAL40 database (SCOP version 1.75), which consists of >10,000 protein domains yielding about 55 x 10(6) possible protein structure alignments. Here, we provide the resulting protein structure alignments as a public web-based service, named GANGSTA+ Internet Services (GIS). We also allow to browse the ASTRAL40 database of protein structures with GANGSTA+ relative to an externally given protein structure using different constraints to select specific results. GIS allows us to analyze protein structure families according to the SCOP classification scheme. Additionally, users can upload their own protein structures for pairwise protein structure comparison, alignment against all protein structures of the ASTRAL40 database (SCOP version 1.75) or symmetry analysis. GIS is publicly available at http://agknapp.chemie.fu-berlin.de/gplus.
Self-Similar Traffic In Wireless Networks

OpenAIRE

Jerjomins, R.; Petersons, E.

2005-01-01

Many studies have shown that traffic in Ethernet and other wired networks is self-similar. This paper reveals that wireless network traffic is also self-similar and long-range dependant by analyzing big amount of data captured from the wireless router.
The Experience Elicited by Hallucinogens Presents the Highest Similarity to Dreaming within a Large Database of Psychoactive Substance Reports

Science.gov (United States)

Sanz, Camila; Zamberlan, Federico; Erowid, Earth; Erowid, Fire; Tagliazucchi, Enzo

2018-01-01

Ever since the modern rediscovery of psychedelic substances by Western society, several authors have independently proposed that their effects bear a high resemblance to the dreams and dreamlike experiences occurring naturally during the sleep-wake cycle. Recent studies in humans have provided neurophysiological evidence supporting this hypothesis. However, a rigorous comparative analysis of the phenomenology (“what it feels like” to experience these states) is currently lacking. We investigated the semantic similarity between a large number of subjective reports of psychoactive substances and reports of high/low lucidity dreams, and found that the highest-ranking substance in terms of the similarity to high lucidity dreams was the serotonergic psychedelic lysergic acid diethylamide (LSD), whereas the highest-ranking in terms of the similarity to dreams of low lucidity were plants of the Datura genus, rich in deliriant tropane alkaloids. Conversely, sedatives, stimulants, antipsychotics, and antidepressants comprised most of the lowest-ranking substances. An analysis of the most frequent words in the subjective reports of dreams and hallucinogens revealed that terms associated with perception (“see,” “visual,” “face,” “reality,” “color”), emotion (“fear”), setting (“outside,” “inside,” “street,” “front,” “behind”) and relatives (“mom,” “dad,” “brother,” “parent,” “family”) were the most prevalent across both experiences. In summary, we applied novel quantitative analyses to a large volume of empirical data to confirm the hypothesis that, among all psychoactive substances, hallucinogen drugs elicit experiences with the highest semantic similarity to those of dreams. Our results and the associated methodological developments open the way to study the comparative phenomenology of different altered states of consciousness and its relationship with non-invasive measurements of brain physiology. PMID
The Experience Elicited by Hallucinogens Presents the Highest Similarity to Dreaming within a Large Database of Psychoactive Substance Reports

Directory of Open Access Journals (Sweden)

Camila Sanz

2018-01-01

Full Text Available Ever since the modern rediscovery of psychedelic substances by Western society, several authors have independently proposed that their effects bear a high resemblance to the dreams and dreamlike experiences occurring naturally during the sleep-wake cycle. Recent studies in humans have provided neurophysiological evidence supporting this hypothesis. However, a rigorous comparative analysis of the phenomenology (“what it feels like” to experience these states is currently lacking. We investigated the semantic similarity between a large number of subjective reports of psychoactive substances and reports of high/low lucidity dreams, and found that the highest-ranking substance in terms of the similarity to high lucidity dreams was the serotonergic psychedelic lysergic acid diethylamide (LSD, whereas the highest-ranking in terms of the similarity to dreams of low lucidity were plants of the Datura genus, rich in deliriant tropane alkaloids. Conversely, sedatives, stimulants, antipsychotics, and antidepressants comprised most of the lowest-ranking substances. An analysis of the most frequent words in the subjective reports of dreams and hallucinogens revealed that terms associated with perception (“see,” “visual,” “face,” “reality,” “color”, emotion (“fear”, setting (“outside,” “inside,” “street,” “front,” “behind” and relatives (“mom,” “dad,” “brother,” “parent,” “family” were the most prevalent across both experiences. In summary, we applied novel quantitative analyses to a large volume of empirical data to confirm the hypothesis that, among all psychoactive substances, hallucinogen drugs elicit experiences with the highest semantic similarity to those of dreams. Our results and the associated methodological developments open the way to study the comparative phenomenology of different altered states of consciousness and its relationship with non-invasive measurements of brain
Planned and ongoing projects (pop) database: development and results.

Science.gov (United States)

Wild, Claudia; Erdös, Judit; Warmuth, Marisa; Hinterreiter, Gerda; Krämer, Peter; Chalon, Patrice

2014-11-01

The aim of this study was to present the development, structure and results of a database on planned and ongoing health technology assessment (HTA) projects (POP Database) in Europe. The POP Database (POP DB) was set up in an iterative process from a basic Excel sheet to a multifunctional electronic online database. The functionalities, such as the search terminology, the procedures to fill and update the database, the access rules to enter the database, as well as the maintenance roles, were defined in a multistep participatory feedback loop with EUnetHTA Partners. The POP Database has become an online database that hosts not only the titles and MeSH categorizations, but also some basic information on status and contact details about the listed projects of EUnetHTA Partners. Currently, it stores more than 1,200 planned, ongoing or recently published projects of forty-three EUnetHTA Partners from twenty-four countries. Because the POP Database aims to facilitate collaboration, it also provides a matching system to assist in identifying similar projects. Overall, more than 10 percent of the projects in the database are identical both in terms of pathology (indication or disease) and technology (drug, medical device, intervention). In addition, approximately 30 percent of the projects are similar, meaning that they have at least some overlap in content. Although the POP DB is successful concerning regular updates of most national HTA agencies within EUnetHTA, little is known about its actual effects on collaborations in Europe. Moreover, many non-nationally nominated HTA producing agencies neither have access to the POP DB nor can share their projects.
Notions of similarity for computational biology models

KAUST Repository

Waltemath, Dagmar

2016-03-21

Computational models used in biology are rapidly increasing in complexity, size, and numbers. To build such large models, researchers need to rely on software tools for model retrieval, model combination, and version control. These tools need to be able to quantify the differences and similarities between computational models. However, depending on the specific application, the notion of similarity may greatly vary. A general notion of model similarity, applicable to various types of models, is still missing. Here, we introduce a general notion of quantitative model similarities, survey the use of existing model comparison methods in model building and management, and discuss potential applications of model comparison. To frame model comparison as a general problem, we describe a theoretical approach to defining and computing similarities based on different model aspects. Potentially relevant aspects of a model comprise its references to biological entities, network structure, mathematical equations and parameters, and dynamic behaviour. Future similarity measures could combine these model aspects in flexible, problem-specific ways in order to mimic users\\' intuition about model similarity, and to support complex model searches in databases.
Notions of similarity for computational biology models

KAUST Repository

Waltemath, Dagmar; Henkel, Ron; Hoehndorf, Robert; Kacprowski, Tim; Knuepfer, Christian; Liebermeister, Wolfram

2016-01-01

Computational models used in biology are rapidly increasing in complexity, size, and numbers. To build such large models, researchers need to rely on software tools for model retrieval, model combination, and version control. These tools need to be able to quantify the differences and similarities between computational models. However, depending on the specific application, the notion of similarity may greatly vary. A general notion of model similarity, applicable to various types of models, is still missing. Here, we introduce a general notion of quantitative model similarities, survey the use of existing model comparison methods in model building and management, and discuss potential applications of model comparison. To frame model comparison as a general problem, we describe a theoretical approach to defining and computing similarities based on different model aspects. Potentially relevant aspects of a model comprise its references to biological entities, network structure, mathematical equations and parameters, and dynamic behaviour. Future similarity measures could combine these model aspects in flexible, problem-specific ways in order to mimic users' intuition about model similarity, and to support complex model searches in databases.
Notions of similarity for systems biology models.

Science.gov (United States)

Henkel, Ron; Hoehndorf, Robert; Kacprowski, Tim; Knüpfer, Christian; Liebermeister, Wolfram; Waltemath, Dagmar

2018-01-01

Systems biology models are rapidly increasing in complexity, size and numbers. When building large models, researchers rely on software tools for the retrieval, comparison, combination and merging of models, as well as for version control. These tools need to be able to quantify the differences and similarities between computational models. However, depending on the specific application, the notion of 'similarity' may greatly vary. A general notion of model similarity, applicable to various types of models, is still missing. Here we survey existing methods for the comparison of models, introduce quantitative measures for model similarity, and discuss potential applications of combined similarity measures. To frame model comparison as a general problem, we describe a theoretical approach to defining and computing similarities based on a combination of different model aspects. The six aspects that we define as potentially relevant for similarity are underlying encoding, references to biological entities, quantitative behaviour, qualitative behaviour, mathematical equations and parameters and network structure. We argue that future similarity measures will benefit from combining these model aspects in flexible, problem-specific ways to mimic users' intuition about model similarity, and to support complex model searches in databases. © The Author 2016. Published by Oxford University Press.
Similarly shaped letters evoke similar colors in grapheme-color synesthesia.

Science.gov (United States)

Brang, David; Rouw, Romke; Ramachandran, V S; Coulson, Seana

2011-04-01

Grapheme-color synesthesia is a neurological condition in which viewing numbers or letters (graphemes) results in the concurrent sensation of color. While the anatomical substrates underlying this experience are well understood, little research to date has investigated factors influencing the particular colors associated with particular graphemes or how synesthesia occurs developmentally. A recent suggestion of such an interaction has been proposed in the cascaded cross-tuning (CCT) model of synesthesia, which posits that in synesthetes connections between grapheme regions and color area V4 participate in a competitive activation process, with synesthetic colors arising during the component-stage of grapheme processing. This model more directly suggests that graphemes sharing similar component features (lines, curves, etc.) should accordingly activate more similar synesthetic colors. To test this proposal, we created and regressed synesthetic color-similarity matrices for each of 52 synesthetes against a letter-confusability matrix, an unbiased measure of visual similarity among graphemes. Results of synesthetes' grapheme-color correspondences indeed revealed that more similarly shaped graphemes corresponded with more similar synesthetic colors, with stronger effects observed in individuals with more intense synesthetic experiences (projector synesthetes). These results support the CCT model of synesthesia, implicate early perceptual mechanisms as driving factors in the elicitation of synesthetic hues, and further highlight the relationship between conceptual and perceptual factors in this phenomenon. Copyright © 2011 Elsevier Ltd. All rights reserved.
Chemical databases evaluated by order theoretical tools.

Science.gov (United States)

Voigt, Kristina; Brüggemann, Rainer; Pudenz, Stefan

2004-10-01

Data on environmental chemicals are urgently needed to comply with the future chemicals policy in the European Union. The availability of data on parameters and chemicals can be evaluated by chemometrical and environmetrical methods. Different mathematical and statistical methods are taken into account in this paper. The emphasis is set on a new, discrete mathematical method called METEOR (method of evaluation by order theory). Application of the Hasse diagram technique (HDT) of the complete data-matrix comprising 12 objects (databases) x 27 attributes (parameters + chemicals) reveals that ECOTOX (ECO), environmental fate database (EFD) and extoxnet (EXT)--also called multi-database databases--are best. Most single databases which are specialised are found in a minimal position in the Hasse diagram; these are biocatalysis/biodegradation database (BID), pesticide database (PES) and UmweltInfo (UMW). The aggregation of environmental parameters and chemicals (equal weight) leads to a slimmer data-matrix on the attribute side. However, no significant differences are found in the "best" and "worst" objects. The whole approach indicates a rather bad situation in terms of the availability of data on existing chemicals and hence an alarming signal concerning the new and existing chemicals policies of the EEC.
Knowledge Discovery in Biological Databases for Revealing Candidate Genes Linked to Complex Phenotypes.

Science.gov (United States)

Hassani-Pak, Keywan; Rawlings, Christopher

2017-06-13

Genetics and "omics" studies designed to uncover genotype to phenotype relationships often identify large numbers of potential candidate genes, among which the causal genes are hidden. Scientists generally lack the time and technical expertise to review all relevant information available from the literature, from key model species and from a potentially wide range of related biological databases in a variety of data formats with variable quality and coverage. Computational tools are needed for the integration and evaluation of heterogeneous information in order to prioritise candidate genes and components of interaction networks that, if perturbed through potential interventions, have a positive impact on the biological outcome in the whole organism without producing negative side effects. Here we review several bioinformatics tools and databases that play an important role in biological knowledge discovery and candidate gene prioritization. We conclude with several key challenges that need to be addressed in order to facilitate biological knowledge discovery in the future.
Database Dictionary for Ethiopian National Ground-Water DAtabase (ENGDA) Data Fields

Science.gov (United States)

Kuniansky, Eve L.; Litke, David W.; Tucci, Patrick

2007-01-01

ENGDA database field name and relational database table is designated; along with the ENGDA screen entry form(s) and the ENGDA field form (attachment 2). The database dictionary is separated into sections. The first section, Basic Site Data Fields, describes the basic site information that is similar for all of the different types of sites. The remaining sections may be applicable for only one type of site; for example, the Well Drilling and Construction Data Fields and Lithologic Description Data Fields are applicable to boreholes and not to springs. Attachment 1 contains a table for conversion from English to metric units. Attachment 2 contains selected field forms used in conjunction with ENGDA. A separate document, 'Users Reference Manual for the Ethiopian National Ground-Water DAtabase (ENGDA),' by David W. Litke was developed as a users guide for the computer database and screen entry. This database dictionary serves as a reference for both the field forms and the computer database. Every effort has been made to have identical field names between the field forms and the screen entry forms in order to avoid confusion.
Functional integration of automated system databases by means of artificial intelligence

Science.gov (United States)

Dubovoi, Volodymyr M.; Nikitenko, Olena D.; Kalimoldayev, Maksat; Kotyra, Andrzej; Gromaszek, Konrad; Iskakova, Aigul

2017-08-01

The paper presents approaches for functional integration of automated system databases by means of artificial intelligence. The peculiarities of turning to account the database in the systems with the usage of a fuzzy implementation of functions were analyzed. Requirements for the normalization of such databases were defined. The question of data equivalence in conditions of uncertainty and collisions in the presence of the databases functional integration is considered and the model to reveal their possible occurrence is devised. The paper also presents evaluation method of standardization of integrated database normalization.
Interactive Exploration for Continuously Expanding Neuron Databases.

Science.gov (United States)

Li, Zhongyu; Metaxas, Dimitris N; Lu, Aidong; Zhang, Shaoting

2017-02-15

This paper proposes a novel framework to help biologists explore and analyze neurons based on retrieval of data from neuron morphological databases. In recent years, the continuously expanding neuron databases provide a rich source of information to associate neuronal morphologies with their functional properties. We design a coarse-to-fine framework for efficient and effective data retrieval from large-scale neuron databases. In the coarse-level, for efficiency in large-scale, we employ a binary coding method to compress morphological features into binary codes of tens of bits. Short binary codes allow for real-time similarity searching in Hamming space. Because the neuron databases are continuously expanding, it is inefficient to re-train the binary coding model from scratch when adding new neurons. To solve this problem, we extend binary coding with online updating schemes, which only considers the newly added neurons and update the model on-the-fly, without accessing the whole neuron databases. In the fine-grained level, we introduce domain experts/users in the framework, which can give relevance feedback for the binary coding based retrieval results. This interactive strategy can improve the retrieval performance through re-ranking the above coarse results, where we design a new similarity measure and take the feedback into account. Our framework is validated on more than 17,000 neuron cells, showing promising retrieval accuracy and efficiency. Moreover, we demonstrate its use case in assisting biologists to identify and explore unknown neurons. Copyright © 2017 Elsevier Inc. All rights reserved.
Database Description - Yeast Interacting Proteins Database | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us Yeast Interacting Proteins Database Database Description General information of database Database... name Yeast Interacting Proteins Database Alternative name - DOI 10.18908/lsdba.nbdc00742-000 Creator C...-ken 277-8561 Tel: +81-4-7136-3989 FAX: +81-4-7136-3979 E-mail : Database classif...s cerevisiae Taxonomy ID: 4932 Database description Information on interactions and related information obta...l Acad Sci U S A. 2001 Apr 10;98(8):4569-74. Epub 2001 Mar 13. External Links: Original website information Database
Update History of This Database - Trypanosomes Database | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us Trypanosomes Database Update History of This Database Date Update contents 2014/05/07 The co...ntact information is corrected. The features and manner of utilization of the database are corrected. 2014/02/04 Trypanosomes Databas...e English archive site is opened. 2011/04/04 Trypanosomes Database ( http://www.tan...paku.org/tdb/ ) is opened. About This Database Database Description Download Lice...nse Update History of This Database Site Policy | Contact Us Update History of This Database - Trypanosomes Database | LSDB Archive ...
Meta-analysis of pulsed-field gel electrophoresis fingerprints based on a constructed Salmonella database.

Directory of Open Access Journals (Sweden)

Wen Zou

Full Text Available A database was constructed consisting of 45,923 Salmonella pulsed-field gel electrophoresis (PFGE patterns. The patterns, randomly selected from all submissions to CDC PulseNet during 2005 to 2010, included the 20 most frequent serotypes and 12 less frequent serotypes. Meta-analysis was applied to all of the PFGE patterns in the database. In the range of 20 to 1100 kb, serotype Enteritidis averaged the fewest bands at 12 bands and Paratyphi A the most with 19, with most serotypes in the 13-15 range among the 32 serptypes. The 10 most frequent bands for each of the 32 serotypes were sorted and distinguished, and the results were in concordance with those from distance matrix and two-way hierarchical cluster analyses of the patterns in the database. The hierarchical cluster analysis divided the 32 serotypes into three major groups according to dissimilarity measures, and revealed for the first time the similarities among the PFGE patterns of serotype Saintpaul to serotypes Typhimurium, Typhimurium var. 5-, and I 4,[5],12:i:-; of serotype Hadar to serotype Infantis; and of serotype Muenchen to serotype Newport. The results of the meta-analysis indicated that the pattern similarities/dissimilarities determined the serotype discrimination of PFGE method, and that the possible PFGE markers may have utility for serotype identification. The presence of distinct, serotype specific patterns may provide useful information to aid in the distribution of serotypes in the population and potentially reduce the need for laborious analyses, such as traditional serotyping.

Revelation of the Sun Self-Similarity Skeletal Structures

International Nuclear Information System (INIS)

Rantsev-Kartinov, V.A.

2005-01-01

The analysis of databases of photographic images of a surface of the Sun, its atmosphere and the closest its space environment taken at various spatial resolutions and for various types of radiation of a surface of the Sun by means of a method multilevel dynamic contrasting, has revealed presence skeletal structures as on the Sun directly such and in its environment. It is demonstrated the revealed a global structures of the Sun and powerful ejections of mass of its corona, as well as the structures of its atmosphere, protuberances, sun-spots and a globular structures of its photosphere
The HMMER Web Server for Protein Sequence Similarity Search.

Science.gov (United States)

Prakash, Ananth; Jeffryes, Matt; Bateman, Alex; Finn, Robert D

2017-12-08

Protein sequence similarity search is one of the most commonly used bioinformatics methods for identifying evolutionarily related proteins. In general, sequences that are evolutionarily related share some degree of similarity, and sequence-search algorithms use this principle to identify homologs. The requirement for a fast and sensitive sequence search method led to the development of the HMMER software, which in the latest version (v3.1) uses a combination of sophisticated acceleration heuristics and mathematical and computational optimizations to enable the use of profile hidden Markov models (HMMs) for sequence analysis. The HMMER Web server provides a common platform by linking the HMMER algorithms to databases, thereby enabling the search for homologs, as well as providing sequence and functional annotation by linking external databases. This unit describes three basic protocols and two alternate protocols that explain how to use the HMMER Web server using various input formats and user defined parameters. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.
An efficient similarity measure for content based image retrieval using memetic algorithm

Directory of Open Access Journals (Sweden)

Mutasem K. Alsmadi

2017-06-01

Full Text Available Content based image retrieval (CBIR systems work by retrieving images which are related to the query image (QI from huge databases. The available CBIR systems extract limited feature sets which confine the retrieval efficacy. In this work, extensive robust and important features were extracted from the images database and then stored in the feature repository. This feature set is composed of color signature with the shape and color texture features. Where, features are extracted from the given QI in the similar fashion. Consequently, a novel similarity evaluation using a meta-heuristic algorithm called a memetic algorithm (genetic algorithm with great deluge is achieved between the features of the QI and the features of the database images. Our proposed CBIR system is assessed by inquiring number of images (from the test dataset and the efficiency of the system is evaluated by calculating precision-recall value for the results. The results were superior to other state-of-the-art CBIR systems in regard to precision.
Summary of earthquake experience database

International Nuclear Information System (INIS)

1999-01-01

Strong-motion earthquakes frequently occur throughout the Pacific Basin, where power plants or industrial facilities are included in the affected areas. By studying the performance of these earthquake-affected (or database) facilities, a large inventory of various types of equipment installations can be compiled that have experienced substantial seismic motion. The primary purposes of the seismic experience database are summarized as follows: to determine the most common sources of seismic damage, or adverse effects, on equipment installations typical of industrial facilities; to determine the thresholds of seismic motion corresponding to various types of seismic damage; to determine the general performance of equipment during earthquakes, regardless of the levels of seismic motion; to determine minimum standards in equipment construction and installation, based on past experience, to assure the ability to withstand anticipated seismic loads. To summarize, the primary assumption in compiling an experience database is that the actual seismic hazard to industrial installations is best demonstrated by the performance of similar installations in past earthquakes
Enabling Semantic Queries Against the Spatial Database

Directory of Open Access Journals (Sweden)

PENG, X.

2012-02-01

Full Text Available The spatial database based upon the object-relational database management system (ORDBMS has the merits of a clear data model, good operability and high query efficiency. That is why it has been widely used in spatial data organization and management. However, it cannot express the semantic relationships among geospatial objects, making the query results difficult to meet the user's requirement well. Therefore, this paper represents an attempt to combine the Semantic Web technology with the spatial database so as to make up for the traditional database's disadvantages. In this way, on the one hand, users can take advantages of ORDBMS to store and manage spatial data; on the other hand, if the spatial database is released in the form of Semantic Web, the users could describe a query more concisely with the cognitive pattern which is similar to that of daily life. As a consequence, this methodology enables the benefits of both Semantic Web and the object-relational database (ORDB available. The paper discusses systematically the semantic enriched spatial database's architecture, key technologies and implementation. Subsequently, we demonstrate the function of spatial semantic queries via a practical prototype system. The query results indicate that the method used in this study is feasible.
The Danish Microbiology Database (MiBa) 2010 to 2013.

Science.gov (United States)

Voldstedlund, M; Haarh, M; Mølbak, K

2014-01-09

The Danish Microbiology Database (MiBa) is a national database that receives copies of reports from all Danish departments of clinical microbiology. The database was launched in order to provide healthcare personnel with nationwide access to microbiology reports and to enable real-time surveillance of communicable diseases and microorganisms. The establishment and management of MiBa has been a collaborative process among stakeholders, and the present paper summarises lessons learned from this nationwide endeavour which may be relevant to similar projects in the rapidly changing landscape of health informatics.
Determining the semantic similarities among Gene Ontology terms.

Science.gov (United States)

Taha, Kamal

2013-05-01

We present in this paper novel techniques that determine the semantic relationships among GeneOntology (GO) terms. We implemented these techniques in a prototype system called GoSE, which resides between user application and GO database. Given a set S of GO terms, GoSE would return another set S' of GO terms, where each term in S' is semantically related to each term in S. Most current research is focused on determining the semantic similarities among GO ontology terms based solely on their IDs and proximity to one another in the GO graph structure, while overlooking the contexts of the terms, which may lead to erroneous results. The context of a GO term T is the set of other terms, whose existence in the GO graph structure is dependent on T. We propose novel techniques that determine the contexts of terms based on the concept of existence dependency. We present a stack-based sort-merge algorithm employing these techniques for determining the semantic similarities among GO terms.We evaluated GoSE experimentally and compared it with three existing methods. The results of measuring the semantic similarities among genes in KEGG and Pfam pathways retrieved from the DBGET and Sanger Pfam databases, respectively, have shown that our method outperforms the other three methods in recall and precision.
SS-Wrapper: a package of wrapper applications for similarity searches on Linux clusters

Directory of Open Access Journals (Sweden)

Lefkowitz Elliot J

2004-10-01

Full Text Available Abstract Background Large-scale sequence comparison is a powerful tool for biological inference in modern molecular biology. Comparing new sequences to those in annotated databases is a useful source of functional and structural information about these sequences. Using software such as the basic local alignment search tool (BLAST or HMMPFAM to identify statistically significant matches between newly sequenced segments of genetic material and those in databases is an important task for most molecular biologists. Searching algorithms are intrinsically slow and data-intensive, especially in light of the rapid growth of biological sequence databases due to the emergence of high throughput DNA sequencing techniques. Thus, traditional bioinformatics tools are impractical on PCs and even on dedicated UNIX servers. To take advantage of larger databases and more reliable methods, high performance computation becomes necessary. Results We describe the implementation of SS-Wrapper (Similarity Search Wrapper, a package of wrapper applications that can parallelize similarity search applications on a Linux cluster. Our wrapper utilizes a query segmentation-search (QS-search approach to parallelize sequence database search applications. It takes into consideration load balancing between each node on the cluster to maximize resource usage. QS-search is designed to wrap many different search tools, such as BLAST and HMMPFAM using the same interface. This implementation does not alter the original program, so newly obtained programs and program updates should be accommodated easily. Benchmark experiments using QS-search to optimize BLAST and HMMPFAM showed that QS-search accelerated the performance of these programs almost linearly in proportion to the number of CPUs used. We have also implemented a wrapper that utilizes a database segmentation approach (DS-BLAST that provides a complementary solution for BLAST searches when the database is too large to fit into
SS-Wrapper: a package of wrapper applications for similarity searches on Linux clusters.

Science.gov (United States)

Wang, Chunlin; Lefkowitz, Elliot J

2004-10-28

Large-scale sequence comparison is a powerful tool for biological inference in modern molecular biology. Comparing new sequences to those in annotated databases is a useful source of functional and structural information about these sequences. Using software such as the basic local alignment search tool (BLAST) or HMMPFAM to identify statistically significant matches between newly sequenced segments of genetic material and those in databases is an important task for most molecular biologists. Searching algorithms are intrinsically slow and data-intensive, especially in light of the rapid growth of biological sequence databases due to the emergence of high throughput DNA sequencing techniques. Thus, traditional bioinformatics tools are impractical on PCs and even on dedicated UNIX servers. To take advantage of larger databases and more reliable methods, high performance computation becomes necessary. We describe the implementation of SS-Wrapper (Similarity Search Wrapper), a package of wrapper applications that can parallelize similarity search applications on a Linux cluster. Our wrapper utilizes a query segmentation-search (QS-search) approach to parallelize sequence database search applications. It takes into consideration load balancing between each node on the cluster to maximize resource usage. QS-search is designed to wrap many different search tools, such as BLAST and HMMPFAM using the same interface. This implementation does not alter the original program, so newly obtained programs and program updates should be accommodated easily. Benchmark experiments using QS-search to optimize BLAST and HMMPFAM showed that QS-search accelerated the performance of these programs almost linearly in proportion to the number of CPUs used. We have also implemented a wrapper that utilizes a database segmentation approach (DS-BLAST) that provides a complementary solution for BLAST searches when the database is too large to fit into the memory of a single node. Used together
TrSDB: a proteome database of transcription factors

Science.gov (United States)

Hermoso, Antoni; Aguilar, Daniel; Aviles, Francesc X.; Querol, Enrique

2004-01-01

TrSDB—TranScout Database—(http://ibb.uab.es/trsdb) is a proteome database of eukaryotic transcription factors based upon predicted motifs by TranScout and data sources such as InterPro and Gene Ontology Annotation. Nine eukaryotic proteomes are included in the current version. Extensive and diverse information for each database entry, different analyses considering TranScout classification and similarity relationships are offered for research on transcription factors or gene expression. PMID:14681387
PubData: search engine for bioinformatics databases worldwide

OpenAIRE

Vand, Kasra; Wahlestedt, Thor; Khomtchouk, Kelly; Sayed, Mohammed; Wahlestedt, Claes; Khomtchouk, Bohdan

2016-01-01

We propose a search engine and file retrieval system for all bioinformatics databases worldwide. PubData searches biomedical data in a user-friendly fashion similar to how PubMed searches biomedical literature. PubData is built on novel network programming, natural language processing, and artificial intelligence algorithms that can patch into the file transfer protocol servers of any user-specified bioinformatics database, query its contents, retrieve files for download, and adapt to the use...
Comparative analysis of perioperative complications between a multicenter prospective cervical deformity database and the Nationwide Inpatient Sample database.

Science.gov (United States)

Passias, Peter G; Horn, Samantha R; Jalai, Cyrus M; Poorman, Gregory; Bono, Olivia J; Ramchandran, Subaraman; Smith, Justin S; Scheer, Justin K; Sciubba, Daniel M; Hamilton, D Kojo; Mundis, Gregory; Oh, Cheongeun; Klineberg, Eric O; Lafage, Virginie; Shaffrey, Christopher I; Ames, Christopher P

2017-11-01

Complication rates for adult cervical deformity are poorly characterized given the complexity and heterogeneity of cases. To compare perioperative complication rates following adult cervical deformity corrective surgery between a prospective multicenter database for patients with cervical deformity (PCD) and the Nationwide Inpatient Sample (NIS). Retrospective review of prospective databases. A total of 11,501 adult patients with cervical deformity (11,379 patients from the NIS and 122 patients from the PCD database). Perioperative medical and surgical complications. The NIS was queried (2001-2013) for cervical deformity discharges for patients ≥18 years undergoing cervical fusions using International Classification of Disease, Ninth Revision (ICD-9) coding. Patients ≥18 years from the PCD database (2013-2015) were selected. Equivalent complications were identified and rates were compared. Bonferroni correction (pdatabases. A total of 11,379 patients from the NIS database and 122 patiens from the PCD database were identified. Patients from the PCD database were older (62.49 vs. 55.15, pdatabase. The PCD database had an increased risk of reporting overall complications than the NIS (odds ratio: 2.81, confidence interval: 1.81-4.38). Only device-related complications were greater in the NIS (7.1% vs. 1.1%, p=.007). Patients from the PCD database displayed higher rates of the following complications: peripheral vascular (0.8% vs. 0.1%, p=.001), gastrointestinal (GI) (2.5% vs. 0.2%, pdatabases (p>.004). Based on surgicalapproach, the PCD reported higher GI and neurologic complication rates for combined anterior-posterior procedures (pdatabase revealed higher overall and individual complication rates and higher data granularity. The nationwide database may underestimate complications of patients with adult cervical deformity (ACD) particularly in regard to perioperative surgical details owing to coding and deformity generalizations. The surgeon-maintained database
Update History of This Database - Arabidopsis Phenome Database | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us Arabidopsis Phenome Database Update History of This Database Date Update contents 2017/02/27 Arabidopsis Phenome Data...base English archive site is opened. - Arabidopsis Phenome Database (http://jphenom...e.info/?page_id=95) is opened. About This Database Database Description Download License Update History of This Database... Site Policy | Contact Us Update History of This Database - Arabidopsis Phenome Database | LSDB Archive ...
Refactoring databases evolutionary database design

CERN Document Server

Ambler, Scott W

2006-01-01

Refactoring has proven its value in a wide range of development projects–helping software professionals improve system designs, maintainability, extensibility, and performance. Now, for the first time, leading agile methodologist Scott Ambler and renowned consultant Pramodkumar Sadalage introduce powerful refactoring techniques specifically designed for database systems. Ambler and Sadalage demonstrate how small changes to table structures, data, stored procedures, and triggers can significantly enhance virtually any database design–without changing semantics. You’ll learn how to evolve database schemas in step with source code–and become far more effective in projects relying on iterative, agile methodologies. This comprehensive guide and reference helps you overcome the practical obstacles to refactoring real-world databases by covering every fundamental concept underlying database refactoring. Using start-to-finish examples, the authors walk you through refactoring simple standalone databas...
Update History of This Database - SKIP Stemcell Database | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us SKIP Stemcell Database Update History of This Database Date Update contents 2017/03/13 SKIP Stemcell Database... English archive site is opened. 2013/03/29 SKIP Stemcell Database ( https://www.skip.med.k...eio.ac.jp/SKIPSearch/top?lang=en ) is opened. About This Database Database Description Download License Update History of This Databa...se Site Policy | Contact Us Update History of This Database - SKIP Stemcell Database | LSDB Archive ...
Attention-based image similarity measure with application to content-based information retrieval

Science.gov (United States)

Stentiford, Fred W. M.

2003-01-01

Whilst storage and capture technologies are able to cope with huge numbers of images, image retrieval is in danger of rendering many repositories valueless because of the difficulty of access. This paper proposes a similarity measure that imposes only very weak assumptions on the nature of the features used in the recognition process. This approach does not make use of a pre-defined set of feature measurements which are extracted from a query image and used to match those from database images, but instead generates features on a trial and error basis during the calculation of the similarity measure. This has the significant advantage that features that determine similarity can match whatever image property is important in a particular region whether it be a shape, a texture, a colour or a combination of all three. It means that effort is expended searching for the best feature for the region rather than expecting that a fixed feature set will perform optimally over the whole area of an image and over every image in a database. The similarity measure is evaluated on a problem of distinguishing similar shapes in sets of black and white symbols.
Applications of Location Similarity Measures and Conceptual Spaces to Event Coreference and Classification

Science.gov (United States)

McConky, Katie Theresa

2013-01-01

This work covers topics in event coreference and event classification from spoken conversation. Event coreference is the process of identifying descriptions of the same event across sentences, documents, or structured databases. Existing event coreference work focuses on sentence similarity models or feature based similarity models requiring slot…
Proteins in similarity relationship with the cluster - Gclust Server | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us Gclust Server Proteins in similarity relationship with the cluster Data detail Data name Pro...teins in similarity relationship with the cluster DOI 10.18908/lsdba.nbdc00464-003 Description of data conte...s Proteins in similarity relationship with the cluster - Gclust Server | LSDB Archive ...
The Danish Nonmelanoma Skin Cancer Dermatology Database.

Science.gov (United States)

Lamberg, Anna Lei; Sølvsten, Henrik; Lei, Ulrikke; Vinding, Gabrielle Randskov; Stender, Ida Marie; Jemec, Gregor Borut Ernst; Vestergaard, Tine; Thormann, Henrik; Hædersdal, Merete; Dam, Tomas Norman; Olesen, Anne Braae

2016-01-01

The Danish Nonmelanoma Skin Cancer Dermatology Database was established in 2008. The aim of this database was to collect data on nonmelanoma skin cancer (NMSC) treatment and improve its treatment in Denmark. NMSC is the most common malignancy in the western countries and represents a significant challenge in terms of public health management and health care costs. However, high-quality epidemiological and treatment data on NMSC are sparse. The NMSC database includes patients with the following skin tumors: basal cell carcinoma (BCC), squamous cell carcinoma, Bowen's disease, and keratoacanthoma diagnosed by the participating office-based dermatologists in Denmark. Clinical and histological diagnoses, BCC subtype, localization, size, skin cancer history, skin phototype, and evidence of metastases and treatment modality are the main variables in the NMSC database. Information on recurrence, cosmetic results, and complications are registered at two follow-up visits at 3 months (between 0 and 6 months) and 12 months (between 6 and 15 months) after treatment. In 2014, 11,522 patients with 17,575 tumors were registered in the database. Of tumors with a histological diagnosis, 13,571 were BCCs, 840 squamous cell carcinomas, 504 Bowen's disease, and 173 keratoakanthomas. The NMSC database encompasses detailed information on the type of tumor, a variety of prognostic factors, treatment modalities, and outcomes after treatment. The database has revealed that overall, the quality of care of NMSC in Danish dermatological clinics is high, and the database provides the necessary data for continuous quality assurance.
Heterogeneous Biomedical Database Integration Using a Hybrid Strategy: A p53 Cancer Research Database

Directory of Open Access Journals (Sweden)

Vadim Y. Bichutskiy

2006-01-01

Full Text Available Complex problems in life science research give rise to multidisciplinary collaboration, and hence, to the need for heterogeneous database integration. The tumor suppressor p53 is mutated in close to 50% of human cancers, and a small drug-like molecule with the ability to restore native function to cancerous p53 mutants is a long-held medical goal of cancer treatment. The Cancer Research DataBase (CRDB was designed in support of a project to find such small molecules. As a cancer informatics project, the CRDB involved small molecule data, computational docking results, functional assays, and protein structure data. As an example of the hybrid strategy for data integration, it combined the mediation and data warehousing approaches. This paper uses the CRDB to illustrate the hybrid strategy as a viable approach to heterogeneous data integration in biomedicine, and provides a design method for those considering similar systems. More efficient data sharing implies increased productivity, and, hopefully, improved chances of success in cancer research. (Code and database schemas are freely downloadable, http://www.igb.uci.edu/research/research.html.

Domain Regeneration for Cross-Database Micro-Expression Recognition

Science.gov (United States)

Zong, Yuan; Zheng, Wenming; Huang, Xiaohua; Shi, Jingang; Cui, Zhen; Zhao, Guoying

2018-05-01

In this paper, we investigate the cross-database micro-expression recognition problem, where the training and testing samples are from two different micro-expression databases. Under this setting, the training and testing samples would have different feature distributions and hence the performance of most existing micro-expression recognition methods may decrease greatly. To solve this problem, we propose a simple yet effective method called Target Sample Re-Generator (TSRG) in this paper. By using TSRG, we are able to re-generate the samples from target micro-expression database and the re-generated target samples would share same or similar feature distributions with the original source samples. For this reason, we can then use the classifier learned based on the labeled source samples to accurately predict the micro-expression categories of the unlabeled target samples. To evaluate the performance of the proposed TSRG method, extensive cross-database micro-expression recognition experiments designed based on SMIC and CASME II databases are conducted. Compared with recent state-of-the-art cross-database emotion recognition methods, the proposed TSRG achieves more promising results.
Personality similarity in negotiations: Testing the dyadic effects of similarity in interpersonal traits and the use of emotional displays on negotiation outcomes.

Science.gov (United States)

Wilson, Kelly Schwind; DeRue, D Scott; Matta, Fadel K; Howe, Michael; Conlon, Donald E

2016-10-01

We build on the small but growing literature documenting personality influences on negotiation by examining how the joint disposition of both negotiators with respect to the interpersonal traits of agreeableness and extraversion influences important negotiation processes and outcomes. Building on similarity-attraction theory, we articulate and demonstrate how being similarly high or similarly low on agreeableness and extraversion leads dyad members to express more positive emotional displays during negotiation. Moreover, because of increased positive emotional displays, we show that dyads with such compositions also tend to reach agreements faster, perceive less relationship conflict, and have more positive impressions of their negotiation partner. Interestingly, these results hold regardless of whether negotiating dyads are similar in normatively positive (i.e., similarly agreeable and similarly extraverted) or normatively negative (i.e., similarly disagreeable and similarly introverted) ways. Overall, these findings demonstrate the importance of considering the dyad's personality configuration when attempting to understand the affective experience as well as the downstream outcomes of a negotiation. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Database design and database administration for a kindergarten

OpenAIRE

Vítek, Daniel

2009-01-01

The bachelor thesis deals with creation of database design for a standard kindergarten, installation of the designed database into the database system Oracle Database 10g Express Edition and demonstration of the administration tasks in this database system. The verification of the database was proved by a developed access application.
Document image database indexing with pictorial dictionary

Science.gov (United States)

Akbari, Mohammad; Azimi, Reza

2010-02-01

In this paper we introduce a new approach for information retrieval from Persian document image database without using Optical Character Recognition (OCR).At first an attribute called subword upper contour label is defined then, a pictorial dictionary is constructed based on this attribute for the subwords. By this approach we address two issues in document image retrieval: keyword spotting and retrieval according to the document similarities. The proposed methods have been evaluated on a Persian document image database. The results have proved the ability of this approach in document image information retrieval.
Interactive searching of facial image databases

Science.gov (United States)

Nicholls, Robert A.; Shepherd, John W.; Shepherd, Jean

1995-09-01

A set of psychological facial descriptors has been devised to enable computerized searching of criminal photograph albums. The descriptors have been used to encode image databased of up to twelve thousand images. Using a system called FACES, the databases are searched by translating a witness' verbal description into corresponding facial descriptors. Trials of FACES have shown that this coding scheme is more productive and efficient than searching traditional photograph albums. An alternative method of searching the encoded database using a genetic algorithm is currenly being tested. The genetic search method does not require the witness to verbalize a description of the target but merely to indicate a degree of similarity between the target and a limited selection of images from the database. The major drawback of FACES is that is requires a manual encoding of images. Research is being undertaken to automate the process, however, it will require an algorithm which can predict human descriptive values. Alternatives to human derived coding schemes exist using statistical classifications of images. Since databases encoded using statistical classifiers do not have an obvious direct mapping to human derived descriptors, a search method which does not require the entry of human descriptors is required. A genetic search algorithm is being tested for such a purpose.
A Review of Stellar Abundance Databases and the Hypatia Catalog Database

Science.gov (United States)

Hinkel, Natalie Rose

2018-01-01

The astronomical community is interested in elements from lithium to thorium, from solar twins to peculiarities of stellar evolution, because they give insight into different regimes of star formation and evolution. However, while some trends between elements and other stellar or planetary properties are well known, many other trends are not as obvious and are a point of conflict. For example, stars that host giant planets are found to be consistently enriched in iron, but the same cannot be definitively said for any other element. Therefore, it is time to take advantage of large stellar abundance databases in order to better understand not only the large-scale patterns, but also the more subtle, small-scale trends within the data.In this overview to the special session, I will present a review of large stellar abundance databases that are both currently available (i.e. RAVE, APOGEE) and those that will soon be online (i.e. Gaia-ESO, GALAH). Additionally, I will discuss the Hypatia Catalog Database (www.hypatiacatalog.com) -- which includes abundances from individual literature sources that observed stars within 150pc. The Hypatia Catalog currently contains 72 elements as measured within ~6000 stars, with a total of ~240,000 unique abundance determinations. The online database offers a variety of solar normalizations, stellar properties, and planetary properties (where applicable) that can all be viewed through multiple interactive plotting interfaces as well as in a tabular format. By analyzing stellar abundances for large populations of stars and from a variety of different perspectives, a wealth of information can be revealed on both large and small scales.
Database Description - Open TG-GATEs Pathological Image Database | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us Open TG-GATEs Pathological Image Database Database Description General information of database Database... name Open TG-GATEs Pathological Image Database Alternative name - DOI 10.18908/lsdba.nbdc00954-0...iomedical Innovation 7-6-8, Saito-asagi, Ibaraki-city, Osaka 567-0085, Japan TEL:81-72-641-9826 Email: Database... classification Toxicogenomics Database Organism Taxonomy Name: Rattus norvegi... Article title: Author name(s): Journal: External Links: Original website information Database
Image-based query-by-example for big databases of galaxy images

Science.gov (United States)

Shamir, Lior; Kuminski, Evan

2017-01-01

Very large astronomical databases containing millions or even billions of galaxy images have been becoming increasingly important tools in astronomy research. However, in many cases the very large size makes it more difficult to analyze these data manually, reinforcing the need for computer algorithms that can automate the data analysis process. An example of such task is the identification of galaxies of a certain morphology of interest. For instance, if a rare galaxy is identified it is reasonable to expect that more galaxies of similar morphology exist in the database, but it is virtually impossible to manually search these databases to identify such galaxies. Here we describe computer vision and pattern recognition methodology that receives a galaxy image as an input, and searches automatically a large dataset of galaxies to return a list of galaxies that are visually similar to the query galaxy. The returned list is not necessarily complete or clean, but it provides a substantial reduction of the original database into a smaller dataset, in which the frequency of objects visually similar to the query galaxy is much higher. Experimental results show that the algorithm can identify rare galaxies such as ring galaxies among datasets of 10,000 astronomical objects.
The CSB Incident Screening Database: description, summary statistics and uses.

Science.gov (United States)

Gomez, Manuel R; Casper, Susan; Smith, E Allen

2008-11-15

This paper briefly describes the Chemical Incident Screening Database currently used by the CSB to identify and evaluate chemical incidents for possible investigations, and summarizes descriptive statistics from this database that can potentially help to estimate the number, character, and consequences of chemical incidents in the US. The report compares some of the information in the CSB database to roughly similar information available from databases operated by EPA and the Agency for Toxic Substances and Disease Registry (ATSDR), and explores the possible implications of these comparisons with regard to the dimension of the chemical incident problem. Finally, the report explores in a preliminary way whether a system modeled after the existing CSB screening database could be developed to serve as a national surveillance tool for chemical incidents.
ALFRED: An Allele Frequency Database for Microevolutionary Studies

Directory of Open Access Journals (Sweden)

Kenneth K Kidd

2005-01-01

Full Text Available Many kinds of microevolutionary studies require data on multiple polymorphisms in multiple populations. Increasingly, and especially for human populations, multiple research groups collect relevant data and those data are dispersed widely in the literature. ALFRED has been designed to hold data from many sources and make them available over the web. Data are assembled from multiple sources, curated, and entered into the database. Multiple links to other resources are also established by the curators. A variety of search options are available and additional geographic based interfaces are being developed. The database can serve the human anthropologic genetic community by identifying what loci are already typed on many populations thereby helping to focus efforts on a common set of markers. The database can also serve as a model for databases handling similar DNA polymorphism data for other species.
CRIMINAL LAW PROTECTION OF DATABASE AT A GLANCE

Directory of Open Access Journals (Sweden)

LUCIAN T. POENARU

2012-05-01

Full Text Available Database protection is provided in Romania by the general law on copyright no. 8/1996. According to the law, it is considered to be a crime making available to the public, by any means, the special rights attributed to database owners or copies thereof. This paper will focus on, one hand, presenting the way database and database related products can be subject to a copyright general protection and, on the other, revealing the special sui generis right attributed to database owners. In such a context, criminal instruments for protecting such rights seem to be quite annoying for the perpetrator, but less effective when it comes to a proper enforcement by the criminal bodies. This paper will therefore try to compare the way guilty actions of the culprit are effectively sanctioned by the criminal instruments provided by the law.And because the Romanian law on copyright does follow at least the letter of the European Directives on copyright and the protection of database, this paper will also search the spirit of the relevant European case-law and its applicability by the Romanian authorities.
Representational Similarity Analysis Reveals Heterogeneous Networks Supporting Speech Motor Control

DEFF Research Database (Denmark)

Zheng, Zane; Cusack, Rhodri; Johnsrude, Ingrid

The everyday act of speaking involves the complex processes of speech motor control. One important feature of such control is regulation of articulation when auditory concomitants of speech do not correspond to the intended motor gesture. While theoretical accounts of speech monitoring posit...... multiple functional components required for detection of errors in speech planning (e.g., Levelt, 1983), neuroimaging studies generally indicate either single brain regions sensitive to speech production errors, or small, discrete networks. Here we demonstrate that the complex system controlling speech...... is supported by a complex neural network that is involved in linguistic, motoric and sensory processing. With the aid of novel real-time acoustic analyses and representational similarity analyses of fMRI signals, our data show functionally differentiated networks underlying auditory feedback control of speech....
Update History of This Database - Yeast Interacting Proteins Database | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us Yeast Interacting Proteins Database Update History of This Database Date Update contents 201...0/03/29 Yeast Interacting Proteins Database English archive site is opened. 2000/12/4 Yeast Interacting Proteins Database...( http://itolab.cb.k.u-tokyo.ac.jp/Y2H/ ) is released. About This Database Database Description... Download License Update History of This Database Site Policy | Contact Us Update History of This Database... - Yeast Interacting Proteins Database | LSDB Archive ...
A similarity measure method combining location feature for mammogram retrieval.

Science.gov (United States)

Wang, Zhiqiong; Xin, Junchang; Huang, Yukun; Li, Chen; Xu, Ling; Li, Yang; Zhang, Hao; Gu, Huizi; Qian, Wei

2018-05-28

Breast cancer, the most common malignancy among women, has a high mortality rate in clinical practice. Early detection, diagnosis and treatment can reduce the mortalities of breast cancer greatly. The method of mammogram retrieval can help doctors to find the early breast lesions effectively and determine a reasonable feature set for image similarity measure. This will improve the accuracy effectively for mammogram retrieval. This paper proposes a similarity measure method combining location feature for mammogram retrieval. Firstly, the images are pre-processed, the regions of interest are detected and the lesions are segmented in order to get the center point and radius of the lesions. Then, the method, namely Coherent Point Drift, is used for image registration with the pre-defined standard image. The center point and radius of the lesions after registration are obtained and the standard location feature of the image is constructed. This standard location feature can help figure out the location similarity between the image pair from the query image to each dataset image in the database. Next, the content feature of the image is extracted, including the Histogram of Oriented Gradients, the Edge Direction Histogram, the Local Binary Pattern and the Gray Level Histogram, and the image pair content similarity can be calculated using the Earth Mover's Distance. Finally, the location similarity and content similarity are fused to form the image fusion similarity, and the specified number of the most similar images can be returned according to it. In the experiment, 440 mammograms, which are from Chinese women in Northeast China, are used as the database. When fusing 40% lesion location feature similarity and 60% content feature similarity, the results have obvious advantages. At this time, precision is 0.83, recall is 0.76, comprehensive indicator is 0.79, satisfaction is 96.0%, mean is 4.2 and variance is 17.7. The results show that the precision and recall of this
Database Description - RMOS | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available base Description General information of database Database name RMOS Alternative nam...arch Unit Shoshi Kikuchi E-mail : Database classification Plant databases - Rice Microarray Data and other Gene Expression Database...s Organism Taxonomy Name: Oryza sativa Taxonomy ID: 4530 Database description The Ric...19&lang=en Whole data download - Referenced database Rice Expression Database (RED) Rice full-length cDNA Database... (KOME) Rice Genome Integrated Map Database (INE) Rice Mutant Panel Database (Tos17) Rice Genome Annotation Database
Database implementation to fluidized cracking catalytic-FCC process

International Nuclear Information System (INIS)

Santana, Antonio Otavio de; Dantas, Carlos Costa; Santos, Valdemir A. dos

2009-01-01

A process of Fluidized Cracking Catalytic (FCC) was developed by our research group. A cold model FCC unit, in laboratory scale, was used for obtaining of the data relative to the following parameters: air flow, system pressure, riser inlet pressure, rise outlet pressure, pressure drop in the riser, motor speed of catalyst injection and density. The measured of the density is made by gamma ray transmission. For the fact of the process of FCC not to have a database until then, the present work supplied this deficiency with the implementation of a database in connection with the Matlab software. The data from the FCC unit (laboratory model) are obtained as spreadsheet of the MS-Excel software. These spreadsheets were treated before importing them as database tables. The application of the process of normalization of database and the analysis done with the MS-Access in these spreadsheets treated revealed the need of an only relation (table) for to represent the database. The Database Manager System (DBMS) chosen has been the MS-Access by to satisfy our flow of data. The next step was the creation of the database, being built the table of data, the action query, selection query and the macro for to import data from the unit FCC in study. Also an interface between the application 'Database Toolbox' (Matlab2008a) and the database was created. This was obtained through the drivers ODBC (Open Data Base Connectivity). This interface allows the manipulation of the database by the users operating in the Matlab. (author)
Database implementation to fluidized cracking catalytic-FCC process

Energy Technology Data Exchange (ETDEWEB)

Santana, Antonio Otavio de; Dantas, Carlos Costa, E-mail: aos@ufpe.b [Universidade Federal de Pernambuco (UFPE), Recife, PE (Brazil). Dept. de Energia Nuclear; Santos, Valdemir A. dos, E-mail: valdemir.alexandre@pq.cnpq.b [Universidade Catolica de Pernambuco, Recife, PE (Brazil). Centro de Ciencia e Tecnologia

2009-07-01

A process of Fluidized Cracking Catalytic (FCC) was developed by our research group. A cold model FCC unit, in laboratory scale, was used for obtaining of the data relative to the following parameters: air flow, system pressure, riser inlet pressure, rise outlet pressure, pressure drop in the riser, motor speed of catalyst injection and density. The measured of the density is made by gamma ray transmission. For the fact of the process of FCC not to have a database until then, the present work supplied this deficiency with the implementation of a database in connection with the Matlab software. The data from the FCC unit (laboratory model) are obtained as spreadsheet of the MS-Excel software. These spreadsheets were treated before importing them as database tables. The application of the process of normalization of database and the analysis done with the MS-Access in these spreadsheets treated revealed the need of an only relation (table) for to represent the database. The Database Manager System (DBMS) chosen has been the MS-Access by to satisfy our flow of data. The next step was the creation of the database, being built the table of data, the action query, selection query and the macro for to import data from the unit FCC in study. Also an interface between the application 'Database Toolbox' (Matlab2008a) and the database was created. This was obtained through the drivers ODBC (Open Data Base Connectivity). This interface allows the manipulation of the database by the users operating in the Matlab. (author)
Identification of proteins similar to AvrE type III effector proteins from ...

African Journals Online (AJOL)

Stephen Opiyo

GSE22274), and AraCyc databases, we highlighted 16 protein candidates from Arabidopsidis genome .... projection method similar to principal component analysis (PCA) .... RIN4 RIN4 (RPM1 INTERACTING PROTEIN 4); protein binding.
KALIMER database development (database configuration and design methodology)

International Nuclear Information System (INIS)

Jeong, Kwan Seong; Kwon, Young Min; Lee, Young Bum; Chang, Won Pyo; Hahn, Do Hee

2001-10-01

KALIMER Database is an advanced database to utilize the integration management for Liquid Metal Reactor Design Technology Development using Web Applicatins. KALIMER Design database consists of Results Database, Inter-Office Communication (IOC), and 3D CAD database, Team Cooperation system, and Reserved Documents, Results Database is a research results database during phase II for Liquid Metal Reactor Design Technology Develpment of mid-term and long-term nuclear R and D. IOC is a linkage control system inter sub project to share and integrate the research results for KALIMER. 3D CAD Database is s schematic design overview for KALIMER. Team Cooperation System is to inform team member of research cooperation and meetings. Finally, KALIMER Reserved Documents is developed to manage collected data and several documents since project accomplishment. This report describes the features of Hardware and Software and the Database Design Methodology for KALIMER
Database Description - SAHG | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available base Description General information of database Database name SAHG Alternative nam...h: Contact address Chie Motono Tel : +81-3-3599-8067 E-mail : Database classification Structure Databases - ...e databases - Protein properties Organism Taxonomy Name: Homo sapiens Taxonomy ID: 9606 Database description... Links: Original website information Database maintenance site The Molecular Profiling Research Center for D...stration Not available About This Database Database Description Download License Update History of This Database Site Policy | Contact Us Database Description - SAHG | LSDB Archive ...

A Framework for Analysis of Music Similarity Measures

DEFF Research Database (Denmark)

Jensen, Jesper Højvang; Christensen, Mads G.; Jensen, Søren Holdt

2007-01-01

To analyze specific properties of music similarity measures that the commonly used genre classification evaluation procedure does not reveal, we introduce a MIDI based test framework for music similarity measures. We introduce the framework by example and thus outline an experiment to analyze the...
Relational databases for SSC design and control

International Nuclear Information System (INIS)

Barr, E.; Peggs, S.; Saltmarsh, C.

1989-01-01

Most people agree that a database is A Good Thing, but there is much confusion in the jargon used, and in what jobs a database management system and its peripheral software can and cannot do. During the life cycle of an enormous project like the SSC, from conceptual and theoretical design, through research and development, to construction, commissioning and operation, an enormous amount of data will be generated. Some of these data, originating in the early parts of the project, will be needed during commissioning or operation, many years in the future. Two of these pressing data management needs-from the magnet research and industrialization programs and the lattice design-have prompted work on understanding and adapting commercial database practices for scientific projects. Modern relational database management systems (rDBMS's) cope naturally with a large proportion of the requirements of data structures, like the SSC database structure built for the superconduction cable supplies, uses, and properties. This application is similar to the commercial applications for which these database systems were developed. The SSC application has further requirements not immediately satisfied by the commercial systems. These derive from the diversity of the data structures to be managed, the changing emphases and uses during the project lifetime, and the large amount of scientific data processing to be expected. 4 refs., 5 figs
Database Description - PSCDB | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available abase Description General information of database Database name PSCDB Alternative n...rial Science and Technology (AIST) Takayuki Amemiya E-mail: Database classification Structure Databases - Protein structure Database...554-D558. External Links: Original website information Database maintenance site Graduate School of Informat...available URL of Web services - Need for user registration Not available About This Database Database Descri...ption Download License Update History of This Database Site Policy | Contact Us Database Description - PSCDB | LSDB Archive ...
Database Description - ASTRA | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available abase Description General information of database Database name ASTRA Alternative n...tics Journal Search: Contact address Database classification Nucleotide Sequence Databases - Gene structure,...3702 Taxonomy Name: Oryza sativa Taxonomy ID: 4530 Database description The database represents classified p...(10):1211-6. External Links: Original website information Database maintenance site National Institute of Ad... for user registration Not available About This Database Database Description Dow
Database Description - RPD | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available ase Description General information of database Database name RPD Alternative name Rice Proteome Database...titute of Crop Science, National Agriculture and Food Research Organization Setsuko Komatsu E-mail: Database... classification Proteomics Resources Plant databases - Rice Organism Taxonomy Name: Oryza sativa Taxonomy ID: 4530 Database... description Rice Proteome Database contains information on protei...and entered in the Rice Proteome Database. The database is searchable by keyword,
Database Description - PLACE | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available abase Description General information of database Database name PLACE Alternative name A Database...Kannondai, Tsukuba, Ibaraki 305-8602, Japan National Institute of Agrobiological Sciences E-mail : Databas...e classification Plant databases Organism Taxonomy Name: Tracheophyta Taxonomy ID: 58023 Database...99, Vol.27, No.1 :297-300 External Links: Original website information Database maintenance site National In...- Need for user registration Not available About This Database Database Descripti
Geography and similarity of regional cuisines in China.

Science.gov (United States)

Zhu, Yu-Xiao; Huang, Junming; Zhang, Zi-Ke; Zhang, Qian-Ming; Zhou, Tao; Ahn, Yong-Yeol

2013-01-01

Food occupies a central position in every culture and it is therefore of great interest to understand the evolution of food culture. The advent of the World Wide Web and online recipe repositories have begun to provide unprecedented opportunities for data-driven, quantitative study of food culture. Here we harness an online database documenting recipes from various Chinese regional cuisines and investigate the similarity of regional cuisines in terms of geography and climate. We find that geographical proximity, rather than climate proximity, is a crucial factor that determines the similarity of regional cuisines. We develop a model of regional cuisine evolution that provides helpful clues for understanding the evolution of cuisines and cultures.
Geography and Similarity of Regional Cuisines in China

Science.gov (United States)

Zhu, Yu-Xiao; Huang, Junming; Zhang, Zi-Ke; Zhang, Qian-Ming; Zhou, Tao; Ahn, Yong-Yeol

2013-01-01

Food occupies a central position in every culture and it is therefore of great interest to understand the evolution of food culture. The advent of the World Wide Web and online recipe repositories have begun to provide unprecedented opportunities for data-driven, quantitative study of food culture. Here we harness an online database documenting recipes from various Chinese regional cuisines and investigate the similarity of regional cuisines in terms of geography and climate. We find that geographical proximity, rather than climate proximity, is a crucial factor that determines the similarity of regional cuisines. We develop a model of regional cuisine evolution that provides helpful clues for understanding the evolution of cuisines and cultures. PMID:24260166
Database Description - JSNP | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available base Description General information of database Database name JSNP Alternative nam...n Science and Technology Agency Creator Affiliation: Contact address E-mail : Database...sapiens Taxonomy ID: 9606 Database description A database of about 197,000 polymorphisms in Japanese populat...1):605-610 External Links: Original website information Database maintenance site Institute of Medical Scien...er registration Not available About This Database Database Description Download License Update History of This Database
Database Description - RED | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available ase Description General information of database Database name RED Alternative name Rice Expression Database...enome Research Unit Shoshi Kikuchi E-mail : Database classification Plant databases - Rice Database classifi...cation Microarray, Gene Expression Organism Taxonomy Name: Oryza sativa Taxonomy ID: 4530 Database descripti... Article title: Rice Expression Database: the gateway to rice functional genomics...nt Science (2002) Dec 7 (12):563-564 External Links: Original website information Database maintenance site
Potential translational targets revealed by linking mouse grooming behavioral phenotypes to gene expression using public databases.

Science.gov (United States)

Roth, Andrew; Kyzar, Evan J; Cachat, Jonathan; Stewart, Adam Michael; Green, Jeremy; Gaikwad, Siddharth; O'Leary, Timothy P; Tabakoff, Boris; Brown, Richard E; Kalueff, Allan V

2013-01-10

Rodent self-grooming is an important, evolutionarily conserved behavior, highly sensitive to pharmacological and genetic manipulations. Mice with aberrant grooming phenotypes are currently used to model various human disorders. Therefore, it is critical to understand the biology of grooming behavior, and to assess its translational validity to humans. The present in-silico study used publicly available gene expression and behavioral data obtained from several inbred mouse strains in the open-field, light-dark box, elevated plus- and elevated zero-maze tests. As grooming duration differed between strains, our analysis revealed several candidate genes with significant correlations between gene expression in the brain and grooming duration. The Allen Brain Atlas, STRING, GoMiner and Mouse Genome Informatics databases were used to functionally map and analyze these candidate mouse genes against their human orthologs, assessing the strain ranking of their expression and the regional distribution of expression in the mouse brain. This allowed us to identify an interconnected network of candidate genes (which have expression levels that correlate with grooming behavior), display altered patterns of expression in key brain areas related to grooming, and underlie important functions in the brain. Collectively, our results demonstrate the utility of large-scale, high-throughput data-mining and in-silico modeling for linking genomic and behavioral data, as well as their potential to identify novel neural targets for complex neurobehavioral phenotypes, including grooming. Copyright © 2012 Elsevier Inc. All rights reserved.
Database Description - ConfC | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available abase Description General information of database Database name ConfC Alternative name Database...amotsu Noguchi Tel: 042-495-8736 E-mail: Database classification Structure Database...s - Protein structure Structure Databases - Small molecules Structure Databases - Nucleic acid structure Database... services - Need for user registration - About This Database Database Description Download License Update History of This Database... Site Policy | Contact Us Database Description - ConfC | LSDB Archive ...
The online database MaarjAM reveals global and ecosystemic distribution patterns in arbuscular mycorrhizal fungi (Glomeromycota).

Science.gov (United States)

Opik, M; Vanatoa, A; Vanatoa, E; Moora, M; Davison, J; Kalwij, J M; Reier, U; Zobel, M

2010-10-01

• Here, we describe a new database, MaarjAM, that summarizes publicly available Glomeromycota DNA sequence data and associated metadata. The goal of the database is to facilitate the description of distribution and richness patterns in this group of fungi. • Small subunit (SSU) rRNA gene sequences and available metadata were collated from all suitable taxonomic and ecological publications. These data have been made accessible in an open-access database (http://maarjam.botany.ut.ee). • Two hundred and eighty-two SSU rRNA gene virtual taxa (VT) were described based on a comprehensive phylogenetic analysis of all collated Glomeromycota sequences. Two-thirds of VT showed limited distribution ranges, occurring in single current or historic continents or climatic zones. Those VT that associated with a taxonomically wide range of host plants also tended to have a wide geographical distribution, and vice versa. No relationships were detected between VT richness and latitude, elevation or vascular plant richness. • The collated Glomeromycota molecular diversity data suggest limited distribution ranges in most Glomeromycota taxa and a positive relationship between the width of a taxon's geographical range and its host taxonomic range. Inconsistencies between molecular and traditional taxonomy of Glomeromycota, and shortage of data from major continents and ecosystems, are highlighted.
Patient safety in dentistry - state of play as revealed by a national database of errors.

Science.gov (United States)

Thusu, S; Panesar, S; Bedi, R

2012-08-01

Modern dentistry has become increasingly invasive and sophisticated. Consequently the risk to the patient has increased. The aim of this study is to investigate the types of patient safety incidents (PSIs) that occur in dentistry and the accuracy of the National Patient Safety Agency (NPSA) database in identifying those attributed to dentistry. The database was analysed for all incidents of iatrogenic harm in the speciality of dentistry. A snapshot view using the timeframe January to December 2009 was used. The free text elements from the database were analysed thematically and reclassified according to the nature of the PSI. Descriptive statistics were provided. Two thousand and twelve incident reports were analysed and organised into ten categories. The commonest was due to clerical errors - 36%. Five areas of PSI were further analysed: injury (10%), medical emergency (6%), inhalation/ingestion (4%), adverse reaction (4%) and wrong site extraction (2%). There is generally low reporting of PSIs within the dental specialities. This may be attributed to the voluntary nature of reporting and the reluctance of dental practitioners to disclose incidences for fear of loss of earnings. A significant amount of iatrogenic harm occurs not during treatment but through controllable pre- and post-procedural checks. Incidences of iatrogenic harm to dental patients do occur but their reporting is not widely used. The use of a dental specific reporting system would aid in minimising iatrogenic harm and adhere to the Care Quality Commission (CQC) compliance monitoring system on essential standards of quality and safety in dental practices.
Self-similar solution for coupled thermal electromagnetic model ...

African Journals Online (AJOL)

An investigation into the existence and uniqueness solution of self-similar solution for the coupled Maxwell and Pennes Bio-heat equations have been done. Criteria for existence and uniqueness of self-similar solution are revealed in the consequent theorems. Journal of the Nigerian Association of Mathematical Physics ...
Database management systems understanding and applying database technology

CERN Document Server

Gorman, Michael M

1991-01-01

Database Management Systems: Understanding and Applying Database Technology focuses on the processes, methodologies, techniques, and approaches involved in database management systems (DBMSs).The book first takes a look at ANSI database standards and DBMS applications and components. Discussion focus on application components and DBMS components, implementing the dynamic relationship application, problems and benefits of dynamic relationship DBMSs, nature of a dynamic relationship application, ANSI/NDL, and DBMS standards. The manuscript then ponders on logical database, interrogation, and phy
The STEP database through the end-users eyes--USABILITY STUDY.

Science.gov (United States)

Salunke, Smita; Tuleu, Catherine

2015-08-15

The user-designed database of Safety and Toxicity of Excipients for Paediatrics ("STEP") is created to address the shared need of drug development community to access the relevant information of excipients effortlessly. Usability testing was performed to validate if the database satisfies the need of the end-users. Evaluation framework was developed to assess the usability. The participants performed scenario based tasks and provided feedback and post-session usability ratings. Failure Mode Effect Analysis (FMEA) was performed to prioritize the problems and improvements to the STEP database design and functionalities. The study revealed several design vulnerabilities. Tasks such as limiting the results, running complex queries, location of data and registering to access the database were challenging. The three critical attributes identified to have impact on the usability of the STEP database included (1) content and presentation (2) the navigation and search features (3) potential end-users. Evaluation framework proved to be an effective method for evaluating database effectiveness and user satisfaction. This study provides strong initial support for the usability of the STEP database. Recommendations would be incorporated into the refinement of the database to improve its usability and increase user participation towards the advancement of the database. Copyright © 2015 Elsevier B.V. All rights reserved.
Evaluating gender similarities and differences using metasynthesis.

Science.gov (United States)

Zell, Ethan; Krizan, Zlatan; Teeter, Sabrina R

2015-01-01

Despite the common lay assumption that males and females are profoundly different, Hyde (2005) used data from 46 meta-analyses to demonstrate that males and females are highly similar. Nonetheless, the gender similarities hypothesis has remained controversial. Since Hyde's provocative report, there has been an explosion of meta-analytic interest in psychological gender differences. We utilized this enormous collection of 106 meta-analyses and 386 individual meta-analytic effects to reevaluate the gender similarities hypothesis. Furthermore, we employed a novel data-analytic approach called metasynthesis (Zell & Krizan, 2014) to estimate the average difference between males and females and to explore moderators of gender differences. The average, absolute difference between males and females across domains was relatively small (d = 0.21, SD = 0.14), with the majority of effects being either small (46%) or very small (39%). Magnitude of differences fluctuated somewhat as a function of the psychological domain (e.g., cognitive variables, social and personality variables, well-being), but remained largely constant across age, culture, and generations. These findings provide compelling support for the gender similarities hypothesis, but also underscore conditions under which gender differences are most pronounced. PsycINFO Database Record (c) 2015 APA, all rights reserved.
Construction of patient specific atlases from locally most similar anatomical pieces

Science.gov (United States)

Ramus, Liliane; Commowick, Olivier; Malandain, Grégoire

2010-01-01

Radiotherapy planning requires accurate delineations of the critical structures. To avoid manual contouring, atlas-based segmentation can be used to get automatic delineations. However, the results strongly depend on the chosen atlas, especially for the head and neck region where the anatomical variability is high. To address this problem, atlases adapted to the patient’s anatomy may allow for a better registration, and already showed an improvement in segmentation accuracy. However, building such atlases requires the definition of a criterion to select among a database the images that are the most similar to the patient. Moreover, the inter-expert variability of manual contouring may be high, and therefore bias the segmentation if selecting only one image for each region. To tackle these issues, we present an original method to design a piecewise most similar atlas. Given a query image, we propose an efficient criterion to select for each anatomical region the K most similar images among a database by considering local volume variations possibly induced by the tumor. Then, we present a new approach to combine the K images selected for each region into a piecewise most similar template. Our results obtained with 105 CT images of the head and neck show that our method reduces the over-segmentation seen with an average atlas while being robust to inter-expert manual segmentation variability. PMID:20879395
Database Description - RMG | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available ase Description General information of database Database name RMG Alternative name ...raki 305-8602, Japan National Institute of Agrobiological Sciences E-mail : Database... classification Nucleotide Sequence Databases Organism Taxonomy Name: Oryza sativa Japonica Group Taxonomy ID: 39947 Database...rnal: Mol Genet Genomics (2002) 268: 434–445 External Links: Original website information Database...available URL of Web services - Need for user registration Not available About This Database Database Descri

Designing the Database of Speech Under Stress

Directory of Open Access Journals (Sweden)

Sabo Róbert

2017-12-01

Full Text Available This study describes the methodology used for designing a database of speech under real stress. Based on limits of existing stress databases, we used a communication task via a computer game to collect speech data. To validate the presence of stress, known psychophysiological indicators such as heart rate and electrodermal activity, as well as subjective self-assessment were used. This paper presents the data from first 5 speakers (3 men, 2 women who participated in initial tests of the proposed design. In 4 out of 5 speakers increases in fundamental frequency and intensity of speech were registered. Similarly, in 4 out of 5 speakers heart rate was significantly increased during the task, when compared with reference measurement from before the task. These first results show that proposed design might be appropriate for building a speech under stress database. However, there are still considerations that need to be addressed.
Cross-kingdom similarities in microbiome functions

NARCIS (Netherlands)

Mendes, R.; Raaijmakers, J.M.

2015-01-01

Recent advances in medical research have revealed how humans rely on their microbiome for diverse traits and functions. Similarly, microbiomes of other higher organisms play key roles in disease, health, growth and development of their host. Exploring microbiome functions across kingdoms holds
Relational databases

CERN Document Server

Bell, D A

1986-01-01

Relational Databases explores the major advances in relational databases and provides a balanced analysis of the state of the art in relational databases. Topics covered include capture and analysis of data placement requirements; distributed relational database systems; data dependency manipulation in database schemata; and relational database support for computer graphics and computer aided design. This book is divided into three sections and begins with an overview of the theory and practice of distributed systems, using the example of INGRES from Relational Technology as illustration. The
Securing SQL Server Protecting Your Database from Attackers

CERN Document Server

Cherry, Denny

2011-01-01

There is a lot at stake for administrators taking care of servers, since they house sensitive data like credit cards, social security numbers, medical records, and much more. In Securing SQL Server you will learn about the potential attack vectors that can be used to break into your SQL Server database, and how to protect yourself from these attacks. Written by a Microsoft SQL Server MVP, you will learn how to properly secure your database, from both internal and external threats. Best practices and specific tricks employed by the author will also be revealed. Learn expert techniques to protec
Database Description - DGBY | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available base Description General information of database Database name DGBY Alternative name Database...EL: +81-29-838-8066 E-mail: Database classification Microarray Data and other Gene Expression Databases Orga...nism Taxonomy Name: Saccharomyces cerevisiae Taxonomy ID: 4932 Database descripti...-called phenomics). We uploaded these data on this website which is designated DGBY(Database for Gene expres...ma J, Ando A, Takagi H. Journal: Yeast. 2008 Mar;25(3):179-90. External Links: Original website information Database
Database Description - KOME | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available base Description General information of database Database name KOME Alternative nam... Sciences Plant Genome Research Unit Shoshi Kikuchi E-mail : Database classification Plant databases - Rice ...Organism Taxonomy Name: Oryza sativa Taxonomy ID: 4530 Database description Information about approximately ...Hayashizaki Y, Kikuchi S. Journal: PLoS One. 2007 Nov 28; 2(11):e1235. External Links: Original website information Database...OS) Rice mutant panel database (Tos17) A Database of Plant Cis-acting Regulatory
A High-Resolution LC-MS-Based Secondary Metabolite Fingerprint Database of Marine Bacteria

KAUST Repository

Lu, Liang

2014-10-09

© 2014 Macmillan Publishers Limited. All rights reserved. Marine bacteria are the most widely distributed organisms in the ocean environment and produce a wide variety of secondary metabolites. However, traditional screening for bioactive natural compounds is greatly hindered by the lack of a systematic way of cataloguing the chemical profiles of bacterial strains found in nature. Here we present a chemical fingerprint database of marine bacteria based on their secondary metabolite profiles, acquired by high-resolution LC-MS. Till now, 1,430 bacterial strains spanning 168 known species collected from different marine environments were cultured and profiled. Using this database, we demonstrated that secondary metabolite profile similarity is approximately, but not always, correlated with taxonomical similarity. We also validated the ability of this database to find species-specific metabolites, as well as to discover known bioactive compounds from previously unknown sources. An online interface to this database, as well as the accompanying software, is provided freely for the community to use.
A High-Resolution LC-MS-Based Secondary Metabolite Fingerprint Database of Marine Bacteria

KAUST Repository

Lu, Liang; Wang, Jijie; Xu, Ying; Wang, Kailing; Hu, Yingwei; Tian, Renmao; Yang, Bo; Lai, Qiliang; Li, Yongxin; Zhang, Weipeng; Shao, Zongze; Lam, Henry; Qian, Pei-Yuan

2014-01-01

© 2014 Macmillan Publishers Limited. All rights reserved. Marine bacteria are the most widely distributed organisms in the ocean environment and produce a wide variety of secondary metabolites. However, traditional screening for bioactive natural compounds is greatly hindered by the lack of a systematic way of cataloguing the chemical profiles of bacterial strains found in nature. Here we present a chemical fingerprint database of marine bacteria based on their secondary metabolite profiles, acquired by high-resolution LC-MS. Till now, 1,430 bacterial strains spanning 168 known species collected from different marine environments were cultured and profiled. Using this database, we demonstrated that secondary metabolite profile similarity is approximately, but not always, correlated with taxonomical similarity. We also validated the ability of this database to find species-specific metabolites, as well as to discover known bioactive compounds from previously unknown sources. An online interface to this database, as well as the accompanying software, is provided freely for the community to use.
NPACT: Naturally Occurring Plant-based Anti-cancer Compound-Activity-Target database.

Science.gov (United States)

Mangal, Manu; Sagar, Parul; Singh, Harinder; Raghava, Gajendra P S; Agarwal, Subhash M

2013-01-01

Plant-derived molecules have been highly valued by biomedical researchers and pharmaceutical companies for developing drugs, as they are thought to be optimized during evolution. Therefore, we have collected and compiled a central resource Naturally Occurring Plant-based Anti-cancer Compound-Activity-Target database (NPACT, http://crdd.osdd.net/raghava/npact/) that gathers the information related to experimentally validated plant-derived natural compounds exhibiting anti-cancerous activity (in vitro and in vivo), to complement the other databases. It currently contains 1574 compound entries, and each record provides information on their structure, manually curated published data on in vitro and in vivo experiments along with reference for users referral, inhibitory values (IC(50)/ED(50)/EC(50)/GI(50)), properties (physical, elemental and topological), cancer types, cell lines, protein targets, commercial suppliers and drug likeness of compounds. NPACT can easily be browsed or queried using various options, and an online similarity tool has also been made available. Further, to facilitate retrieval of existing data, each record is hyperlinked to similar databases like SuperNatural, Herbal Ingredients' Targets, Comparative Toxicogenomics Database, PubChem and NCI-60 GI(50) data.
NPACT: Naturally Occurring Plant-based Anti-cancer Compound-Activity-Target database

Science.gov (United States)

Mangal, Manu; Sagar, Parul; Singh, Harinder; Raghava, Gajendra P. S.; Agarwal, Subhash M.

2013-01-01

Plant-derived molecules have been highly valued by biomedical researchers and pharmaceutical companies for developing drugs, as they are thought to be optimized during evolution. Therefore, we have collected and compiled a central resource Naturally Occurring Plant-based Anti-cancer Compound-Activity-Target database (NPACT, http://crdd.osdd.net/raghava/npact/) that gathers the information related to experimentally validated plant-derived natural compounds exhibiting anti-cancerous activity (in vitro and in vivo), to complement the other databases. It currently contains 1574 compound entries, and each record provides information on their structure, manually curated published data on in vitro and in vivo experiments along with reference for users referral, inhibitory values (IC50/ED50/EC50/GI50), properties (physical, elemental and topological), cancer types, cell lines, protein targets, commercial suppliers and drug likeness of compounds. NPACT can easily be browsed or queried using various options, and an online similarity tool has also been made available. Further, to facilitate retrieval of existing data, each record is hyperlinked to similar databases like SuperNatural, Herbal Ingredients’ Targets, Comparative Toxicogenomics Database, PubChem and NCI-60 GI50 data. PMID:23203877
Databases

Directory of Open Access Journals (Sweden)

Nick Ryan

2004-01-01

Full Text Available Databases are deeply embedded in archaeology, underpinning and supporting many aspects of the subject. However, as well as providing a means for storing, retrieving and modifying data, databases themselves must be a result of a detailed analysis and design process. This article looks at this process, and shows how the characteristics of data models affect the process of database design and implementation. The impact of the Internet on the development of databases is examined, and the article concludes with a discussion of a range of issues associated with the recording and management of archaeological data.
A Minimum Spanning Tree Representation of Anime Similarities

OpenAIRE

Wibowo, Canggih Puspo

2016-01-01

In this work, a new way to represent Japanese animation (anime) is presented. We applied a minimum spanning tree to show the relation between anime. The distance between anime is calculated through three similarity measurements, namely crew, score histogram, and topic similarities. Finally, the centralities are also computed to reveal the most significance anime. The result shows that the minimum spanning tree can be used to determine the similarity anime. Furthermore, by using centralities c...
Audio Query by Example Using Similarity Measures between Probability Density Functions of Features

Directory of Open Access Journals (Sweden)

Marko Helén

2010-01-01

Full Text Available This paper proposes a query by example system for generic audio. We estimate the similarity of the example signal and the samples in the queried database by calculating the distance between the probability density functions (pdfs of their frame-wise acoustic features. Since the features are continuous valued, we propose to model them using Gaussian mixture models (GMMs or hidden Markov models (HMMs. The models parametrize each sample efficiently and retain sufficient information for similarity measurement. To measure the distance between the models, we apply a novel Euclidean distance, approximations of Kullback-Leibler divergence, and a cross-likelihood ratio test. The performance of the measures was tested in simulations where audio samples are automatically retrieved from a general audio database, based on the estimated similarity to a user-provided example. The simulations show that the distance between probability density functions is an accurate measure for similarity. Measures based on GMMs or HMMs are shown to produce better results than that of the existing methods based on simpler statistics or histograms of the features. A good performance with low computational cost is obtained with the proposed Euclidean distance.
P2-35: The KU Facial Expression Database: A Validated Database of Emotional and Conversational Expressions

Directory of Open Access Journals (Sweden)

Haenah Lee

2012-10-01

Full Text Available Facial expressions are one of the most important means of nonverbal communication transporting both emotional and conversational content. For investigating this large space of expressions we recently developed a large database containing dynamic emotional and conversational expressions in Germany (MPI facial expression database. As facial expressions crucially depend on the cultural context, however, a similar resource is needed for studies outside of Germany. Here, we introduce and validate a new, extensive Korean facial expression database containing dynamic emotional and conversational information. Ten individuals performed 62 expressions following a method-acting protocol, in which each person was asked to imagine themselves in one of 62 corresponding everyday scenarios and to react accordingly. To validate this database, we conducted two experiments: 20 participants were asked to name the appropriate expression for each of the 62 everyday scenarios shown as text. Ten additional participants were asked to name each of the 62 expression videos from 10 actors in addition to rating its naturalness. All naming answers were then rated as valid or invalid. Scenario validation yielded 89% valid answers showing that the scenarios are effective in eliciting appropriate expressions. Video sequences were judged as natural with an average of 66% valid answers. This is an excellent result considering that videos were seen without any conversational context and that 62 expressions were to be recognized. These results validate our Korean database and, as they also parallel the German validation results, will enable detailed cross-cultural comparisons of the complex space of emotional and conversational expressions.
Testing Self-Similarity Through Lamperti Transformations

KAUST Repository

Lee, Myoungji

2016-07-14

Self-similar processes have been widely used in modeling real-world phenomena occurring in environmetrics, network traffic, image processing, and stock pricing, to name but a few. The estimation of the degree of self-similarity has been studied extensively, while statistical tests for self-similarity are scarce and limited to processes indexed in one dimension. This paper proposes a statistical hypothesis test procedure for self-similarity of a stochastic process indexed in one dimension and multi-self-similarity for a random field indexed in higher dimensions. If self-similarity is not rejected, our test provides a set of estimated self-similarity indexes. The key is to test stationarity of the inverse Lamperti transformations of the process. The inverse Lamperti transformation of a self-similar process is a strongly stationary process, revealing a theoretical connection between the two processes. To demonstrate the capability of our test, we test self-similarity of fractional Brownian motions and sheets, their time deformations and mixtures with Gaussian white noise, and the generalized Cauchy family. We also apply the self-similarity test to real data: annual minimum water levels of the Nile River, network traffic records, and surface heights of food wrappings. © 2016, International Biometric Society.
Automated dating of the world’s language families based on lexical similarity

OpenAIRE

Holman, E.; Brown, C.; Wichmann, S.; Müller, A.; Velupillai, V.; Hammarström, H.; Sauppe, S.; Jung, H.; Bakker, D.; Brown, P.; Belyaev, O.; Urban, M.; Mailhammer, R.; List, J.; Egorov, D.

2011-01-01

This paper describes a computerized alternative to glottochronology for estimating elapsed time since parent languages diverged into daughter languages. The method, developed by the Automated Similarity Judgment Program (ASJP) consortium, is different from glottochronology in four major respects: (1) it is automated and thus is more objective, (2) it applies a uniform analytical approach to a single database of worldwide languages, (3) it is based on lexical similarity as determined from Leve...
A searchable cross-platform gene expression database reveals connections between drug treatments and disease

Directory of Open Access Journals (Sweden)

Williams Gareth

2012-01-01

Full Text Available Abstract Background Transcriptional data covering multiple platforms and species is collected and processed into a searchable platform independent expression database (SPIED. SPIED consists of over 100,000 expression fold profiles defined independently of control/treatment assignment and mapped to non-redundant gene lists. The database is thus searchable with query profiles defined over genes alone. The motivation behind SPIED is that transcriptional profiles can be quantitatively compared and ranked and thus serve as effective surrogates for comparing the underlying biological states across multiple experiments. Results Drug perturbation, cancer and neurodegenerative disease derived transcriptional profiles are shown to be effective descriptors of the underlying biology as they return related drugs and pathologies from SPIED. In the case of Alzheimer's disease there is high transcriptional overlap with other neurodegenerative conditions and rodent models of neurodegeneration and nerve injury. Combining the query signature with correlating profiles allows for the definition of a tight neurodegeneration signature that successfully highlights many neuroprotective drugs in the Broad connectivity map. Conclusions Quantitative querying of expression data from across the totality of deposited experiments is an effective way of discovering connections between different biological systems and in particular that between drug action and biological disease state. Examples in cancer and neurodegenerative conditions validate the utility of SPIED.
Database Description - SSBD | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available base Description General information of database Database name SSBD Alternative nam...ss 2-2-3 Minatojima-minamimachi, Chuo-ku, Kobe 650-0047, Japan, RIKEN Quantitative Biology Center Shuichi Onami E-mail: Database... classification Other Molecular Biology Databases Database classification Dynamic databa...elegans Taxonomy ID: 6239 Taxonomy Name: Escherichia coli Taxonomy ID: 562 Database description Systems Scie...i Onami Journal: Bioinformatics/April, 2015/Volume 31, Issue 7 External Links: Original website information Database
Database Description - GETDB | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available abase Description General information of database Database name GETDB Alternative n...ame Gal4 Enhancer Trap Insertion Database DOI 10.18908/lsdba.nbdc00236-000 Creator Creator Name: Shigeo Haya... Chuo-ku, Kobe 650-0047 Tel: +81-78-306-3185 FAX: +81-78-306-3183 E-mail: Database classification Expression... Invertebrate genome database Organism Taxonomy Name: Drosophila melanogaster Taxonomy ID: 7227 Database des...riginal website information Database maintenance site Drosophila Genetic Resource
JICST Factual DatabaseJICST Chemical Substance Safety Regulation Database

Science.gov (United States)

Abe, Atsushi; Sohma, Tohru

JICST Chemical Substance Safety Regulation Database is based on the Database of Safety Laws for Chemical Compounds constructed by Japan Chemical Industry Ecology-Toxicology & Information Center (JETOC) sponsored by the Sience and Technology Agency in 1987. JICST has modified JETOC database system, added data and started the online service through JOlS-F (JICST Online Information Service-Factual database) in January 1990. JICST database comprises eighty-three laws and fourteen hundred compounds. The authors outline the database, data items, files and search commands. An example of online session is presented.

A database of major breakwaters around the world

NARCIS (Netherlands)

Allsop, N.W.H.; Cork, R.S.; Verhagen, H.J.

2009-01-01

This paper introduces a co-operative project between HR Wallingford UK (HRW) and Delft University of Technology, Netherlands, (TUD) to develop, populate, and then to apply a database on all major breakwaters around the world. It builds on, and revives, similar initiatives that originate in the late
Developing an Online Database of National and Sub-National Clean Energy Policies

Energy Technology Data Exchange (ETDEWEB)

Haynes, R.; Cross, S.; Heinemann, A.; Booth, S.

2014-06-01

The Database of State Incentives for Renewables and Efficiency (DSIRE) was established in 1995 to provide summaries of energy efficiency and renewable energy policies offered by the federal and state governments. This primer provides an overview of the major policy, research, and technical topics to be considered when creating a similar clean energy policy database and website.
Advanced SPARQL querying in small molecule databases.

Science.gov (United States)

Galgonek, Jakub; Hurt, Tomáš; Michlíková, Vendula; Onderka, Petr; Schwarz, Jan; Vondrášek, Jiří

2016-01-01

In recent years, the Resource Description Framework (RDF) and the SPARQL query language have become more widely used in the area of cheminformatics and bioinformatics databases. These technologies allow better interoperability of various data sources and powerful searching facilities. However, we identified several deficiencies that make usage of such RDF databases restrictive or challenging for common users. We extended a SPARQL engine to be able to use special procedures inside SPARQL queries. This allows the user to work with data that cannot be simply precomputed and thus cannot be directly stored in the database. We designed an algorithm that checks a query against data ontology to identify possible user errors. This greatly improves query debugging. We also introduced an approach to visualize retrieved data in a user-friendly way, based on templates describing visualizations of resource classes. To integrate all of our approaches, we developed a simple web application. Our system was implemented successfully, and we demonstrated its usability on the ChEBI database transformed into RDF form. To demonstrate procedure call functions, we employed compound similarity searching based on OrChem. The application is publicly available at https://bioinfo.uochb.cas.cz/projects/chemRDF.
HIM-herbal ingredients in-vivo metabolism database.

Science.gov (United States)

Kang, Hong; Tang, Kailin; Liu, Qi; Sun, Yi; Huang, Qi; Zhu, Ruixin; Gao, Jun; Zhang, Duanfeng; Huang, Chenggang; Cao, Zhiwei

2013-05-31

Herbal medicine has long been viewed as a valuable asset for potential new drug discovery and herbal ingredients' metabolites, especially the in vivo metabolites were often found to gain better pharmacological, pharmacokinetic and even better safety profiles compared to their parent compounds. However, these herbal metabolite information is still scattered and waiting to be collected. HIM database manually collected so far the most comprehensive available in-vivo metabolism information for herbal active ingredients, as well as their corresponding bioactivity, organs and/or tissues distribution, toxicity, ADME and the clinical research profile. Currently HIM contains 361 ingredients and 1104 corresponding in-vivo metabolites from 673 reputable herbs. Tools of structural similarity, substructure search and Lipinski's Rule of Five are also provided. Various links were made to PubChem, PubMed, TCM-ID (Traditional Chinese Medicine Information database) and HIT (Herbal ingredients' targets databases). A curated database HIM is set up for the in vivo metabolites information of the active ingredients for Chinese herbs, together with their corresponding bioactivity, toxicity and ADME profile. HIM is freely accessible to academic researchers at http://www.bioinformatics.org.cn/.
Surf similarity and solitary wave runup

DEFF Research Database (Denmark)

Fuhrman, David R.; Madsen, Per A.

2008-01-01

The notion of surf similarity in the runup of solitary waves is revisited. We show that the surf similarity parameter for solitary waves may be effectively reduced to the beach slope divided by the offshore wave height to depth ratio. This clarifies its physical interpretation relative to a previ...... functional dependence on their respective surf similarity parameters. Important equivalencies in the runup of sinusoidal and solitary waves are thus revealed.......The notion of surf similarity in the runup of solitary waves is revisited. We show that the surf similarity parameter for solitary waves may be effectively reduced to the beach slope divided by the offshore wave height to depth ratio. This clarifies its physical interpretation relative...... to a previous parameterization, which was not given in an explicit form. Good coherency with experimental (breaking) runup data is preserved with this simpler parameter. A recasting of analytical (nonbreaking) runup expressions for sinusoidal and solitary waves additionally shows that they contain identical...
Database Description - KAIKOcDNA | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us KAIKOcDNA Database Description General information of database Database name KAIKOcDNA Alter...National Institute of Agrobiological Sciences Akiya Jouraku E-mail : Database cla...ssification Nucleotide Sequence Databases Organism Taxonomy Name: Bombyx mori Taxonomy ID: 7091 Database des...rnal: G3 (Bethesda) / 2013, Sep / vol.9 External Links: Original website information Database maintenance si...available URL of Web services - Need for user registration Not available About This Database Database
Download - Trypanosomes Database | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us Trypanosomes Database Download First of all, please read the license of this database. Data ...1.4 KB) Simple search and download Downlaod via FTP FTP server is sometimes jammed. If it is, access [here]. About This Database Data...base Description Download License Update History of This Database Site Policy | Contact Us Download - Trypanosomes Database | LSDB Archive ...
Class dependency of fuzzy relational database using relational calculus and conditional probability

Science.gov (United States)

Deni Akbar, Mohammad; Mizoguchi, Yoshihiro; Adiwijaya

2018-03-01

In this paper, we propose a design of fuzzy relational database to deal with a conditional probability relation using fuzzy relational calculus. In the previous, there are several researches about equivalence class in fuzzy database using similarity or approximate relation. It is an interesting topic to investigate the fuzzy dependency using equivalence classes. Our goal is to introduce a formulation of a fuzzy relational database model using the relational calculus on the category of fuzzy relations. We also introduce general formulas of the relational calculus for the notion of database operations such as ’projection’, ’selection’, ’injection’ and ’natural join’. Using the fuzzy relational calculus and conditional probabilities, we introduce notions of equivalence class, redundant, and dependency in the theory fuzzy relational database.
The Genetic Activity Profile database.

Science.gov (United States)

Waters, M D; Stack, H F; Garrett, N E; Jackson, M A

1991-12-01

A graphic approach termed a Genetic Activity Profile (GAP) has been developed to display a matrix of data on the genetic and related effects of selected chemical agents. The profiles provide a visual overview of the quantitative (doses) and qualitative (test results) data for each chemical. Either the lowest effective dose (LED) or highest ineffective dose (HID) is recorded for each agent and bioassay. Up to 200 different test systems are represented across the GAP. Bioassay systems are organized according to the phylogeny of the test organisms and the end points of genetic activity. The methodology for the production and evaluation of GAPs has been developed in collaboration with the International Agency for Research on Cancer. Data on individual chemicals have been compiled by IARC and by the U.S. Environmental Protection Agency. Data are available on 299 compounds selected from volumes 1-50 of the IARC Monographs and on 115 compounds identified as Superfund Priority Substances. Software to display the GAPs on an IBM-compatible personal computer is available from the authors. Structurally similar compounds frequently display qualitatively and quantitatively similar GAPs. By examining the patterns of GAPs of pairs and groups of chemicals, it is possible to make more informed decisions regarding the selection of test batteries to be used in evaluating chemical analogs. GAPs have provided useful data for the development of weight-of-evidence hazard ranking schemes. Also, some knowledge of the potential genetic activity of complex environmental mixtures may be gained from assessing the GAPs of component chemicals. The fundamental techniques and computer programs devised for the GAP database may be used to develop similar databases in other disciplines.
License - Arabidopsis Phenome Database | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us Arabidopsis Phenome Database License License to Use This Database Last updated : 2017/02/27 You may use this database...cense specifies the license terms regarding the use of this database and the requirements you must follow in using this database.... The license for this database is specified in the Creative ...Commons Attribution-Share Alike 4.0 International . If you use data from this database, please be sure attribute this database...ative Commons Attribution-Share Alike 4.0 International is found here . With regard to this database, you ar
RAACFDb: Rheumatoid arthritis ayurvedic classical formulations database.

Science.gov (United States)

Mohamed Thoufic Ali, A M; Agrawal, Aakash; Sajitha Lulu, S; Mohana Priya, A; Vino, S

2017-02-02

In the past years, the treatment of rheumatoid arthritis (RA) has undergone remarkable changes in all therapeutic modes. The present newfangled care in clinical research is to determine and to pick a new track for better treatment options for RA. Recent ethnopharmacological investigations revealed that traditional herbal remedies are the most preferred modality of complementary and alternative medicine (CAM). However, several ayurvedic modes of treatments and formulations for RA are not much studied and documented from Indian traditional system of medicine. Therefore, this directed us to develop an integrated database, RAACFDb (acronym: Rheumatoid Arthritis Ayurvedic Classical Formulations Database) by consolidating data from the repository of Vedic Samhita - The Ayurveda to retrieve the available formulations information easily. Literature data was gathered using several search engines and from ayurvedic practitioners for loading information in the database. In order to represent the collected information about classical ayurvedic formulations, an integrated database is constructed and implemented on a MySQL and PHP back-end. The database is supported by describing all the ayurvedic classical formulations for the treatment rheumatoid arthritis. It includes composition, usage, plant parts used, active ingredients present in the composition and their structures. The prime objective is to locate ayurvedic formulations proven to be quite successful and highly effective among the patients with reduced side effects. The database (freely available at www.beta.vit.ac.in/raacfdb/index.html) hopefully enables easy access for clinical researchers and students to discover novel leads with reduced side effects. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Interspecific introgression in cetaceans: DNA markers reveal post-F1 status of a pilot whale.

Directory of Open Access Journals (Sweden)

Laura Miralles

Full Text Available Visual species identification of cetacean strandings is difficult, especially when dead specimens are degraded and/or species are morphologically similar. The two recognised pilot whale species (Globicephala melas and Globicephala macrorhynchus are sympatric in the North Atlantic Ocean. These species are very similar in external appearance and their morphometric characteristics partially overlap; thus visual identification is not always reliable. Genetic species identification ensures correct identification of specimens. Here we have employed one mitochondrial (D-Loop region and eight nuclear loci (microsatellites as genetic markers to identify six stranded pilot whales found in Galicia (Northwest Spain, one of them of ambiguous phenotype. DNA analyses yielded positive amplification of all loci and enabled species identification. Nuclear microsatellite DNA genotypes revealed mixed ancestry for one individual, identified as a post-F1 interspecific hybrid employing two different Bayesian methods. From the mitochondrial sequence the maternal species was Globicephala melas. This is the first hybrid documented between Globicephala melas and G. macrorhynchus, and the first post-F1 hybrid genetically identified between cetaceans, revealing interspecific genetic introgression in marine mammals. We propose to add nuclear loci to genetic databases for cetacean species identification in order to detect hybrid individuals.
Integrating heterogeneous databases in clustered medic care environments using object-oriented technology

Science.gov (United States)

Thakore, Arun K.; Sauer, Frank

1994-05-01

The organization of modern medical care environments into disease-related clusters, such as a cancer center, a diabetes clinic, etc., has the side-effect of introducing multiple heterogeneous databases, often containing similar information, within the same organization. This heterogeneity fosters incompatibility and prevents the effective sharing of data amongst applications at different sites. Although integration of heterogeneous databases is now feasible, in the medical arena this is often an ad hoc process, not founded on proven database technology or formal methods. In this paper we illustrate the use of a high-level object- oriented semantic association method to model information found in different databases into an integrated conceptual global model that integrates the databases. We provide examples from the medical domain to illustrate an integration approach resulting in a consistent global view, without attacking the autonomy of the underlying databases.
Database Description - AcEST | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available abase Description General information of database Database name AcEST Alternative n...hi, Tokyo-to 192-0397 Tel: +81-42-677-1111(ext.3654) E-mail: Database classificat...eneris Taxonomy ID: 13818 Database description This is a database of EST sequences of Adiantum capillus-vene...(3): 223-227. External Links: Original website information Database maintenance site Plant Environmental Res...base Database Description Download License Update History of This Database Site Policy | Contact Us Database Description - AcEST | LSDB Archive ...
Capability Database of Injection Molding Process— Requirements Study for Wider Suitability and Higher Accuracy

DEFF Research Database (Denmark)

Boorla, Srinivasa Murthy; Eifler, Tobias; Jepsen, Jens Dines O.

2017-01-01

for an improved applicability of corresponding database solutions in an industrial context. A survey of database users at all phases of product value chain in the plastic industry revealed that 59% of the participating companies use their own, internally created databases, although reported to be not fully...... adequate in most cases. Essential influences are the suitability of the provided data, defined by the content such as material, tolerance types, etc. covered, as well as its accuracy, largely influenced by the updating frequency. Forming a consortium with stakeholders, linking database update to technology...
License - SKIP Stemcell Database | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us SKIP Stemcell Database License License to Use This Database Last updated : 2017/03/13 You may use this database...specifies the license terms regarding the use of this database and the requirements you must follow in using this database.... The license for this database is specified in the Creative Common...s Attribution-Share Alike 4.0 International . If you use data from this database, please be sure attribute this database...al ... . The summary of the Creative Commons Attribution-Share Alike 4.0 International is found here . With regard to this database
KALIMER database development

Energy Technology Data Exchange (ETDEWEB)

Jeong, Kwan Seong; Lee, Yong Bum; Jeong, Hae Yong; Ha, Kwi Seok

2003-03-01

KALIMER database is an advanced database to utilize the integration management for liquid metal reactor design technology development using Web applications. KALIMER design database is composed of results database, Inter-Office Communication (IOC), 3D CAD database, and reserved documents database. Results database is a research results database during all phase for liquid metal reactor design technology development of mid-term and long-term nuclear R and D. IOC is a linkage control system inter sub project to share and integrate the research results for KALIMER. 3D CAD database is a schematic overview for KALIMER design structure. And reserved documents database is developed to manage several documents and reports since project accomplishment.
KALIMER database development

International Nuclear Information System (INIS)

Jeong, Kwan Seong; Lee, Yong Bum; Jeong, Hae Yong; Ha, Kwi Seok

2003-03-01

KALIMER database is an advanced database to utilize the integration management for liquid metal reactor design technology development using Web applications. KALIMER design database is composed of results database, Inter-Office Communication (IOC), 3D CAD database, and reserved documents database. Results database is a research results database during all phase for liquid metal reactor design technology development of mid-term and long-term nuclear R and D. IOC is a linkage control system inter sub project to share and integrate the research results for KALIMER. 3D CAD database is a schematic overview for KALIMER design structure. And reserved documents database is developed to manage several documents and reports since project accomplishment
Database Description - RPSD | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available base Description General information of database Database name RPSD Alternative nam...e Rice Protein Structure Database DOI 10.18908/lsdba.nbdc00749-000 Creator Creator Name: Toshimasa Yamazaki ... Ibaraki 305-8602, Japan National Institute of Agrobiological Sciences Toshimasa Yamazaki E-mail : Databas...e classification Structure Databases - Protein structure Organism Taxonomy Name: Or...or name(s): Journal: External Links: Original website information Database maintenance site National Institu
Database Description - FANTOM5 | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us FANTOM5 Database Description General information of database Database name FANTOM5 Alternati...me: Rattus norvegicus Taxonomy ID: 10116 Taxonomy Name: Macaca mulatta Taxonomy ID: 9544 Database descriptio...l Links: Original website information Database maintenance site RIKEN Center for Life Science Technologies, ...ilable Web services Not available URL of Web services - Need for user registration Not available About This Database Database... Description Download License Update History of This Database Site Policy | Contact Us Database Description - FANTOM5 | LSDB Archive ...

Updated Palaeotsunami Database for Aotearoa/New Zealand

Science.gov (United States)

Gadsby, M. R.; Goff, J. R.; King, D. N.; Robbins, J.; Duesing, U.; Franz, T.; Borrero, J. C.; Watkins, A.

2016-12-01

similar one currently under development in Japan. Expressions of interest in collaborating with the A/NZ team to expand the database are invited from other Pacific nations.
NoSQL databases

OpenAIRE

Mrozek, Jakub

2012-01-01

This thesis deals with database systems referred to as NoSQL databases. In the second chapter, I explain basic terms and the theory of database systems. A short explanation is dedicated to database systems based on the relational data model and the SQL standardized query language. Chapter Three explains the concept and history of the NoSQL databases, and also presents database models, major features and the use of NoSQL databases in comparison with traditional database systems. In the fourth ...
Development of database systems for safety of repositories for disposal of radioactive wastes

Energy Technology Data Exchange (ETDEWEB)

Lee, Yeong Hun; Han, Jeong Sang; Shin, Hyeon Jun; Ham, Sang Won; Kim, Hye Seong [Yonsei Univ., Seoul (Korea, Republic of)

1999-03-15

In the study, GSIS os developed for the maximizing effectiveness of the database system. For this purpose, the spatial relation of data from various fields that are constructed in the database which was developed for the site selection and management of repository for radioactive waste disposal. By constructing the integration system that can link attribute and spatial data, it is possible to evaluate the safety of repository effectively and economically. The suitability of integrating database and GSIS is examined by constructing the database in the test district where the site characteristics are similar to that of repository for radioactive waste disposal.
A Survey of Binary Similarity and Distance Measures

Directory of Open Access Journals (Sweden)

Seung-Seok Choi

2010-02-01

Full Text Available The binary feature vector is one of the most common representations of patterns and measuring similarity and distance measures play a critical role in many problems such as clustering, classification, etc. Ever since Jaccard proposed a similarity measure to classify ecological species in 1901, numerous binary similarity and distance measures have been proposed in various fields. Applying appropriate measures results in more accurate data analysis. Notwithstanding, few comprehensive surveys on binary measures have been conducted. Hence we collected 76 binary similarity and distance measures used over the last century and reveal their correlations through the hierarchical clustering technique.
An electrophysiological signature of summed similarity in visual working memory.

Science.gov (United States)

van Vugt, Marieke K; Sekuler, Robert; Wilson, Hugh R; Kahana, Michael J

2013-05-01

Summed-similarity models of short-term item recognition posit that participants base their judgments of an item's prior occurrence on that item's summed similarity to the ensemble of items on the remembered list. We examined the neural predictions of these models in 3 short-term recognition memory experiments using electrocorticographic/depth electrode recordings and scalp electroencephalography. On each experimental trial, participants judged whether a test face had been among a small set of recently studied faces. Consistent with summed-similarity theory, participants' tendency to endorse a test item increased as a function of its summed similarity to the items on the just-studied list. To characterize this behavioral effect of summed similarity, we successfully fit a summed-similarity model to individual participant data from each experiment. Using the parameters determined from fitting the summed-similarity model to the behavioral data, we examined the relation between summed similarity and brain activity. We found that 4-9 Hz theta activity in the medial temporal lobe and 2-4 Hz delta activity recorded from frontal and parietal cortices increased with summed similarity. These findings demonstrate direct neural correlates of the similarity computations that form the foundation of several major cognitive theories of human recognition memory. PsycINFO Database Record (c) 2013 APA, all rights reserved.
Evaluation of information-theoretic similarity measures for content-based retrieval and detection of masses in mammograms

International Nuclear Information System (INIS)

Tourassi, Georgia D.; Harrawood, Brian; Singh, Swatee; Lo, Joseph Y.; Floyd, Carey E.

2007-01-01

The purpose of this study was to evaluate image similarity measures employed in an information-theoretic computer-assisted detection (IT-CAD) scheme. The scheme was developed for content-based retrieval and detection of masses in screening mammograms. The study is aimed toward an interactive clinical paradigm where physicians query the proposed IT-CAD scheme on mammographic locations that are either visually suspicious or indicated as suspicious by other cuing CAD systems. The IT-CAD scheme provides an evidence-based, second opinion for query mammographic locations using a knowledge database of mass and normal cases. In this study, eight entropy-based similarity measures were compared with respect to retrieval precision and detection accuracy using a database of 1820 mammographic regions of interest. The IT-CAD scheme was then validated on a separate database for false positive reduction of progressively more challenging visual cues generated by an existing, in-house mass detection system. The study showed that the image similarity measures fall into one of two categories; one category is better suited to the retrieval of semantically similar cases while the second is more effective with knowledge-based decisions regarding the presence of a true mass in the query location. In addition, the IT-CAD scheme yielded a substantial reduction in false-positive detections while maintaining high detection rate for malignant masses
Trends in maar crater size and shape using the global Maar Volcano Location and Shape (MaarVLS) database

Science.gov (United States)

Graettinger, A. H.

2018-05-01

A maar crater is the top of a much larger subsurface diatreme structure produced by phreatomagmatic explosions and the size and shape of the crater reflects the growth history of that structure during an eruption. Recent experimental and geophysical research has shown that crater complexity can reflect subsurface complexity. Morphometry provides a means of characterizing a global population of maar craters in order to establish the typical size and shape of features. A global database of Quaternary maar crater planform morphometry indicates that maar craters are typically not circular and frequently have compound shapes resembling overlapping circles. Maar craters occur in volcanic fields that contain both small volume and complex volcanoes. The global perspective provided by the database shows that maars are common in many volcanic and tectonic settings producing a similar diversity of size and shape within and between volcanic fields. A few exceptional populations of maars were revealed by the database, highlighting directions of future research to improve our understanding on the geometry and spacing of subsurface explosions that produce maars. These outlying populations, such as anomalously large craters (>3000 m), chains of maars, and volcanic fields composed of mostly maar craters each represent a small portion of the database, but provide opportunities to reinvestigate fundamental questions on maar formation. Maar crater morphometry can be integrated with structural, hydrological studies to investigate lateral migration of phreatomagmatic explosion location in the subsurface. A comprehensive database of intact maar morphometry is also beneficial for the hunt for maar-diatremes on other planets.
Database Description - DMPD | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available base Description General information of database Database name DMPD Alternative nam...e Dynamic Macrophage Pathway CSML Database DOI 10.18908/lsdba.nbdc00558-000 Creator Creator Name: Masao Naga...ty of Tokyo 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-8639 Tel: +81-3-5449-5615 FAX: +83-3-5449-5442 E-mail: Database...606 Taxonomy Name: Mammalia Taxonomy ID: 40674 Database description DMPD collects...e(s) Article title: Author name(s): Journal: External Links: Original website information Database maintenan
CTDB: An Integrated Chickpea Transcriptome Database for Functional and Applied Genomics.

Directory of Open Access Journals (Sweden)

Mohit Verma

Full Text Available Chickpea is an important grain legume used as a rich source of protein in human diet. The narrow genetic diversity and limited availability of genomic resources are the major constraints in implementing breeding strategies and biotechnological interventions for genetic enhancement of chickpea. We developed an integrated Chickpea Transcriptome Database (CTDB, which provides the comprehensive web interface for visualization and easy retrieval of transcriptome data in chickpea. The database features many tools for similarity search, functional annotation (putative function, PFAM domain and gene ontology search and comparative gene expression analysis. The current release of CTDB (v2.0 hosts transcriptome datasets with high quality functional annotation from cultivated (desi and kabuli types and wild chickpea. A catalog of transcription factor families and their expression profiles in chickpea are available in the database. The gene expression data have been integrated to study the expression profiles of chickpea transcripts in major tissues/organs and various stages of flower development. The utilities, such as similarity search, ortholog identification and comparative gene expression have also been implemented in the database to facilitate comparative genomic studies among different legumes and Arabidopsis. Furthermore, the CTDB represents a resource for the discovery of functional molecular markers (microsatellites and single nucleotide polymorphisms between different chickpea types. We anticipate that integrated information content of this database will accelerate the functional and applied genomic research for improvement of chickpea. The CTDB web service is freely available at http://nipgr.res.in/ctdb.html.
Metagenomic analysis reveals that modern microbialites and polar microbial mats have similar taxonomic and functional potential

Directory of Open Access Journals (Sweden)

Richard Allen White III

2015-09-01

Full Text Available Within the subarctic climate of Clinton Creek, Yukon, Canada, lies an abandoned and flooded open-pit asbestos mine that harbors rapidly growing microbialites. To understand their formation we completed a metagenomic community profile of the microbialites and their surrounding sediments. Assembled metagenomic data revealed that bacteria within the phylum Proteobacteria numerically dominated this system, although the relative abundances of taxa within the phylum varied among environments. Bacteria belonging to Alphaproteobacteria and Gammaproteobacteria were dominant in the microbialites and sediments, respectively. The microbialites were also home to many other groups associated with microbialite formation including filamentous cyanobacteria and dissimilatory sulfate-reducing Deltaproteobacteria, consistent with the idea of a shared global microbialite microbiome. Other members were present that are typically not associated with microbialites including Gemmatimonadetes and iron-oxidizing Betaproteobacteria, which participate in carbon metabolism and iron cycling. Compared to the sediments, the microbialite microbiome has significantly more genes associated with photosynthetic processes (e.g., photosystem II reaction centers, carotenoid and chlorophyll biosynthesis and carbon fixation (e.g., CO dehydrogenase. The Clinton Creek microbialite communities had strikingly similar functional potentials to non-lithifying microbial mats from the Canadian High Arctic and Antarctica, but are functionally distinct, from non-lithifying mats or biofilms from Yellowstone. Clinton Creek microbialites also share metabolic genes (R2 0.900. These metagenomic profiles from an anthropogenic microbialite-forming ecosystem provide context to microbialite formation on a human-relevant timescale.
Database Dump - fRNAdb | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us fRNAdb Database Dump Data detail Data name Database Dump DOI 10.18908/lsdba.nbdc00452-002 De... data (tab separeted text) Data file File name: Database_Dump File URL: ftp://ftp....biosciencedbc.jp/archive/frnadb/LATEST/Database_Dump File size: 673 MB Simple search URL - Data acquisition...s. Data analysis method - Number of data entries 4 files - About This Database Database Description Download... License Update History of This Database Site Policy | Contact Us Database Dump - fRNAdb | LSDB Archive ...
An integrated web medicinal materials DNA database: MMDBD (Medicinal Materials DNA Barcode Database

Directory of Open Access Journals (Sweden)

But Paul

2010-06-01

Full Text Available Abstract Background Thousands of plants and animals possess pharmacological properties and there is an increased interest in using these materials for therapy and health maintenance. Efficacies of the application is critically dependent on the use of genuine materials. For time to time, life-threatening poisoning is found because toxic adulterant or substitute is administered. DNA barcoding provides a definitive means of authentication and for conducting molecular systematics studies. Owing to the reduced cost in DNA authentication, the volume of the DNA barcodes produced for medicinal materials is on the rise and necessitates the development of an integrated DNA database. Description We have developed an integrated DNA barcode multimedia information platform- Medicinal Materials DNA Barcode Database (MMDBD for data retrieval and similarity search. MMDBD contains over 1000 species of medicinal materials listed in the Chinese Pharmacopoeia and American Herbal Pharmacopoeia. MMDBD also contains useful information of the medicinal material, including resources, adulterant information, medical parts, photographs, primers used for obtaining the barcodes and key references. MMDBD can be accessed at http://www.cuhk.edu.hk/icm/mmdbd.htm. Conclusions This work provides a centralized medicinal materials DNA barcode database and bioinformatics tools for data storage, analysis and exchange for promoting the identification of medicinal materials. MMDBD has the largest collection of DNA barcodes of medicinal materials and is a useful resource for researchers in conservation, systematic study, forensic and herbal industry.
DOT Online Database

Science.gov (United States)

Page Home Table of Contents Contents Search Database Search Login Login Databases Advisory Circulars accessed by clicking below: Full-Text WebSearch Databases Database Records Date Advisory Circulars 2092 5 data collection and distribution policies. Document Database Website provided by MicroSearch
HTT-DB: horizontally transferred transposable elements database.

Science.gov (United States)

Dotto, Bruno Reis; Carvalho, Evelise Leis; Silva, Alexandre Freitas; Duarte Silva, Luiz Fernando; Pinto, Paulo Marcos; Ortiz, Mauro Freitas; Wallau, Gabriel Luz

2015-09-01

Horizontal transfer of transposable (HTT) elements among eukaryotes was discovered in the mid-1980s. As then, >300 new cases have been described. New findings about HTT are revealing the evolutionary impact of this phenomenon on host genomes. In order to provide an up to date, interactive and expandable database for such events, we developed the HTT-DB database. HTT-DB allows easy access to most of HTT cases reported along with rich information about each case. Moreover, it allows the user to generate tables and graphs based on searches using Transposable elements and/or host species classification and export them in several formats. This database is freely available on the web at http://lpa.saogabriel.unipampa.edu.br:8080/httdatabase. HTT-DB was developed based on Java and MySQL with all major browsers supported. Tools and software packages used are free for personal or non-profit projects. bdotto82@gmail.com or gabriel.wallau@gmail.com. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
The Hofmethode: Computing Semantic Similarities between E-Learning Products

Directory of Open Access Journals (Sweden)

Oliver Michel

2009-11-01

Full Text Available The key task in building useful e-learning repositories is to develop a system with an algorithm allowing users to retrieve information that corresponds to their specific requirements. To achieve this, products (or their verbal descriptions, i.e. presented in metadata need to be compared and structured according to the results of this comparison. Such structuring is crucial insofar as there are many search results that correspond to the entered keyword. The Hofmethode is an algorithm (based on psychological considerations to compute semantic similarities between texts and therefore offer a way to compare e-learning products. The computed similarity values are used to build semantic maps in which the products are visually arranged according to their similarities. The paper describes how the Hofmethode is implemented in the online database edulap, and how it contributes to help the user to explore the data in which he is interested.
Databases

Digital Repository Service at National Institute of Oceanography (India)

Kunte, P.D.

Information on bibliographic as well as numeric/textual databases relevant to coastal geomorphology has been included in a tabular form. Databases cover a broad spectrum of related subjects like coastal environment and population aspects, coastline...
The HISTMAG database: combining historical, archaeomagnetic and volcanic data

Science.gov (United States)

Arneitz, Patrick; Leonhardt, Roman; Schnepp, Elisabeth; Heilig, Balázs; Mayrhofer, Franziska; Kovacs, Peter; Hejda, Pavel; Valach, Fridrich; Vadasz, Gergely; Hammerl, Christa; Egli, Ramon; Fabian, Karl; Kompein, Niko

2017-09-01

Records of the past geomagnetic field can be divided into two main categories. These are instrumental historical observations on the one hand, and field estimates based on the magnetization acquired by rocks, sediments and archaeological artefacts on the other hand. In this paper, a new database combining historical, archaeomagnetic and volcanic records is presented. HISTMAG is a relational database, implemented in MySQL, and can be accessed via a web-based interface (http://www.conrad-observatory.at/zamg/index.php/data-en/histmag-database). It combines available global historical data compilations covering the last ∼500 yr as well as archaeomagnetic and volcanic data collections from the last 50 000 yr. Furthermore, new historical and archaeomagnetic records, mainly from central Europe, have been acquired. In total, 190 427 records are currently available in the HISTMAG database, whereby the majority is related to historical declination measurements (155 525). The original database structure was complemented by new fields, which allow for a detailed description of the different data types. A user-comment function provides the possibility for a scientific discussion about individual records. Therefore, HISTMAG database supports thorough reliability and uncertainty assessments of the widely different data sets, which are an essential basis for geomagnetic field reconstructions. A database analysis revealed systematic offset for declination records derived from compass roses on historical geographical maps through comparison with other historical records, while maps created for mining activities represent a reliable source.
Database Description - eSOL | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available base Description General information of database Database name eSOL Alternative nam...eator Affiliation: The Research and Development of Biological Databases Project, National Institute of Genet...nology 4259 Nagatsuta-cho, Midori-ku, Yokohama, Kanagawa 226-8501 Japan Email: Tel.: +81-45-924-5785 Database... classification Protein sequence databases - Protein properties Organism Taxonomy Name: Escherichia coli Taxonomy ID: 562 Database...i U S A. 2009 Mar 17;106(11):4201-6. External Links: Original website information Database maintenance site
"Mr. Database" : Jim Gray and the History of Database Technologies.

Science.gov (United States)

Hanwahr, Nils C

2017-12-01

Although the widespread use of the term "Big Data" is comparatively recent, it invokes a phenomenon in the developments of database technology with distinct historical contexts. The database engineer Jim Gray, known as "Mr. Database" in Silicon Valley before his disappearance at sea in 2007, was involved in many of the crucial developments since the 1970s that constitute the foundation of exceedingly large and distributed databases. Jim Gray was involved in the development of relational database systems based on the concepts of Edgar F. Codd at IBM in the 1970s before he went on to develop principles of Transaction Processing that enable the parallel and highly distributed performance of databases today. He was also involved in creating forums for discourse between academia and industry, which influenced industry performance standards as well as database research agendas. As a co-founder of the San Francisco branch of Microsoft Research, Gray increasingly turned toward scientific applications of database technologies, e. g. leading the TerraServer project, an online database of satellite images. Inspired by Vannevar Bush's idea of the memex, Gray laid out his vision of a Personal Memex as well as a World Memex, eventually postulating a new era of data-based scientific discovery termed "Fourth Paradigm Science". This article gives an overview of Gray's contributions to the development of database technology as well as his research agendas and shows that central notions of Big Data have been occupying database engineers for much longer than the actual term has been in use.
Mathematics for Databases

NARCIS (Netherlands)

ir. Sander van Laar

2007-01-01

A formal description of a database consists of the description of the relations (tables) of the database together with the constraints that must hold on the database. Furthermore the contents of a database can be retrieved using queries. These constraints and queries for databases can very well be

Stereoselective virtual screening of the ZINC database using atom pair 3D-fingerprints.

Science.gov (United States)

Awale, Mahendra; Jin, Xian; Reymond, Jean-Louis

2015-01-01

Tools to explore large compound databases in search for analogs of query molecules provide a strategically important support in drug discovery to help identify available analogs of any given reference or hit compound by ligand based virtual screening (LBVS). We recently showed that large databases can be formatted for very fast searching with various 2D-fingerprints using the city-block distance as similarity measure, in particular a 2D-atom pair fingerprint (APfp) and the related category extended atom pair fingerprint (Xfp) which efficiently encode molecular shape and pharmacophores, but do not perceive stereochemistry. Here we investigated related 3D-atom pair fingerprints to enable rapid stereoselective searches in the ZINC database (23.2 million 3D structures). Molecular fingerprints counting atom pairs at increasing through-space distance intervals were designed using either all atoms (16-bit 3DAPfp) or different atom categories (80-bit 3DXfp). These 3D-fingerprints retrieved molecular shape and pharmacophore analogs (defined by OpenEye ROCS scoring functions) of 110,000 compounds from the Cambridge Structural Database with equal or better accuracy than the 2D-fingerprints APfp and Xfp, and showed comparable performance in recovering actives from decoys in the DUD database. LBVS by 3DXfp or 3DAPfp similarity was stereoselective and gave very different analogs when starting from different diastereomers of the same chiral drug. Results were also different from LBVS with the parent 2D-fingerprints Xfp or APfp. 3D- and 2D-fingerprints also gave very different results in LBVS of folded molecules where through-space distances between atom pairs are much shorter than topological distances. 3DAPfp and 3DXfp are suitable for stereoselective searches for shape and pharmacophore analogs of query molecules in large databases. Web-browsers for searching ZINC by 3DAPfp and 3DXfp similarity are accessible at www.gdb.unibe.ch and should provide useful assistance to drug
Electron microscopy and in vitro deneddylation reveal similar architectures and biochemistry of isolated human and Flag-mouse COP9 signalosome complexes

Energy Technology Data Exchange (ETDEWEB)

Rockel, Beate [Department of Molecular Structural Biology, Max-Planck-Institute of Biochemistry, Am Klopferspitz 18, 82152 Martinsried (Germany); Schmaler, Tilo; Huang, Xiaohua [Division of Molecular Biology, Department of General, Visceral, Vascular and Thoracic Surgery, Charité – Universitätsmedizin Berlin, Charitéplatz 1, 10117 Berlin (Germany); Dubiel, Wolfgang, E-mail: Wolfgang.dubiel@charite.de [Division of Molecular Biology, Department of General, Visceral, Vascular and Thoracic Surgery, Charité – Universitätsmedizin Berlin, Charitéplatz 1, 10117 Berlin (Germany)

2014-07-25

Highlights: • Deneddylation rates of human erythrocyte and mouse fibroblast CSN are very similar. • 3D models of native human and mouse CSN reveal common architectures. • The cryo-structure of native mammalian CSN shows a horseshoe subunit arrangement. - Abstract: The COP9 signalosome (CSN) is a regulator of the ubiquitin (Ub) proteasome system (UPS). In the UPS, proteins are Ub-labeled for degradation by Ub ligases conferring substrate specificity. The CSN controls a large family of Ub ligases called cullin-RING ligases (CRLs), which ubiquitinate cell cycle regulators, transcription factors and DNA damage response proteins. The CSN possesses structural similarities with the 26S proteasome Lid complex and the translation initiation complex 3 (eIF3) indicating similar ancestry and function. Initial structures were obtained 14 years ago by 2D electron microscopy (EM). Recently, first 3D molecular models of the CSN were created on the basis of negative-stain EM and single-particle analysis, mostly with recombinant complexes. Here, we compare deneddylating activity and structural features of CSN complexes purified in an elaborate procedure from human erythrocytes and efficiently pulled down from mouse Flag-CSN2 B8 fibroblasts. In an in vitro deneddylation assay both the human and the mouse CSN complexes deneddylated Nedd8-Cul1 with comparable rates. 3D structural models of the erythrocyte CSN as well as of the mouse Flag-CSN were generated by negative stain EM and by cryo-EM. Both complexes show a central U-shaped segment from which several arms emanate. This structure, called the horseshoe, is formed by the PCI domain subunits. CSN5 and CSN6 point away from the horseshoe. Compared to 3D models of negatively stained CSN complexes, densities assigned to CSN2 and CSN4 are better defined in the cryo-map. Because biochemical and structural results obtained with CSN complexes isolated from human erythrocytes and purified by Flag-CSN pulldown from mouse B8 fibroblasts
Electron microscopy and in vitro deneddylation reveal similar architectures and biochemistry of isolated human and Flag-mouse COP9 signalosome complexes

International Nuclear Information System (INIS)

Rockel, Beate; Schmaler, Tilo; Huang, Xiaohua; Dubiel, Wolfgang

2014-01-01

Highlights: • Deneddylation rates of human erythrocyte and mouse fibroblast CSN are very similar. • 3D models of native human and mouse CSN reveal common architectures. • The cryo-structure of native mammalian CSN shows a horseshoe subunit arrangement. - Abstract: The COP9 signalosome (CSN) is a regulator of the ubiquitin (Ub) proteasome system (UPS). In the UPS, proteins are Ub-labeled for degradation by Ub ligases conferring substrate specificity. The CSN controls a large family of Ub ligases called cullin-RING ligases (CRLs), which ubiquitinate cell cycle regulators, transcription factors and DNA damage response proteins. The CSN possesses structural similarities with the 26S proteasome Lid complex and the translation initiation complex 3 (eIF3) indicating similar ancestry and function. Initial structures were obtained 14 years ago by 2D electron microscopy (EM). Recently, first 3D molecular models of the CSN were created on the basis of negative-stain EM and single-particle analysis, mostly with recombinant complexes. Here, we compare deneddylating activity and structural features of CSN complexes purified in an elaborate procedure from human erythrocytes and efficiently pulled down from mouse Flag-CSN2 B8 fibroblasts. In an in vitro deneddylation assay both the human and the mouse CSN complexes deneddylated Nedd8-Cul1 with comparable rates. 3D structural models of the erythrocyte CSN as well as of the mouse Flag-CSN were generated by negative stain EM and by cryo-EM. Both complexes show a central U-shaped segment from which several arms emanate. This structure, called the horseshoe, is formed by the PCI domain subunits. CSN5 and CSN6 point away from the horseshoe. Compared to 3D models of negatively stained CSN complexes, densities assigned to CSN2 and CSN4 are better defined in the cryo-map. Because biochemical and structural results obtained with CSN complexes isolated from human erythrocytes and purified by Flag-CSN pulldown from mouse B8 fibroblasts
P2-34: Similar Dimensions Underlie Emotional and Conversational Expressions in Korean and German Cultural Contexts

Directory of Open Access Journals (Sweden)

Ahyoung Shin

2012-10-01

Full Text Available Although facial expressions are one of the most important ways of communication in human society, most studies in the field focus only on the emotional aspect of facial expressions. The communicative/conversational aspects of expressions remain largely neglected. In addition, whereas it is known that there are culturally universal emotional expressions, less is known about how conversational expressions are perceived across cultures. Here, we investigate the underlying dimensions of the complex space of emotional and conversational expressions in a cross-cultural context. For the experiments, we used 620 video sequences of the KU facial expression database (62 expressions of 10 Korean actors, and 540 video sequences of the MPI facial expression database (54 expressions of 10 German actors. Four groups of native German and Korean participants were asked to group the sequences of the German or Korean databases into clusters based on similarity, yielding a fully crossed design across cultural contexts and databases. The confusion matrices created from the grouping data showed similar structure for both databases, but also yielded significantly less confusion for own-culture judgments. Interestingly, multidimensional scaling of the confusion matrices showed that for all four participant groups, two dimensions explained the data sufficiently. Most importantly, post-hoc analyses identified these two dimensions as valence and arousal, respectively, for all cultural contexts and databases. We conclude that although expressions from a familiar background are more effectively grouped, the evaluative dimensions for both German and Korean cultural contexts are exactly the same, showing that cultural universals exist even in this complex space.
Global Tsunami Database: Adding Geologic Deposits, Proxies, and Tools

Science.gov (United States)

Brocko, V. R.; Varner, J.

2007-12-01

A result of collaboration between NOAA's National Geophysical Data Center (NGDC) and the Cooperative Institute for Research in the Environmental Sciences (CIRES), the Global Tsunami Database includes instrumental records, human observations, and now, information inferred from the geologic record. Deep Ocean Assessment and Reporting of Tsunamis (DART) data, historical reports, and information gleaned from published tsunami deposit research build a multi-faceted view of tsunami hazards and their history around the world. Tsunami history provides clues to what might happen in the future, including frequency of occurrence and maximum wave heights. However, instrumental and written records commonly span too little time to reveal the full range of a region's tsunami hazard. The sedimentary deposits of tsunamis, identified with the aid of modern analogs, increasingly complement instrumental and human observations. By adding the component of tsunamis inferred from the geologic record, the Global Tsunami Database extends the record of tsunamis backward in time. Deposit locations, their estimated age and descriptions of the deposits themselves fill in the tsunami record. Tsunamis inferred from proxies, such as evidence for coseismic subsidence, are included to estimate recurrence intervals, but are flagged to highlight the absence of a physical deposit. Authors may submit their own descriptions and upload digital versions of publications. Users may sort by any populated field, including event, location, region, age of deposit, author, publication type (extract information from peer reviewed publications only, if you wish), grain size, composition, presence/absence of plant material. Users may find tsunami deposit references for a given location, event or author; search for particular properties of tsunami deposits; and even identify potential collaborators. Users may also download public-domain documents. Data and information may be viewed using tools designed to extract and
Biochemical characterization of an exonuclease from Arabidopsis thaliana reveals similarities to the DNA exonuclease of the human Werner syndrome protein.

Science.gov (United States)

Plchova, Helena; Hartung, Frank; Puchta, Holger

2003-11-07

The human Werner syndrome protein (hWRN-p) possessing DNA helicase and exonuclease activities is essential for genome stability. Plants have no homologue of this bifunctional protein, but surprisingly the Arabidopsis genome contains a small open reading frame (ORF) (AtWRNexo) with homology to the exonuclease domain of hWRN-p. Expression of this ORF in Escherichia coli revealed an exonuclease activity for AtWRN-exo-p with similarities but also some significant differences to hWRN-p. The protein digests recessed strands of DNA duplexes in the 3' --> 5' direction but hardly single-stranded DNA or blunt-ended duplexes. In contrast to the Werner exonuclease, AtWRNexo-p is also able to digest 3'-protruding strands. DNA with recessed 3'-PO4 and 3'-OH termini is degraded to a similar extent. AtWRNexo-p hydrolyzes the 3'-recessed strand termini of duplexes containing mismatched bases. AtWRNexo-p needs the divalent cation Mg2+ for activity, which can be replaced by Mn2+. Apurinic sites, cholesterol adducts, and oxidative DNA damage (such as 8-oxoadenine and 8-oxoguanine) inhibit or block the enzyme. Other DNA modifications, including uracil, hypoxanthine and ethenoadenine, did not inhibit AtWRNexo-p. A mutation of a conserved residue within the exonuclease domain (E135A) completely abolished the exonucleolytic activity. Our results indicate that a type of WRN-like exonuclease activity seems to be a common feature of the DNA metabolism of animals and plants.
Brand name confusion: Subjective and objective measures of orthographic similarity.

Science.gov (United States)

Burt, Jennifer S; McFarlane, Kimberley A; Kelly, Sarah J; Humphreys, Michael S; Weatherall, Kimberlee; Burrell, Robert G

2017-09-01

Determining brand name similarity is vital in areas of trademark registration and brand confusion. Students rated the orthographic (spelling) similarity of word pairs (Experiments 1, 2, and 4) and brand name pairs (Experiment 5). Similarity ratings were consistently higher when words shared beginnings rather than endings, whereas shared pronunciation of the stressed vowel had small and less consistent effects on ratings. In Experiment 3 a behavioral task confirmed the similarity of shared beginnings in lexical processing. Specifically, in a task requiring participants to decide whether 2 words presented in the clear (a probe and a later target) were the same or different, a masked prime word preceding the target shortened response latencies if it shared its initial 3 letters with the target. The ratings of students for word and brand name pairs were strongly predicted by metrics of orthographic similarity from the visual word identification literature based on the number of shared letters and their relative positions. The results indicate a potential use for orthographic metrics in brand name registration and trademark law. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Database development and management

CERN Document Server

Chao, Lee

2006-01-01

Introduction to Database Systems Functions of a DatabaseDatabase Management SystemDatabase ComponentsDatabase Development ProcessConceptual Design and Data Modeling Introduction to Database Design Process Understanding Business ProcessEntity-Relationship Data Model Representing Business Process with Entity-RelationshipModelTable Structure and NormalizationIntroduction to TablesTable NormalizationTransforming Data Models to Relational Databases .DBMS Selection Transforming Data Models to Relational DatabasesEnforcing ConstraintsCreating Database for Business ProcessPhysical Design and Database
Meta-analysis using a novel database, miRStress, reveals miRNAs that are frequently associated with the radiation and hypoxia stress-responses.

Directory of Open Access Journals (Sweden)

Laura Ann Jacobs

Full Text Available Organisms are often exposed to environmental pressures that affect homeostasis, so it is important to understand the biological basis of stress-response. Various biological mechanisms have evolved to help cells cope with potentially cytotoxic changes in their environment. miRNAs are small non-coding RNAs which are able to regulate mRNA stability. It has been suggested that miRNAs may tip the balance between continued cytorepair and induction of apoptosis in response to stress. There is a wealth of data in the literature showing the effect of environmental stress on miRNAs, but it is scattered in a large number of disparate publications. Meta-analyses of this data would produce added insight into the molecular mechanisms of stress-response. To facilitate this we created and manually curated the miRStress database, which describes the changes in miRNA levels following an array of stress types in eukaryotic cells. Here we describe this database and validate the miRStress tool for analysing miRNAs that are regulated by stress. To validate the database we performed a cross-species analysis to identify miRNAs that respond to radiation. The analysis tool confirms miR-21 and miR-34a as frequently deregulated in response to radiation, but also identifies novel candidates as potentially important players in this stress response, including miR-15b, miR-19b, and miR-106a. Similarly, we used the miRStress tool to analyse hypoxia-responsive miRNAs. The most frequently deregulated miRNAs were miR-210 and miR-21, as expected. Several other miRNAs were also found to be associated with hypoxia, including miR-181b, miR-26a/b, miR-106a, miR-213 and miR-192. Therefore the miRStress tool has identified miRNAs with hitherto unknown or under-appreciated roles in the response to specific stress types. The miRStress tool, which can be used to uncover new insight into the biological roles of miRNAs, and also has the potential to unearth potential biomarkers for
Database reliability engineering designing and operating resilient database systems

CERN Document Server

Campbell, Laine

2018-01-01

The infrastructure-as-code revolution in IT is also affecting database administration. With this practical book, developers, system administrators, and junior to mid-level DBAs will learn how the modern practice of site reliability engineering applies to the craft of database architecture and operations. Authors Laine Campbell and Charity Majors provide a framework for professionals looking to join the ranks of today’s database reliability engineers (DBRE). You’ll begin by exploring core operational concepts that DBREs need to master. Then you’ll examine a wide range of database persistence options, including how to implement key technologies to provide resilient, scalable, and performant data storage and retrieval. With a firm foundation in database reliability engineering, you’ll be ready to dive into the architecture and operations of any modern database. This book covers: Service-level requirements and risk management Building and evolving an architecture for operational visibility ...
FDA toxicity databases and real-time data entry

International Nuclear Information System (INIS)

Arvidson, Kirk B.

2008-01-01

Structure-searchable electronic databases are valuable new tools that are assisting the FDA in its mission to promptly and efficiently review incoming submissions for regulatory approval of new food additives and food contact substances. The Center for Food Safety and Applied Nutrition's Office of Food Additive Safety (CFSAN/OFAS), in collaboration with Leadscope, Inc., is consolidating genetic toxicity data submitted in food additive petitions from the 1960s to the present day. The Center for Drug Evaluation and Research, Office of Pharmaceutical Science's Informatics and Computational Safety Analysis Staff (CDER/OPS/ICSAS) is separately gathering similar information from their submissions. Presently, these data are distributed in various locations such as paper files, microfiche, and non-standardized toxicology memoranda. The organization of the data into a consistent, searchable format will reduce paperwork, expedite the toxicology review process, and provide valuable information to industry that is currently available only to the FDA. Furthermore, by combining chemical structures with genetic toxicity information, biologically active moieties can be identified and used to develop quantitative structure-activity relationship (QSAR) modeling and testing guidelines. Additionally, chemicals devoid of toxicity data can be compared to known structures, allowing for improved safety review through the identification and analysis of structural analogs. Four database frameworks have been created: bacterial mutagenesis, in vitro chromosome aberration, in vitro mammalian mutagenesis, and in vivo micronucleus. Controlled vocabularies for these databases have been established. The four separate genetic toxicity databases are compiled into a single, structurally-searchable database for easy accessibility of the toxicity information. Beyond the genetic toxicity databases described here, additional databases for subchronic, chronic, and teratogenicity studies have been prepared
Measure of Node Similarity in Multilayer Networks

DEFF Research Database (Denmark)

Møllgaard, Anders; Zettler, Ingo; Dammeyer, Jesper

2016-01-01

university.Our analysis is based on data obtained using smartphones equipped with customdata collection software, complemented by questionnaire-based data. The networkof social contacts is represented as a weighted multilayer network constructedfrom different channels of telecommunication as well as data...... might bepresent in one layer of the multilayer network and simultaneously be absent inthe other layers. For a variable such as gender, our measure reveals atransition from similarity between nodes connected with links of relatively lowweight to dis-similarity for the nodes connected by the strongest...
Solving Relational Database Problems with ORDBMS in an Advanced Database Course

Science.gov (United States)

Wang, Ming

2011-01-01

This paper introduces how to use the object-relational database management system (ORDBMS) to solve relational database (RDB) problems in an advanced database course. The purpose of the paper is to provide a guideline for database instructors who desire to incorporate the ORDB technology in their traditional database courses. The paper presents…
Kingfisher: a system for remote sensing image database management

Science.gov (United States)

Bruzzo, Michele; Giordano, Ferdinando; Dellepiane, Silvana G.

2003-04-01

At present retrieval methods in remote sensing image database are mainly based on spatial-temporal information. The increasing amount of images to be collected by the ground station of earth observing systems emphasizes the need for database management with intelligent data retrieval capabilities. The purpose of the proposed method is to realize a new content based retrieval system for remote sensing images database with an innovative search tool based on image similarity. This methodology is quite innovative for this application, at present many systems exist for photographic images, as for example QBIC and IKONA, but they are not able to extract and describe properly remote image content. The target database is set by an archive of images originated from an X-SAR sensor (spaceborne mission, 1994). The best content descriptors, mainly texture parameters, guarantees high retrieval performances and can be extracted without losses independently of image resolution. The latter property allows DBMS (Database Management System) to process low amount of information, as in the case of quick-look images, improving time performance and memory access without reducing retrieval accuracy. The matching technique has been designed to enable image management (database population and retrieval) independently of dimensions (width and height). Local and global content descriptors are compared, during retrieval phase, with the query image and results seem to be very encouraging.
Generalized Database Management System Support for Numeric Database Environments.

Science.gov (United States)

Dominick, Wayne D.; Weathers, Peggy G.

1982-01-01

This overview of potential for utilizing database management systems (DBMS) within numeric database environments highlights: (1) major features, functions, and characteristics of DBMS; (2) applicability to numeric database environment needs and user needs; (3) current applications of DBMS technology; and (4) research-oriented and…
License - Trypanosomes Database | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data List Contact us Trypanoso... Attribution-Share Alike 2.1 Japan . If you use data from this database, please be sure attribute this database as follows: Trypanoso...nse Update History of This Database Site Policy | Contact Us License - Trypanosomes Database | LSDB Archive ...
Use of a German longitudinal prescription database (LRx) in pharmacoepidemiology.

Science.gov (United States)

Richter, Hartmut; Dombrowski, Silvia; Hamer, Hajo; Hadji, Peyman; Kostev, Karel

2015-01-01

Large epidemiological databases are often used to examine matters pertaining to drug utilization, health services, and drug safety. The major strength of such databases is that they include large sample sizes, which allow precise estimates to be made. The IMS® LRx database has in recent years been used as a data source for epidemiological research. The aim of this paper is to review a number of recent studies published with the aid of this database and compare these with the results of similar studies using independent data published in the literature. In spite of being somewhat limited to studies for which comparative independent results were available, it was possible to include a wide range of possible uses of the LRx database in a variety of therapeutic fields: prevalence/incidence rate determination (diabetes, epilepsy), persistence analyses (diabetes, osteoporosis), use of comedication (diabetes), drug utilization (G-CSF market) and treatment costs (diabetes, G-CSF market). In general, the results of the LRx studies were found to be clearly in line with previously published reports. In some cases, noticeable discrepancies between the LRx results and the literature data were found (e.g. prevalence in epilepsy, persistence in osteoporosis) and these were discussed and possible reasons presented. Overall, it was concluded that the IMS® LRx database forms a suitable database for pharmacoepidemiological studies.
Federal databases

International Nuclear Information System (INIS)

Welch, M.J.; Welles, B.W.

1988-01-01

Accident statistics on all modes of transportation are available as risk assessment analytical tools through several federal agencies. This paper reports on the examination of the accident databases by personal contact with the federal staff responsible for administration of the database programs. This activity, sponsored by the Department of Energy through Sandia National Laboratories, is an overview of the national accident data on highway, rail, air, and marine shipping. For each mode, the definition or reporting requirements of an accident are determined and the method of entering the accident data into the database is established. Availability of the database to others, ease of access, costs, and who to contact were prime questions to each of the database program managers. Additionally, how the agency uses the accident data was of major interest
The YH database: the first Asian diploid genome database

DEFF Research Database (Denmark)

Li, Guoqing; Ma, Lijia; Song, Chao

2009-01-01

genome consensus. The YH database is currently one of the three personal genome database, organizing the original data and analysis results in a user-friendly interface, which is an endeavor to achieve fundamental goals for establishing personal medicine. The database is available at http://yh.genomics.org.cn....
Database Description - tRNADB-CE | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data List Contact us tRNAD...B-CE Database Description General information of database Database name tRNADB-CE Alter...CC BY-SA Detail Background and funding Name: MEXT Integrated Database Project Reference(s) Article title: tRNAD... 2009 Jan;37(Database issue):D163-8. External Links: Article title: tRNADB-CE 2011: tRNA gene database curat...n Download License Update History of This Database Site Policy | Contact Us Database Description - tRNADB-CE | LSDB Archive ...

Towards novel organic high-Tc superconductors: Data mining using density of states similarity search

Science.gov (United States)

Geilhufe, R. Matthias; Borysov, Stanislav S.; Kalpakchi, Dmytro; Balatsky, Alexander V.

2018-02-01

Identifying novel functional materials with desired key properties is an important part of bridging the gap between fundamental research and technological advancement. In this context, high-throughput calculations combined with data-mining techniques highly accelerated this process in different areas of research during the past years. The strength of a data-driven approach for materials prediction lies in narrowing down the search space of thousands of materials to a subset of prospective candidates. Recently, the open-access organic materials database OMDB was released providing electronic structure data for thousands of previously synthesized three-dimensional organic crystals. Based on the OMDB, we report about the implementation of a novel density of states similarity search tool which is capable of retrieving materials with similar density of states to a reference material. The tool is based on the approximate nearest neighbor algorithm as implemented in the ANNOY library and can be applied via the OMDB web interface. The approach presented here is wide ranging and can be applied to various problems where the density of states is responsible for certain key properties of a material. As the first application, we report about materials exhibiting electronic structure similarities to the aromatic hydrocarbon p-terphenyl which was recently discussed as a potential organic high-temperature superconductor exhibiting a transition temperature in the order of 120 K under strong potassium doping. Although the mechanism driving the remarkable transition temperature remains under debate, we argue that the density of states, reflecting the electronic structure of a material, might serve as a crucial ingredient for the observed high Tc. To provide candidates which might exhibit comparable properties, we present 15 purely organic materials with similar features to p-terphenyl within the electronic structure, which also tend to have structural similarities with p
Uniform standards for genome databases in forest and fruit trees

Science.gov (United States)

TreeGenes and tfGDR serve the international forestry and fruit tree genomics research communities, respectively. These databases hold similar sequence data and provide resources for the submission and recovery of this information in order to enable comparative genomics research. Large-scale genotype...
Lung Cancer Signature Biomarkers: tissue specific semantic similarity based clustering of Digital Differential Display (DDD data

Directory of Open Access Journals (Sweden)

Srivastava Mousami

2012-11-01

Full Text Available Abstract Background The tissue-specific Unigene Sets derived from more than one million expressed sequence tags (ESTs in the NCBI, GenBank database offers a platform for identifying significantly and differentially expressed tissue-specific genes by in-silico methods. Digital differential display (DDD rapidly creates transcription profiles based on EST comparisons and numerically calculates, as a fraction of the pool of ESTs, the relative sequence abundance of known and novel genes. However, the process of identifying the most likely tissue for a specific disease in which to search for candidate genes from the pool of differentially expressed genes remains difficult. Therefore, we have used ‘Gene Ontology semantic similarity score’ to measure the GO similarity between gene products of lung tissue-specific candidate genes from control (normal and disease (cancer sets. This semantic similarity score matrix based on hierarchical clustering represents in the form of a dendrogram. The dendrogram cluster stability was assessed by multiple bootstrapping. Multiple bootstrapping also computes a p-value for each cluster and corrects the bias of the bootstrap probability. Results Subsequent hierarchical clustering by the multiple bootstrapping method (α = 0.95 identified seven clusters. The comparative, as well as subtractive, approach revealed a set of 38 biomarkers comprising four distinct lung cancer signature biomarker clusters (panel 1–4. Further gene enrichment analysis of the four panels revealed that each panel represents a set of lung cancer linked metastasis diagnostic biomarkers (panel 1, chemotherapy/drug resistance biomarkers (panel 2, hypoxia regulated biomarkers (panel 3 and lung extra cellular matrix biomarkers (panel 4. Conclusions Expression analysis reveals that hypoxia induced lung cancer related biomarkers (panel 3, HIF and its modulating proteins (TGM2, CSNK1A1, CTNNA1, NAMPT/Visfatin, TNFRSF1A, ETS1, SRC-1, FN1, APLP2, DMBT1
Network-based statistical comparison of citation topology of bibliographic databases

Science.gov (United States)

Šubelj, Lovro; Fiala, Dalibor; Bajec, Marko

2014-01-01

Modern bibliographic databases provide the basis for scientific research and its evaluation. While their content and structure differ substantially, there exist only informal notions on their reliability. Here we compare the topological consistency of citation networks extracted from six popular bibliographic databases including Web of Science, CiteSeer and arXiv.org. The networks are assessed through a rich set of local and global graph statistics. We first reveal statistically significant inconsistencies between some of the databases with respect to individual statistics. For example, the introduced field bow-tie decomposition of DBLP Computer Science Bibliography substantially differs from the rest due to the coverage of the database, while the citation information within arXiv.org is the most exhaustive. Finally, we compare the databases over multiple graph statistics using the critical difference diagram. The citation topology of DBLP Computer Science Bibliography is the least consistent with the rest, while, not surprisingly, Web of Science is significantly more reliable from the perspective of consistency. This work can serve either as a reference for scholars in bibliometrics and scientometrics or a scientific evaluation guideline for governments and research agencies. PMID:25263231
The Development of a Combined Search for a Heterogeneous Chemistry Database

Directory of Open Access Journals (Sweden)

Lulu Jiang

2015-05-01

Full Text Available A combined search, which joins a slow molecule structure search with a fast compound property search, results in more accurate search results and has been applied in several chemistry databases. However, the problems of search speed differences and combining the two separate search results are two major challenges. In this paper, two kinds of search strategies, synchronous search and asynchronous search, are proposed to solve these problems in the heterogeneous structure database and the property database found in ChemDB, a chemistry database owned by the Institute of Process Engineering, CAS. Their advantages and disadvantages under different conditions are discussed in detail. Furthermore, we applied these two searches to ChemDB and used them to screen for potential molecules that can work as CO2 absorbents. The results reveal that this combined search discovers reasonable target molecules within an acceptable time frame.
Testing statistical significance scores of sequence comparison methods with structure similarity

Directory of Open Access Journals (Sweden)

Leunissen Jack AM

2006-10-01

Full Text Available Abstract Background In the past years the Smith-Waterman sequence comparison algorithm has gained popularity due to improved implementations and rapidly increasing computing power. However, the quality and sensitivity of a database search is not only determined by the algorithm but also by the statistical significance testing for an alignment. The e-value is the most commonly used statistical validation method for sequence database searching. The CluSTr database and the Protein World database have been created using an alternative statistical significance test: a Z-score based on Monte-Carlo statistics. Several papers have described the superiority of the Z-score as compared to the e-value, using simulated data. We were interested if this could be validated when applied to existing, evolutionary related protein sequences. Results All experiments are performed on the ASTRAL SCOP database. The Smith-Waterman sequence comparison algorithm with both e-value and Z-score statistics is evaluated, using ROC, CVE and AP measures. The BLAST and FASTA algorithms are used as reference. We find that two out of three Smith-Waterman implementations with e-value are better at predicting structural similarities between proteins than the Smith-Waterman implementation with Z-score. SSEARCH especially has very high scores. Conclusion The compute intensive Z-score does not have a clear advantage over the e-value. The Smith-Waterman implementations give generally better results than their heuristic counterparts. We recommend using the SSEARCH algorithm combined with e-values for pairwise sequence comparisons.
Database Administrator

Science.gov (United States)

Moore, Pam

2010-01-01

The Internet and electronic commerce (e-commerce) generate lots of data. Data must be stored, organized, and managed. Database administrators, or DBAs, work with database software to find ways to do this. They identify user needs, set up computer databases, and test systems. They ensure that systems perform as they should and add people to the…
Databases as policy instruments. About extending networks as evidence-based policy

Directory of Open Access Journals (Sweden)

Stoevelaar Herman

2007-12-01

Full Text Available Abstract Background This article seeks to identify the role of databases in health policy. Access to information and communication technologies has changed traditional relationships between the state and professionals, creating new systems of surveillance and control. As a result, databases may have a profound effect on controlling clinical practice. Methods We conducted three case studies to reconstruct the development and use of databases as policy instruments. Each database was intended to be employed to control the use of one particular pharmaceutical in the Netherlands (growth hormone, antiretroviral drugs for HIV and Taxol, respectively. We studied the archives of the Dutch Health Insurance Board, conducted in-depth interviews with key informants and organized two focus groups, all focused on the use of databases both in policy circles and in clinical practice. Results Our results demonstrate that policy makers hardly used the databases, neither for cost control nor for quality assurance. Further analysis revealed that these databases facilitated self-regulation and quality assurance by (national bodies of professionals, resulting in restrictive prescription behavior amongst physicians. Conclusion The databases fulfill control functions that were formerly located within the policy realm. The databases facilitate collaboration between policy makers and physicians, since they enable quality assurance by professionals. Delegating regulatory authority downwards into a network of physicians who control the use of pharmaceuticals seems to be a good alternative for centralized control on the basis of monitoring data.
Ocean Acidification Experiments in Large-Scale Mesocosms Reveal Similar Dynamics of Dissolved Organic Matter Production and Biotransformation

Directory of Open Access Journals (Sweden)

Maren Zark

2017-09-01

Full Text Available Dissolved organic matter (DOM represents a major reservoir of carbon in the oceans. Environmental stressors such as ocean acidification (OA potentially affect DOM production and degradation processes, e.g., phytoplankton exudation or microbial uptake and biotransformation of molecules. Resulting changes in carbon storage capacity of the ocean, thus, may cause feedbacks on the global carbon cycle. Previous experiments studying OA effects on the DOM pool under natural conditions, however, were mostly conducted in temperate and coastal eutrophic areas. Here, we report on OA effects on the existing and newly produced DOM pool during an experiment in the subtropical North Atlantic Ocean at the Canary Islands during an (1 oligotrophic phase and (2 after simulated deep water upwelling. The last is a frequently occurring event in this region controlling nutrient and phytoplankton dynamics. We manipulated nine large-scale mesocosms with a gradient of pCO2 ranging from ~350 up to ~1,030 μatm and monitored the DOM molecular composition using ultrahigh-resolution mass spectrometry via Fourier-transform ion cyclotron resonance mass spectrometry (FT-ICR-MS. An increase of 37 μmol L−1 DOC was observed in all mesocosms during a phytoplankton bloom induced by simulated upwelling. Indications for enhanced DOC accumulation under elevated CO2 became apparent during a phase of nutrient recycling toward the end of the experiment. The production of DOM was reflected in changes of the molecular DOM composition. Out of the 7,212 molecular formulae, which were detected throughout the experiment, ~50% correlated significantly in mass spectrometric signal intensity with cumulative bacterial protein production (BPP and are likely a product of microbial transformation. However, no differences in the produced compounds were found with respect to CO2 levels. Comparing the results of this experiment with a comparable OA experiment in the Swedish Gullmar Fjord, reveals
Database Description - TMFunction | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available sidue (or mutant) in a protein. The experimental data are collected from the literature both by searching th...the sequence database, UniProt, structural database, PDB, and literature database
PrimateLit Database

Science.gov (United States)

Primate Info Net Related Databases NCRR PrimateLit: A bibliographic database for primatology Top of any problems with this service. We welcome your feedback. The PrimateLit database is no longer being Resources, National Institutes of Health. The database is a collaborative project of the Wisconsin Primate
INVESTIGATION OF MIS ITEM 011589A AND 3013 CONTAINERS HAVING SIMILAR CHARACTERISTICS

Energy Technology Data Exchange (ETDEWEB)

Friday, G

2006-08-23

Recent testing has identified the presence of hydrogen and oxygen in MIS Item 011589A. This isolated observation has effectuated concern regarding the potential for flammable gas mixtures in containers in the storage inventory. This study examines the known physicochemical characteristics of MIS Item 011589A and queries the ISP Database for items that are most similar or potentially similar. Items identified as most similar are believed to have the highest probability of being chemically and structurally identical to MIS Item 011589A. Items identified as potentially like MIS Item 011589A have some attributes in common, have the potential to generate gases, but have a lower probability of having similar gas generating characteristics. MIS Item 011589A is an oxide that was generated prior to 1990 at Rocky Flats in Building 707. It was associated with foundry processing and had an actinide assay of approximately 77%. Prompt gamma analysis of MIS Item 011589A indicated the presence of chloride, fluorine, magnesium, sodium, and aluminum. Queries based on MIS representation classification and process of origin were applied to the ISP Database. Evaluation criteria included binning classification (i.e., innocuous, pressure, or pressure and corrosion), availability of prompt gamma analyses, presence of chlorine and magnesium, percentage of chlorine by weight, peak ratios (i.e., Na:Cl and Mg:Na), moisture, and percent assay. These queries identified 15 items that were most similar and 106 items that were potentially like MIS Item 011589A. Although these queries identified containers that could potentially generate flammable gases, verification and confirmation can only be accomplished by destructive evaluation and testing of containers from the storage inventory.
Winnowing sequences from a database search.

Science.gov (United States)

Berman, P; Zhang, Z; Wolf, Y I; Koonin, E V; Miller, W

2000-01-01

In database searches for sequence similarity, matches to a distinct sequence region (e.g., protein domain) are frequently obscured by numerous matches to another region of the same sequence. In order to cope with this problem, algorithms are developed to discard redundant matches. One model for this problem begins with a list of intervals, each with an associated score; each interval gives the range of positions in the query sequence that align to a database sequence, and the score is that of the alignment. If interval I is contained in interval J, and I's score is less than J's, then I is said to be dominated by J. The problem is then to identify each interval that is dominated by at least K other intervals, where K is a given level of "tolerable redundancy." An algorithm is developed to solve the problem in O(N log N) time and O(N*) space, where N is the number of intervals and N* is a precisely defined value that never exceeds N and is frequently much smaller. This criterion for discarding database hits has been implemented in the Blast program, as illustrated herein with examples. Several variations and extensions of this approach are also described.
Analysis of HIV-1 intersubtype recombination breakpoints suggests region with high pairing probability may be a more fundamental factor than sequence similarity affecting HIV-1 recombination.

Science.gov (United States)

Jia, Lei; Li, Lin; Gui, Tao; Liu, Siyang; Li, Hanping; Han, Jingwan; Guo, Wei; Liu, Yongjian; Li, Jingyun

2016-09-21

With increasing data on HIV-1, a more relevant molecular model describing mechanism details of HIV-1 genetic recombination usually requires upgrades. Currently an incomplete structural understanding of the copy choice mechanism along with several other issues in the field that lack elucidation led us to perform an analysis of the correlation between breakpoint distributions and (1) the probability of base pairing, and (2) intersubtype genetic similarity to further explore structural mechanisms. Near full length sequences of URFs from Asia, Europe, and Africa (one sequence/patient), and representative sequences of worldwide CRFs were retrieved from the Los Alamos HIV database. Their recombination patterns were analyzed by jpHMM in detail. Then the relationships between breakpoint distributions and (1) the probability of base pairing, and (2) intersubtype genetic similarities were investigated. Pearson correlation test showed that all URF groups and the CRF group exhibit the same breakpoint distribution pattern. Additionally, the Wilcoxon two-sample test indicated a significant and inexplicable limitation of recombination in regions with high pairing probability. These regions have been found to be strongly conserved across distinct biological states (i.e., strong intersubtype similarity), and genetic similarity has been determined to be a very important factor promoting recombination. Thus, the results revealed an unexpected disagreement between intersubtype similarity and breakpoint distribution, which were further confirmed by genetic similarity analysis. Our analysis reveals a critical conflict between results from natural HIV-1 isolates and those from HIV-1-based assay vectors in which genetic similarity has been shown to be a very critical factor promoting recombination. These results indicate the region with high-pairing probabilities may be a more fundamental factor affecting HIV-1 recombination than sequence similarity in natural HIV-1 infections. Our
Protein backbone angle restraints from searching a database for chemical shift and sequence homology

Energy Technology Data Exchange (ETDEWEB)

Cornilescu, Gabriel; Delaglio, Frank; Bax, Ad [National Institutes of Health, Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases (United States)

1999-03-15

Chemical shifts of backbone atoms in proteins are exquisitely sensitive to local conformation, and homologous proteins show quite similar patterns of secondary chemical shifts. The inverse of this relation is used to search a database for triplets of adjacent residues with secondary chemical shifts and sequence similarity which provide the best match to the query triplet of interest. The database contains 13C{alpha}, 13C{beta}, 13C', 1H{alpha} and 15N chemical shifts for 20 proteins for which a high resolution X-ray structure is available. The computer program TALOS was developed to search this database for strings of residues with chemical shift and residue type homology. The relative importance of the weighting factors attached to the secondary chemical shifts of the five types of resonances relative to that of sequence similarity was optimized empirically. TALOS yields the 10 triplets which have the closest similarity in secondary chemical shift and amino acid sequence to those of the query sequence. If the central residues in these 10 triplets exhibit similar {phi} and {psi} backbone angles, their averages can reliably be used as angular restraints for the protein whose structure is being studied. Tests carried out for proteins of known structure indicate that the root-mean-square difference (rmsd) between the output of TALOS and the X-ray derived backbone angles is about 15 deg. Approximately 3% of the predictions made by TALOS are found to be in error.
SPTEdb: a database for transposable elements in salicaceous plants

Science.gov (United States)

Jia, Zirui; Xiao, Yao; Ma, Wenjun; Wang, Junhui

2018-01-01

Abstract Although transposable elements (TEs) play significant roles in structural, functional and evolutionary dynamics of the salicaceous plants genome and the accurate identification, definition and classification of TEs are still inadequate. In this study, we identified 18 393 TEs from Populus trichocarpa, Populus euphratica and Salix suchowensis using a combination of signature-based, similarity-based and De novo method, and annotated them into 1621 families. A comprehensive and user-friendly web-based database, SPTEdb, was constructed and served for researchers. SPTEdb enables users to browse, retrieve and download the TEs sequences from the database. Meanwhile, several analysis tools, including BLAST, HMMER, GetORF and Cut sequence, were also integrated into SPTEdb to help users to mine the TEs data easily and effectively. In summary, SPTEdb will facilitate the study of TEs biology and functional genomics in salicaceous plants. Database URL: http://genedenovoweb.ticp.net:81/SPTEdb/index.php PMID:29688371
License - Yeast Interacting Proteins Database | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us Yeast Interacting Proteins Database License to Use This Database Last updated : 2010/02/15 You may use this database...nal License described below. The Standard License specifies the license terms regarding the use of this database... and the requirements you must follow in using this database. The Additional ...the Standard License. Standard License The Standard License for this database is the license specified in th...e Creative Commons Attribution-Share Alike 2.1 Japan . If you use data from this database
Hmrbase: a database of hormones and their receptors

Science.gov (United States)

Rashid, Mamoon; Singla, Deepak; Sharma, Arun; Kumar, Manish; Raghava, Gajendra PS

2009-01-01

Background Hormones are signaling molecules that play vital roles in various life processes, like growth and differentiation, physiology, and reproduction. These molecules are mostly secreted by endocrine glands, and transported to target organs through the bloodstream. Deficient, or excessive, levels of hormones are associated with several diseases such as cancer, osteoporosis, diabetes etc. Thus, it is important to collect and compile information about hormones and their receptors. Description This manuscript describes a database called Hmrbase which has been developed for managing information about hormones and their receptors. It is a highly curated database for which information has been collected from the literature and the public databases. The current version of Hmrbase contains comprehensive information about ~2000 hormones, e.g., about their function, source organism, receptors, mature sequences, structures etc. Hmrbase also contains information about ~3000 hormone receptors, in terms of amino acid sequences, subcellular localizations, ligands, and post-translational modifications etc. One of the major features of this database is that it provides data about ~4100 hormone-receptor pairs. A number of online tools have been integrated into the database, to provide the facilities like keyword search, structure-based search, mapping of a given peptide(s) on the hormone/receptor sequence, sequence similarity search. This database also provides a number of external links to other resources/databases in order to help in the retrieving of further related information. Conclusion Owing to the high impact of endocrine research in the biomedical sciences, the Hmrbase could become a leading data portal for researchers. The salient features of Hmrbase are hormone-receptor pair-related information, mapping of peptide stretches on the protein sequences of hormones and receptors, Pfam domain annotations, categorical browsing options, online data submission, Drug
NoSQL database scaling

OpenAIRE

Žardin, Norbert

2017-01-01

NoSQL database scaling is a decision, where system resources or financial expenses are traded for database performance or other benefits. By scaling a database, database performance and resource usage might increase or decrease, such changes might have a negative impact on an application that uses the database. In this work it is analyzed how database scaling affect database resource usage and performance. As a results, calculations are acquired, using which database scaling types and differe...
Self-aligning and compressed autosophy video databases

Science.gov (United States)

Holtz, Klaus E.

1993-04-01

Autosophy, an emerging new science, explains `self-assembling structures,' such as crystals or living trees, in mathematical terms. This research provides a new mathematical theory of `learning' and a new `information theory' which permits the growing of self-assembling data network in a computer memory similar to the growing of `data crystals' or `data trees' without data processing or programming. Autosophy databases are educated very much like a human child to organize their own internal data storage. Input patterns, such as written questions or images, are converted to points in a mathematical omni dimensional hyperspace. The input patterns are then associated with output patterns, such as written answers or images. Omni dimensional information storage will result in enormous data compression because each pattern fragment is only stored once. Pattern recognition in the text or image files is greatly simplified by the peculiar omni dimensional storage method. Video databases will absorb input images from a TV camera and associate them with textual information. The `black box' operations are totally self-aligning where the input data will determine their own hyperspace storage locations. Self-aligning autosophy databases may lead to a new generation of brain-like devices.

High-throughput STR analysis for DNA database using direct PCR.

Science.gov (United States)

Sim, Jeong Eun; Park, Su Jeong; Lee, Han Chul; Kim, Se-Yong; Kim, Jong Yeol; Lee, Seung Hwan

2013-07-01

Since the Korean criminal DNA database was launched in 2010, we have focused on establishing an automated DNA database profiling system that analyzes short tandem repeat loci in a high-throughput and cost-effective manner. We established a DNA database profiling system without DNA purification using a direct PCR buffer system. The quality of direct PCR procedures was compared with that of conventional PCR system under their respective optimized conditions. The results revealed not only perfect concordance but also an excellent PCR success rate, good electropherogram quality, and an optimal intra/inter-loci peak height ratio. In particular, the proportion of DNA extraction required due to direct PCR failure could be minimized to <3%. In conclusion, the newly developed direct PCR system can be adopted for automated DNA database profiling systems to replace or supplement conventional PCR system in a time- and cost-saving manner. © 2013 American Academy of Forensic Sciences Published 2013. This article is a U.S. Government work and is in the public domain in the U.S.A.
Big Data and Total Hip Arthroplasty: How Do Large Databases Compare?

Science.gov (United States)

Bedard, Nicholas A; Pugely, Andrew J; McHugh, Michael A; Lux, Nathan R; Bozic, Kevin J; Callaghan, John J

2018-01-01

Use of large databases for orthopedic research has become extremely popular in recent years. Each database varies in the methods used to capture data and the population it represents. The purpose of this study was to evaluate how these databases differed in reported demographics, comorbidities, and postoperative complications for primary total hip arthroplasty (THA) patients. Primary THA patients were identified within National Surgical Quality Improvement Programs (NSQIP), Nationwide Inpatient Sample (NIS), Medicare Standard Analytic Files (MED), and Humana administrative claims database (HAC). NSQIP definitions for comorbidities and complications were matched to corresponding International Classification of Diseases, 9th Revision/Current Procedural Terminology codes to query the other databases. Demographics, comorbidities, and postoperative complications were compared. The number of patients from each database was 22,644 in HAC, 371,715 in MED, 188,779 in NIS, and 27,818 in NSQIP. Age and gender distribution were clinically similar. Overall, there was variation in prevalence of comorbidities and rates of postoperative complications between databases. As an example, NSQIP had more than twice the obesity than NIS. HAC and MED had more than 2 times the diabetics than NSQIP. Rates of deep infection and stroke 30 days after THA had more than 2-fold difference between all databases. Among databases commonly used in orthopedic research, there is considerable variation in complication rates following THA depending upon the database used for analysis. It is important to consider these differences when critically evaluating database research. Additionally, with the advent of bundled payments, these differences must be considered in risk adjustment models. Copyright © 2017 Elsevier Inc. All rights reserved.
DataBase on Demand

International Nuclear Information System (INIS)

Aparicio, R Gaspar; Gomez, D; Wojcik, D; Coz, I Coterillo

2012-01-01

At CERN a number of key database applications are running on user-managed MySQL database services. The database on demand project was born out of an idea to provide the CERN user community with an environment to develop and run database services outside of the actual centralised Oracle based database services. The Database on Demand (DBoD) empowers the user to perform certain actions that had been traditionally done by database administrators, DBA's, providing an enterprise platform for database applications. It also allows the CERN user community to run different database engines, e.g. presently open community version of MySQL and single instance Oracle database server. This article describes a technology approach to face this challenge, a service level agreement, the SLA that the project provides, and an evolution of possible scenarios.
Low dose CT image restoration using a database of image patches

Science.gov (United States)

Ha, Sungsoo; Mueller, Klaus

2015-01-01

Reducing the radiation dose in CT imaging has become an active research topic and many solutions have been proposed to remove the significant noise and streak artifacts in the reconstructed images. Most of these methods operate within the domain of the image that is subject to restoration. This, however, poses limitations on the extent of filtering possible. We advocate to take into consideration the vast body of external knowledge that exists in the domain of already acquired medical CT images, since after all, this is what radiologists do when they examine these low quality images. We can incorporate this knowledge by creating a database of prior scans, either of the same patient or a diverse corpus of different patients, to assist in the restoration process. Our paper follows up on our previous work that used a database of images. Using images, however, is challenging since it requires tedious and error prone registration and alignment. Our new method eliminates these problems by storing a diverse set of small image patches in conjunction with a localized similarity matching scheme. We also empirically show that it is sufficient to store these patches without anatomical tags since their statistics are sufficiently strong to yield good similarity matches from the database and as a direct effect, produce image restorations of high quality. A final experiment demonstrates that our global database approach can recover image features that are difficult to preserve with conventional denoising approaches.
Full Data of Yeast Interacting Proteins Database (Original Version) - Yeast Interacting Proteins Database | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us Yeast Interacting Proteins Database Full Data of Yeast Interacting Proteins Database (Origin...al Version) Data detail Data name Full Data of Yeast Interacting Proteins Database (Original Version) DOI 10....18908/lsdba.nbdc00742-004 Description of data contents The entire data in the Yeast Interacting Proteins Database...eir interactions are required. Several sources including YPD (Yeast Proteome Database, Costanzo, M. C., Hoga...ematic name in the SGD (Saccharomyces Genome Database; http://www.yeastgenome.org /). Bait gene name The gen
muBLASTP: database-indexed protein sequence search on multicore CPUs.

Science.gov (United States)

Zhang, Jing; Misra, Sanchit; Wang, Hao; Feng, Wu-Chun

2016-11-04

The Basic Local Alignment Search Tool (BLAST) is a fundamental program in the life sciences that searches databases for sequences that are most similar to a query sequence. Currently, the BLAST algorithm utilizes a query-indexed approach. Although many approaches suggest that sequence search with a database index can achieve much higher throughput (e.g., BLAT, SSAHA, and CAFE), they cannot deliver the same level of sensitivity as the query-indexed BLAST, i.e., NCBI BLAST, or they can only support nucleotide sequence search, e.g., MegaBLAST. Due to different challenges and characteristics between query indexing and database indexing, the existing techniques for query-indexed search cannot be used into database indexed search. muBLASTP, a novel database-indexed BLAST for protein sequence search, delivers identical hits returned to NCBI BLAST. On Intel Haswell multicore CPUs, for a single query, the single-threaded muBLASTP achieves up to a 4.41-fold speedup for alignment stages, and up to a 1.75-fold end-to-end speedup over single-threaded NCBI BLAST. For a batch of queries, the multithreaded muBLASTP achieves up to a 5.7-fold speedups for alignment stages, and up to a 4.56-fold end-to-end speedup over multithreaded NCBI BLAST. With a newly designed index structure for protein database and associated optimizations in BLASTP algorithm, we re-factored BLASTP algorithm for modern multicore processors that achieves much higher throughput with acceptable memory footprint for the database index.
Energy Consumption Database

Science.gov (United States)

Consumption Database The California Energy Commission has created this on-line database for informal reporting ) classifications. The database also provides easy downloading of energy consumption data into Microsoft Excel (XLSX
Use of Graph Database for the Integration of Heterogeneous Biological Data.

Science.gov (United States)

Yoon, Byoung-Ha; Kim, Seon-Kyu; Kim, Seon-Young

2017-03-01

Understanding complex relationships among heterogeneous biological data is one of the fundamental goals in biology. In most cases, diverse biological data are stored in relational databases, such as MySQL and Oracle, which store data in multiple tables and then infer relationships by multiple-join statements. Recently, a new type of database, called the graph-based database, was developed to natively represent various kinds of complex relationships, and it is widely used among computer science communities and IT industries. Here, we demonstrate the feasibility of using a graph-based database for complex biological relationships by comparing the performance between MySQL and Neo4j, one of the most widely used graph databases. We collected various biological data (protein-protein interaction, drug-target, gene-disease, etc.) from several existing sources, removed duplicate and redundant data, and finally constructed a graph database containing 114,550 nodes and 82,674,321 relationships. When we tested the query execution performance of MySQL versus Neo4j, we found that Neo4j outperformed MySQL in all cases. While Neo4j exhibited a very fast response for various queries, MySQL exhibited latent or unfinished responses for complex queries with multiple-join statements. These results show that using graph-based databases, such as Neo4j, is an efficient way to store complex biological relationships. Moreover, querying a graph database in diverse ways has the potential to reveal novel relationships among heterogeneous biological data.
Using a Semi-Realistic Database to Support a Database Course

Science.gov (United States)

Yue, Kwok-Bun

2013-01-01

A common problem for university relational database courses is to construct effective databases for instructions and assignments. Highly simplified "toy" databases are easily available for teaching, learning, and practicing. However, they do not reflect the complexity and practical considerations that students encounter in real-world…
Outsourced Similarity Search on Metric Data Assets

DEFF Research Database (Denmark)

Yiu, Man Lung; Assent, Ira; Jensen, Christian S.

2012-01-01

. Outsourcing offers the data owner scalability and a low initial investment. The need for privacy may be due to the data being sensitive (e.g., in medicine), valuable (e.g., in astronomy), or otherwise confidential. Given this setting, the paper presents techniques that transform the data prior to supplying......This paper considers a cloud computing setting in which similarity querying of metric data is outsourced to a service provider. The data is to be revealed only to trusted users, not to the service provider or anyone else. Users query the server for the most similar data objects to a query example...
Fire test database

International Nuclear Information System (INIS)

Lee, J.A.

1989-01-01

This paper describes a project recently completed for EPRI by Impell. The purpose of the project was to develop a reference database of fire tests performed on non-typical fire rated assemblies. The database is designed for use by utility fire protection engineers to locate test reports for power plant fire rated assemblies. As utilities prepare to respond to Information Notice 88-04, the database will identify utilities, vendors or manufacturers who have specific fire test data. The database contains fire test report summaries for 729 tested configurations. For each summary, a contact is identified from whom a copy of the complete fire test report can be obtained. Five types of configurations are included: doors, dampers, seals, wraps and walls. The database is computerized. One version for IBM; one for Mac. Each database is accessed through user-friendly software which allows adding, deleting, browsing, etc. through the database. There are five major database files. One each for the five types of tested configurations. The contents of each provides significant information regarding the test method and the physical attributes of the tested configuration. 3 figs
Artificial Radionuclides Database in the Pacific Ocean: HAM Database

Directory of Open Access Journals (Sweden)

Michio Aoyama

2004-01-01

Full Text Available The database “Historical Artificial Radionuclides in the Pacific Ocean and its Marginal Seas”, or HAM database, has been created. The database includes 90Sr, 137Cs, and 239,240Pu concentration data from the seawater of the Pacific Ocean and its marginal seas with some measurements from the sea surface to the bottom. The data in the HAM database were collected from about 90 literature citations, which include published papers; annual reports by the Hydrographic Department, Maritime Safety Agency, Japan; and unpublished data provided by individuals. The data of concentrations of 90Sr, 137Cs, and 239,240Pu have been accumulating since 19571998. The present HAM database includes 7737 records for 137Cs concentration data, 3972 records for 90Sr concentration data, and 2666 records for 239,240Pu concentration data. The spatial variation of sampling stations in the HAM database is heterogeneous, namely, more than 80% of the data for each radionuclide is from the Pacific Ocean and the Sea of Japan, while a relatively small portion of data is from the South Pacific. This HAM database will allow us to use these radionuclides as significant chemical tracers for oceanographic study as well as the assessment of environmental affects of anthropogenic radionuclides for these 5 decades. Furthermore, these radionuclides can be used to verify the oceanic general circulation models in the time scale of several decades.
Databases and their application

NARCIS (Netherlands)

Grimm, E.C.; Bradshaw, R.H.W; Brewer, S.; Flantua, S.; Giesecke, T.; Lézine, A.M.; Takahara, H.; Williams, J.W.,Jr; Elias, S.A.; Mock, C.J.

2013-01-01

During the past 20 years, several pollen database cooperatives have been established. These databases are now constituent databases of the Neotoma Paleoecology Database, a public domain, multiproxy, relational database designed for Quaternary-Pliocene fossil data and modern surface samples. The
Neurotree: a collaborative, graphical database of the academic genealogy of neuroscience.

Science.gov (United States)

David, Stephen V; Hayden, Benjamin Y

2012-01-01

Neurotree is an online database that documents the lineage of academic mentorship in neuroscience. Modeled on the tree format typically used to describe biological genealogies, the Neurotree web site provides a concise summary of the intellectual history of neuroscience and relationships between individuals in the current neuroscience community. The contents of the database are entirely crowd-sourced: any internet user can add information about researchers and the connections between them. As of July 2012, Neurotree has collected information from 10,000 users about 35,000 researchers and 50,000 mentor relationships, and continues to grow. The present report serves to highlight the utility of Neurotree as a resource for academic research and to summarize some basic analysis of its data. The tree structure of the database permits a variety of graphical analyses. We find that the connectivity and graphical distance between researchers entered into Neurotree early has stabilized and thus appears to be mostly complete. The connectivity of more recent entries continues to mature. A ranking of researcher fecundity based on their mentorship reveals a sustained period of influential researchers from 1850-1950, with the most influential individuals active at the later end of that period. Finally, a clustering analysis reveals that some subfields of neuroscience are reflected in tightly interconnected mentor-trainee groups.
Measure of Node Similarity in Multilayer Networks

DEFF Research Database (Denmark)

Møllgaard, Anders; Zettler, Ingo; Dammeyer, Jesper

2016-01-01

The weight of links in a network is often related to the similarity of thenodes. Here, we introduce a simple tunable measure for analysing the similarityof nodes across different link weights. In particular, we use the measure toanalyze homophily in a group of 659 freshman students at a large...... university.Our analysis is based on data obtained using smartphones equipped with customdata collection software, complemented by questionnaire-based data. The networkof social contacts is represented as a weighted multilayer network constructedfrom different channels of telecommunication as well as data...... might bepresent in one layer of the multilayer network and simultaneously be absent inthe other layers. For a variable such as gender, our measure reveals atransition from similarity between nodes connected with links of relatively lowweight to dis-similarity for the nodes connected by the strongest...
Development of Tsunami Trace Database with reliability evaluation on Japan coasts

International Nuclear Information System (INIS)

Iwabuchi, Yoko; Sugino, Hideharu; Imamura, Fumihiko; Imai, Kentaro; Tsuji, Yoshinobu; Matsuoka, Yuya; Shuto, Nobuo

2012-01-01

The purpose of this research is to develop a Tsunami Trace Database by collecting historical materials as well as documents concerning tsunamis which had hit Japan and, of which the reliability of tsunami run-up and related data is taken into account. Based on acquisition and surveying of references, tsunami trace data over past 400 years of Japan has collected into a database, and reliability of each trace data was evaluated according to categorization of Japan Society of Civil Engineers (2002). As a result, trace data can now be searched and filtered with reliability levels accordingly whilst utilizing it for verification of tsunami numerical analysis and estimation of tsunami sources. By analyzing this database, we have quantitatively revealed the fact that the amount of reliable data tends to diminish as it goes older. (author)
Measure of Node Similarity in Multilayer Networks.

Directory of Open Access Journals (Sweden)

Anders Mollgaard

Full Text Available The weight of links in a network is often related to the similarity of the nodes. Here, we introduce a simple tunable measure for analysing the similarity of nodes across different link weights. In particular, we use the measure to analyze homophily in a group of 659 freshman students at a large university. Our analysis is based on data obtained using smartphones equipped with custom data collection software, complemented by questionnaire-based data. The network of social contacts is represented as a weighted multilayer network constructed from different channels of telecommunication as well as data on face-to-face contacts. We find that even strongly connected individuals are not more similar with respect to basic personality traits than randomly chosen pairs of individuals. In contrast, several socio-demographics variables have a significant degree of similarity. We further observe that similarity might be present in one layer of the multilayer network and simultaneously be absent in the other layers. For a variable such as gender, our measure reveals a transition from similarity between nodes connected with links of relatively low weight to dis-similarity for the nodes connected by the strongest links. We finally analyze the overlap between layers in the network for different levels of acquaintanceships.
Secondary analysis of a marketing research database reveals patterns in dairy product purchases over time.

Science.gov (United States)

Van Wave, Timothy W; Decker, Michael

2003-04-01

Development of a method using marketing research data to assess food purchase behavior and consequent nutrient availability for purposes of nutrition surveillance, evaluation of intervention effects, and epidemiologic studies of diet-health relationships. Data collected on household food purchases accrued over a 13-week period were selected by using Universal Product Code numbers and household characteristics from a marketing research database. Universal Product Code numbers for 39,408 dairy product purchases were linked to a standard reference for food composition to estimate the nutrient content of foods purchased over time. Two thousand one hundred sixty-one households located in Victoria, Texas, and surrounding communities who were active members of a frequent shopper program. Demographic characteristics of sample households and the nutrient content of their dairy product purchases were analyzed using frequency distribution, cross tabulation, analysis of variance, and t test procedures. A method for using marketing research data was successfully used to estimate household purchases of specific foods and their nutrient content from a marketing database containing hundreds of thousands of records. Distribution of dairy product purchases and their concomitant nutrients between Hispanic and non-Hispanic households were significant (P<.01, P<.001, respectively) and sustained over time. Purchase records from large, nationally representative panels of shoppers, such as those maintained by major market research companies, might be used to accomplish detailed longitudinal epidemiologic studies or surveillance of national food- and nutrient-purchasing patterns within and between countries and segments of their respective populations.
A New Reversible Database Watermarking Approach with Firefly Optimization Algorithm

Directory of Open Access Journals (Sweden)

Mustafa Bilgehan Imamoglu

2017-01-01

Full Text Available Up-to-date information is crucial in many fields such as medicine, science, and stock market, where data should be distributed to clients from a centralized database. Shared databases are usually stored in data centers where they are distributed over insecure public access network, the Internet. Sharing may result in a number of problems such as unauthorized copies, alteration of data, and distribution to unauthorized people for reuse. Researchers proposed using watermarking to prevent problems and claim digital rights. Many methods are proposed recently to watermark databases to protect digital rights of owners. Particularly, optimization based watermarking techniques draw attention, which results in lower distortion and improved watermark capacity. Difference expansion watermarking (DEW with Firefly Algorithm (FFA, a bioinspired optimization technique, is proposed to embed watermark into relational databases in this work. Best attribute values to yield lower distortion and increased watermark capacity are selected efficiently by the FFA. Experimental results indicate that FFA has reduced complexity and results in less distortion and improved watermark capacity compared to similar works reported in the literature.
Database Optimizing Services

Directory of Open Access Journals (Sweden)

Adrian GHENCEA

2010-12-01

Full Text Available Almost every organization has at its centre a database. The database provides support for conducting different activities, whether it is production, sales and marketing or internal operations. Every day, a database is accessed for help in strategic decisions. The satisfaction therefore of such needs is entailed with a high quality security and availability. Those needs can be realised using a DBMS (Database Management System which is, in fact, software for a database. Technically speaking, it is software which uses a standard method of cataloguing, recovery, and running different data queries. DBMS manages the input data, organizes it, and provides ways of modifying or extracting the data by its users or other programs. Managing the database is an operation that requires periodical updates, optimizing and monitoring.

Outsourced similarity search on metric data assets

KAUST Repository

Yiu, Man Lung

2012-02-01

This paper considers a cloud computing setting in which similarity querying of metric data is outsourced to a service provider. The data is to be revealed only to trusted users, not to the service provider or anyone else. Users query the server for the most similar data objects to a query example. Outsourcing offers the data owner scalability and a low-initial investment. The need for privacy may be due to the data being sensitive (e.g., in medicine), valuable (e.g., in astronomy), or otherwise confidential. Given this setting, the paper presents techniques that transform the data prior to supplying it to the service provider for similarity queries on the transformed data. Our techniques provide interesting trade-offs between query cost and accuracy. They are then further extended to offer an intuitive privacy guarantee. Empirical studies with real data demonstrate that the techniques are capable of offering privacy while enabling efficient and accurate processing of similarity queries.
Update History of This Database - RED | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us RED Update History of This Database Date Update contents 2015/12/21 Rice Expression Database English archi...s Database Database Description Download License Update History of This Database Site Policy | Contact Us Update History of This Database - RED | LSDB Archive ... ...ve site is opened. 2000/10/1 Rice Expression Database ( http://red.dna.affrc.go.jp/RED/ ) is opened. About Thi
GRIP Database original data - GRIPDB | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data List Contact us GRI...PDB GRIP Database original data Data detail Data name GRIP Database original data DOI 10....18908/lsdba.nbdc01665-006 Description of data contents GRIP Database original data It consists of data table...s and sequences. Data file File name: gripdb_original_data.zip File URL: ftp://ftp.biosciencedbc.jp/archive/gripdb/LATEST/gri...e Database Description Download License Update History of This Database Site Policy | Contact Us GRIP Database original data - GRIPDB | LSDB Archive ...
CW EPR parameters reveal cytochrome P450 ligand binding modes.

Science.gov (United States)

Lockart, Molly M; Rodriguez, Carlo A; Atkins, William M; Bowman, Michael K

2018-06-01

Cytochrome P450 (CYP) monoxygenses utilize heme cofactors to catalyze oxidation reactions. They play a critical role in metabolism of many classes of drugs, are an attractive target for drug development, and mediate several prominent drug interactions. Many substrates and inhibitors alter the spin state of the ferric heme by displacing the heme's axial water ligand in the resting enzyme to yield a five-coordinate iron complex, or they replace the axial water to yield a nitrogen-ligated six-coordinate iron complex, which are traditionally assigned by UV-vis spectroscopy. However, crystal structures and recent pulsed electron paramagnetic resonance (EPR) studies find a few cases where molecules hydrogen bond to the axial water. The water-bridged drug-H 2 O-heme has UV-vis spectra similar to nitrogen-ligated, six-coordinate complexes, but are closer to "reverse type I" complexes described in older liteature. Here, pulsed and continuous wave (CW) EPR demonstrate that water-bridged complexes are remarkably common among a range of nitrogenous drugs or drug fragments that bind to CYP3A4 or CYP2C9. Principal component analysis reveals a distinct clustering of CW EPR spectral parameters for water-bridged complexes. CW EPR reveals heterogeneous mixtures of ligated states, including multiple directly-coordinated complexes and water-bridged complexes. These results suggest that water-bridged complexes are under-represented in CYP structural databases and can have energies similar to other ligation modes. The data indicates that water-bridged binding modes can be identified and distinguished from directly-coordinated binding by CW EPR. Copyright © 2018 Elsevier Inc. All rights reserved.
Is the phonological similarity effect in working memory due to proactive interference?

Science.gov (United States)

Baddeley, Alan D; Hitch, Graham J; Quinlan, Philip T

2018-04-12

Immediate serial recall of verbal material is highly sensitive to impairment attributable to phonological similarity. Although this has traditionally been interpreted as a within-sequence similarity effect, Engle (2007) proposed an interpretation based on interference from prior sequences, a phenomenon analogous to that found in the Peterson short-term memory (STM) task. We use the method of serial reconstruction to test this in an experiment contrasting the standard paradigm in which successive sequences are drawn from the same set of phonologically similar or dissimilar words and one in which the vowel sound on which similarity is based is switched from trial to trial, a manipulation analogous to that producing release from PI in the Peterson task. A substantial similarity effect occurs under both conditions although there is a small advantage from switching across similar sequences. There is, however, no evidence for the suggestion that the similarity effect will be absent from the very first sequence tested. Our results support the within-sequence similarity rather than a between-list PI interpretation. Reasons for the contrast with the classic Peterson short-term forgetting task are briefly discussed. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Update History of This Database - RPD | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us RPD Update History of This Database Date Update contents 2016/02/02 Rice Proteome Database English archi...s Database Database Description Download License Update History of This Database Site Policy | Contact Us Update History of This Database - RPD | LSDB Archive ... ...ve site is opened. 2003/01/07 Rice Proteome Database ( http://gene64.dna.affrc.go.jp/RPD/ ) is opened. About Thi
Brasilia’s Database Administrators

Directory of Open Access Journals (Sweden)

Jane Adriana

2016-06-01

Full Text Available Database administration has gained an essential role in the management of new database technologies. Different data models are being created for supporting the enormous data volume, from the traditional relational database. These new models are called NoSQL (Not only SQL databases. The adoption of best practices and procedures, has become essential for the operation of database management systems. Thus, this paper investigates some of the techniques and tools used by database administrators. The study highlights features and particularities in databases within the area of Brasilia, the Capital of Brazil. The results point to which new technologies regarding database management are currently the most relevant, as well as the central issues in this area.
Development of the severe accident risk information database management system SARD

International Nuclear Information System (INIS)

Ahn, Kwang Il; Kim, Dong Ha

2003-01-01

The main purpose of this report is to introduce essential features and functions of a severe accident risk information management system, SARD (Severe Accident Risk Database Management System) version 1.0, which has been developed in Korea Atomic Energy Research Institute, and database management and data retrieval procedures through the system. The present database management system has powerful capabilities that can store automatically and manage systematically the plant-specific severe accident analysis results for core damage sequences leading to severe accidents, and search intelligently the related severe accident risk information. For that purpose, the present database system mainly takes into account the plant-specific severe accident sequences obtained from the Level 2 Probabilistic Safety Assessments (PSAs), base case analysis results for various severe accident sequences (such as code responses and summary for key-event timings), and related sensitivity analysis results for key input parameters/models employed in the severe accident codes. Accordingly, the present database system can be effectively applied in supporting the Level 2 PSA of similar plants, for fast prediction and intelligent retrieval of the required severe accident risk information for the specific plant whose information was previously stored in the database system, and development of plant-specific severe accident management strategies
Development of the severe accident risk information database management system SARD

Energy Technology Data Exchange (ETDEWEB)

Ahn, Kwang Il; Kim, Dong Ha

2003-01-01

The main purpose of this report is to introduce essential features and functions of a severe accident risk information management system, SARD (Severe Accident Risk Database Management System) version 1.0, which has been developed in Korea Atomic Energy Research Institute, and database management and data retrieval procedures through the system. The present database management system has powerful capabilities that can store automatically and manage systematically the plant-specific severe accident analysis results for core damage sequences leading to severe accidents, and search intelligently the related severe accident risk information. For that purpose, the present database system mainly takes into account the plant-specific severe accident sequences obtained from the Level 2 Probabilistic Safety Assessments (PSAs), base case analysis results for various severe accident sequences (such as code responses and summary for key-event timings), and related sensitivity analysis results for key input parameters/models employed in the severe accident codes. Accordingly, the present database system can be effectively applied in supporting the Level 2 PSA of similar plants, for fast prediction and intelligent retrieval of the required severe accident risk information for the specific plant whose information was previously stored in the database system, and development of plant-specific severe accident management strategies.
Do similarities or differences between CEO leadership and organizational culture have a more positive effect on firm performance? A test of competing predictions.

Science.gov (United States)

Hartnell, Chad A; Kinicki, Angelo J; Lambert, Lisa Schurer; Fugate, Mel; Doyle Corner, Patricia

2016-06-01

This study examines the nature of the interaction between CEO leadership and organizational culture using 2 common metathemes (task and relationship) in leadership and culture research. Two perspectives, similarity and dissimilarity, offer competing predictions about the fit, or interaction, between leadership and culture and its predicted effect on firm performance. Predictions for the similarity perspective draw upon attribution theory and social identity theory of leadership, whereas predictions for the dissimilarity perspective are developed based upon insights from leadership contingency theories and the notion of substitutability. Hierarchical regression results from 114 CEOs and 324 top management team (TMT) members failed to support the similarity hypotheses but revealed broad support for the dissimilarity predictions. Findings suggest that culture can serve as a substitute for leadership when leadership behaviors are redundant with cultural values (i.e., they both share a task- or relationship-oriented focus). Findings also support leadership contingency theories indicating that CEO leadership is effective when it provides psychological and motivational resources lacking in the organization's culture. We discuss theoretical and practical implications and delineate directions for future research. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Fast protein tertiary structure retrieval based on global surface shape similarity.

Science.gov (United States)

Sael, Lee; Li, Bin; La, David; Fang, Yi; Ramani, Karthik; Rustamov, Raif; Kihara, Daisuke

2008-09-01

Characterization and identification of similar tertiary structure of proteins provides rich information for investigating function and evolution. The importance of structure similarity searches is increasing as structure databases continue to expand, partly due to the structural genomics projects. A crucial drawback of conventional protein structure comparison methods, which compare structures by their main-chain orientation or the spatial arrangement of secondary structure, is that a database search is too slow to be done in real-time. Here we introduce a global surface shape representation by three-dimensional (3D) Zernike descriptors, which represent a protein structure compactly as a series expansion of 3D functions. With this simplified representation, the search speed against a few thousand structures takes less than a minute. To investigate the agreement between surface representation defined by 3D Zernike descriptor and conventional main-chain based representation, a benchmark was performed against a protein classification generated by the combinatorial extension algorithm. Despite the different representation, 3D Zernike descriptor retrieved proteins of the same conformation defined by combinatorial extension in 89.6% of the cases within the top five closest structures. The real-time protein structure search by 3D Zernike descriptor will open up new possibility of large-scale global and local protein surface shape comparison. 2008 Wiley-Liss, Inc.
A fully automatic end-to-end method for content-based image retrieval of CT scans with similar liver lesion annotations.

Science.gov (United States)

Spanier, A B; Caplan, N; Sosna, J; Acar, B; Joskowicz, L

2018-01-01

The goal of medical content-based image retrieval (M-CBIR) is to assist radiologists in the decision-making process by retrieving medical cases similar to a given image. One of the key interests of radiologists is lesions and their annotations, since the patient treatment depends on the lesion diagnosis. Therefore, a key feature of M-CBIR systems is the retrieval of scans with the most similar lesion annotations. To be of value, M-CBIR systems should be fully automatic to handle large case databases. We present a fully automatic end-to-end method for the retrieval of CT scans with similar liver lesion annotations. The input is a database of abdominal CT scans labeled with liver lesions, a query CT scan, and optionally one radiologist-specified lesion annotation of interest. The output is an ordered list of the database CT scans with the most similar liver lesion annotations. The method starts by automatically segmenting the liver in the scan. It then extracts a histogram-based features vector from the segmented region, learns the features' relative importance, and ranks the database scans according to the relative importance measure. The main advantages of our method are that it fully automates the end-to-end querying process, that it uses simple and efficient techniques that are scalable to large datasets, and that it produces quality retrieval results using an unannotated CT scan. Our experimental results on 9 CT queries on a dataset of 41 volumetric CT scans from the 2014 Image CLEF Liver Annotation Task yield an average retrieval accuracy (Normalized Discounted Cumulative Gain index) of 0.77 and 0.84 without/with annotation, respectively. Fully automatic end-to-end retrieval of similar cases based on image information alone, rather that on disease diagnosis, may help radiologists to better diagnose liver lesions.
Accelerating Information Retrieval from Profile Hidden Markov Model Databases.

Science.gov (United States)

Tamimi, Ahmad; Ashhab, Yaqoub; Tamimi, Hashem

2016-01-01

Profile Hidden Markov Model (Profile-HMM) is an efficient statistical approach to represent protein families. Currently, several databases maintain valuable protein sequence information as profile-HMMs. There is an increasing interest to improve the efficiency of searching Profile-HMM databases to detect sequence-profile or profile-profile homology. However, most efforts to enhance searching efficiency have been focusing on improving the alignment algorithms. Although the performance of these algorithms is fairly acceptable, the growing size of these databases, as well as the increasing demand for using batch query searching approach, are strong motivations that call for further enhancement of information retrieval from profile-HMM databases. This work presents a heuristic method to accelerate the current profile-HMM homology searching approaches. The method works by cluster-based remodeling of the database to reduce the search space, rather than focusing on the alignment algorithms. Using different clustering techniques, 4284 TIGRFAMs profiles were clustered based on their similarities. A representative for each cluster was assigned. To enhance sensitivity, we proposed an extended step that allows overlapping among clusters. A validation benchmark of 6000 randomly selected protein sequences was used to query the clustered profiles. To evaluate the efficiency of our approach, speed and recall values were measured and compared with the sequential search approach. Using hierarchical, k-means, and connected component clustering techniques followed by the extended overlapping step, we obtained an average reduction in time of 41%, and an average recall of 96%. Our results demonstrate that representation of profile-HMMs using a clustering-based approach can significantly accelerate data retrieval from profile-HMM databases.
Reldata - a tool for reliability database management

International Nuclear Information System (INIS)

Vinod, Gopika; Saraf, R.K.; Babar, A.K.; Sanyasi Rao, V.V.S.; Tharani, Rajiv

2000-01-01

Component failure, repair and maintenance data is a very important element of any Probabilistic Safety Assessment study. The credibility of the results of such study is enhanced if the data used is generated from operating experience of similar power plants. Towards this objective, a computerised database is designed, with fields such as, date and time of failure, component name, failure mode, failure cause, ways of failure detection, reactor operating power status, repair times, down time, etc. This leads to evaluation of plant specific failure rate, and on demand failure probability/unavailability for all components. Systematic data updation can provide a real time component reliability parameter statistics and trend analysis and this helps in planning maintenance strategies. A software package has been developed RELDATA, which incorporates the database management and data analysis methods. This report describes the software features and underlying methodology in detail. (author)
IMPPAT: A curated database of Indian Medicinal Plants, Phytochemistry And Therapeutics.

Science.gov (United States)

Mohanraj, Karthikeyan; Karthikeyan, Bagavathy Shanmugam; Vivek-Ananth, R P; Chand, R P Bharath; Aparna, S R; Mangalapandi, Pattulingam; Samal, Areejit

2018-03-12

Phytochemicals of medicinal plants encompass a diverse chemical space for drug discovery. India is rich with a flora of indigenous medicinal plants that have been used for centuries in traditional Indian medicine to treat human maladies. A comprehensive online database on the phytochemistry of Indian medicinal plants will enable computational approaches towards natural product based drug discovery. In this direction, we present, IMPPAT, a manually curated database of 1742 Indian Medicinal Plants, 9596 Phytochemicals, And 1124 Therapeutic uses spanning 27074 plant-phytochemical associations and 11514 plant-therapeutic associations. Notably, the curation effort led to a non-redundant in silico library of 9596 phytochemicals with standard chemical identifiers and structure information. Using cheminformatic approaches, we have computed the physicochemical, ADMET (absorption, distribution, metabolism, excretion, toxicity) and drug-likeliness properties of the IMPPAT phytochemicals. We show that the stereochemical complexity and shape complexity of IMPPAT phytochemicals differ from libraries of commercial compounds or diversity-oriented synthesis compounds while being similar to other libraries of natural products. Within IMPPAT, we have filtered a subset of 960 potential druggable phytochemicals, of which majority have no significant similarity to existing FDA approved drugs, and thus, rendering them as good candidates for prospective drugs. IMPPAT database is openly accessible at: https://cb.imsc.res.in/imppat .
Biofuel Database

Science.gov (United States)

Biofuel Database (Web, free access) This database brings together structural, biological, and thermodynamic data for enzymes that are either in current use or are being considered for use in the production of biofuels.
FARME DB: a functional antibiotic resistance element database

OpenAIRE

Wallace, James C.; Port, Jesse A.; Smith, Marissa N.; Faustman, Elaine M.

2017-01-01

Antibiotic resistance (AR) is a major global public health threat but few resources exist that catalog AR genes outside of a clinical context. Current AR sequence databases are assembled almost exclusively from genomic sequences derived from clinical bacterial isolates and thus do not include many microbial sequences derived from environmental samples that confer resistance in functional metagenomic studies. These environmental metagenomic sequences often show little or no similarity to AR se...
NIRS database of the original research database

International Nuclear Information System (INIS)

Morita, Kyoko

1991-01-01

Recently, library staffs arranged and compiled the original research papers that have been written by researchers for 33 years since National Institute of Radiological Sciences (NIRS) established. This papers describes how the internal database of original research papers has been created. This is a small sample of hand-made database. This has been cumulating by staffs who have any knowledge about computer machine or computer programming. (author)
Teaching Case: Adapting the Access Northwind Database to Support a Database Course

Science.gov (United States)

Dyer, John N.; Rogers, Camille

2015-01-01

A common problem encountered when teaching database courses is that few large illustrative databases exist to support teaching and learning. Most database textbooks have small "toy" databases that are chapter objective specific, and thus do not support application over the complete domain of design, implementation and management concepts…
Community Database

Data.gov (United States)

National Oceanic and Atmospheric Administration, Department of Commerce — This excel spreadsheet is the result of merging at the port level of several of the in-house fisheries databases in combination with other demographic databases such...

Applications of GIS and database technologies to manage a Karst Feature Database

Science.gov (United States)

Gao, Y.; Tipping, R.G.; Alexander, E.C.

2006-01-01

This paper describes the management of a Karst Feature Database (KFD) in Minnesota. Two sets of applications in both GIS and Database Management System (DBMS) have been developed for the KFD of Minnesota. These applications were used to manage and to enhance the usability of the KFD. Structured Query Language (SQL) was used to manipulate transactions of the database and to facilitate the functionality of the user interfaces. The Database Administrator (DBA) authorized users with different access permissions to enhance the security of the database. Database consistency and recovery are accomplished by creating data logs and maintaining backups on a regular basis. The working database provides guidelines and management tools for future studies of karst features in Minnesota. The methodology of designing this DBMS is applicable to develop GIS-based databases to analyze and manage geomorphic and hydrologic datasets at both regional and local scales. The short-term goal of this research is to develop a regional KFD for the Upper Mississippi Valley Karst and the long-term goal is to expand this database to manage and study karst features at national and global scales.
Impact of database quality in knowledge-based treatment planning for prostate cancer.

Science.gov (United States)

Wall, Phillip D H; Carver, Robert L; Fontenot, Jonas D

2018-03-13

This article investigates dose-volume prediction improvements in a common knowledge-based planning (KBP) method using a Pareto plan database compared with using a conventional, clinical plan database. Two plan databases were created using retrospective, anonymized data of 124 volumetric modulated arc therapy (VMAT) prostate cancer patients. The clinical plan database (CPD) contained planning data from each patient's clinically treated VMAT plan, which were manually optimized by various planners. The multicriteria optimization database (MCOD) contained Pareto-optimal plan data from VMAT plans created using a standardized multicriteria optimization protocol. Overlap volume histograms, incorporating fractional organ at risk volumes only within the treatment fields, were computed for each patient and used to match new patient anatomy to similar database patients. For each database patient, CPD and MCOD KBP predictions were generated for D 10 , D 30 , D 50 , D 65 , and D 80 of the bladder and rectum in a leave-one-out manner. Prediction achievability was evaluated through a replanning study on a subset of 31 randomly selected database patients using the best KBP predictions, regardless of plan database origin, as planning goals. MCOD predictions were significantly lower than CPD predictions for all 5 bladder dose-volumes and rectum D 50 (P = .004) and D 65 (P databases affects the performance and achievability of dose-volume predictions from a common knowledge-based planning approach for prostate cancer. Bladder and rectum dose-volume predictions derived from a database of standardized Pareto-optimal plans were compared with those derived from clinical plans manually designed by various planners. Dose-volume predictions from the Pareto plan database were significantly lower overall than those from the clinical plan database, without compromising achievability. Copyright © 2018 Elsevier Inc. All rights reserved.
Using the Pathogen-Host Interactions database (PHI-base to investigate plant pathogen genomes and genes implicated in virulence

Directory of Open Access Journals (Sweden)

Martin eUrban

2015-08-01

Full Text Available New pathogen-host interaction mechanisms can be revealed by integrating mutant phenotype data with genetic information. PHI-base is a multi-species manually curated database combining peer-reviewed published phenotype data from plant and animal pathogens and gene/protein information in a single database.
Open Geoscience Database

Science.gov (United States)

Bashev, A.

2012-04-01

Currently there is an enormous amount of various geoscience databases. Unfortunately the only users of the majority of the databases are their elaborators. There are several reasons for that: incompaitability, specificity of tasks and objects and so on. However the main obstacles for wide usage of geoscience databases are complexity for elaborators and complication for users. The complexity of architecture leads to high costs that block the public access. The complication prevents users from understanding when and how to use the database. Only databases, associated with GoogleMaps don't have these drawbacks, but they could be hardly named "geoscience" Nevertheless, open and simple geoscience database is necessary at least for educational purposes (see our abstract for ESSI20/EOS12). We developed a database and web interface to work with them and now it is accessible at maps.sch192.ru. In this database a result is a value of a parameter (no matter which) in a station with a certain position, associated with metadata: the date when the result was obtained; the type of a station (lake, soil etc); the contributor that sent the result. Each contributor has its own profile, that allows to estimate the reliability of the data. The results can be represented on GoogleMaps space image as a point in a certain position, coloured according to the value of the parameter. There are default colour scales and each registered user can create the own scale. The results can be also extracted in *.csv file. For both types of representation one could select the data by date, object type, parameter type, area and contributor. The data are uploaded in *.csv format: Name of the station; Lattitude(dd.dddddd); Longitude(ddd.dddddd); Station type; Parameter type; Parameter value; Date(yyyy-mm-dd). The contributor is recognised while entering. This is the minimal set of features that is required to connect a value of a parameter with a position and see the results. All the complicated data
Standard Ship Test and Inspection Plan, Procedures and Database

National Research Council Canada - National Science Library

1999-01-01

... construction schedules and increased cost is the area of test and inspection. This project investigates existing rules and regulations for testing and inspection of commercial ships and identifies differences and similarities within the requirements. The results include comparison matrices, a standard test plan, a set of standard test procedures, and a sample test database developed for a typical commercial ship.
Development of an online database of typical food portion sizes in Irish population groups.

Science.gov (United States)

Lyons, Jacqueline; Walton, Janette; Flynn, Albert

2013-01-01

The Irish Food Portion Sizes Database (available at www.iuna.net) describes typical portion weights for an extensive range of foods and beverages for Irish children, adolescents and adults. The present paper describes the methodologies used to develop the database and some key characteristics of the portion weight data contained therein. The data are derived from three large, cross-sectional food consumption surveys carried out in Ireland over the last decade: the National Children's Food Survey (2003-2004), National Teens' Food Survey (2005-2006) and National Adult Nutrition Survey (2008-2010). Median, 25th and 75th percentile portion weights are described for a total of 545 items across the three survey groups, split by age group or sex as appropriate. The typical (median) portion weights reported for adolescents and adults are similar for many foods, while those reported for children are notably smaller. Adolescent and adult males generally consume larger portions than their female counterparts, though similar portion weights may be consumed where foods are packaged in unit amounts (for example, pots of yoghurt). The inclusion of energy under-reporters makes little difference to the estimation of typical portion weights in adults. The data have wide-ranging applications in dietary assessment and food labelling, and will serve as a useful reference against which to compare future portion size data from the Irish population. The present paper provides a useful context for researchers and others wishing to use the Irish Food Portion Sizes Database, and may guide researchers in other countries in establishing similar databases of their own.
Inleiding database-systemen

NARCIS (Netherlands)

Pels, H.J.; Lans, van der R.F.; Pels, H.J.; Meersman, R.A.

1993-01-01

Dit artikel introduceert de voornaamste begrippen die een rol spelen rond databases en het geeft een overzicht van de doelstellingen, de functies en de componenten van database-systemen. Hoewel de functie van een database intuitief vrij duidelijk is, is het toch een in technologisch opzicht complex
MultitaskProtDB: a database of multitasking proteins.

Science.gov (United States)

Hernández, Sergio; Ferragut, Gabriela; Amela, Isaac; Perez-Pons, JosepAntoni; Piñol, Jaume; Mozo-Villarias, Angel; Cedano, Juan; Querol, Enrique

2014-01-01

We have compiled MultitaskProtDB, available online at http://wallace.uab.es/multitask, to provide a repository where the many multitasking proteins found in the literature can be stored. Multitasking or moonlighting is the capability of some proteins to execute two or more biological functions. Usually, multitasking proteins are experimentally revealed by serendipity. This ability of proteins to perform multitasking functions helps us to understand one of the ways used by cells to perform many complex functions with a limited number of genes. Even so, the study of this phenomenon is complex because, among other things, there is no database of moonlighting proteins. The existence of such a tool facilitates the collection and dissemination of these important data. This work reports the database, MultitaskProtDB, which is designed as a friendly user web page containing >288 multitasking proteins with their NCBI and UniProt accession numbers, canonical and additional biological functions, monomeric/oligomeric states, PDB codes when available and bibliographic references. This database also serves to gain insight into some characteristics of multitasking proteins such as frequencies of the different pairs of functions, phylogenetic conservation and so forth.
Integration of Biodiversity Databases in Taiwan and Linkage to Global Databases

Directory of Open Access Journals (Sweden)

Kwang-Tsao Shao

2007-03-01

Full Text Available The biodiversity databases in Taiwan were dispersed to various institutions and colleges with limited amount of data by 2001. The Natural Resources and Ecology GIS Database sponsored by the Council of Agriculture, which is part of the National Geographic Information System planned by the Ministry of Interior, was the most well established biodiversity database in Taiwan. But thisThis database was, however, mainly collectingcollected the distribution data of terrestrial animals and plants within the Taiwan area. In 2001, GBIF was formed, and Taiwan joined as one of the an Associate Participant and started, starting the establishment and integration of animal and plant species databases; therefore, TaiBIF was able to co-operate with GBIF. The information of Catalog of Life, specimens, and alien species were integrated by the Darwin core. The standard. These metadata standards allowed the biodiversity information of Taiwan to connect with global databases.
Database Replication

CERN Document Server

Kemme, Bettina

2010-01-01

Database replication is widely used for fault-tolerance, scalability and performance. The failure of one database replica does not stop the system from working as available replicas can take over the tasks of the failed replica. Scalability can be achieved by distributing the load across all replicas, and adding new replicas should the load increase. Finally, database replication can provide fast local access, even if clients are geographically distributed clients, if data copies are located close to clients. Despite its advantages, replication is not a straightforward technique to apply, and
GDR (Genome Database for Rosaceae: integrated web resources for Rosaceae genomics and genetics research

Directory of Open Access Journals (Sweden)

Ficklin Stephen

2004-09-01

Full Text Available Abstract Background Peach is being developed as a model organism for Rosaceae, an economically important family that includes fruits and ornamental plants such as apple, pear, strawberry, cherry, almond and rose. The genomics and genetics data of peach can play a significant role in the gene discovery and the genetic understanding of related species. The effective utilization of these peach resources, however, requires the development of an integrated and centralized database with associated analysis tools. Description The Genome Database for Rosaceae (GDR is a curated and integrated web-based relational database. GDR contains comprehensive data of the genetically anchored peach physical map, an annotated peach EST database, Rosaceae maps and markers and all publicly available Rosaceae sequences. Annotations of ESTs include contig assembly, putative function, simple sequence repeats, and anchored position to the peach physical map where applicable. Our integrated map viewer provides graphical interface to the genetic, transcriptome and physical mapping information. ESTs, BACs and markers can be queried by various categories and the search result sites are linked to the integrated map viewer or to the WebFPC physical map sites. In addition to browsing and querying the database, users can compare their sequences with the annotated GDR sequences via a dedicated sequence similarity server running either the BLAST or FASTA algorithm. To demonstrate the utility of the integrated and fully annotated database and analysis tools, we describe a case study where we anchored Rosaceae sequences to the peach physical and genetic map by sequence similarity. Conclusions The GDR has been initiated to meet the major deficiency in Rosaceae genomics and genetics research, namely a centralized web database and bioinformatics tools for data storage, analysis and exchange. GDR can be accessed at http://www.genome.clemson.edu/gdr/.
GDR (Genome Database for Rosaceae): integrated web resources for Rosaceae genomics and genetics research.

Science.gov (United States)

Jung, Sook; Jesudurai, Christopher; Staton, Margaret; Du, Zhidian; Ficklin, Stephen; Cho, Ilhyung; Abbott, Albert; Tomkins, Jeffrey; Main, Dorrie

2004-09-09

Peach is being developed as a model organism for Rosaceae, an economically important family that includes fruits and ornamental plants such as apple, pear, strawberry, cherry, almond and rose. The genomics and genetics data of peach can play a significant role in the gene discovery and the genetic understanding of related species. The effective utilization of these peach resources, however, requires the development of an integrated and centralized database with associated analysis tools. The Genome Database for Rosaceae (GDR) is a curated and integrated web-based relational database. GDR contains comprehensive data of the genetically anchored peach physical map, an annotated peach EST database, Rosaceae maps and markers and all publicly available Rosaceae sequences. Annotations of ESTs include contig assembly, putative function, simple sequence repeats, and anchored position to the peach physical map where applicable. Our integrated map viewer provides graphical interface to the genetic, transcriptome and physical mapping information. ESTs, BACs and markers can be queried by various categories and the search result sites are linked to the integrated map viewer or to the WebFPC physical map sites. In addition to browsing and querying the database, users can compare their sequences with the annotated GDR sequences via a dedicated sequence similarity server running either the BLAST or FASTA algorithm. To demonstrate the utility of the integrated and fully annotated database and analysis tools, we describe a case study where we anchored Rosaceae sequences to the peach physical and genetic map by sequence similarity. The GDR has been initiated to meet the major deficiency in Rosaceae genomics and genetics research, namely a centralized web database and bioinformatics tools for data storage, analysis and exchange. GDR can be accessed at http://www.genome.clemson.edu/gdr/.
Update History of This Database - PLACE | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us PLACE Update History of This Database Date Update contents 2016/08/22 The contact address is...s Database Database Description Download License Update History of Thi...s Database Site Policy | Contact Us Update History of This Database - PLACE | LSDB Archive ... ... changed. 2014/10/20 The URLs of the database maintenance site and the portal site are changed. 2014/07/17 PLACE English archi
Database Publication Practices

DEFF Research Database (Denmark)

Bernstein, P.A.; DeWitt, D.; Heuer, A.

2005-01-01

There has been a growing interest in improving the publication processes for database research papers. This panel reports on recent changes in those processes and presents an initial cut at historical data for the VLDB Journal and ACM Transactions on Database Systems.......There has been a growing interest in improving the publication processes for database research papers. This panel reports on recent changes in those processes and presents an initial cut at historical data for the VLDB Journal and ACM Transactions on Database Systems....
Super Natural II--a database of natural products.

Science.gov (United States)

Banerjee, Priyanka; Erehman, Jevgeni; Gohlke, Björn-Oliver; Wilhelm, Thomas; Preissner, Robert; Dunkel, Mathias

2015-01-01

Natural products play a significant role in drug discovery and development. Many topological pharmacophore patterns are common between natural products and commercial drugs. A better understanding of the specific physicochemical and structural features of natural products is important for corresponding drug development. Several encyclopedias of natural compounds have been composed, but the information remains scattered or not freely available. The first version of the Supernatural database containing ∼ 50,000 compounds was published in 2006 to face these challenges. Here we present a new, updated and expanded version of natural product database, Super Natural II (http://bioinformatics.charite.de/supernatural), comprising ∼ 326,000 molecules. It provides all corresponding 2D structures, the most important structural and physicochemical properties, the predicted toxicity class for ∼ 170,000 compounds and the vendor information for the vast majority of compounds. The new version allows a template-based search for similar compounds as well as a search for compound names, vendors, specific physical properties or any substructures. Super Natural II also provides information about the pathways associated with synthesis and degradation of the natural products, as well as their mechanism of action with respect to structurally similar drugs and their target proteins. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Developing an A Priori Database for Passive Microwave Snow Water Retrievals Over Ocean

Science.gov (United States)

Yin, Mengtao; Liu, Guosheng

2017-12-01

A physically optimized a priori database is developed for Global Precipitation Measurement Microwave Imager (GMI) snow water retrievals over ocean. The initial snow water content profiles are derived from CloudSat Cloud Profiling Radar (CPR) measurements. A radiative transfer model in which the single-scattering properties of nonspherical snowflakes are based on the discrete dipole approximate results is employed to simulate brightness temperatures and their gradients. Snow water content profiles are then optimized through a one-dimensional variational (1D-Var) method. The standard deviations of the difference between observed and simulated brightness temperatures are in a similar magnitude to the observation errors defined for observation error covariance matrix after the 1D-Var optimization, indicating that this variational method is successful. This optimized database is applied in a Bayesian retrieval snow water algorithm. The retrieval results indicated that the 1D-Var approach has a positive impact on the GMI retrieved snow water content profiles by improving the physical consistency between snow water content profiles and observed brightness temperatures. Global distribution of snow water contents retrieved from the a priori database is compared with CloudSat CPR estimates. Results showed that the two estimates have a similar pattern of global distribution, and the difference of their global means is small. In addition, we investigate the impact of using physical parameters to subset the database on snow water retrievals. It is shown that using total precipitable water to subset the database with 1D-Var optimization is beneficial for snow water retrievals.
The MAR databases: development and implementation of databases specific for marine metagenomics.

Science.gov (United States)

Klemetsen, Terje; Raknes, Inge A; Fu, Juan; Agafonov, Alexander; Balasundaram, Sudhagar V; Tartari, Giacomo; Robertsen, Espen; Willassen, Nils P

2018-01-04

We introduce the marine databases; MarRef, MarDB and MarCat (https://mmp.sfb.uit.no/databases/), which are publicly available resources that promote marine research and innovation. These data resources, which have been implemented in the Marine Metagenomics Portal (MMP) (https://mmp.sfb.uit.no/), are collections of richly annotated and manually curated contextual (metadata) and sequence databases representing three tiers of accuracy. While MarRef is a database for completely sequenced marine prokaryotic genomes, which represent a marine prokaryote reference genome database, MarDB includes all incomplete sequenced prokaryotic genomes regardless level of completeness. The last database, MarCat, represents a gene (protein) catalog of uncultivable (and cultivable) marine genes and proteins derived from marine metagenomics samples. The first versions of MarRef and MarDB contain 612 and 3726 records, respectively. Each record is built up of 106 metadata fields including attributes for sampling, sequencing, assembly and annotation in addition to the organism and taxonomic information. Currently, MarCat contains 1227 records with 55 metadata fields. Ontologies and controlled vocabularies are used in the contextual databases to enhance consistency. The user-friendly web interface lets the visitors browse, filter and search in the contextual databases and perform BLAST searches against the corresponding sequence databases. All contextual and sequence databases are freely accessible and downloadable from https://s1.sfb.uit.no/public/mar/. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Update History of This Database - DMPD | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us DMPD Update History of This Database Date Update contents 2010/03/29 DMPD English archive si....jp/macrophage/ ) is released. About This Database Database Description Download License Update History of Thi...s Database Site Policy | Contact Us Update History of This Database - DMPD | LSDB Archive ...
DMPD: Are the IKKs and IKK-related kinases TBK1 and IKK-epsilon similarly activated? [Dynamic Macrophage Pathway CSML Database

Lifescience Database Archive (English)

Full Text Available 18353649 Are the IKKs and IKK-related kinases TBK1 and IKK-epsilon similarly activa...e IKKs and IKK-related kinases TBK1 and IKK-epsilon similarly activated? PubmedID... 18353649 Title Are the IKKs and IKK-related kinases TBK1 and IKK-epsilon similarly activated? Authors Chau
Role of Database Management Systems in Selected Engineering Institutions of Andhra Pradesh: An Analytical Survey

Directory of Open Access Journals (Sweden)

Kutty Kumar

2016-06-01

Full Text Available This paper aims to analyze the function of database management systems from the perspective of librarians working in engineering institutions in Andhra Pradesh. Ninety-eight librarians from one hundred thirty engineering institutions participated in the study. The paper reveals that training by computer suppliers and software packages are the significant mode of acquiring DBMS skills by librarians; three-fourths of the librarians are postgraduate degree holders. Most colleges use database applications for automation purposes and content value. Electrical problems and untrained staff seem to be major constraints faced by respondents for managing library databases.

Generating "fragment-based virtual library" using pocket similarity search of ligand-receptor complexes.

Science.gov (United States)

Khashan, Raed S

2015-01-01

As the number of available ligand-receptor complexes is increasing, researchers are becoming more dedicated to mine these complexes to aid in the drug design and development process. We present free software which is developed as a tool for performing similarity search across ligand-receptor complexes for identifying binding pockets which are similar to that of a target receptor. The search is based on 3D-geometric and chemical similarity of the atoms forming the binding pocket. For each match identified, the ligand's fragment(s) corresponding to that binding pocket are extracted, thus forming a virtual library of fragments (FragVLib) that is useful for structure-based drug design. The program provides a very useful tool to explore available databases.
Development of a personalized training system using the Lung Image Database Consortium and Image Database resource Initiative Database.

Science.gov (United States)

Lin, Hongli; Wang, Weisheng; Luo, Jiawei; Yang, Xuedong

2014-12-01

The aim of this study was to develop a personalized training system using the Lung Image Database Consortium (LIDC) and Image Database resource Initiative (IDRI) Database, because collecting, annotating, and marking a large number of appropriate computed tomography (CT) scans, and providing the capability of dynamically selecting suitable training cases based on the performance levels of trainees and the characteristics of cases are critical for developing a efficient training system. A novel approach is proposed to develop a personalized radiology training system for the interpretation of lung nodules in CT scans using the Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI) database, which provides a Content-Boosted Collaborative Filtering (CBCF) algorithm for predicting the difficulty level of each case of each trainee when selecting suitable cases to meet individual needs, and a diagnostic simulation tool to enable trainees to analyze and diagnose lung nodules with the help of an image processing tool and a nodule retrieval tool. Preliminary evaluation of the system shows that developing a personalized training system for interpretation of lung nodules is needed and useful to enhance the professional skills of trainees. The approach of developing personalized training systems using the LIDC/IDRL database is a feasible solution to the challenges of constructing specific training program in terms of cost and training efficiency. Copyright © 2014 AUR. Published by Elsevier Inc. All rights reserved.
Update History of This Database - KOME | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us KOME Update History of This Database Date Update contents 2014/10/22 The URL of the whole da...site is opened. 2003/07/18 KOME ( http://cdna01.dna.affrc.go.jp/cDNA/ ) is opened. About This Database Dat...abase Description Download License Update History of This Database Site Policy | Contact Us Update History of This Database - KOME | LSDB Archive ...
Update History of This Database - PSCDB | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us PSCDB Update History of This Database Date Update contents 2016/11/30 PSCDB English archive ...site is opened. 2011/11/13 PSCDB ( http://idp1.force.cs.is.nagoya-u.ac.jp/pscdb/ ) is opened. About This Database Database... Description Download License Update History of This Database Site Policy | Contact Us Update History of This Database - PSCDB | LSDB Archive ...
Mycobacteriophage genome database.

Science.gov (United States)

Joseph, Jerrine; Rajendran, Vasanthi; Hassan, Sameer; Kumar, Vanaja

2011-01-01

Mycobacteriophage genome database (MGDB) is an exclusive repository of the 64 completely sequenced mycobacteriophages with annotated information. It is a comprehensive compilation of the various gene parameters captured from several databases pooled together to empower mycobacteriophage researchers. The MGDB (Version No.1.0) comprises of 6086 genes from 64 mycobacteriophages classified into 72 families based on ACLAME database. Manual curation was aided by information available from public databases which was enriched further by analysis. Its web interface allows browsing as well as querying the classification. The main objective is to collect and organize the complexity inherent to mycobacteriophage protein classification in a rational way. The other objective is to browse the existing and new genomes and describe their functional annotation. The database is available for free at http://mpgdb.ibioinformatics.org/mpgdb.php.
The Astrobiology Habitable Environments Database (AHED)

Science.gov (United States)

Lafuente, B.; Stone, N.; Downs, R. T.; Blake, D. F.; Bristow, T.; Fonda, M.; Pires, A.

2015-12-01

The Astrobiology Habitable Environments Database (AHED) is a central, high quality, long-term searchable repository for archiving and collaborative sharing of astrobiologically relevant data, including, morphological, textural and contextural images, chemical, biochemical, isotopic, sequencing, and mineralogical information. The aim of AHED is to foster long-term innovative research by supporting integration and analysis of diverse datasets in order to: 1) help understand and interpret planetary geology; 2) identify and characterize habitable environments and pre-biotic/biotic processes; 3) interpret returned data from present and past missions; 4) provide a citable database of NASA-funded published and unpublished data (after an agreed-upon embargo period). AHED uses the online open-source software "The Open Data Repository's Data Publisher" (ODR - http://www.opendatarepository.org) [1], which provides a user-friendly interface that research teams or individual scientists can use to design, populate and manage their own database according to the characteristics of their data and the need to share data with collaborators or the broader scientific community. This platform can be also used as a laboratory notebook. The database will have the capability to import and export in a variety of standard formats. Advanced graphics will be implemented including 3D graphing, multi-axis graphs, error bars, and similar scientific data functions together with advanced online tools for data analysis (e. g. the statistical package, R). A permissions system will be put in place so that as data are being actively collected and interpreted, they will remain proprietary. A citation system will allow research data to be used and appropriately referenced by other researchers after the data are made public. This project is supported by the Science-Enabling Research Activity (SERA) and NASA NNX11AP82A, Mars Science Laboratory Investigations. [1] Nate et al. (2015) AGU, submitted.
Development of a California commercial building benchmarking database

International Nuclear Information System (INIS)

Kinney, Satkartar; Piette, Mary Ann

2002-01-01

Building energy benchmarking is a useful starting point for commercial building owners and operators to target energy savings opportunities. There are a number of tools and methods for benchmarking energy use. Benchmarking based on regional data can provides more relevant information for California buildings than national tools such as Energy Star. This paper discusses issues related to benchmarking commercial building energy use and the development of Cal-Arch, a building energy benchmarking database for California. Currently Cal-Arch uses existing survey data from California's Commercial End Use Survey (CEUS), a largely underutilized wealth of information collected by California's major utilities. Doe's Commercial Building Energy Consumption Survey (CBECS) is used by a similar tool, Arch, and by a number of other benchmarking tools. Future versions of Arch/Cal-Arch will utilize additional data sources including modeled data and individual buildings to expand the database
Logical database design principles

CERN Document Server

Garmany, John; Clark, Terry

2005-01-01

INTRODUCTION TO LOGICAL DATABASE DESIGNUnderstanding a Database Database Architectures Relational Databases Creating the Database System Development Life Cycle (SDLC)Systems Planning: Assessment and Feasibility System Analysis: RequirementsSystem Analysis: Requirements Checklist Models Tracking and Schedules Design Modeling Functional Decomposition DiagramData Flow Diagrams Data Dictionary Logical Structures and Decision Trees System Design: LogicalSYSTEM DESIGN AND IMPLEMENTATION The ER ApproachEntities and Entity Types Attribute Domains AttributesSet-Valued AttributesWeak Entities Constraint
Analysing and Rationalising Molecular and Materials Databases Using Machine-Learning

Science.gov (United States)

de, Sandip; Ceriotti, Michele

Computational materials design promises to greatly accelerate the process of discovering new or more performant materials. Several collaborative efforts are contributing to this goal by building databases of structures, containing between thousands and millions of distinct hypothetical compounds, whose properties are computed by high-throughput electronic-structure calculations. The complexity and sheer amount of information has made manual exploration, interpretation and maintenance of these databases a formidable challenge, making it necessary to resort to automatic analysis tools. Here we will demonstrate how, starting from a measure of (dis)similarity between database items built from a combination of local environment descriptors, it is possible to apply hierarchical clustering algorithms, as well as dimensionality reduction methods such as sketchmap, to analyse, classify and interpret trends in molecular and materials databases, as well as to detect inconsistencies and errors. Thanks to the agnostic and flexible nature of the underlying metric, we will show how our framework can be applied transparently to different kinds of systems ranging from organic molecules and oligopeptides to inorganic crystal structures as well as molecular crystals. Funded by National Center for Computational Design and Discovery of Novel Materials (MARVEL) and Swiss National Science Foundation.
Accelerating Information Retrieval from Profile Hidden Markov Model Databases.

Directory of Open Access Journals (Sweden)

Ahmad Tamimi

Full Text Available Profile Hidden Markov Model (Profile-HMM is an efficient statistical approach to represent protein families. Currently, several databases maintain valuable protein sequence information as profile-HMMs. There is an increasing interest to improve the efficiency of searching Profile-HMM databases to detect sequence-profile or profile-profile homology. However, most efforts to enhance searching efficiency have been focusing on improving the alignment algorithms. Although the performance of these algorithms is fairly acceptable, the growing size of these databases, as well as the increasing demand for using batch query searching approach, are strong motivations that call for further enhancement of information retrieval from profile-HMM databases. This work presents a heuristic method to accelerate the current profile-HMM homology searching approaches. The method works by cluster-based remodeling of the database to reduce the search space, rather than focusing on the alignment algorithms. Using different clustering techniques, 4284 TIGRFAMs profiles were clustered based on their similarities. A representative for each cluster was assigned. To enhance sensitivity, we proposed an extended step that allows overlapping among clusters. A validation benchmark of 6000 randomly selected protein sequences was used to query the clustered profiles. To evaluate the efficiency of our approach, speed and recall values were measured and compared with the sequential search approach. Using hierarchical, k-means, and connected component clustering techniques followed by the extended overlapping step, we obtained an average reduction in time of 41%, and an average recall of 96%. Our results demonstrate that representation of profile-HMMs using a clustering-based approach can significantly accelerate data retrieval from profile-HMM databases.
Identification of Alternative Splice Variants Using Unique Tryptic Peptide Sequences for Database Searches.

Science.gov (United States)

Tran, Trung T; Bollineni, Ravi C; Strozynski, Margarita; Koehler, Christian J; Thiede, Bernd

2017-07-07

Alternative splicing is a mechanism in eukaryotes by which different forms of mRNAs are generated from the same gene. Identification of alternative splice variants requires the identification of peptides specific for alternative splice forms. For this purpose, we generated a human database that contains only unique tryptic peptides specific for alternative splice forms from Swiss-Prot entries. Using this database allows an easy access to splice variant-specific peptide sequences that match to MS data. Furthermore, we combined this database without alternative splice variant-1-specific peptides with human Swiss-Prot. This combined database can be used as a general database for searching of LC-MS data. LC-MS data derived from in-solution digests of two different cell lines (LNCaP, HeLa) and phosphoproteomics studies were analyzed using these two databases. Several nonalternative splice variant-1-specific peptides were found in both cell lines, and some of them seemed to be cell-line-specific. Control and apoptotic phosphoproteomes from Jurkat T cells revealed several nonalternative splice variant-1-specific peptides, and some of them showed clear quantitative differences between the two states.
Specialist Bibliographic Databases.

Science.gov (United States)

Gasparyan, Armen Yuri; Yessirkepov, Marlen; Voronov, Alexander A; Trukhachev, Vladimir I; Kostyukova, Elena I; Gerasimov, Alexey N; Kitas, George D

2016-05-01

Specialist bibliographic databases offer essential online tools for researchers and authors who work on specific subjects and perform comprehensive and systematic syntheses of evidence. This article presents examples of the established specialist databases, which may be of interest to those engaged in multidisciplinary science communication. Access to most specialist databases is through subscription schemes and membership in professional associations. Several aggregators of information and database vendors, such as EBSCOhost and ProQuest, facilitate advanced searches supported by specialist keyword thesauri. Searches of items through specialist databases are complementary to those through multidisciplinary research platforms, such as PubMed, Web of Science, and Google Scholar. Familiarizing with the functional characteristics of biomedical and nonbiomedical bibliographic search tools is mandatory for researchers, authors, editors, and publishers. The database users are offered updates of the indexed journal lists, abstracts, author profiles, and links to other metadata. Editors and publishers may find particularly useful source selection criteria and apply for coverage of their peer-reviewed journals and grey literature sources. These criteria are aimed at accepting relevant sources with established editorial policies and quality controls.
Specialist Bibliographic Databases

Science.gov (United States)

2016-01-01

Specialist bibliographic databases offer essential online tools for researchers and authors who work on specific subjects and perform comprehensive and systematic syntheses of evidence. This article presents examples of the established specialist databases, which may be of interest to those engaged in multidisciplinary science communication. Access to most specialist databases is through subscription schemes and membership in professional associations. Several aggregators of information and database vendors, such as EBSCOhost and ProQuest, facilitate advanced searches supported by specialist keyword thesauri. Searches of items through specialist databases are complementary to those through multidisciplinary research platforms, such as PubMed, Web of Science, and Google Scholar. Familiarizing with the functional characteristics of biomedical and nonbiomedical bibliographic search tools is mandatory for researchers, authors, editors, and publishers. The database users are offered updates of the indexed journal lists, abstracts, author profiles, and links to other metadata. Editors and publishers may find particularly useful source selection criteria and apply for coverage of their peer-reviewed journals and grey literature sources. These criteria are aimed at accepting relevant sources with established editorial policies and quality controls. PMID:27134485
Directory of IAEA databases

International Nuclear Information System (INIS)

1992-12-01

This second edition of the Directory of IAEA Databases has been prepared within the Division of Scientific and Technical Information (NESI). Its main objective is to describe the computerized information sources available to staff members. This directory contains all databases produced at the IAEA, including databases stored on the mainframe, LAN's and PC's. All IAEA Division Directors have been requested to register the existence of their databases with NESI. For the second edition database owners were requested to review the existing entries for their databases and answer four additional questions. The four additional questions concerned the type of database (e.g. Bibliographic, Text, Statistical etc.), the category of database (e.g. Administrative, Nuclear Data etc.), the available documentation and the type of media used for distribution. In the individual entries on the following pages the answers to the first two questions (type and category) is always listed, but the answers to the second two questions (documentation and media) is only listed when information has been made available
Understanding, modeling, and improving main-memory database performance

OpenAIRE

Manegold, S.

2002-01-01

textabstractDuring the last two decades, computer hardware has experienced remarkable developments. Especially CPU (clock-)speed has been following Moore's Law, i.e., doubling every 18 months; and there is no indication that this trend will change in the foreseeable future. Recent research has revealed that database performance, even with main-memory based systems, can hardly benefit from the ever increasing CPU power. The reason for this is that the performance of other hardware components h...
Updates on drug-target network; facilitating polypharmacology and data integration by growth of DrugBank database.

Science.gov (United States)

Barneh, Farnaz; Jafari, Mohieddin; Mirzaie, Mehdi

2016-11-01

Network pharmacology elucidates the relationship between drugs and targets. As the identified targets for each drug increases, the corresponding drug-target network (DTN) evolves from solely reflection of the pharmaceutical industry trend to a portrait of polypharmacology. The aim of this study was to evaluate the potentials of DrugBank database in advancing systems pharmacology. We constructed and analyzed DTN from drugs and targets associations in the DrugBank 4.0 database. Our results showed that in bipartite DTN, increased ratio of identified targets for drugs augmented density and connectivity of drugs and targets and decreased modular structure. To clear up the details in the network structure, the DTNs were projected into two networks namely, drug similarity network (DSN) and target similarity network (TSN). In DSN, various classes of Food and Drug Administration-approved drugs with distinct therapeutic categories were linked together based on shared targets. Projected TSN also showed complexity because of promiscuity of the drugs. By including investigational drugs that are currently being tested in clinical trials, the networks manifested more connectivity and pictured the upcoming pharmacological space in the future years. Diverse biological processes and protein-protein interactions were manipulated by new drugs, which can extend possible target combinations. We conclude that network-based organization of DrugBank 4.0 data not only reveals the potential for repurposing of existing drugs, also allows generating novel predictions about drugs off-targets, drug-drug interactions and their side effects. Our results also encourage further effort for high-throughput identification of targets to build networks that can be integrated into disease networks. © The Author 2015. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
Update History of This Database - SAHG | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us SAHG Update History of This Database Date Update contents 2016/05/09 SAHG English archive si...te is opened. 2009/10 SAHG ( http://bird.cbrc.jp/sahg ) is opened. About This Database Database Description ...Download License Update History of This Database Site Policy | Contact Us Update History of This Database - SAHG | LSDB Archive ...
Update History of This Database - RMOS | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us RMOS Update History of This Database Date Update contents 2015/10/27 RMOS English archive si...12 RMOS (http://cdna01.dna.affrc.go.jp/RMOS/) is opened. About This Database Database Description Download License Update Hi...story of This Database Site Policy | Contact Us Update History of This Database - RMOS | LSDB Archive ...
Comparative analysis of cloud cover databases for CORDEX-AFRICA

Science.gov (United States)

Enríquez, A.; Taima-Hernández, D.; González, A.; Pérez, J. C.; Díaz, J. P.; Expósito, F. J.

2012-04-01

The main objective of the CORDEX program (COordinated Regional climate Downscaling Experiment) [1] is the production of regional climate change scenarios at a global scale, creating a contribution to the IPCC (Intergovernmental Panel on Climate Change) AR5 (5th Assessment Report). Inside this project, Africa is the key region due to the lack of data at this moment. In this study, the cloud cover information obtained through five well-known databases: ERA-40, ERA-Interim, ISCCP, NCEP and CRU, over the CORDEX-AFRICA domain, is analyzed for the period 1984-2000, in order to determine the similarity between them.To analyze the accuracy and consistency of the climate databases, some statistical techniques such as correlation coefficient (r), root mean square (RMS) differences and a defined skill score (SS), based on the difference between areas of the probability density functions (PDFs) associated to study parameters [2], were applied. Thus which databases are well-related in different regions and which not are determined, establishing an appropriate framework which could be used to validate the AR5 models in historical simulations.
Surrogate for oropharyngeal cancer HPV status in cancer database studies.

Science.gov (United States)

Megwalu, Uchechukwu C; Chen, Michelle M; Ma, Yifei; Divi, Vasu

2017-12-01

The utility of cancer databases for oropharyngeal cancer studies is limited by lack of information on human papillomavirus (HPV) status. The purpose of this study was to develop a surrogate that can be used to adjust for the effect of HPV status on survival. The study cohort included 6419 patients diagnosed with oropharyngeal squamous cell carcinoma between 2004 and 2012, identified in the National Cancer Database (NCDB). The HPV surrogate score was developed using a logistic regression model predicting HPV-positive status. The HPV surrogate score was predictive of HPV status (area under the curve [AUC] 0.73; accuracy of 70.4%). Similar to HPV-positive tumors, HPV surrogate positive tumors were associated with improved overall survival (OS; hazard ratio [HR] 0.73; 95% confidence interval [CI] 0.59-0.91; P = .005), after adjusting for important covariates. The HPV surrogate score is useful for adjusting for the effect of HPV status on survival in studies utilizing cancer databases. © 2017 Wiley Periodicals, Inc.

The World Bacterial Biogeography and Biodiversity through Databases: A Case Study of NCBI Nucleotide Database and GBIF Database

Directory of Open Access Journals (Sweden)

Okba Selama

2013-01-01

Full Text Available Databases are an essential tool and resource within the field of bioinformatics. The primary aim of this study was to generate an overview of global bacterial biodiversity and biogeography using available data from the two largest public online databases, NCBI Nucleotide and GBIF. The secondary aim was to highlight the contribution each geographic area has to each database. The basis for data analysis of this study was the metadata provided by both databases, mainly, the taxonomy and the geographical area origin of isolation of the microorganism (record. These were directly obtained from GBIF through the online interface, while E-utilities and Python were used in combination with a programmatic web service access to obtain data from the NCBI Nucleotide Database. Results indicate that the American continent, and more specifically the USA, is the top contributor, while Africa and Antarctica are less well represented. This highlights the imbalance of exploration within these areas rather than any reduction in biodiversity. This study describes a novel approach to generating global scale patterns of bacterial biodiversity and biogeography and indicates that the Proteobacteria are the most abundant and widely distributed phylum within both databases.
An Interoperable Cartographic Database

OpenAIRE

Slobodanka Ključanin; Zdravko Galić

2007-01-01

The concept of producing a prototype of interoperable cartographic database is explored in this paper, including the possibilities of integration of different geospatial data into the database management system and their visualization on the Internet. The implementation includes vectorization of the concept of a single map page, creation of the cartographic database in an object-relation database, spatial analysis, definition and visualization of the database content in the form of a map on t...
Software listing: CHEMTOX database

International Nuclear Information System (INIS)

Moskowitz, P.D.

1993-01-01

Initially launched in 1983, the CHEMTOX Database was among the first microcomputer databases containing hazardous chemical information. The database is used in many industries and government agencies in more than 17 countries. Updated quarterly, the CHEMTOX Database provides detailed environmental and safety information on 7500-plus hazardous substances covered by dozens of regulatory and advisory sources. This brief listing describes the method of accessing data and provides ordering information for those wishing to obtain the CHEMTOX Database
Increased aggression during human group contests when competitive ability is more similar

NARCIS (Netherlands)

Stulp, Gert; Kordsmeyer, Tobias; Buunk, Abraham P.; Verhulst, Simon

2012-01-01

Theoretical analyses and empirical studies have revealed that conflict escalation is more likely when individuals are more similar in resource-holding potential (RHP). Conflicts can also occur between groups, but it is unknown whether conflicts also escalate more when groups are more similar in RHP.
Update History of This Database - SSBD | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us SSBD Update History of This Database Date Update contents 2016/07/25 SSBD English archive si...tion Download License Update History of This Database Site Policy | Contact Us Update History of This Database - SSBD | LSDB Archive ... ...te is opened. 2013/09/03 SSBD ( http://ssbd.qbic.riken.jp/ ) is opened. About This Database Database Descrip
Investigation of romanization of Japanese personal author's names in English databases

International Nuclear Information System (INIS)

Izawa, Michiyo; Kajiro, Tadashi; Narui, Shigeko

1984-01-01

This investigation was made on the INIS database produced in 1981 and original papers concerned. Its analysis revealed a significant difference of descriptions of the names between inputs from the INIS center for Japan and inputs from other INIS national centers to INIS. The percentage that the former center spelled out was 92%. However, 99.9% of the items from the latter centers had only one initial of given name, though 45% of the items had fully-spelled given names in the original papers. This investigation was supplemented by check of samples of Japanese name in other databases i.e., CA Search, NTIS, COMPENDEX and INSPEC. In conclusion, it is required to spell out Japanese personal author's names in Roman character to all of authors, editors of primary documents and producers of secondary information databases in English, in order to obtain high identification of the names. (author)
An Interoperable Cartographic Database

Directory of Open Access Journals (Sweden)

Slobodanka Ključanin

2007-05-01

Full Text Available The concept of producing a prototype of interoperable cartographic database is explored in this paper, including the possibilities of integration of different geospatial data into the database management system and their visualization on the Internet. The implementation includes vectorization of the concept of a single map page, creation of the cartographic database in an object-relation database, spatial analysis, definition and visualization of the database content in the form of a map on the Internet.
Data integration and knowledge discovery in biomedical databases. Reliable information from unreliable sources

Directory of Open Access Journals (Sweden)

A Mitnitski

2003-01-01

Full Text Available To better understand information about human health from databases we analyzed three datasets collected for different purposes in Canada: a biomedical database of older adults, a large population survey across all adult ages, and vital statistics. Redundancy in the variables was established, and this led us to derive a generalized (macroscopic state variable, being a fitness/frailty index that reflects both individual and group health status. Evaluation of the relationship between fitness/frailty and the mortality rate revealed that the latter could be expressed in terms of variables generally available from any cross-sectional database. In practical terms, this means that the risk of mortality might readily be assessed from standard biomedical appraisals collected for other purposes.
Systematic discovery of unannotated genes in 11 yeast species using a database of orthologous genomic segments

LENUS (Irish Health Repository)

OhEigeartaigh, Sean S

2011-07-26

Abstract Background In standard BLAST searches, no information other than the sequences of the query and the database entries is considered. However, in situations where two genes from different species have only borderline similarity in a BLAST search, the discovery that the genes are located within a region of conserved gene order (synteny) can provide additional evidence that they are orthologs. Thus, for interpreting borderline search results, it would be useful to know whether the syntenic context of a database hit is similar to that of the query. This principle has often been used in investigations of particular genes or genomic regions, but to our knowledge it has never been implemented systematically. Results We made use of the synteny information contained in the Yeast Gene Order Browser database for 11 yeast species to carry out a systematic search for protein-coding genes that were overlooked in the original annotations of one or more yeast genomes but which are syntenic with their orthologs. Such genes tend to have been overlooked because they are short, highly divergent, or contain introns. The key features of our software - called SearchDOGS - are that the database entries are classified into sets of genomic segments that are already known to be orthologous, and that very weak BLAST hits are retained for further analysis if their genomic location is similar to that of the query. Using SearchDOGS we identified 595 additional protein-coding genes among the 11 yeast species, including two new genes in Saccharomyces cerevisiae. We found additional genes for the mating pheromone a-factor in six species including Kluyveromyces lactis. Conclusions SearchDOGS has proven highly successful for identifying overlooked genes in the yeast genomes. We anticipate that our approach can be adapted for study of further groups of species, such as bacterial genomes. More generally, the concept of doing sequence similarity searches against databases to which external
Quality standards for DNA sequence variation databases to improve clinical management under development in Australia

Directory of Open Access Journals (Sweden)

B. Bennetts

2014-09-01

Full Text Available Despite the routine nature of comparing sequence variations identified during clinical testing to database records, few databases meet quality requirements for clinical diagnostics. To address this issue, The Royal College of Pathologists of Australasia (RCPA in collaboration with the Human Genetics Society of Australasia (HGSA, and the Human Variome Project (HVP is developing standards for DNA sequence variation databases intended for use in the Australian clinical environment. The outputs of this project will be promoted to other health systems and accreditation bodies by the Human Variome Project to support the development of similar frameworks in other jurisdictions.
Self-similar drop-size distributions produced by breakup in chaotic flows

International Nuclear Information System (INIS)

Muzzio, F.J.; Tjahjadi, M.; Ottino, J.M.; Department of Chemical Engineering, University of Massachusetts, Amherst, Massachusetts 01003; Department of Chemical Engineering, Northwestern University, Evanston, Illinois 60208)

1991-01-01

Deformation and breakup of immiscible fluids in deterministic chaotic flows is governed by self-similar distributions of stretching histories and stretching rates and produces populations of droplets of widely distributed sizes. Scaling reveals that distributions of drop sizes collapse into two self-similar families; each family exhibits a different shape, presumably due to changes in the breakup mechanism
A method for rapid similarity analysis of RNA secondary structures

Directory of Open Access Journals (Sweden)

Liu Na

2006-11-01

Full Text Available Abstract Background Owing to the rapid expansion of RNA structure databases in recent years, efficient methods for structure comparison are in demand for function prediction and evolutionary analysis. Usually, the similarity of RNA secondary structures is evaluated based on tree models and dynamic programming algorithms. We present here a new method for the similarity analysis of RNA secondary structures. Results Three sets of real data have been used as input for the example applications. Set I includes the structures from 5S rRNAs. Set II includes the secondary structures from RNase P and RNase MRP. Set III includes the structures from 16S rRNAs. Reasonable phylogenetic trees are derived for these three sets of data by using our method. Moreover, our program runs faster as compared to some existing ones. Conclusion The famous Lempel-Ziv algorithm can efficiently extract the information on repeated patterns encoded in RNA secondary structures and makes our method an alternative to analyze the similarity of RNA secondary structures. This method will also be useful to researchers who are interested in evolutionary analysis.
Vocal caricatures reveal signatures of speaker identity

Science.gov (United States)

López, Sabrina; Riera, Pablo; Assaneo, María Florencia; Eguía, Manuel; Sigman, Mariano; Trevisan, Marcos A.

2013-12-01

What are the features that impersonators select to elicit a speaker's identity? We built a voice database of public figures (targets) and imitations produced by professional impersonators. They produced one imitation based on their memory of the target (caricature) and another one after listening to the target audio (replica). A set of naive participants then judged identity and similarity of pairs of voices. Identity was better evoked by the caricatures and replicas were perceived to be closer to the targets in terms of voice similarity. We used this data to map relevant acoustic dimensions for each task. Our results indicate that speaker identity is mainly associated with vocal tract features, while perception of voice similarity is related to vocal folds parameters. We therefore show the way in which acoustic caricatures emphasize identity features at the cost of loosing similarity, which allows drawing an analogy with caricatures in the visual space.
Branch length similarity entropy-based descriptors for shape representation

Science.gov (United States)

Kwon, Ohsung; Lee, Sang-Hee

2017-11-01

In previous studies, we showed that the branch length similarity (BLS) entropy profile could be successfully used for the shape recognition such as battle tanks, facial expressions, and butterflies. In the present study, we proposed new descriptors, roundness, symmetry, and surface roughness, for the recognition, which are more accurate and fast in the computation than the previous descriptors. The roundness represents how closely a shape resembles to a circle, the symmetry characterizes how much one shape is similar with another when the shape is moved in flip, and the surface roughness quantifies the degree of vertical deviations of a shape boundary. To evaluate the performance of the descriptors, we used the database of leaf images with 12 species. Each species consisted of 10 - 20 leaf images and the total number of images were 160. The evaluation showed that the new descriptors successfully discriminated the leaf species. We believe that the descriptors can be a useful tool in the field of pattern recognition.
Extending Database Integration Technology

National Research Council Canada - National Science Library

Buneman, Peter

1999-01-01

Formal approaches to the semantics of databases and database languages can have immediate and practical consequences in extending database integration technologies to include a vastly greater range...
Chicken genome analysis reveals novel genes encoding biotin-binding proteins related to avidin family

Directory of Open Access Journals (Sweden)

Nordlund Henri R

2005-03-01

Full Text Available Abstract Background A chicken egg contains several biotin-binding proteins (BBPs, whose complete DNA and amino acid sequences are not known. In order to identify and characterise these genes and proteins we studied chicken cDNAs and genes available in the NCBI database and chicken genome database using the reported N-terminal amino acid sequences of chicken egg-yolk BBPs as search strings. Results Two separate hits showing significant homology for these N-terminal sequences were discovered. For one of these hits, the chromosomal location in the immediate proximity of the avidin gene family was found. Both of these hits encode proteins having high sequence similarity with avidin suggesting that chicken BBPs are paralogous to avidin family. In particular, almost all residues corresponding to biotin binding in avidin are conserved in these putative BBP proteins. One of the found DNA sequences, however, seems to encode a carboxy-terminal extension not present in avidin. Conclusion We describe here the predicted properties of the putative BBP genes and proteins. Our present observations link BBP genes together with avidin gene family and shed more light on the genetic arrangement and variability of this family. In addition, comparative modelling revealed the potential structural elements important for the functional and structural properties of the putative BBP proteins.
footprintDB: a database of transcription factors with annotated cis elements and binding interfaces.

Science.gov (United States)

Sebastian, Alvaro; Contreras-Moreira, Bruno

2014-01-15

Traditional and high-throughput techniques for determining transcription factor (TF) binding specificities are generating large volumes of data of uneven quality, which are scattered across individual databases. FootprintDB integrates some of the most comprehensive freely available libraries of curated DNA binding sites and systematically annotates the binding interfaces of the corresponding TFs. The first release contains 2422 unique TF sequences, 10 112 DNA binding sites and 3662 DNA motifs. A survey of the included data sources, organisms and TF families was performed together with proprietary database TRANSFAC, finding that footprintDB has a similar coverage of multicellular organisms, while also containing bacterial regulatory data. A search engine has been designed that drives the prediction of DNA motifs for input TFs, or conversely of TF sequences that might recognize input regulatory sequences, by comparison with database entries. Such predictions can also be extended to a single proteome chosen by the user, and results are ranked in terms of interface similarity. Benchmark experiments with bacterial, plant and human data were performed to measure the predictive power of footprintDB searches, which were able to correctly recover 10, 55 and 90% of the tested sequences, respectively. Correctly predicted TFs had a higher interface similarity than the average, confirming its diagnostic value. Web site implemented in PHP,Perl, MySQL and Apache. Freely available from http://floresta.eead.csic.es/footprintdb.
Wavelet optimization for content-based image retrieval in medical databases.

Science.gov (United States)

Quellec, G; Lamard, M; Cazuguel, G; Cochener, B; Roux, C

2010-04-01

We propose in this article a content-based image retrieval (CBIR) method for diagnosis aid in medical fields. In the proposed system, images are indexed in a generic fashion, without extracting domain-specific features: a signature is built for each image from its wavelet transform. These image signatures characterize the distribution of wavelet coefficients in each subband of the decomposition. A distance measure is then defined to compare two image signatures and thus retrieve the most similar images in a database when a query image is submitted by a physician. To retrieve relevant images from a medical database, the signatures and the distance measure must be related to the medical interpretation of images. As a consequence, we introduce several degrees of freedom in the system so that it can be tuned to any pathology and image modality. In particular, we propose to adapt the wavelet basis, within the lifting scheme framework, and to use a custom decomposition scheme. Weights are also introduced between subbands. All these parameters are tuned by an optimization procedure, using the medical grading of each image in the database to define a performance measure. The system is assessed on two medical image databases: one for diabetic retinopathy follow up and one for screening mammography, as well as a general purpose database. Results are promising: a mean precision of 56.50%, 70.91% and 96.10% is achieved for these three databases, when five images are returned by the system. Copyright 2009 Elsevier B.V. All rights reserved.
Database specification for the Worldwide Port System (WPS) Regional Integrated Cargo Database (ICDB)

Energy Technology Data Exchange (ETDEWEB)

Faby, E.Z.; Fluker, J.; Hancock, B.R.; Grubb, J.W.; Russell, D.L. [Univ. of Tennessee, Knoxville, TN (United States); Loftis, J.P.; Shipe, P.C.; Truett, L.F. [Oak Ridge National Lab., TN (United States)

1994-03-01

This Database Specification for the Worldwide Port System (WPS) Regional Integrated Cargo Database (ICDB) describes the database organization and storage allocation, provides the detailed data model of the logical and physical designs, and provides information for the construction of parts of the database such as tables, data elements, and associated dictionaries and diagrams.
Specialist Bibliographic Databases

OpenAIRE

Gasparyan, Armen Yuri; Yessirkepov, Marlen; Voronov, Alexander A.; Trukhachev, Vladimir I.; Kostyukova, Elena I.; Gerasimov, Alexey N.; Kitas, George D.

2016-01-01

Specialist bibliographic databases offer essential online tools for researchers and authors who work on specific subjects and perform comprehensive and systematic syntheses of evidence. This article presents examples of the established specialist databases, which may be of interest to those engaged in multidisciplinary science communication. Access to most specialist databases is through subscription schemes and membership in professional associations. Several aggregators of information and d...

Nuclear power economic database

International Nuclear Information System (INIS)

Ding Xiaoming; Li Lin; Zhao Shiping

1996-01-01

Nuclear power economic database (NPEDB), based on ORACLE V6.0, consists of three parts, i.e., economic data base of nuclear power station, economic data base of nuclear fuel cycle and economic database of nuclear power planning and nuclear environment. Economic database of nuclear power station includes data of general economics, technique, capital cost and benefit, etc. Economic database of nuclear fuel cycle includes data of technique and nuclear fuel price. Economic database of nuclear power planning and nuclear environment includes data of energy history, forecast, energy balance, electric power and energy facilities
Keyword Search in Databases

CERN Document Server

Yu, Jeffrey Xu; Chang, Lijun

2009-01-01

It has become highly desirable to provide users with flexible ways to query/search information over databases as simple as keyword search like Google search. This book surveys the recent developments on keyword search over databases, and focuses on finding structural information among objects in a database using a set of keywords. Such structural information to be returned can be either trees or subgraphs representing how the objects, that contain the required keywords, are interconnected in a relational database or in an XML database. The structural keyword search is completely different from
Update History of This Database - AT Atlas | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us AT Atlas Update History of This Database Date Update contents 2013/12/16 The email address i... ( http://www.tanpaku.org/atatlas/ ) is opened. About This Database Database Description Download License Update History of This Data...base Site Policy | Contact Us Update History of This Database - AT Atlas | LSDB Archive ...
600 MW nuclear power database

International Nuclear Information System (INIS)

Cao Ruiding; Chen Guorong; Chen Xianfeng; Zhang Yishu

1996-01-01

600 MW Nuclear power database, based on ORACLE 6.0, consists of three parts, i.e. nuclear power plant database, nuclear power position database and nuclear power equipment database. In the database, there are a great deal of technique data and picture of nuclear power, provided by engineering designing units and individual. The database can give help to the designers of nuclear power
ESIM: Edge Similarity for Screen Content Image Quality Assessment.

Science.gov (United States)

Ni, Zhangkai; Ma, Lin; Zeng, Huanqiang; Chen, Jing; Cai, Canhui; Ma, Kai-Kuang

2017-10-01

In this paper, an accurate full-reference image quality assessment (IQA) model developed for assessing screen content images (SCIs), called the edge similarity (ESIM), is proposed. It is inspired by the fact that the human visual system (HVS) is highly sensitive to edges that are often encountered in SCIs; therefore, essential edge features are extracted and exploited for conducting IQA for the SCIs. The key novelty of the proposed ESIM lies in the extraction and use of three salient edge features-i.e., edge contrast, edge width, and edge direction. The first two attributes are simultaneously generated from the input SCI based on a parametric edge model, while the last one is derived directly from the input SCI. The extraction of these three features will be performed for the reference SCI and the distorted SCI, individually. The degree of similarity measured for each above-mentioned edge attribute is then computed independently, followed by combining them together using our proposed edge-width pooling strategy to generate the final ESIM score. To conduct the performance evaluation of our proposed ESIM model, a new and the largest SCI database (denoted as SCID) is established in our work and made to the public for download. Our database contains 1800 distorted SCIs that are generated from 40 reference SCIs. For each SCI, nine distortion types are investigated, and five degradation levels are produced for each distortion type. Extensive simulation results have clearly shown that the proposed ESIM model is more consistent with the perception of the HVS on the evaluation of distorted SCIs than the multiple state-of-the-art IQA methods.
Hazard Analysis Database Report

Energy Technology Data Exchange (ETDEWEB)

GAULT, G.W.

1999-10-13

The Hazard Analysis Database was developed in conjunction with the hazard analysis activities conducted in accordance with DOE-STD-3009-94, Preparation Guide for US Department of Energy Nonreactor Nuclear Facility Safety Analysis Reports, for the Tank Waste Remediation System (TWRS) Final Safety Analysis Report (FSAR). The FSAR is part of the approved TWRS Authorization Basis (AB). This document describes, identifies, and defines the contents and structure of the TWRS FSAR Hazard Analysis Database and documents the configuration control changes made to the database. The TWRS Hazard Analysis Database contains the collection of information generated during the initial hazard evaluations and the subsequent hazard and accident analysis activities. The database supports the preparation of Chapters 3,4, and 5 of the TWRS FSAR and the USQ process and consists of two major, interrelated data sets: (1) Hazard Evaluation Database--Data from the results of the hazard evaluations; and (2) Hazard Topography Database--Data from the system familiarization and hazard identification.
Collecting Taxes Database

Data.gov (United States)

US Agency for International Development — The Collecting Taxes Database contains performance and structural indicators about national tax systems. The database contains quantitative revenue performance...
Development of a California commercial building benchmarking database

Energy Technology Data Exchange (ETDEWEB)

Kinney, Satkartar; Piette, Mary Ann

2002-05-17

Building energy benchmarking is a useful starting point for commercial building owners and operators to target energy savings opportunities. There are a number of tools and methods for benchmarking energy use. Benchmarking based on regional data can provides more relevant information for California buildings than national tools such as Energy Star. This paper discusses issues related to benchmarking commercial building energy use and the development of Cal-Arch, a building energy benchmarking database for California. Currently Cal-Arch uses existing survey data from California's Commercial End Use Survey (CEUS), a largely underutilized wealth of information collected by California's major utilities. Doe's Commercial Building Energy Consumption Survey (CBECS) is used by a similar tool, Arch, and by a number of other benchmarking tools. Future versions of Arch/Cal-Arch will utilize additional data sources including modeled data and individual buildings to expand the database.
Accessing and using chemical databases

DEFF Research Database (Denmark)

Nikolov, Nikolai Georgiev; Pavlov, Todor; Niemelä, Jay Russell

2013-01-01

Computer-based representation of chemicals makes it possible to organize data in chemical databases-collections of chemical structures and associated properties. Databases are widely used wherever efficient processing of chemical information is needed, including search, storage, retrieval......, and dissemination. Structure and functionality of chemical databases are considered. The typical kinds of information found in a chemical database are considered-identification, structural, and associated data. Functionality of chemical databases is presented, with examples of search and access types. More details...... are included about the OASIS database and platform and the Danish (Q)SAR Database online. Various types of chemical database resources are discussed, together with a list of examples....
Column-oriented database management systems

OpenAIRE

Možina, David

2013-01-01

In the following thesis I will present column-oriented database. Among other things, I will answer on a question why there is a need for a column-oriented database. In recent years there have been a lot of attention regarding a column-oriented database, even if the existence of a columnar database management systems dates back in the early seventies of the last century. I will compare both systems for a database management – a colum-oriented database system and a row-oriented database system ...
Large margin classification with indefinite similarities

KAUST Repository

Alabdulmohsin, Ibrahim

2016-01-07

Classification with indefinite similarities has attracted attention in the machine learning community. This is partly due to the fact that many similarity functions that arise in practice are not symmetric positive semidefinite, i.e. the Mercer condition is not satisfied, or the Mercer condition is difficult to verify. Examples of such indefinite similarities in machine learning applications are ample including, for instance, the BLAST similarity score between protein sequences, human-judged similarities between concepts and words, and the tangent distance or the shape matching distance in computer vision. Nevertheless, previous works on classification with indefinite similarities are not fully satisfactory. They have either introduced sources of inconsistency in handling past and future examples using kernel approximation, settled for local-minimum solutions using non-convex optimization, or produced non-sparse solutions by learning in Krein spaces. Despite the large volume of research devoted to this subject lately, we demonstrate in this paper how an old idea, namely the 1-norm support vector machine (SVM) proposed more than 15 years ago, has several advantages over more recent work. In particular, the 1-norm SVM method is conceptually simpler, which makes it easier to implement and maintain. It is competitive, if not superior to, all other methods in terms of predictive accuracy. Moreover, it produces solutions that are often sparser than more recent methods by several orders of magnitude. In addition, we provide various theoretical justifications by relating 1-norm SVM to well-established learning algorithms such as neural networks, SVM, and nearest neighbor classifiers. Finally, we conduct a thorough experimental evaluation, which reveals that the evidence in favor of 1-norm SVM is statistically significant.
Database Description - Society Catalog | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available ion of the academic societies in Japan (organization name, website URL, contact a...sing a category tree or a society website's thumbnail. This database is useful especially when the users are... External Links: Original website information Database maintenance site National Bioscience Database Center *The original web...site was terminated. URL of the original website - Operation start date 2008/06 Last update
An algorithm of discovering signatures from DNA databases on a computer cluster.

Science.gov (United States)

Lee, Hsiao Ping; Sheu, Tzu-Fang

2014-10-05

Signatures are short sequences that are unique and not similar to any other sequence in a database that can be used as the basis to identify different species. Even though several signature discovery algorithms have been proposed in the past, these algorithms require the entirety of databases to be loaded in the memory, thus restricting the amount of data that they can process. It makes those algorithms unable to process databases with large amounts of data. Also, those algorithms use sequential models and have slower discovery speeds, meaning that the efficiency can be improved. In this research, we are debuting the utilization of a divide-and-conquer strategy in signature discovery and have proposed a parallel signature discovery algorithm on a computer cluster. The algorithm applies the divide-and-conquer strategy to solve the problem posed to the existing algorithms where they are unable to process large databases and uses a parallel computing mechanism to effectively improve the efficiency of signature discovery. Even when run with just the memory of regular personal computers, the algorithm can still process large databases such as the human whole-genome EST database which were previously unable to be processed by the existing algorithms. The algorithm proposed in this research is not limited by the amount of usable memory and can rapidly find signatures in large databases, making it useful in applications such as Next Generation Sequencing and other large database analysis and processing. The implementation of the proposed algorithm is available at http://www.cs.pu.edu.tw/~fang/DDCSDPrograms/DDCSD.htm.
Annotation error in public databases: misannotation of molecular function in enzyme superfamilies.

Directory of Open Access Journals (Sweden)

Alexandra M Schnoes

2009-12-01

Full Text Available Due to the rapid release of new data from genome sequencing projects, the majority of protein sequences in public databases have not been experimentally characterized; rather, sequences are annotated using computational analysis. The level of misannotation and the types of misannotation in large public databases are currently unknown and have not been analyzed in depth. We have investigated the misannotation levels for molecular function in four public protein sequence databases (UniProtKB/Swiss-Prot, GenBank NR, UniProtKB/TrEMBL, and KEGG for a model set of 37 enzyme families for which extensive experimental information is available. The manually curated database Swiss-Prot shows the lowest annotation error levels (close to 0% for most families; the two other protein sequence databases (GenBank NR and TrEMBL and the protein sequences in the KEGG pathways database exhibit similar and surprisingly high levels of misannotation that average 5%-63% across the six superfamilies studied. For 10 of the 37 families examined, the level of misannotation in one or more of these databases is >80%. Examination of the NR database over time shows that misannotation has increased from 1993 to 2005. The types of misannotation that were found fall into several categories, most associated with "overprediction" of molecular function. These results suggest that misannotation in enzyme superfamilies containing multiple families that catalyze different reactions is a larger problem than has been recognized. Strategies are suggested for addressing some of the systematic problems contributing to these high levels of misannotation.
Annotation error in public databases: misannotation of molecular function in enzyme superfamilies.

Science.gov (United States)

Schnoes, Alexandra M; Brown, Shoshana D; Dodevski, Igor; Babbitt, Patricia C

2009-12-01

Due to the rapid release of new data from genome sequencing projects, the majority of protein sequences in public databases have not been experimentally characterized; rather, sequences are annotated using computational analysis. The level of misannotation and the types of misannotation in large public databases are currently unknown and have not been analyzed in depth. We have investigated the misannotation levels for molecular function in four public protein sequence databases (UniProtKB/Swiss-Prot, GenBank NR, UniProtKB/TrEMBL, and KEGG) for a model set of 37 enzyme families for which extensive experimental information is available. The manually curated database Swiss-Prot shows the lowest annotation error levels (close to 0% for most families); the two other protein sequence databases (GenBank NR and TrEMBL) and the protein sequences in the KEGG pathways database exhibit similar and surprisingly high levels of misannotation that average 5%-63% across the six superfamilies studied. For 10 of the 37 families examined, the level of misannotation in one or more of these databases is >80%. Examination of the NR database over time shows that misannotation has increased from 1993 to 2005. The types of misannotation that were found fall into several categories, most associated with "overprediction" of molecular function. These results suggest that misannotation in enzyme superfamilies containing multiple families that catalyze different reactions is a larger problem than has been recognized. Strategies are suggested for addressing some of the systematic problems contributing to these high levels of misannotation.
Update History of This Database - RMG | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us RMG Update History of This Database Date Update contents 2016/08/22 The contact address is c...dna.affrc.go.jp/ ) is opened. About This Database Database Description Download License Update Hi...story of This Database Site Policy | Contact Us Update History of This Database - RMG | LSDB Archive ... ... URL of the portal site is changed. 2013/08/07 RMG archive site is opened. 2002/09/25 RMG ( http://rmg.rice.
Update History of This Database - DGBY | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us DGBY Update History of This Database Date Update contents 2014/10/20 The URL of the portal s...aro.affrc.go.jp/yakudachi/yeast/index.html ) is opened. About This Database Database Description Download License Update Hi...story of This Database Site Policy | Contact Us Update History of This Database - DGBY | LSDB Archive ... ... Expression of attribution in License is updated. 2012/03/08 DGBY English archive site is opened. 2006/10/02
PRIDE and "Database on Demand" as valuable tools for computational proteomics.

Science.gov (United States)

Vizcaíno, Juan Antonio; Reisinger, Florian; Côté, Richard; Martens, Lennart

2011-01-01

The Proteomics Identifications Database (PRIDE, http://www.ebi.ac.uk/pride ) provides users with the ability to explore and compare mass spectrometry-based proteomics experiments that reveal details of the protein expression found in a broad range of taxonomic groups, tissues, and disease states. A PRIDE experiment typically includes identifications of proteins, peptides, and protein modifications. Additionally, many of the submitted experiments also include the mass spectra that provide the evidence for these identifications. Finally, one of the strongest advantages of PRIDE in comparison with other proteomics repositories is the amount of metadata it contains, a key point to put the above-mentioned data in biological and/or technical context. Several informatics tools have been developed in support of the PRIDE database. The most recent one is called "Database on Demand" (DoD), which allows custom sequence databases to be built in order to optimize the results from search engines. We describe the use of DoD in this chapter. Additionally, in order to show the potential of PRIDE as a source for data mining, we also explore complex queries using federated BioMart queries to integrate PRIDE data with other resources, such as Ensembl, Reactome, or UniProt.
Update History of This Database - KAIKOcDNA | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us KAIKOcDNA Update History of This Database Date Update contents 2014/10/20 The URL of the dat... database ( http://sgp.dna.affrc.go.jp/EST/ ) is opened. About This Database Database Description Download License Update Hi...story of This Database Site Policy | Contact Us Update History of This Database - KAIKOcDNA | LSDB Archive ... ...abase maintenance site is changed. 2014/10/08 KAIKOcDNA English archive site is opened. 2004/04/12 KAIKOcDNA
Dietary Supplement Ingredient Database

Science.gov (United States)

... and US Department of Agriculture Dietary Supplement Ingredient Database Toggle navigation Menu Home About DSID Mission Current ... values can be saved to build a small database or add to an existing database for national, ...

Sensitivity of human auditory cortex to rapid frequency modulation revealed by multivariate representational similarity analysis.

Science.gov (United States)

Joanisse, Marc F; DeSouza, Diedre D

2014-01-01

Functional Magnetic Resonance Imaging (fMRI) was used to investigate the extent, magnitude, and pattern of brain activity in response to rapid frequency-modulated sounds. We examined this by manipulating the direction (rise vs. fall) and the rate (fast vs. slow) of the apparent pitch of iterated rippled noise (IRN) bursts. Acoustic parameters were selected to capture features used in phoneme contrasts, however the stimuli themselves were not perceived as speech per se. Participants were scanned as they passively listened to sounds in an event-related paradigm. Univariate analyses revealed a greater level and extent of activation in bilateral auditory cortex in response to frequency-modulated sweeps compared to steady-state sounds. This effect was stronger in the left hemisphere. However, no regions showed selectivity for either rate or direction of frequency modulation. In contrast, multivoxel pattern analysis (MVPA) revealed feature-specific encoding for direction of modulation in auditory cortex bilaterally. Moreover, this effect was strongest when analyses were restricted to anatomical regions lying outside Heschl's gyrus. We found no support for feature-specific encoding of frequency modulation rate. Differential findings of modulation rate and direction of modulation are discussed with respect to their relevance to phonetic discrimination.
MIPS: a database for protein sequences, homology data and yeast genome information.

Science.gov (United States)

Mewes, H W; Albermann, K; Heumann, K; Liebl, S; Pfeiffer, F

1997-01-01

The MIPS group (Martinsried Institute for Protein Sequences) at the Max-Planck-Institute for Biochemistry, Martinsried near Munich, Germany, collects, processes and distributes protein sequence data within the framework of the tripartite association of the PIR-International Protein Sequence Database (,). MIPS contributes nearly 50% of the data input to the PIR-International Protein Sequence Database. The database is distributed on CD-ROM together with PATCHX, an exhaustive supplement of unique, unverified protein sequences from external sources compiled by MIPS. Through its WWW server (http://www.mips.biochem.mpg.de/ ) MIPS permits internet access to sequence databases, homology data and to yeast genome information. (i) Sequence similarity results from the FASTA program () are stored in the FASTA database for all proteins from PIR-International and PATCHX. The database is dynamically maintained and permits instant access to FASTA results. (ii) Starting with FASTA database queries, proteins have been classified into families and superfamilies (PROT-FAM). (iii) The HPT (hashed position tree) data structure () developed at MIPS is a new approach for rapid sequence and pattern searching. (iv) MIPS provides access to the sequence and annotation of the complete yeast genome (), the functional classification of yeast genes (FunCat) and its graphical display, the 'Genome Browser' (). A CD-ROM based on the JAVA programming language providing dynamic interactive access to the yeast genome and the related protein sequences has been compiled and is available on request. PMID:9016498
PseudoMLSA: a database for multigenic sequence analysis of Pseudomonas species

Directory of Open Access Journals (Sweden)

Lalucat Jorge

2010-04-01

Full Text Available Abstract Background The genus Pseudomonas comprises more than 100 species of environmental, clinical, agricultural, and biotechnological interest. Although, the recommended method for discriminating bacterial species is DNA-DNA hybridisation, alternative techniques based on multigenic sequence analysis are becoming a common practice in bacterial species discrimination studies. Since there is not a general criterion for determining which genes are more useful for species resolution; the number of strains and genes analysed is increasing continuously. As a result, sequences of different genes are dispersed throughout several databases. This sequence information needs to be collected in a common database, in order to be useful for future identification-based projects. Description The PseudoMLSA Database is a comprehensive database of multiple gene sequences from strains of Pseudomonas species. The core of the database is composed of selected gene sequences from all Pseudomonas type strains validly assigned to the genus through 2008. The database is aimed to be useful for MultiLocus Sequence Analysis (MLSA procedures, for the identification and characterisation of any Pseudomonas bacterial isolate. The sequences are available for download via a direct connection to the National Center for Biotechnology Information (NCBI. Additionally, the database includes an online BLAST interface for flexible nucleotide queries and similarity searches with the user's datasets, and provides a user-friendly output for easily parsing, navigating, and analysing BLAST results. Conclusions The PseudoMLSA database amasses strains and sequence information of validly described Pseudomonas species, and allows free querying of the database via a user-friendly, web-based interface available at http://www.uib.es/microbiologiaBD/Welcome.html. The web-based platform enables easy retrieval at strain or gene sequence information level; including references to published peer
Database and Expert Systems Applications

DEFF Research Database (Denmark)

Viborg Andersen, Kim; Debenham, John; Wagner, Roland

schemata, query evaluation, semantic processing, information retrieval, temporal and spatial databases, querying XML, organisational aspects of databases, natural language processing, ontologies, Web data extraction, semantic Web, data stream management, data extraction, distributed database systems......This book constitutes the refereed proceedings of the 16th International Conference on Database and Expert Systems Applications, DEXA 2005, held in Copenhagen, Denmark, in August 2005.The 92 revised full papers presented together with 2 invited papers were carefully reviewed and selected from 390...... submissions. The papers are organized in topical sections on workflow automation, database queries, data classification and recommendation systems, information retrieval in multimedia databases, Web applications, implementational aspects of databases, multimedia databases, XML processing, security, XML...
Image quality assessment based on inter-patch and intra-patch similarity.

Directory of Open Access Journals (Sweden)

Fei Zhou

Full Text Available In this paper, we propose a full-reference (FR image quality assessment (IQA scheme, which evaluates image fidelity from two aspects: the inter-patch similarity and the intra-patch similarity. The scheme is performed in a patch-wise fashion so that a quality map can be obtained. On one hand, we investigate the disparity between one image patch and its adjacent ones. This disparity is visually described by an inter-patch feature, where the hybrid effect of luminance masking and contrast masking is taken into account. The inter-patch similarity is further measured by modifying the normalized correlation coefficient (NCC. On the other hand, we also attach importance to the impact of image contents within one patch on the IQA problem. For the intra-patch feature, we consider image curvature as an important complement of image gradient. According to local image contents, the intra-patch similarity is measured by adaptively comparing image curvature and gradient. Besides, a nonlinear integration of the inter-patch and intra-patch similarity is presented to obtain an overall score of image quality. The experiments conducted on six publicly available image databases show that our scheme achieves better performance in comparison with several state-of-the-art schemes.
Optic Disc Detection from Fundus Photography via Best-Buddies Similarity

Directory of Open Access Journals (Sweden)

Kangning Hou

2018-05-01

Full Text Available Robust and effective optic disc (OD detection is a necessary processing step in the research work of the automatic analysis of fundus images. In this paper, we propose a novel and robust method for the automated detection of ODs from fundus photographs. It is essentially carried out by performing template matching using the Best-Buddies Similarity (BBS measure between the hand-marked OD region and the small parts of target images. For well characterizing the local spatial information of fundus images, a gradient constraint term was introduced for computing the BBS measurement. The performance of the proposed method is validated with Digital Retinal Images for Vessel Extraction (DRIVE and Standard Diabetic Retinopathy Database Calibration Level 1 (DIARETDB1 databases, and quantitative results were obtained. Success rates/error distances of 100%/10.4 pixel and of 97.7%/12.9 pixel, respectively, were achieved. The algorithm has been tested and compared with other commonly used methods, and the results show that the proposed method shows superior performance.
Case Study III: The Construction of a Nanotoxicity Database - The MOD-ENP-TOX Experience.

Science.gov (United States)

Vriens, Hanne; Mertens, Dominik; Regret, Renaud; Lin, Pinpin; Locquet, Jean-Pierre; Hoet, Peter

2017-01-01

The amount of experimental studies on the toxicity of nanomaterials is growing fast. Interpretation and comparison of these studies is a complex issue due to the high amount of variables possibly determining the toxicity of nanomaterials.Qualitative databases providing a structured combination, integration and quality evaluation of the existing data could reveal insights that cannot be seen from different studies alone. A few database initiatives are under development but in practice very little data is publicly available and collaboration between physicists, toxicologists, computer scientists and modellers is needed to further develop databases, standards and analysis tools.In this case study the process of building a database on the in vitro toxicity of amorphous silica nanoparticles (NPs) is described in detail. Experimental data were systematically collected from peer reviewed papers, manually curated and stored in a standardised format. The result is a database in ISA-Tab-Nano including 68 peer reviewed papers on the toxicity of 148 amorphous silica NPs. Both the physicochemical characterization of the particles and their biological effect (described in 230 in vitro assays) were stored in the database. A scoring system was elaborated in order to evaluate the reliability of the stored data.
CCDB: a curated database of genes involved in cervix cancer.

Science.gov (United States)

Agarwal, Subhash M; Raghav, Dhwani; Singh, Harinder; Raghava, G P S

2011-01-01

The Cervical Cancer gene DataBase (CCDB, http://crdd.osdd.net/raghava/ccdb) is a manually curated catalog of experimentally validated genes that are thought, or are known to be involved in the different stages of cervical carcinogenesis. In spite of the large women population that is presently affected from this malignancy still at present, no database exists that catalogs information on genes associated with cervical cancer. Therefore, we have compiled 537 genes in CCDB that are linked with cervical cancer causation processes such as methylation, gene amplification, mutation, polymorphism and change in expression level, as evident from published literature. Each record contains details related to gene like architecture (exon-intron structure), location, function, sequences (mRNA/CDS/protein), ontology, interacting partners, homology to other eukaryotic genomes, structure and links to other public databases, thus augmenting CCDB with external data. Also, manually curated literature references have been provided to support the inclusion of the gene in the database and establish its association with cervix cancer. In addition, CCDB provides information on microRNA altered in cervical cancer as well as search facility for querying, several browse options and an online tool for sequence similarity search, thereby providing researchers with easy access to the latest information on genes involved in cervix cancer.
Optimizing top precision performance measure of content-based image retrieval by learning similarity function

KAUST Repository

Liang, Ru-Ze

2017-04-24

In this paper we study the problem of content-based image retrieval. In this problem, the most popular performance measure is the top precision measure, and the most important component of a retrieval system is the similarity function used to compare a query image against a database image. However, up to now, there is no existing similarity learning method proposed to optimize the top precision measure. To fill this gap, in this paper, we propose a novel similarity learning method to maximize the top precision measure. We model this problem as a minimization problem with an objective function as the combination of the losses of the relevant images ranked behind the top-ranked irrelevant image, and the squared Frobenius norm of the similarity function parameter. This minimization problem is solved as a quadratic programming problem. The experiments over two benchmark data sets show the advantages of the proposed method over other similarity learning methods when the top precision is used as the performance measure.
Optimizing top precision performance measure of content-based image retrieval by learning similarity function

KAUST Repository

Liang, Ru-Ze; Shi, Lihui; Wang, Haoxiang; Meng, Jiandong; Wang, Jim Jing-Yan; Sun, Qingquan; Gu, Yi

2017-01-01

In this paper we study the problem of content-based image retrieval. In this problem, the most popular performance measure is the top precision measure, and the most important component of a retrieval system is the similarity function used to compare a query image against a database image. However, up to now, there is no existing similarity learning method proposed to optimize the top precision measure. To fill this gap, in this paper, we propose a novel similarity learning method to maximize the top precision measure. We model this problem as a minimization problem with an objective function as the combination of the losses of the relevant images ranked behind the top-ranked irrelevant image, and the squared Frobenius norm of the similarity function parameter. This minimization problem is solved as a quadratic programming problem. The experiments over two benchmark data sets show the advantages of the proposed method over other similarity learning methods when the top precision is used as the performance measure.
National Database of Geriatrics

DEFF Research Database (Denmark)

Kannegaard, Pia Nimann; Vinding, Kirsten L; Hare-Bruun, Helle

2016-01-01

AIM OF DATABASE: The aim of the National Database of Geriatrics is to monitor the quality of interdisciplinary diagnostics and treatment of patients admitted to a geriatric hospital unit. STUDY POPULATION: The database population consists of patients who were admitted to a geriatric hospital unit....... Geriatric patients cannot be defined by specific diagnoses. A geriatric patient is typically a frail multimorbid elderly patient with decreasing functional ability and social challenges. The database includes 14-15,000 admissions per year, and the database completeness has been stable at 90% during the past......, percentage of discharges with a rehabilitation plan, and the part of cases where an interdisciplinary conference has taken place. Data are recorded by doctors, nurses, and therapists in a database and linked to the Danish National Patient Register. DESCRIPTIVE DATA: Descriptive patient-related data include...
Insights for Planetarium and Museum Educators Revealed by the iSTAR international Study of Astronomical Reasoning Database

Science.gov (United States)

Slater, T. F.; Tatge, C. B.; Ratcliff, M.; Slater, S. J.

2016-12-01

Dedicated sky watchers through the centuries have long sought to find the best teaching methods to efficiently and effectively transfer vast amounts of accumulated star knowledge to the next generation of sky watchers. Although detailed maps specifying the names and locations of stars have been carefully displayed on spherical globes for thousands of years, it is the 1923 installation of a Zeiss-made, large, mechanical star projector in Munich that is often cited as the first modern projection planetarium for teaching astronomy. In the 1930's, impressive planetariums were installed Chicago, Los Angeles and New York, which then in turn served as a catalyst for additional planetarium construction. Planetarium construction increased rapidly in the United States due to federal funding to schools and museums through the 1958 US National Defense Education Act and the US went from one planetarium in 1930, to six in 1940, to about 100 in 1960, increasing to 200 in 1963, 450 by 1967—even before humans had landed on the Moon—and more than 1,000 by 1975. Today, nearly 3,000 permanent planetarium facilities are available to show the stars and heavenly motions to children and adults alike across the world, with perhaps another thousand portable planetariums adding to the available teaching venues. Simultaneous with their construction, discipline-based astronomy education have been trying to better understand, and ultimately improve, how people learn astronomy in the planetarium. A systematic analysis of planetarium education research articles, dissertations, and theses found in the recently constructed, community-wide, international Study of Astronomical Reasoning iSTAR database at istardatabase.org reveal that many of the systematic studies conducted in the 1960s and 1970s using domes served by servo-mechanical star projects have been reproduced again in recent decades in theaters using digital video projection showing nearly the same results: student-passive, information
RICD: A rice indica cDNA database resource for rice functional genomics

Directory of Open Access Journals (Sweden)

Zhang Qifa

2008-11-01

Full Text Available Abstract Background The Oryza sativa L. indica subspecies is the most widely cultivated rice. During the last few years, we have collected over 20,000 putative full-length cDNAs and over 40,000 ESTs isolated from various cDNA libraries of two indica varieties Guangluai 4 and Minghui 63. A database of the rice indica cDNAs was therefore built to provide a comprehensive web data source for searching and retrieving the indica cDNA clones. Results Rice Indica cDNA Database (RICD is an online MySQL-PHP driven database with a user-friendly web interface. It allows investigators to query the cDNA clones by keyword, genome position, nucleotide or protein sequence, and putative function. It also provides a series of information, including sequences, protein domain annotations, similarity search results, SNPs and InDels information, and hyperlinks to gene annotation in both The Rice Annotation Project Database (RAP-DB and The TIGR Rice Genome Annotation Resource, expression atlas in RiceGE and variation report in Gramene of each cDNA. Conclusion The online rice indica cDNA database provides cDNA resource with comprehensive information to researchers for functional analysis of indica subspecies and for comparative genomics. The RICD database is available through our website http://www.ncgr.ac.cn/ricd.
A new relational database structure and online interface for the HITRAN database

International Nuclear Information System (INIS)

Hill, Christian; Gordon, Iouli E.; Rothman, Laurence S.; Tennyson, Jonathan

2013-01-01

A new format for the HITRAN database is proposed. By storing the line-transition data in a number of linked tables described by a relational database schema, it is possible to overcome the limitations of the existing format, which have become increasingly apparent over the last few years as new and more varied data are being used by radiative-transfer models. Although the database in the new format can be searched using the well-established Structured Query Language (SQL), a web service, HITRANonline, has been deployed to allow users to make most common queries of the database using a graphical user interface in a web page. The advantages of the relational form of the database to ensuring data integrity and consistency are explored, and the compatibility of the online interface with the emerging standards of the Virtual Atomic and Molecular Data Centre (VAMDC) project is discussed. In particular, the ability to access HITRAN data using a standard query language from other websites, command line tools and from within computer programs is described. -- Highlights: • A new, interactive version of the HITRAN database is presented. • The data is stored in a structured fashion in a relational database. • The new HITRANonline interface offers increased functionality and easier error correction
Update History of This Database - fRNAdb | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us fRNAdb Update History of This Database Date Update contents 2016/03/29 fRNAdb English archiv...on Download License Update History of This Database Site Policy | Contact Us Update History of This Database - fRNAdb | LSDB Archive ... ...e site is opened. 2006/12 fRNAdb ( http://www.ncrna.org/ ) is opened. About This Database Database Descripti
Suspect filler similarity in eyewitness lineups: a literature review and a novel methodology.

Science.gov (United States)

Fitzgerald, Ryan J; Oriet, Chris; Price, Heather L

2015-02-01

Eyewitness lineups typically contain a suspect (guilty or innocent) and fillers (known innocents). The degree to which fillers should resemble the suspect is a complex issue that has yet to be resolved. Previously, researchers have voiced concern that eyewitnesses would be unable to identify their target from a lineup containing highly similar fillers; however, our literature review suggests highly similar fillers have only rarely been shown to have this effect. To further examine the effect of highly similar fillers on lineup responses, we used morphing software to create fillers of moderately high and very high similarity to the suspect. When the culprit was in the lineup, a higher correct identification rate was observed in moderately high similarity lineups than in very high similarity lineups. When the culprit was absent, similarity did not yield a significant effect on innocent suspect misidentification rates. However, the correct rejection rate in the moderately high similarity lineup was 20% higher than in the very high similarity lineup. When choosing rates were controlled by calculating identification probabilities for only those who made a selection from the lineup, culprit identification rates as well as innocent suspect misidentification rates were significantly higher in the moderately high similarity lineup than in the very high similarity lineup. Thus, very high similarity fillers yielded costs and benefits. Although our research suggests that selecting the most similar fillers available may adversely affect correct identification rates, we recommend additional research using fillers obtained from police databases to corroborate our findings.
Multilevel security for relational databases

CERN Document Server

Faragallah, Osama S; El-Samie, Fathi E Abd

2014-01-01

Concepts of Database Security Database Concepts Relational Database Security Concepts Access Control in Relational Databases Discretionary Access Control Mandatory Access Control Role-Based Access Control Work Objectives Book Organization Basic Concept of Multilevel Database Security IntroductionMultilevel Database Relations Polyinstantiation Invisible Polyinstantiation Visible Polyinstantiation Types of Polyinstantiation Architectural Consideration
SuperNatural: a searchable database of available natural compounds.

Science.gov (United States)

Dunkel, Mathias; Fullbeck, Melanie; Neumann, Stefanie; Preissner, Robert

2006-01-01

Although tremendous effort has been put into synthetic libraries, most drugs on the market are still natural compounds or derivatives thereof. There are encyclopaedias of natural compounds, but the availability of these compounds is often unclear and catalogues from numerous suppliers have to be checked. To overcome these problems we have compiled a database of approximately 50,000 natural compounds from different suppliers. To enable efficient identification of the desired compounds, we have implemented substructure searches with typical templates. Starting points for in silico screenings are about 2500 well-known and classified natural compounds from a compendium that we have added. Possible medical applications can be ascertained via automatic searches for similar drugs in a free conformational drug database containing WHO indications. Furthermore, we have computed about three million conformers, which are deployed to account for the flexibilities of the compounds when the 3D superposition algorithm that we have developed is used. The SuperNatural Database is publicly available at http://bioinformatics.charite.de/supernatural. Viewing requires the free Chime-plugin from MDL (Chime) or Java2 Runtime Environment (MView), which is also necessary for using Marvin application for chemical drawing.
A Case for Database Filesystems

Energy Technology Data Exchange (ETDEWEB)

Adams, P A; Hax, J C

2009-05-13

Data intensive science is offering new challenges and opportunities for Information Technology and traditional relational databases in particular. Database filesystems offer the potential to store Level Zero data and analyze Level 1 and Level 3 data within the same database system [2]. Scientific data is typically composed of both unstructured files and scalar data. Oracle SecureFiles is a new database filesystem feature in Oracle Database 11g that is specifically engineered to deliver high performance and scalability for storing unstructured or file data inside the Oracle database. SecureFiles presents the best of both the filesystem and the database worlds for unstructured content. Data stored inside SecureFiles can be queried or written at performance levels comparable to that of traditional filesystems while retaining the advantages of the Oracle database.
Update History of This Database - TogoTV | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us TogoTV Update History of This Database Date Update contents 2017/05/12 TogoTV English archiv...ription Download License Update History of This Database Site Policy | Contact Us Update History of This Database - TogoTV | LSDB Archive ... ...e site is opened. 2007/07/20 TogoTV ( http://togotv.dbcls.jp/ ) is opened. About This Database Database Desc

Update History of This Database - ConfC | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us ConfC Update History of This Database Date Update contents 2016/09/20 ConfC English archive ...tion Download License Update History of This Database Site Policy | Contact Us Update History of This Database - ConfC | LSDB Archive ... ...site is opened. 2005/05/01 ConfC (http://mbs.cbrc.jp/ConfC/) is opened. About This Database Database Descrip
The effects of similarity of theme and instantiation in analogical reasoning.

Science.gov (United States)

Yanowitz, K L

2001-01-01

The influence of 2 types of structural similarity on analogical reasoning was examined. The theme of a story is a structural component that constrains other relationships in the story. Another structural component is the way in which the theme is implemented. Participants received pairs of stories that varied in the similarity of these two components. Participants in Experiment 1 judged stories containing similar themes as more analogous than stories with dissimilar themes. Likewise, stories with similar implementations were judged as more analogous than stories with dissimilar implementations. Experiment 2 revealed a similar pattern when participants had the opportunity to transfer information from source to target stories. Greater transfer was seen for stories with similar themes than for stories with dissimilar themes. Greater transfer was also seen for stories with similar implementations of different themes than for stories with different implementations.
Aviation Safety Issues Database

Science.gov (United States)

Morello, Samuel A.; Ricks, Wendell R.

2009-01-01

The aviation safety issues database was instrumental in the refinement and substantiation of the National Aviation Safety Strategic Plan (NASSP). The issues database is a comprehensive set of issues from an extremely broad base of aviation functions, personnel, and vehicle categories, both nationally and internationally. Several aviation safety stakeholders such as the Commercial Aviation Safety Team (CAST) have already used the database. This broader interest was the genesis to making the database publically accessible and writing this report.
Functional similarities between the dictyostelium protein AprA and the human protein dipeptidyl-peptidase IV.

Science.gov (United States)

Herlihy, Sarah E; Tang, Yu; Phillips, Jonathan E; Gomer, Richard H

2017-03-01

Autocrine proliferation repressor protein A (AprA) is a protein secreted by Dictyostelium discoideum cells. Although there is very little sequence similarity between AprA and any human protein, AprA has a predicted structural similarity to the human protein dipeptidyl peptidase IV (DPPIV). AprA is a chemorepellent for Dictyostelium cells, and DPPIV is a chemorepellent for neutrophils. This led us to investigate if AprA and DPPIV have additional functional similarities. We find that like AprA, DPPIV is a chemorepellent for, and inhibits the proliferation of, D. discoideum cells, and that AprA binds some DPPIV binding partners such as fibronectin. Conversely, rAprA has DPPIV-like protease activity. These results indicate a functional similarity between two eukaryotic chemorepellent proteins with very little sequence similarity, and emphasize the usefulness of using a predicted protein structure to search a protein structure database, in addition to searching for proteins with similar sequences. © 2016 The Protein Society.
Functional similarities between the dictyostelium protein AprA and the human protein dipeptidyl‐peptidase IV

Science.gov (United States)

Herlihy, Sarah E.; Tang, Yu; Phillips, Jonathan E.

2017-01-01

Abstract Autocrine proliferation repressor protein A (AprA) is a protein secreted by Dictyostelium discoideum cells. Although there is very little sequence similarity between AprA and any human protein, AprA has a predicted structural similarity to the human protein dipeptidyl peptidase IV (DPPIV). AprA is a chemorepellent for Dictyostelium cells, and DPPIV is a chemorepellent for neutrophils. This led us to investigate if AprA and DPPIV have additional functional similarities. We find that like AprA, DPPIV is a chemorepellent for, and inhibits the proliferation of, D. discoideum cells, and that AprA binds some DPPIV binding partners such as fibronectin. Conversely, rAprA has DPPIV‐like protease activity. These results indicate a functional similarity between two eukaryotic chemorepellent proteins with very little sequence similarity, and emphasize the usefulness of using a predicted protein structure to search a protein structure database, in addition to searching for proteins with similar sequences. PMID:28028841
BDVC (Bimodal Database of Violent Content): A database of violent audio and video

Science.gov (United States)

Rivera Martínez, Jose Luis; Mijes Cruz, Mario Humberto; Rodríguez Vázqu, Manuel Antonio; Rodríguez Espejo, Luis; Montoya Obeso, Abraham; García Vázquez, Mireya Saraí; Ramírez Acosta, Alejandro Álvaro

2017-09-01

Nowadays there is a trend towards the use of unimodal databases for multimedia content description, organization and retrieval applications of a single type of content like text, voice and images, instead bimodal databases allow to associate semantically two different types of content like audio-video, image-text, among others. The generation of a bimodal database of audio-video implies the creation of a connection between the multimedia content through the semantic relation that associates the actions of both types of information. This paper describes in detail the used characteristics and methodology for the creation of the bimodal database of violent content; the semantic relationship is stablished by the proposed concepts that describe the audiovisual information. The use of bimodal databases in applications related to the audiovisual content processing allows an increase in the semantic performance only and only if these applications process both type of content. This bimodal database counts with 580 audiovisual annotated segments, with a duration of 28 minutes, divided in 41 classes. Bimodal databases are a tool in the generation of applications for the semantic web.
A Profile-Based Framework for Factorial Similarity and the Congruence Coefficient.

Science.gov (United States)

Hartley, Anselma G; Furr, R Michael

2017-01-01

We present a novel profile-based framework for understanding factorial similarity in the context of exploratory factor analysis in general, and for understanding the congruence coefficient (a commonly used index of factor similarity) specifically. First, we introduce the profile-based framework articulating factorial similarity in terms of 3 intuitive components: general saturation similarity, differential saturation similarity, and configural similarity. We then articulate the congruence coefficient in terms of these components, along with 2 additional profile-based components, and we explain how these components resolve ambiguities that can be-and are-found when using the congruence coefficient. Finally, we present secondary analyses revealing that profile-based components of factorial are indeed linked to experts' actual evaluations of factorial similarity. Overall, the profile-based approach we present offers new insights into the ways in which researchers can examine factor similarity and holds the potential to enhance researchers' ability to understand the congruence coefficient.
Newly Digitized Database Reveals the Lives and Families of Forced Migrants from Finnish Karelia

Directory of Open Access Journals (Sweden)

John Loehr

2017-12-01

Full Text Available Studies on displaced persons often suffer from a lack of data on the long-term effects of forced migration. A register created during 1960s and published as a book series ‘Siirtokarjalaisten tie’ in 1970 documented the lives of individuals who fled the southern Karelian district of Finland after its first and second occupation by the Soviet Union in 1940 and 1944. To realize the potential value of these data for scientific research, we have recently scanned the register using optical character recognition (OCR software, and developed proprietary computer code to extract these data. Here we outline the steps involved in the digitization process, and present an overview of the Migration Karelia (MiKARELIA database now available to researchers. The digitized register contains over 160000 adults and a wide range of data on births, marriages, occupations and movements of these forced migrants, likely to be of interest to researchers across disciplines including demographers, anthropologists, evolutionary biologists, historians, economists and sociologists.
GenoMycDB: a database for comparative analysis of mycobacterial genes and genomes.

Science.gov (United States)

Catanho, Marcos; Mascarenhas, Daniel; Degrave, Wim; Miranda, Antonio Basílio de

2006-03-31

Several databases and computational tools have been created with the aim of organizing, integrating and analyzing the wealth of information generated by large-scale sequencing projects of mycobacterial genomes and those of other organisms. However, with very few exceptions, these databases and tools do not allow for massive and/or dynamic comparison of these data. GenoMycDB (http://www.dbbm.fiocruz.br/GenoMycDB) is a relational database built for large-scale comparative analyses of completely sequenced mycobacterial genomes, based on their predicted protein content. Its central structure is composed of the results obtained after pair-wise sequence alignments among all the predicted proteins coded by the genomes of six mycobacteria: Mycobacterium tuberculosis (strains H37Rv and CDC1551), M. bovis AF2122/97, M. avium subsp. paratuberculosis K10, M. leprae TN, and M. smegmatis MC2 155. The database stores the computed similarity parameters of every aligned pair, providing for each protein sequence the predicted subcellular localization, the assigned cluster of orthologous groups, the features of the corresponding gene, and links to several important databases. Tables containing pairs or groups of potential homologs between selected species/strains can be produced dynamically by user-defined criteria, based on one or multiple sequence similarity parameters. In addition, searches can be restricted according to the predicted subcellular localization of the protein, the DNA strand of the corresponding gene and/or the description of the protein. Massive data search and/or retrieval are available, and different ways of exporting the result are offered. GenoMycDB provides an on-line resource for the functional classification of mycobacterial proteins as well as for the analysis of genome structure, organization, and evolution.
Thoughts toward a clinical database of architecture: evidence, complexity, and impact

Directory of Open Access Journals (Sweden)

Leonard R. Bachman

2012-10-01

Full Text Available This paper examines how architecture is building a clinical database similar to that of law and medicine and is developing this database for the purposes of acquiring complex design insight. This emerging clinical branch of architectural knowledge exceeds the scope of everyday experience of physical form and can thus be shown to enable a more satisfying scale of design thinking. It is argued that significant transformational kinds of professional transparency and accountability are thus intensifying. The tactics and methods of this paper are to connect previously disparate historical and contemporary events that mark the evolution of this database and then to fold those events into an explanatory narrative concerning clinical design practice. Beginning with architecture’s use of precedent (Collins 1971, the formulation of design as complex problems (Rittel and Webber 1973, high performance buildings to meet the crisis of climate change, social mandates of postindustrial society (Bell 1973, and other roots of evidence, the paper then elaborates the themes in which this database is evolving. Such themes include post-occupancy evaluation (Bordass and Leaman 2005, continuous commissioning, performance simulation, digital instrumentation, automation, and other modes of data collection in buildings. Finally, the paper concludes with some anticipated impacts that such a clinical database might have on design practice and how their benefits can be achieved through new interdisciplinary relations between academia and practice.
Experiment Databases

Science.gov (United States)

Vanschoren, Joaquin; Blockeel, Hendrik

Next to running machine learning algorithms based on inductive queries, much can be learned by immediately querying the combined results of many prior studies. Indeed, all around the globe, thousands of machine learning experiments are being executed on a daily basis, generating a constant stream of empirical information on machine learning techniques. While the information contained in these experiments might have many uses beyond their original intent, results are typically described very concisely in papers and discarded afterwards. If we properly store and organize these results in central databases, they can be immediately reused for further analysis, thus boosting future research. In this chapter, we propose the use of experiment databases: databases designed to collect all the necessary details of these experiments, and to intelligently organize them in online repositories to enable fast and thorough analysis of a myriad of collected results. They constitute an additional, queriable source of empirical meta-data based on principled descriptions of algorithm executions, without reimplementing the algorithms in an inductive database. As such, they engender a very dynamic, collaborative approach to experimentation, in which experiments can be freely shared, linked together, and immediately reused by researchers all over the world. They can be set up for personal use, to share results within a lab or to create open, community-wide repositories. Here, we provide a high-level overview of their design, and use an existing experiment database to answer various interesting research questions about machine learning algorithms and to verify a number of recent studies.
An XCT image database system

International Nuclear Information System (INIS)

Komori, Masaru; Minato, Kotaro; Koide, Harutoshi; Hirakawa, Akina; Nakano, Yoshihisa; Itoh, Harumi; Torizuka, Kanji; Yamasaki, Tetsuo; Kuwahara, Michiyoshi.

1984-01-01

In this paper, an expansion of X-ray CT (XCT) examination history database to XCT image database is discussed. The XCT examination history database has been constructed and used for daily examination and investigation in our hospital. This database consists of alpha-numeric information (locations, diagnosis and so on) of more than 15,000 cases, and for some of them, we add tree structured image data which has a flexibility for various types of image data. This database system is written by MUMPS database manipulation language. (author)
The Danish fetal medicine database

DEFF Research Database (Denmark)

Ekelund, Charlotte Kvist; Kopp, Tine Iskov; Tabor, Ann

2016-01-01

trimester ultrasound scan performed at all public hospitals in Denmark are registered in the database. Main variables/descriptive data: Data on maternal characteristics, ultrasonic, and biochemical variables are continuously sent from the fetal medicine units’Astraia databases to the central database via...... analyses are sent to the database. Conclusion: It has been possible to establish a fetal medicine database, which monitors first-trimester screening for chromosomal abnormalities and second-trimester screening for major fetal malformations with the input from already collected data. The database...
Sequence tagging reveals unexpected modifications in toxicoproteomics

Science.gov (United States)

Dasari, Surendra; Chambers, Matthew C.; Codreanu, Simona G.; Liebler, Daniel C.; Collins, Ben C.; Pennington, Stephen R.; Gallagher, William M.; Tabb, David L.

2010-01-01

Toxicoproteomic samples are rich in posttranslational modifications (PTMs) of proteins. Identifying these modifications via standard database searching can incur significant performance penalties. Here we describe the latest developments in TagRecon, an algorithm that leverages inferred sequence tags to identify modified peptides in toxicoproteomic data sets. TagRecon identifies known modifications more effectively than the MyriMatch database search engine. TagRecon outperformed state of the art software in recognizing unanticipated modifications from LTQ, Orbitrap, and QTOF data sets. We developed user-friendly software for detecting persistent mass shifts from samples. We follow a three-step strategy for detecting unanticipated PTMs in samples. First, we identify the proteins present in the sample with a standard database search. Next, identified proteins are interrogated for unexpected PTMs with a sequence tag-based search. Finally, additional evidence is gathered for the detected mass shifts with a refinement search. Application of this technology on toxicoproteomic data sets revealed unintended cross-reactions between proteins and sample processing reagents. Twenty five proteins in rat liver showed signs of oxidative stress when exposed to potentially toxic drugs. These results demonstrate the value of mining toxicoproteomic data sets for modifications. PMID:21214251
National database

DEFF Research Database (Denmark)

Kristensen, Helen Grundtvig; Stjernø, Henrik

1995-01-01

Artikel om national database for sygeplejeforskning oprettet på Dansk Institut for Sundheds- og Sygeplejeforskning. Det er målet med databasen at samle viden om forsknings- og udviklingsaktiviteter inden for sygeplejen.......Artikel om national database for sygeplejeforskning oprettet på Dansk Institut for Sundheds- og Sygeplejeforskning. Det er målet med databasen at samle viden om forsknings- og udviklingsaktiviteter inden for sygeplejen....
Update History of This Database - AcEST | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us AcEST Update History of This Database Date Update contents 2013/01/10 Errors found on AcEST ...s Database Database Description Download License Update History of This Data...base Site Policy | Contact Us Update History of This Database - AcEST | LSDB Archive ... ...Conting data have been correceted. For details, please refer to the following page. Data correction 2010/03/29 AcEST English archi
The Danish Testicular Cancer database.

Science.gov (United States)

Daugaard, Gedske; Kier, Maria Gry Gundgaard; Bandak, Mikkel; Mortensen, Mette Saksø; Larsson, Heidi; Søgaard, Mette; Toft, Birgitte Groenkaer; Engvad, Birte; Agerbæk, Mads; Holm, Niels Vilstrup; Lauritsen, Jakob

2016-01-01

The nationwide Danish Testicular Cancer database consists of a retrospective research database (DaTeCa database) and a prospective clinical database (Danish Multidisciplinary Cancer Group [DMCG] DaTeCa database). The aim is to improve the quality of care for patients with testicular cancer (TC) in Denmark, that is, by identifying risk factors for relapse, toxicity related to treatment, and focusing on late effects. All Danish male patients with a histologically verified germ cell cancer diagnosis in the Danish Pathology Registry are included in the DaTeCa databases. Data collection has been performed from 1984 to 2007 and from 2013 onward, respectively. The retrospective DaTeCa database contains detailed information with more than 300 variables related to histology, stage, treatment, relapses, pathology, tumor markers, kidney function, lung function, etc. A questionnaire related to late effects has been conducted, which includes questions regarding social relationships, life situation, general health status, family background, diseases, symptoms, use of medication, marital status, psychosocial issues, fertility, and sexuality. TC survivors alive on October 2014 were invited to fill in this questionnaire including 160 validated questions. Collection of questionnaires is still ongoing. A biobank including blood/sputum samples for future genetic analyses has been established. Both samples related to DaTeCa and DMCG DaTeCa database are included. The prospective DMCG DaTeCa database includes variables regarding histology, stage, prognostic group, and treatment. The DMCG DaTeCa database has existed since 2013 and is a young clinical database. It is necessary to extend the data collection in the prospective database in order to answer quality-related questions. Data from the retrospective database will be added to the prospective data. This will result in a large and very comprehensive database for future studies on TC patients.
Update History of This Database - D-HaploDB | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us D-HaploDB Update History of This Database Date Update contents 2016/12/13 Description of the.../orca.gen.kyushu-u.ac.jp/) is released. About This Database Database Description Download License Update History of This Database... Site Policy | Contact Us Update History of This Database - D-HaploDB | LSDB Archive ...
Scopus database: a review.

Science.gov (United States)

Burnham, Judy F

2006-03-08

The Scopus database provides access to STM journal articles and the references included in those articles, allowing the searcher to search both forward and backward in time. The database can be used for collection development as well as for research. This review provides information on the key points of the database and compares it to Web of Science. Neither database is inclusive, but complements each other. If a library can only afford one, choice must be based in institutional needs.
Database principles programming performance

CERN Document Server

O'Neil, Patrick

2014-01-01

Database: Principles Programming Performance provides an introduction to the fundamental principles of database systems. This book focuses on database programming and the relationships between principles, programming, and performance.Organized into 10 chapters, this book begins with an overview of database design principles and presents a comprehensive introduction to the concepts used by a DBA. This text then provides grounding in many abstract concepts of the relational model. Other chapters introduce SQL, describing its capabilities and covering the statements and functions of the programmi

TrED: the Trichophyton rubrum Expression Database

Directory of Open Access Journals (Sweden)

Liu Tao

2007-07-01

Full Text Available Abstract Background Trichophyton rubrum is the most common dermatophyte species and the most frequent cause of fungal skin infections in humans worldwide. It's a major concern because feet and nail infections caused by this organism is extremely difficult to cure. A large set of expression data including expressed sequence tags (ESTs and transcriptional profiles of this important fungal pathogen are now available. Careful analysis of these data can give valuable information about potential virulence factors, antigens and novel metabolic pathways. We intend to create an integrated database TrED to facilitate the study of dermatophytes, and enhance the development of effective diagnostic and treatment strategies. Description All publicly available ESTs and expression profiles of T. rubrum during conidial germination in time-course experiments and challenged with antifungal agents are deposited in the database. In addition, comparative genomics hybridization results of 22 dermatophytic fungi strains from three genera, Trichophyton, Microsporum and Epidermophyton, are also included. ESTs are clustered and assembled to elongate the sequence length and abate redundancy. TrED provides functional analysis based on GenBank, Pfam, and KOG databases, along with KEGG pathway and GO vocabulary. It is integrated with a suite of custom web-based tools that facilitate querying and retrieving various EST properties, visualization and comparison of transcriptional profiles, and sequence-similarity searching by BLAST. Conclusion TrED is built upon a relational database, with a web interface offering analytic functions, to provide integrated access to various expression data of T. rubrum and comparative results of dermatophytes. It is devoted to be a comprehensive resource and platform to assist functional genomic studies in dermatophytes. TrED is available from URL: http://www.mgc.ac.cn/TrED/.
Network and Database Security: Regulatory Compliance, Network, and Database Security - A Unified Process and Goal

Directory of Open Access Journals (Sweden)

Errol A. Blake

2007-12-01

Full Text Available Database security has evolved; data security professionals have developed numerous techniques and approaches to assure data confidentiality, integrity, and availability. This paper will show that the Traditional Database Security, which has focused primarily on creating user accounts and managing user privileges to database objects are not enough to protect data confidentiality, integrity, and availability. This paper is a compilation of different journals, articles and classroom discussions will focus on unifying the process of securing data or information whether it is in use, in storage or being transmitted. Promoting a change in Database Curriculum Development trends may also play a role in helping secure databases. This paper will take the approach that if one make a conscientious effort to unifying the Database Security process, which includes Database Management System (DBMS selection process, following regulatory compliances, analyzing and learning from the mistakes of others, Implementing Networking Security Technologies, and Securing the Database, may prevent database breach.
Hazard Analysis Database Report

CERN Document Server

Grams, W H

2000-01-01

The Hazard Analysis Database was developed in conjunction with the hazard analysis activities conducted in accordance with DOE-STD-3009-94, Preparation Guide for U S . Department of Energy Nonreactor Nuclear Facility Safety Analysis Reports, for HNF-SD-WM-SAR-067, Tank Farms Final Safety Analysis Report (FSAR). The FSAR is part of the approved Authorization Basis (AB) for the River Protection Project (RPP). This document describes, identifies, and defines the contents and structure of the Tank Farms FSAR Hazard Analysis Database and documents the configuration control changes made to the database. The Hazard Analysis Database contains the collection of information generated during the initial hazard evaluations and the subsequent hazard and accident analysis activities. The Hazard Analysis Database supports the preparation of Chapters 3 ,4 , and 5 of the Tank Farms FSAR and the Unreviewed Safety Question (USQ) process and consists of two major, interrelated data sets: (1) Hazard Analysis Database: Data from t...
LCGbase: A Comprehensive Database for Lineage-Based Co-regulated Genes.

Science.gov (United States)

Wang, Dapeng; Zhang, Yubin; Fan, Zhonghua; Liu, Guiming; Yu, Jun

2012-01-01

Animal genes of different lineages, such as vertebrates and arthropods, are well-organized and blended into dynamic chromosomal structures that represent a primary regulatory mechanism for body development and cellular differentiation. The majority of genes in a genome are actually clustered, which are evolutionarily stable to different extents and biologically meaningful when evaluated among genomes within and across lineages. Until now, many questions concerning gene organization, such as what is the minimal number of genes in a cluster and what is the driving force leading to gene co-regulation, remain to be addressed. Here, we provide a user-friendly database-LCGbase (a comprehensive database for lineage-based co-regulated genes)-hosting information on evolutionary dynamics of gene clustering and ordering within animal kingdoms in two different lineages: vertebrates and arthropods. The database is constructed on a web-based Linux-Apache-MySQL-PHP framework and effective interactive user-inquiry service. Compared to other gene annotation databases with similar purposes, our database has three comprehensible advantages. First, our database is inclusive, including all high-quality genome assemblies of vertebrates and representative arthropod species. Second, it is human-centric since we map all gene clusters from other genomes in an order of lineage-ranks (such as primates, mammals, warm-blooded, and reptiles) onto human genome and start the database from well-defined gene pairs (a minimal cluster where the two adjacent genes are oriented as co-directional, convergent, and divergent pairs) to large gene clusters. Furthermore, users can search for any adjacent genes and their detailed annotations. Third, the database provides flexible parameter definitions, such as the distance of transcription start sites between two adjacent genes, which is extendable to genes that flanking the cluster across species. We also provide useful tools for sequence alignment, gene
The Structure-Function Linkage Database.

Science.gov (United States)

Akiva, Eyal; Brown, Shoshana; Almonacid, Daniel E; Barber, Alan E; Custer, Ashley F; Hicks, Michael A; Huang, Conrad C; Lauck, Florian; Mashiyama, Susan T; Meng, Elaine C; Mischel, David; Morris, John H; Ojha, Sunil; Schnoes, Alexandra M; Stryke, Doug; Yunes, Jeffrey M; Ferrin, Thomas E; Holliday, Gemma L; Babbitt, Patricia C

2014-01-01

The Structure-Function Linkage Database (SFLD, http://sfld.rbvi.ucsf.edu/) is a manually curated classification resource describing structure-function relationships for functionally diverse enzyme superfamilies. Members of such superfamilies are diverse in their overall reactions yet share a common ancestor and some conserved active site features associated with conserved functional attributes such as a partial reaction. Thus, despite their different functions, members of these superfamilies 'look alike', making them easy to misannotate. To address this complexity and enable rational transfer of functional features to unknowns only for those members for which we have sufficient functional information, we subdivide superfamily members into subgroups using sequence information, and lastly into families, sets of enzymes known to catalyze the same reaction using the same mechanistic strategy. Browsing and searching options in the SFLD provide access to all of these levels. The SFLD offers manually curated as well as automatically classified superfamily sets, both accompanied by search and download options for all hierarchical levels. Additional information includes multiple sequence alignments, tab-separated files of functional and other attributes, and sequence similarity networks. The latter provide a new and intuitively powerful way to visualize functional trends mapped to the context of sequence similarity.
Animal Detection in Natural Images: Effects of Color and Image Database

Science.gov (United States)

Zhu, Weina; Drewes, Jan; Gegenfurtner, Karl R.

2013-01-01

The visual system has a remarkable ability to extract categorical information from complex natural scenes. In order to elucidate the role of low-level image features for the recognition of objects in natural scenes, we recorded saccadic eye movements and event-related potentials (ERPs) in two experiments, in which human subjects had to detect animals in previously unseen natural images. We used a new natural image database (ANID) that is free of some of the potential artifacts that have plagued the widely used COREL images. Color and grayscale images picked from the ANID and COREL databases were used. In all experiments, color images induced a greater N1 EEG component at earlier time points than grayscale images. We suggest that this influence of color in animal detection may be masked by later processes when measuring reation times. The ERP results of go/nogo and forced choice tasks were similar to those reported earlier. The non-animal stimuli induced bigger N1 than animal stimuli both in the COREL and ANID databases. This result indicates ultra-fast processing of animal images is possible irrespective of the particular database. With the ANID images, the difference between color and grayscale images is more pronounced than with the COREL images. The earlier use of the COREL images might have led to an underestimation of the contribution of color. Therefore, we conclude that the ANID image database is better suited for the investigation of the processing of natural scenes than other databases commonly used. PMID:24130744
Animal detection in natural images: effects of color and image database.

Directory of Open Access Journals (Sweden)

Weina Zhu

Full Text Available The visual system has a remarkable ability to extract categorical information from complex natural scenes. In order to elucidate the role of low-level image features for the recognition of objects in natural scenes, we recorded saccadic eye movements and event-related potentials (ERPs in two experiments, in which human subjects had to detect animals in previously unseen natural images. We used a new natural image database (ANID that is free of some of the potential artifacts that have plagued the widely used COREL images. Color and grayscale images picked from the ANID and COREL databases were used. In all experiments, color images induced a greater N1 EEG component at earlier time points than grayscale images. We suggest that this influence of color in animal detection may be masked by later processes when measuring reation times. The ERP results of go/nogo and forced choice tasks were similar to those reported earlier. The non-animal stimuli induced bigger N1 than animal stimuli both in the COREL and ANID databases. This result indicates ultra-fast processing of animal images is possible irrespective of the particular database. With the ANID images, the difference between color and grayscale images is more pronounced than with the COREL images. The earlier use of the COREL images might have led to an underestimation of the contribution of color. Therefore, we conclude that the ANID image database is better suited for the investigation of the processing of natural scenes than other databases commonly used.
Database for propagation models

Science.gov (United States)

Kantak, Anil V.

1991-07-01

A propagation researcher or a systems engineer who intends to use the results of a propagation experiment is generally faced with various database tasks such as the selection of the computer software, the hardware, and the writing of the programs to pass the data through the models of interest. This task is repeated every time a new experiment is conducted or the same experiment is carried out at a different location generating different data. Thus the users of this data have to spend a considerable portion of their time learning how to implement the computer hardware and the software towards the desired end. This situation may be facilitated considerably if an easily accessible propagation database is created that has all the accepted (standardized) propagation phenomena models approved by the propagation research community. Also, the handling of data will become easier for the user. Such a database construction can only stimulate the growth of the propagation research it if is available to all the researchers, so that the results of the experiment conducted by one researcher can be examined independently by another, without different hardware and software being used. The database may be made flexible so that the researchers need not be confined only to the contents of the database. Another way in which the database may help the researchers is by the fact that they will not have to document the software and hardware tools used in their research since the propagation research community will know the database already. The following sections show a possible database construction, as well as properties of the database for the propagation research.
QSAR and docking studies of anthraquinone derivatives by similarity cluster prediction.

Science.gov (United States)

Harsa, Alexandra M; Harsa, Teodora E; Diudea, Mircea V

2016-01-01

Forty anthraquinone derivatives have been downloaded from PubChem database and investigated in a quantitative structure-activity relationships (QSAR) study. The models describing log P and LD50 of this set were built up on the hypermolecule scheme that mimics the investigated receptor space; the models were validated by the leave-one-out procedure, in the external test set and in a new version of prediction by using similarity clusters. Molecular docking approach using Lamarckian Genetic Algorithm was made on this class of anthraquinones with respect to 3Q3B receptor. The best scored molecules in the docking assay were used as leaders in the similarity clustering procedure. It is demonstrated that the LD50 data of this set of anthraquinones are related to the binding energies of anthraquinone ligands to the 3Q3B receptor.
Update History of This Database - TP Atlas | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us TP Atlas Update History of This Database Date Update contents 2013/12/16 The email address i...s ( http://www.tanpaku.org/tpatlas/ ) is opened. About This Database Database Description Download License Update History of Thi...s Database Site Policy | Contact Us Update History of This Database - TP Atlas | LSDB Archive ... ...n the contact information is corrected. 2013/11/19 TP Atlas English archive site is opened. 2008/4/1 TP Atla
Update History of This Database - KEGG MEDICUS | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available glish archive site is opened. 2010/10/01 KEGG MEDICUS ( http://www.kegg.jp/kegg/medicus/ ) is opened. About ...[ Credits ] English ]; } else if ( url.search(//en//) != -1 ) { url = url.replace(/...switchLanguage; BLAST Search Image Search Home About Archive Update History Data List Contact us KEGG MEDI...CUS Update History of This Database Date Update contents 2014/05/09 KEGG MEDICUS En...This Database Database Description Download License Update History of This Database Site Policy | Contact Us Update History of This Database - KEGG MEDICUS | LSDB Archive ...
Coordinating Mobile Databases: A System Demonstration

OpenAIRE

Zaihrayeu, Ilya; Giunchiglia, Fausto

2004-01-01

In this paper we present the Peer Database Management System (PDBMS). This system runs on top of the standard database management system, and it allows it to connect its database with other (peer) databases on the network. A particularity of our solution is that PDBMS allows for conventional database technology to be effectively operational in mobile settings. We think of database mobility as a database network, where databases appear and disappear spontaneously and their network access point...
Danish Urogynaecological Database

DEFF Research Database (Denmark)

Hansen, Ulla Darling; Gradel, Kim Oren; Larsen, Michael Due

2016-01-01

, complications if relevant, implants used if relevant, 3-6-month postoperative recording of symptoms, if any. A set of clinical quality indicators is being maintained by the steering committee for the database and is published in an annual report which also contains extensive descriptive statistics. The database......The Danish Urogynaecological Database is established in order to ensure high quality of treatment for patients undergoing urogynecological surgery. The database contains details of all women in Denmark undergoing incontinence surgery or pelvic organ prolapse surgery amounting to ~5,200 procedures...... has a completeness of over 90% of all urogynecological surgeries performed in Denmark. Some of the main variables have been validated using medical records as gold standard. The positive predictive value was above 90%. The data are used as a quality monitoring tool by the hospitals and in a number...
RxnFinder: biochemical reaction search engines using molecular structures, molecular fragments and reaction similarity.

Science.gov (United States)

Hu, Qian-Nan; Deng, Zhe; Hu, Huanan; Cao, Dong-Sheng; Liang, Yi-Zeng

2011-09-01

Biochemical reactions play a key role to help sustain life and allow cells to grow. RxnFinder was developed to search biochemical reactions from KEGG reaction database using three search criteria: molecular structures, molecular fragments and reaction similarity. RxnFinder is helpful to get reference reactions for biosynthesis and xenobiotics metabolism. RxnFinder is freely available via: http://sdd.whu.edu.cn/rxnfinder. qnhu@whu.edu.cn.
LandIT Database

DEFF Research Database (Denmark)

Iftikhar, Nadeem; Pedersen, Torben Bach

2010-01-01

and reporting purposes. This paper presents the LandIT database; which is result of the LandIT project, which refers to an industrial collaboration project that developed technologies for communication and data integration between farming devices and systems. The LandIT database in principal is based...... on the ISOBUS standard; however the standard is extended with additional requirements, such as gradual data aggregation and flexible exchange of farming data. This paper describes the conceptual and logical schemas of the proposed database based on a real-life farming case study....
Birds of a feather sit together: physical similarity predicts seating choice.

Science.gov (United States)

Mackinnon, Sean P; Jordan, Christian H; Wilson, Anne E

2011-07-01

Across four studies, people sat (or reported they would sit) closer to physically similar others. Study 1 revealed significant aggregation in seating patterns on two easily observed characteristics: glasses wearing and sex. Study 2 replicated this finding with a wider variety of physical traits: race, sex, glasses wearing, hair length, and hair color. The overall tendency for people to sit beside physically similar others remained significant when controlling for sex and race, suggesting people aggregate on physical dimensions other than broad social categories. Study 3 conceptually replicated these results in a laboratory setting. The more physically similar participants were to a confederate, the closer they sat before an anticipated interaction when controlling for sex, race, and attractiveness similarity. In Study 4, overall physical similarity and glasses wearing similarity predicted self-reported seating distance. These effects were mediated by perceived attitudinal similarity. Liking and inferred acceptance also received support as mediators for glasses wearing similarity. © 2011 by the Society for Personality and Social Psychology, Inc
Verification of road databases using multiple road models

Science.gov (United States)

Ziems, Marcel; Rottensteiner, Franz; Heipke, Christian

2017-08-01

In this paper a new approach for automatic road database verification based on remote sensing images is presented. In contrast to existing methods, the applicability of the new approach is not restricted to specific road types, context areas or geographic regions. This is achieved by combining several state-of-the-art road detection and road verification approaches that work well under different circumstances. Each one serves as an independent module representing a unique road model and a specific processing strategy. All modules provide independent solutions for the verification problem of each road object stored in the database in form of two probability distributions, the first one for the state of a database object (correct or incorrect), and a second one for the state of the underlying road model (applicable or not applicable). In accordance with the Dempster-Shafer Theory, both distributions are mapped to a new state space comprising the classes correct, incorrect and unknown. Statistical reasoning is applied to obtain the optimal state of a road object. A comparison with state-of-the-art road detection approaches using benchmark datasets shows that in general the proposed approach provides results with larger completeness. Additional experiments reveal that based on the proposed method a highly reliable semi-automatic approach for road data base verification can be designed.
Network-based Database Course

DEFF Research Database (Denmark)

Nielsen, J.N.; Knudsen, Morten; Nielsen, Jens Frederik Dalsgaard

A course in database design and implementation has been de- signed, utilizing existing network facilities. The course is an elementary course for students of computer engineering. Its purpose is to give the students a theoretical database knowledge as well as practical experience with design...... and implementation. A tutorial relational database and the students self-designed databases are implemented on the UNIX system of Aalborg University, thus giving the teacher the possibility of live demonstrations in the lecture room, and the students the possibility of interactive learning in their working rooms...
Visual similarity in short-term recall for where and when.

Science.gov (United States)

Jalbert, Annie; Saint-Aubin, Jean; Tremblay, Sébastien

2008-03-01

Two experiments examined the effects of visual similarity on short-term recall for where and when in the visual spatial domain. A series of squares of similar or dissimilar colours were serially presented at various locations on the screen. At recall, all coloured squares were simultaneously presented in a random order at the bottom of the screen, and the locations used for presentation were indicated by white squares. Participants were asked to place the colours at their appropriate location in their presentation order. Performance for location (where) and order (when) was assessed separately. Results revealed that similarity severely hinders both memory for what was where and memory for what was when, under quiet and articulatory suppression conditions. These results provide further evidence that similarity has a major impact on processing relational information in memory.
Predicting drug-target interaction for new drugs using enhanced similarity measures and super-target clustering.

Science.gov (United States)

Shi, Jian-Yu; Yiu, Siu-Ming; Li, Yiming; Leung, Henry C M; Chin, Francis Y L

2015-07-15

Predicting drug-target interaction using computational approaches is an important step in drug discovery and repositioning. To predict whether there will be an interaction between a drug and a target, most existing methods identify similar drugs and targets in the database. The prediction is then made based on the known interactions of these drugs and targets. This idea is promising. However, there are two shortcomings that have not yet been addressed appropriately. Firstly, most of the methods only use 2D chemical structures and protein sequences to measure the similarity of drugs and targets respectively. However, this information may not fully capture the characteristics determining whether a drug will interact with a target. Secondly, there are very few known interactions, i.e. many interactions are "missing" in the database. Existing approaches are biased towards known interactions and have no good solutions to handle possibly missing interactions which affect the accuracy of the prediction. In this paper, we enhance the similarity measures to include non-structural (and non-sequence-based) information and introduce the concept of a "super-target" to handle the problem of possibly missing interactions. Based on evaluations on real data, we show that our similarity measure is better than the existing measures and our approach is able to achieve higher accuracy than the two best existing algorithms, WNN-GIP and KBMF2K. Our approach is available at http://web.hku.hk/∼liym1018/projects/drug/drug.html or http://www.bmlnwpu.org/us/tools/PredictingDTI_S2/METHODS.html. Copyright © 2015 Elsevier Inc. All rights reserved.

Migration Between NoSQL Databases

OpenAIRE

Opačak, Damir

2013-01-01

The thesis discusses the differences and, consequently, potential problems that may arise when migrating between different types of NoSQL databases. The first chapters introduce the reader to the issues of relational databases and present the beginnings of NoSQL databases. The following chapters present different types of NoSQL databases and some of their representatives with the aim to show specific features of NoSQL databases and the fact that each of them was developed to solve specifi...
The volatile compound BinBase mass spectral database.

Science.gov (United States)

Skogerson, Kirsten; Wohlgemuth, Gert; Barupal, Dinesh K; Fiehn, Oliver

2011-08-04

Volatile compounds comprise diverse chemical groups with wide-ranging sources and functions. These compounds originate from major pathways of secondary metabolism in many organisms and play essential roles in chemical ecology in both plant and animal kingdoms. In past decades, sampling methods and instrumentation for the analysis of complex volatile mixtures have improved; however, design and implementation of database tools to process and store the complex datasets have lagged behind. The volatile compound BinBase (vocBinBase) is an automated peak annotation and database system developed for the analysis of GC-TOF-MS data derived from complex volatile mixtures. The vocBinBase DB is an extension of the previously reported metabolite BinBase software developed to track and identify derivatized metabolites. The BinBase algorithm uses deconvoluted spectra and peak metadata (retention index, unique ion, spectral similarity, peak signal-to-noise ratio, and peak purity) from the Leco ChromaTOF software, and annotates peaks using a multi-tiered filtering system with stringent thresholds. The vocBinBase algorithm assigns the identity of compounds existing in the database. Volatile compound assignments are supported by the Adams mass spectral-retention index library, which contains over 2,000 plant-derived volatile compounds. Novel molecules that are not found within vocBinBase are automatically added using strict mass spectral and experimental criteria. Users obtain fully annotated data sheets with quantitative information for all volatile compounds for studies that may consist of thousands of chromatograms. The vocBinBase database may also be queried across different studies, comprising currently 1,537 unique mass spectra generated from 1.7 million deconvoluted mass spectra of 3,435 samples (18 species). Mass spectra with retention indices and volatile profiles are available as free download under the CC-BY agreement (http://vocbinbase.fiehnlab.ucdavis.edu). The Bin
The volatile compound BinBase mass spectral database

Directory of Open Access Journals (Sweden)

Barupal Dinesh K

2011-08-01

Full Text Available Abstract Background Volatile compounds comprise diverse chemical groups with wide-ranging sources and functions. These compounds originate from major pathways of secondary metabolism in many organisms and play essential roles in chemical ecology in both plant and animal kingdoms. In past decades, sampling methods and instrumentation for the analysis of complex volatile mixtures have improved; however, design and implementation of database tools to process and store the complex datasets have lagged behind. Description The volatile compound BinBase (vocBinBase is an automated peak annotation and database system developed for the analysis of GC-TOF-MS data derived from complex volatile mixtures. The vocBinBase DB is an extension of the previously reported metabolite BinBase software developed to track and identify derivatized metabolites. The BinBase algorithm uses deconvoluted spectra and peak metadata (retention index, unique ion, spectral similarity, peak signal-to-noise ratio, and peak purity from the Leco ChromaTOF software, and annotates peaks using a multi-tiered filtering system with stringent thresholds. The vocBinBase algorithm assigns the identity of compounds existing in the database. Volatile compound assignments are supported by the Adams mass spectral-retention index library, which contains over 2,000 plant-derived volatile compounds. Novel molecules that are not found within vocBinBase are automatically added using strict mass spectral and experimental criteria. Users obtain fully annotated data sheets with quantitative information for all volatile compounds for studies that may consist of thousands of chromatograms. The vocBinBase database may also be queried across different studies, comprising currently 1,537 unique mass spectra generated from 1.7 million deconvoluted mass spectra of 3,435 samples (18 species. Mass spectra with retention indices and volatile profiles are available as free download under the CC-BY agreement (http
Update History of This Database - Q-TARO | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us Q-TARO Update History of This Database Date Update contents 2014/10/20 The URL of the portal...ption Download License Update History of This Database Site Policy | Contact Us Update History of This Database - Q-TARO | LSDB Archive ... ... site is changed. 2013/12/17 The URL of the portal site is changed. 2013/12/13 Q-TARO English archive site i...s opened. 2009/11/15 Q-TARO ( http://qtaro.abr.affrc.go.jp/ ) is opened. About This Database Database Descri
H-Y Antigen Incompatibility Not Associated with Adverse Immunologic Graft Outcomes: Deceased Donor Pair Analysis of the OPTN Database

Directory of Open Access Journals (Sweden)

Douglas Scott Keith

2011-01-01

Full Text Available Background. H-Y antigen incompatibility adversely impacts bone marrow transplants however, the relevance of these antigens in kidney transplantation is uncertain. Three previous retrospective studies of kidney transplant databases have produced conflicting results. Methods. This study analyzed the Organ Procurement and Transplantation Network database between 1997 and 2009 using male deceased donor kidney transplant pairs in which the recipient genders were discordant. Death censored graft survival at six months, five, and ten years, treated acute rejection at six months and one year, and rates of graft failure by cause were the primary endpoints analyzed. Results. Death censored graft survival at six months was significantly worse for female recipients. Analysis of the causes of graft failure at six months revealed that the difference in death censored graft survival was due primarily to nonimmunologic graft failures. The adjusted and unadjusted death censored graft survivals at five and ten years were similar between the two genders as were the rates of immunologic graft failure. No difference in the rates of treated acute rejection at six months and one year was seen between the two genders. Conclusions. Male donor to female recipient discordance had no discernable effect on immunologically mediated kidney graft outcomes in the era of modern immunosuppression.
Overview of Historical Earthquake Document Database in Japan and Future Development

Science.gov (United States)

Nishiyama, A.; Satake, K.

2014-12-01

In Japan, damage and disasters from historical large earthquakes have been documented and preserved. Compilation of historical earthquake documents started in the early 20th century and 33 volumes of historical document source books (about 27,000 pages) have been published. However, these source books are not effectively utilized for researchers due to a contamination of low-reliability historical records and a difficulty for keyword searching by characters and dates. To overcome these problems and to promote historical earthquake studies in Japan, construction of text database started in the 21 century. As for historical earthquakes from the beginning of the 7th century to the early 17th century, "Online Database of Historical Documents in Japanese Earthquakes and Eruptions in the Ancient and Medieval Ages" (Ishibashi, 2009) has been already constructed. They investigated the source books or original texts of historical literature, emended the descriptions, and assigned the reliability of each historical document on the basis of written age. Another database compiled the historical documents for seven damaging earthquakes occurred along the Sea of Japan coast in Honshu, central Japan in the Edo period (from the beginning of the 17th century to the middle of the 19th century) and constructed text database and seismic intensity data base. These are now publicized on the web (written only in Japanese). However, only about 9 % of the earthquake source books have been digitized so far. Therefore, we plan to digitize all of the remaining historical documents by the research-program which started in 2014. The specification of the data base will be similar for previous ones. We also plan to combine this database with liquefaction traces database, which will be constructed by other research program, by adding the location information described in historical documents. Constructed database would be utilized to estimate the distributions of seismic intensities and tsunami
Comparison of the Frontier Distributed Database Caching System with NoSQL Databases

CERN Document Server

Dykstra, David

2012-01-01

One of the main attractions of non-relational "NoSQL" databases is their ability to scale to large numbers of readers, including readers spread over a wide area. The Frontier distributed database caching system, used in production by the Large Hadron Collider CMS and ATLAS detector projects for Conditions data, is based on traditional SQL databases but also has high scalability and wide-area distributability for an important subset of applications. This paper compares the major characteristics of the two different approaches and identifies the criteria for choosing which approach to prefer over the other. It also compares in some detail the NoSQL databases used by CMS and ATLAS: MongoDB, CouchDB, HBase, and Cassandra.
Towards cloud-centric distributed database evaluation

OpenAIRE

Seybold, Daniel

2016-01-01

The area of cloud computing also pushed the evolvement of distributed databases, resulting in a variety of distributed database systems, which can be classified in relation databases, NoSQL and NewSQL database systems. In general all representatives of these database system classes claim to provide elasticity and "unlimited" horizontal scalability. As these characteristics comply with the cloud, distributed databases seem to be a perfect match for Database-as-a-Service systems (DBaaS).
Towards Cloud-centric Distributed Database Evaluation

OpenAIRE

Seybold, Daniel

2016-01-01

The area of cloud computing also pushed the evolvement of distributed databases, resulting in a variety of distributed database systems, which can be classified in relation databases, NoSQL and NewSQL database systems. In general all representatives of these database system classes claim to provide elasticity and "unlimited" horizontal scalability. As these characteristics comply with the cloud, distributed databases seem to be a perfect match for Database-as-a-Service systems (DBaaS).
REPLIKASI UNIDIRECTIONAL PADA HETEROGEN DATABASE

OpenAIRE

Hendro Nindito; Evaristus Didik Madyatmadja; Albert Verasius Dian Sano

2013-01-01

The use of diverse database technology in enterprise today can not be avoided. Thus, technology is needed to generate information in real time. The purpose of this research is to discuss a database replication technology that can be applied in heterogeneous database environments. In this study we use Windows-based MS SQL Server database to Linux-based Oracle database as the goal. The research method used is prototyping where development can be done quickly and testing of working models of the...
The Danish Testicular Cancer database

DEFF Research Database (Denmark)

Daugaard, Gedske; Kier, Maria Gry Gundgaard; Bandak, Mikkel

2016-01-01

AIM: The nationwide Danish Testicular Cancer database consists of a retrospective research database (DaTeCa database) and a prospective clinical database (Danish Multidisciplinary Cancer Group [DMCG] DaTeCa database). The aim is to improve the quality of care for patients with testicular cancer (TC......) in Denmark, that is, by identifying risk factors for relapse, toxicity related to treatment, and focusing on late effects. STUDY POPULATION: All Danish male patients with a histologically verified germ cell cancer diagnosis in the Danish Pathology Registry are included in the DaTeCa databases. Data...... collection has been performed from 1984 to 2007 and from 2013 onward, respectively. MAIN VARIABLES AND DESCRIPTIVE DATA: The retrospective DaTeCa database contains detailed information with more than 300 variables related to histology, stage, treatment, relapses, pathology, tumor markers, kidney function...
Drug target identification using side-effect similarity

DEFF Research Database (Denmark)

Campillos, Monica; Kuhn, Michael; Gavin, Anne-Claude

2008-01-01

Targets for drugs have so far been predicted on the basis of molecular or cellular features, for example, by exploiting similarity in chemical structure or in activity across cell lines. We used phenotypic side-effect similarities to infer whether two drugs share a target. Applied to 746 marketed...... drugs, a network of 1018 side effect-driven drug-drug relations became apparent, 261 of which are formed by chemically dissimilar drugs from different therapeutic indications. We experimentally tested 20 of these unexpected drug-drug relations and validated 13 implied drug-target relations by in vitro...... binding assays, of which 11 reveal inhibition constants equal to less than 10 micromolar. Nine of these were tested and confirmed in cell assays, documenting the feasibility of using phenotypic information to infer molecular interactions and hinting at new uses of marketed drugs....
Replikasi Unidirectional pada Heterogen Database

Directory of Open Access Journals (Sweden)

Hendro Nindito

2013-12-01

Full Text Available The use of diverse database technology in enterprise today can not be avoided. Thus, technology is needed to generate information in real time. The purpose of this research is to discuss a database replication technology that can be applied in heterogeneous database environments. In this study we use Windows-based MS SQL Server database to Linux-based Oracle database as the goal. The research method used is prototyping where development can be done quickly and testing of working models of the interaction process is done through repeated. From this research it is obtained that the database replication technolgy using Oracle Golden Gate can be applied in heterogeneous environments in real time as well.
Introduction of the American Academy of Facial Plastic and Reconstructive Surgery FACE TO FACE Database.

Science.gov (United States)

Abraham, Manoj T; Rousso, Joseph J; Hu, Shirley; Brown, Ryan F; Moscatello, Augustine L; Finn, J Charles; Patel, Neha A; Kadakia, Sameep P; Wood-Smith, Donald

2017-07-01

The American Academy of Facial Plastic and Reconstructive Surgery FACE TO FACE database was created to gather and organize patient data primarily from international humanitarian surgical mission trips, as well as local humanitarian initiatives. Similar to cloud-based Electronic Medical Records, this web-based user-generated database allows for more accurate tracking of provider and patient information and outcomes, regardless of site, and is useful when coordinating follow-up care for patients. The database is particularly useful on international mission trips as there are often different surgeons who may provide care to patients on subsequent missions, and patients who may visit more than 1 mission site. Ultimately, by pooling data across multiples sites and over time, the database has the potential to be a useful resource for population-based studies and outcome data analysis. The objective of this paper is to delineate the process involved in creating the AAFPRS FACE TO FACE database, to assess its functional utility, to draw comparisons to electronic medical records systems that are now widely implemented, and to explain the specific benefits and disadvantages of the use of the database as it was implemented on recent international surgical mission trips.
Development of a data entry auditing protocol and quality assurance for a tissue bank database.

Science.gov (United States)

Khushi, Matloob; Carpenter, Jane E; Balleine, Rosemary L; Clarke, Christine L

2012-03-01

Human transcription error is an acknowledged risk when extracting information from paper records for entry into a database. For a tissue bank, it is critical that accurate data are provided to researchers with approved access to tissue bank material. The challenges of tissue bank data collection include manual extraction of data from complex medical reports that are accessed from a number of sources and that differ in style and layout. As a quality assurance measure, the Breast Cancer Tissue Bank (http:\\\\www.abctb.org.au) has implemented an auditing protocol and in order to efficiently execute the process, has developed an open source database plug-in tool (eAuditor) to assist in auditing of data held in our tissue bank database. Using eAuditor, we have identified that human entry errors range from 0.01% when entering donor's clinical follow-up details, to 0.53% when entering pathological details, highlighting the importance of an audit protocol tool such as eAuditor in a tissue bank database. eAuditor was developed and tested on the Caisis open source clinical-research database; however, it can be integrated in other databases where similar functionality is required.
Danish clinical databases: An overview

DEFF Research Database (Denmark)

Green, Anders

2011-01-01

Clinical databases contain data related to diagnostic procedures, treatments and outcomes. In 2001, a scheme was introduced for the approval, supervision and support to clinical databases in Denmark.......Clinical databases contain data related to diagnostic procedures, treatments and outcomes. In 2001, a scheme was introduced for the approval, supervision and support to clinical databases in Denmark....
Dictionary as Database.

Science.gov (United States)

Painter, Derrick

1996-01-01

Discussion of dictionaries as databases focuses on the digitizing of The Oxford English dictionary (OED) and the use of Standard Generalized Mark-Up Language (SGML). Topics include the creation of a consortium to digitize the OED, document structure, relational databases, text forms, sequence, and discourse. (LRW)
POSTER: Privacy-Preserving Profile Similarity Computation in Online Social Networks

NARCIS (Netherlands)

Jeckmans, Arjan; Tang, Qiang; Hartel, Pieter H.

2011-01-01

Currently, none of the existing online social networks (OSNs) enables its users to make new friends without revealing their private information. This leaves the users in a vulnerable position when searching for new friends. We propose a solution which enables a user to compute her profile similarity
Examining database persistence of ISO/EN 13606 standardized electronic health record extracts: relational vs. NoSQL approaches.

Science.gov (United States)

Sánchez-de-Madariaga, Ricardo; Muñoz, Adolfo; Lozano-Rubí, Raimundo; Serrano-Balazote, Pablo; Castro, Antonio L; Moreno, Oscar; Pascual, Mario

2017-08-18

The objective of this research is to compare the relational and non-relational (NoSQL) database systems approaches in order to store, recover, query and persist standardized medical information in the form of ISO/EN 13606 normalized Electronic Health Record XML extracts, both in isolation and concurrently. NoSQL database systems have recently attracted much attention, but few studies in the literature address their direct comparison with relational databases when applied to build the persistence layer of a standardized medical information system. One relational and two NoSQL databases (one document-based and one native XML database) of three different sizes have been created in order to evaluate and compare the response times (algorithmic complexity) of six different complexity growing queries, which have been performed on them. Similar appropriate results available in the literature have also been considered. Relational and non-relational NoSQL database systems show almost linear algorithmic complexity query execution. However, they show very different linear slopes, the former being much steeper than the two latter. Document-based NoSQL databases perform better in concurrency than in isolation, and also better than relational databases in concurrency. Non-relational NoSQL databases seem to be more appropriate than standard relational SQL databases when database size is extremely high (secondary use, research applications). Document-based NoSQL databases perform in general better than native XML NoSQL databases. EHR extracts visualization and edition are also document-based tasks more appropriate to NoSQL database systems. However, the appropriate database solution much depends on each particular situation and specific problem.
INIST: databases reorientation

International Nuclear Information System (INIS)

Bidet, J.C.

1995-01-01

INIST is a CNRS (Centre National de la Recherche Scientifique) laboratory devoted to the treatment of scientific and technical informations and to the management of these informations compiled in a database. Reorientation of the database content has been proposed in 1994 to increase the transfer of research towards enterprises and services, to develop more automatized accesses to the informations, and to create a quality assurance plan. The catalog of publications comprises 5800 periodical titles (1300 for fundamental research and 4500 for applied research). A science and technology multi-thematic database will be created in 1995 for the retrieval of applied and technical informations. ''Grey literature'' (reports, thesis, proceedings..) and human and social sciences data will be added to the base by the use of informations selected in the existing GRISELI and Francis databases. Strong modifications are also planned in the thematic cover of Earth sciences and will considerably reduce the geological information content. (J.S.). 1 tab

Estimating the annotation error rate of curated GO database sequence annotations

Directory of Open Access Journals (Sweden)

Brown Alfred L

2007-05-01

Full Text Available Abstract Background Annotations that describe the function of sequences are enormously important to researchers during laboratory investigations and when making computational inferences. However, there has been little investigation into the data quality of sequence function annotations. Here we have developed a new method of estimating the error rate of curated sequence annotations, and applied this to the Gene Ontology (GO sequence database (GOSeqLite. This method involved artificially adding errors to sequence annotations at known rates, and used regression to model the impact on the precision of annotations based on BLAST matched sequences. Results We estimated the error rate of curated GO sequence annotations in the GOSeqLite database (March 2006 at between 28% and 30%. Annotations made without use of sequence similarity based methods (non-ISS had an estimated error rate of between 13% and 18%. Annotations made with the use of sequence similarity methodology (ISS had an estimated error rate of 49%. Conclusion While the overall error rate is reasonably low, it would be prudent to treat all ISS annotations with caution. Electronic annotators that use ISS annotations as the basis of predictions are likely to have higher false prediction rates, and for this reason designers of these systems should consider avoiding ISS annotations where possible. Electronic annotators that use ISS annotations to make predictions should be viewed sceptically. We recommend that curators thoroughly review ISS annotations before accepting them as valid. Overall, users of curated sequence annotations from the GO database should feel assured that they are using a comparatively high quality source of information.
The Danish Testicular Cancer database

Directory of Open Access Journals (Sweden)

Daugaard G

2016-10-01

Full Text Available Gedske Daugaard,1 Maria Gry Gundgaard Kier,1 Mikkel Bandak,1 Mette Saksø Mortensen,1 Heidi Larsson,2 Mette Søgaard,2 Birgitte Groenkaer Toft,3 Birte Engvad,4 Mads Agerbæk,5 Niels Vilstrup Holm,6 Jakob Lauritsen1 1Department of Oncology 5073, Copenhagen University Hospital, Rigshospitalet, Copenhagen, 2Department of Clinical Epidemiology, Aarhus University Hospital, Aarhus, 3Department of Pathology, Copenhagen University Hospital, Rigshospitalet, Copenhagen, 4Department of Pathology, Odense University Hospital, Odense, 5Department of Oncology, Aarhus University Hospital, Aarhus, 6Department of Oncology, Odense University Hospital, Odense, Denmark Aim: The nationwide Danish Testicular Cancer database consists of a retrospective research database (DaTeCa database and a prospective clinical database (Danish Multidisciplinary Cancer Group [DMCG] DaTeCa database. The aim is to improve the quality of care for patients with testicular cancer (TC in Denmark, that is, by identifying risk factors for relapse, toxicity related to treatment, and focusing on late effects. Study population: All Danish male patients with a histologically verified germ cell cancer diagnosis in the Danish Pathology Registry are included in the DaTeCa databases. Data collection has been performed from 1984 to 2007 and from 2013 onward, respectively. Main variables and descriptive data: The retrospective DaTeCa database contains detailed information with more than 300 variables related to histology, stage, treatment, relapses, pathology, tumor markers, kidney function, lung function, etc. A questionnaire related to late effects has been conducted, which includes questions regarding social relationships, life situation, general health status, family background, diseases, symptoms, use of medication, marital status, psychosocial issues, fertility, and sexuality. TC survivors alive on October 2014 were invited to fill in this questionnaire including 160 validated questions
Self-similar structure in the distribution and density of the partition function zeros

International Nuclear Information System (INIS)

Huang, M.-C.; Luo, Y.-P.; Liaw, T.-M.

2003-01-01

Based on the knowledge of the partition function zeros for the cell-decorated triangular Ising model, we analyze the similar structures contained in the distribution pattern and density function of the zeros. The two own the same symmetries, and the arising of the similar structure in the road toward the infinite decoration-level is exhibited explicitly. The distinct features of the formation of the self-similar structure revealed from this model may be quite general
Comparison of the Frontier Distributed Database Caching System to NoSQL Databases

Science.gov (United States)

Dykstra, Dave

2012-12-01

One of the main attractions of non-relational “NoSQL” databases is their ability to scale to large numbers of readers, including readers spread over a wide area. The Frontier distributed database caching system, used in production by the Large Hadron Collider CMS and ATLAS detector projects for Conditions data, is based on traditional SQL databases but also adds high scalability and the ability to be distributed over a wide-area for an important subset of applications. This paper compares the major characteristics of the two different approaches and identifies the criteria for choosing which approach to prefer over the other. It also compares in some detail the NoSQL databases used by CMS and ATLAS: MongoDB, CouchDB, HBase, and Cassandra.
Comparison of the Frontier Distributed Database Caching System to NoSQL Databases

International Nuclear Information System (INIS)

Dykstra, Dave

2012-01-01

One of the main attractions of non-relational “NoSQL” databases is their ability to scale to large numbers of readers, including readers spread over a wide area. The Frontier distributed database caching system, used in production by the Large Hadron Collider CMS and ATLAS detector projects for Conditions data, is based on traditional SQL databases but also adds high scalability and the ability to be distributed over a wide-area for an important subset of applications. This paper compares the major characteristics of the two different approaches and identifies the criteria for choosing which approach to prefer over the other. It also compares in some detail the NoSQL databases used by CMS and ATLAS: MongoDB, CouchDB, HBase, and Cassandra.
Comparison of the Frontier Distributed Database Caching System to NoSQL Databases

Energy Technology Data Exchange (ETDEWEB)

Dykstra, Dave [Fermilab

2012-07-20

One of the main attractions of non-relational NoSQL databases is their ability to scale to large numbers of readers, including readers spread over a wide area. The Frontier distributed database caching system, used in production by the Large Hadron Collider CMS and ATLAS detector projects for Conditions data, is based on traditional SQL databases but also adds high scalability and the ability to be distributed over a wide-area for an important subset of applications. This paper compares the major characteristics of the two different approaches and identifies the criteria for choosing which approach to prefer over the other. It also compares in some detail the NoSQL databases used by CMS and ATLAS: MongoDB, CouchDB, HBase, and Cassandra.
SOMOTE_EASY: AN ALGORITHM TO TREAT THE CLASSIFICATION ISSUE IN REAL DATABASES

Directory of Open Access Journals (Sweden)

Hugo Leonardo Pereira Rufino

2016-04-01

Full Text Available Most classification tools assume that data distribution be balanced or with similar costs, when not properly classified. Nevertheless, in practical terms, the existence of database where unbalanced classes occur is commonplace, such as in the diagnosis of diseases, in which the confirmed cases are usually rare when compared with a healthy population. Other examples are the detection of fraudulent calls and the detection of system intruders. In these cases, the improper classification of a minority class (for instance, to diagnose a person with cancer as healthy may result in more serious consequences that incorrectly classify a majority class. Therefore, it is important to treat the database where unbalanced classes occur. This paper presents the SMOTE_Easy algorithm, which can classify data, even if there is a high level of unbalancing between different classes. In order to prove its efficiency, a comparison with the main algorithms to treat classification issues was made, where unbalanced data exist. This process was successful in nearly all tested databases
Development of database on the distribution coefficient. 2. Preparation of database

Energy Technology Data Exchange (ETDEWEB)

Takebe, Shinichi; Abe, Masayoshi [Japan Atomic Energy Research Inst., Tokai, Ibaraki (Japan). Tokai Research Establishment

2001-03-01

The distribution coefficient is very important parameter for environmental impact assessment on the disposal of radioactive waste arising from research institutes. 'Database on the Distribution Coefficient' was built up from the informations which were obtained by the literature survey in the country for these various items such as value , measuring method and measurement condition of distribution coefficient, in order to select the reasonable distribution coefficient value on the utilization of this value in the safety evaluation. This report was explained about the outline on preparation of this database and was summarized as a use guide book of database. (author)
Development of database on the distribution coefficient. 2. Preparation of database

Energy Technology Data Exchange (ETDEWEB)

Takebe, Shinichi; Abe, Masayoshi [Japan Atomic Energy Research Inst., Tokai, Ibaraki (Japan). Tokai Research Establishment

2001-03-01

The distribution coefficient is very important parameter for environmental impact assessment on the disposal of radioactive waste arising from research institutes. 'Database on the Distribution Coefficient' was built up from the informations which were obtained by the literature survey in the country for these various items such as value , measuring method and measurement condition of distribution coefficient, in order to select the reasonable distribution coefficient value on the utilization of this value in the safety evaluation. This report was explained about the outline on preparation of this database and was summarized as a use guide book of database. (author)
Balkan Vegetation Database

NARCIS (Netherlands)

Vassilev, Kiril; Pedashenko, Hristo; Alexandrova, Alexandra; Tashev, Alexandar; Ganeva, Anna; Gavrilova, Anna; Gradevska, Asya; Assenov, Assen; Vitkova, Antonina; Grigorov, Borislav; Gussev, Chavdar; Filipova, Eva; Aneva, Ina; Knollová, Ilona; Nikolov, Ivaylo; Georgiev, Georgi; Gogushev, Georgi; Tinchev, Georgi; Pachedjieva, Kalina; Koev, Koycho; Lyubenova, Mariyana; Dimitrov, Marius; Apostolova-Stoyanova, Nadezhda; Velev, Nikolay; Zhelev, Petar; Glogov, Plamen; Natcheva, Rayna; Tzonev, Rossen; Boch, Steffen; Hennekens, Stephan M.; Georgiev, Stoyan; Stoyanov, Stoyan; Karakiev, Todor; Kalníková, Veronika; Shivarov, Veselin; Russakova, Veska; Vulchev, Vladimir

2016-01-01

The Balkan Vegetation Database (BVD; GIVD ID: EU-00-019; http://www.givd.info/ID/EU-00- 019) is a regional database that consists of phytosociological relevés from different vegetation types from six countries on the Balkan Peninsula (Albania, Bosnia and Herzegovina, Bulgaria, Kosovo, Montenegro
Random vs. systematic sampling from administrative databases involving human subjects.

Science.gov (United States)

Hagino, C; Lo, R J

1998-09-01

Two sampling techniques, simple random sampling (SRS) and systematic sampling (SS), were compared to determine whether they yield similar and accurate distributions for the following four factors: age, gender, geographic location and years in practice. Any point estimate within 7 yr or 7 percentage points of its reference standard (SRS or the entire data set, i.e., the target population) was considered "acceptably similar" to the reference standard. The sampling frame was from the entire membership database of the Canadian Chiropractic Association. The two sampling methods were tested using eight different sample sizes of n (50, 100, 150, 200, 250, 300, 500, 800). From the profile/characteristics, summaries of four known factors [gender, average age, number (%) of chiropractors in each province and years in practice], between- and within-methods chi 2 tests and unpaired t tests were performed to determine whether any of the differences [descriptively greater than 7% or 7 yr] were also statistically significant. The strengths of the agreements between the provincial distributions were quantified by calculating the percent agreements for each (provincial pairwise-comparison methods). Any percent agreement less than 70% was judged to be unacceptable. Our assessments of the two sampling methods (SRS and SS) for the different sample sizes tested suggest that SRS and SS yielded acceptably similar results. Both methods started to yield "correct" sample profiles at approximately the same sample size (n > 200). SS is not only convenient, it can be recommended for sampling from large databases in which the data are listed without any inherent order biases other than alphabetical listing by surname.
CHEMVAL project. Critical evaluation of the CHEMVAL thermodynamic database with respect to its contents and relevance to radioactive waste disposal at Sellafield and Dounreay

International Nuclear Information System (INIS)

Falck, W.E.

1992-01-01

This report is concerned with assessing the applicability of the CHEMVAL Thermodynamic Database (Version 3.0) to studies of radioactive waste disposal at Sellafield and Dounreay. Comparisons are drawn with similar listings produced elsewhere and suggestions made for database enhancement. The feasibility of extending the database to take into account simulations at elevated temperatures is also addressed. (author)
Update History of This Database - GenLibi | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us GenLibi Update History of This Database Date Update contents 2014/03/25 GenLibi English archi...base Description Download License Update History of This Database Site Policy | Contact Us Update History of This Database - GenLibi | LSDB Archive ... ...ve site is opened. 2007/03/01 GenLibi ( http://gene.biosciencedbc.jp/ ) is opened. About This Database Data
Update History of This Database - dbQSNP | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us dbQSNP Update History of This Database Date Update contents 2017/02/16 dbQSNP English archiv...e Description Download License Update History of This Database Site Policy | Contact Us Update History of This Database - dbQSNP | LSDB Archive ... ...e site is opened. 2002/10/23 dbQSNP (http://qsnp.gen.kyushu-u.ac.jp/) is opened. About This Database Databas
A new relational database structure and online interface for the HITRAN database

Science.gov (United States)

Hill, Christian; Gordon, Iouli E.; Rothman, Laurence S.; Tennyson, Jonathan

2013-11-01

A new format for the HITRAN database is proposed. By storing the line-transition data in a number of linked tables described by a relational database schema, it is possible to overcome the limitations of the existing format, which have become increasingly apparent over the last few years as new and more varied data are being used by radiative-transfer models. Although the database in the new format can be searched using the well-established Structured Query Language (SQL), a web service, HITRANonline, has been deployed to allow users to make most common queries of the database using a graphical user interface in a web page. The advantages of the relational form of the database to ensuring data integrity and consistency are explored, and the compatibility of the online interface with the emerging standards of the Virtual Atomic and Molecular Data Centre (VAMDC) project is discussed. In particular, the ability to access HITRAN data using a standard query language from other websites, command line tools and from within computer programs is described.
Linking the Taiwan Fish Database to the Global Database

Directory of Open Access Journals (Sweden)

Kwang-Tsao Shao

2007-03-01

Full Text Available Under the support of the National Digital Archive Program (NDAP, basic species information about most Taiwanese fishes, including their morphology, ecology, distribution, specimens with photos, and literatures have been compiled into the "Fish Database of Taiwan" (http://fishdb.sinica.edu.tw. We expect that the all Taiwanese fish species databank (RSD, with 2800+ species, and the digital "Fish Fauna of Taiwan" will be completed in 2007. Underwater ecological photos and video images for all 2,800+ fishes are quite difficult to achieve but will be collected continuously in the future. In the last year of NDAP, we have successfully integrated all fish specimen data deposited at 7 different institutes in Taiwan as well as their collection maps on the Google Map and Google Earth. Further, the database also provides the pronunciation of Latin scientific names and transliteration of Chinese common names by referring to the Romanization system for all Taiwanese fishes (2,902 species in 292 families so far. The Taiwanese fish species checklist with Chinese common/vernacular names and specimen data has been updated periodically and provided to the global FishBase as well as the Global Biodiversity Information Facility (GBIF through the national portal of the Taiwan Biodiversity Information Facility (TaiBIF. Thus, Taiwanese fish data can be queried and browsed on the WWW. For contributing to the "Barcode of Life" and "All Fishes" international projects, alcohol-preserved specimens of more than 1,800 species and cryobanking tissues of 800 species have been accumulated at RCBAS in the past two years. Through this close collaboration between local and global databases, "The Fish Database of Taiwan" now attracts more than 250,000 visitors and achieves 5 million hits per month. We believe that this local database is becoming an important resource for education, research, conservation, and sustainable use of fish in Taiwan.
Tradeoffs in distributed databases

OpenAIRE

Juntunen, R. (Risto)

2016-01-01

Abstract In a distributed database data is spread throughout the network into separated nodes with different DBMS systems (Date, 2000). According to CAP-theorem three database properties — consistency, availability and partition tolerance cannot be achieved simultaneously in distributed database systems. Two of these properties can be achieved but not all three at the same time (Brewer, 2000). Since this theorem there has b...
Automated Oracle database testing

CERN Multimedia

CERN. Geneva

2014-01-01

Ensuring database stability and steady performance in the modern world of agile computing is a major challenge. Various changes happening at any level of the computing infrastructure: OS parameters & packages, kernel versions, database parameters & patches, or even schema changes, all can potentially harm production services. This presentation shows how an automatic and regular testing of Oracle databases can be achieved in such agile environment.
Database Systems - Present and Future

Directory of Open Access Journals (Sweden)

2009-01-01

Full Text Available The database systems have nowadays an increasingly important role in the knowledge-based society, in which computers have penetrated all fields of activity and the Internet tends to develop worldwide. In the current informatics context, the development of the applications with databases is the work of the specialists. Using databases, reach a database from various applications, and also some of related concepts, have become accessible to all categories of IT users. This paper aims to summarize the curricular area regarding the fundamental database systems issues, which are necessary in order to train specialists in economic informatics higher education. The database systems integrate and interfere with several informatics technologies and therefore are more difficult to understand and use. Thus, students should know already a set of minimum, mandatory concepts and their practical implementation: computer systems, programming techniques, programming languages, data structures. The article also presents the actual trends in the evolution of the database systems, in the context of economic informatics.
Similar prefrontal cortical activities between general fluid intelligence and visuospatial working memory tasks in preschool children as revealed by optical topography.

Science.gov (United States)

Kuwajima, Mariko; Sawaguchi, Toshiyuki

2010-10-01

General fluid intelligence (gF) is a major component of intellect in both adults and children. Whereas its neural substrates have been studied relatively thoroughly in adults, those are poorly understood in children, particularly preschoolers. Here, we hypothesized that gF and visuospatial working memory share a common neural system within the lateral prefrontal cortex (LPFC) during the preschool years (4-6 years). At the behavioral level, we found that gF positively and significantly correlated with abilities (especially accuracy) in visuospatial working memory. Optical topography revealed that the LPFC of preschoolers was activated and deactivated during the visuospatial working memory task and the gF task. We found that the spatio-temporal features of neural activity in the LPFC were similar for both the visuospatial working memory task and the gF task. Further, 2 months of training for the visuospatial working memory task significantly increased gF in the preschoolers. These findings suggest that a common neural system in the LPFC is recruited to improve the visuospatial working memory and gF in preschoolers. Efficient recruitment of this neural system may be important for good performance in these functions in preschoolers, and behavioral training using this system would help to increase gF at these ages.

The Fragment Network: A Chemistry Recommendation Engine Built Using a Graph Database.

Science.gov (United States)

Hall, Richard J; Murray, Christopher W; Verdonk, Marcel L

2017-07-27

The hit validation stage of a fragment-based drug discovery campaign involves probing the SAR around one or more fragment hits. This often requires a search for similar compounds in a corporate collection or from commercial suppliers. The Fragment Network is a graph database that allows a user to efficiently search chemical space around a compound of interest. The result set is chemically intuitive, naturally grouped by substitution pattern and meaningfully sorted according to the number of observations of each transformation in medicinal chemistry databases. This paper describes the algorithms used to construct and search the Fragment Network and provides examples of how it may be used in a drug discovery context.
A global multiproxy database for temperature reconstructions of the Common Era

Science.gov (United States)

Emile-Geay, Julian; McKay, Nicholas P.; Kaufman, Darrell S.; von Gunten, Lucien; Wang, Jianghao; Anchukaitis, Kevin J.; Abram, Nerilie J.; Addison, Jason A.; Curran, Mark A.J.; Evans, Michael N.; Henley, Benjamin J.; Hao, Zhixin; Martrat, Belen; McGregor, Helen V.; Neukom, Raphael; Pederson, Gregory T.; Stenni, Barbara; Thirumalai, Kaustubh; Werner, Johannes P.; Xu, Chenxi; Divine, Dmitry V.; Dixon, Bronwyn C.; Gergis, Joelle; Mundo, Ignacio A.; Nakatsuka, T.; Phipps, Steven J.; Routson, Cody C.; Steig, Eric J.; Tierney, Jessica E.; Tyler, Jonathan J.; Allen, Kathryn J.; Bertler, Nancy A. N.; Bjorklund, Jesper; Chase, Brian M.; Chen, Min-Te; Cook, Ed; de Jong, Rixt; DeLong, Kristine L.; Dixon, Daniel A.; Ekaykin, Alexey A.; Ersek, Vasile; Filipsson, Helena L.; Francus, Pierre; Freund, Mandy B.; Frezzotti, M.; Gaire, Narayan P.; Gajewski, Konrad; Ge, Quansheng; Goosse, Hugues; Gornostaeva, Anastasia; Grosjean, Martin; Horiuchi, Kazuho; Hormes, Anne; Husum, Katrine; Isaksson, Elisabeth; Kandasamy, Selvaraj; Kawamura, Kenji; Koc, Nalan; Leduc, Guillaume; Linderholm, Hans W.; Lorrey, Andrew M.; Mikhalenko, Vladimir; Mortyn, P. Graham; Motoyama, Hideaki; Moy, Andrew D.; Mulvaney, Robert; Munz, Philipp M.; Nash, David J.; Oerter, Hans; Opel, Thomas; Orsi, Anais J.; Ovchinnikov, Dmitriy V.; Porter, Trevor J.; Roop, Heidi; Saenger, Casey; Sano, Masaki; Sauchyn, David; Saunders, K.M.; Seidenkrantz, Marit-Solveig; Severi, Mirko; Shao, X.; Sicre, Marie-Alexandrine; Sigl, Michael; Sinclair, Kate; St. George, Scott; St. Jacques, Jeannine-Marie; Thamban, Meloth; Thapa, Udya Kuwar; Thomas, E.; Turney, Chris; Uemura, Ryu; Viau, A.E.; Vladimirova, Diana O.; Wahl, Eugene; White, James W. C.; Yu, Z.; Zinke, Jens

2017-01-01

Reproducible climate reconstructions of the Common Era (1 CE to present) are key to placing industrial-era warming into the context of natural climatic variability. Here we present a community-sourced database of temperature-sensitive proxy records from the PAGES2k initiative. The database gathers 692 records from 648 locations, including all continental regions and major ocean basins. The records are from trees, ice, sediment, corals, speleothems, documentary evidence, and other archives. They range in length from 50 to 2000 years, with a median of 547 years, while temporal resolution ranges from biweekly to centennial. Nearly half of the proxy time series are significantly correlated with HadCRUT4.2 surface temperature over the period 1850–2014. Global temperature composites show a remarkable degree of coherence between high- and low-resolution archives, with broadly similar patterns across archive types, terrestrial versus marine locations, and screening criteria. The database is suited to investigations of global and regional temperature variability over the Common Era, and is shared in the Linked Paleo Data (LiPD) format, including serializations in Matlab, R and Python.
A database and API for variation, dense genotyping and resequencing data

Directory of Open Access Journals (Sweden)

Flicek Paul

2010-05-01

Full Text Available Abstract Background Advances in sequencing and genotyping technologies are leading to the widespread availability of multi-species variation data, dense genotype data and large-scale resequencing projects. The 1000 Genomes Project and similar efforts in other species are challenging the methods previously used for storage and manipulation of such data necessitating the redesign of existing genome-wide bioinformatics resources. Results Ensembl has created a database and software library to support data storage, analysis and access to the existing and emerging variation data from large mammalian and vertebrate genomes. These tools scale to thousands of individual genome sequences and are integrated into the Ensembl infrastructure for genome annotation and visualisation. The database and software system is easily expanded to integrate both public and non-public data sources in the context of an Ensembl software installation and is already being used outside of the Ensembl project in a number of database and application environments. Conclusions Ensembl's powerful, flexible and open source infrastructure for the management of variation, genotyping and resequencing data is freely available at http://www.ensembl.org.
Database on wind characteristics. Contents of database bank

DEFF Research Database (Denmark)

Larsen, Gunner Chr.; Hansen, K.S.

2001-01-01

for the available data in the established database bank and part three is the Users Manual describing the various ways to access and analyse the data. The present report constitutes the second part of the Annex XVII reporting. Basically, the database bank contains three categories of data, i.e. i) high sampled wind...... field time series; ii) high sampled wind turbine structural response time series; andiii) wind resource data. The main emphasis, however, is on category i). The available data, within each of the three categories, are described in details. The description embraces site characteristics, terrain type...
An end to end secure CBIR over encrypted medical database.

Science.gov (United States)

Bellafqira, Reda; Coatrieux, Gouenou; Bouslimi, Dalel; Quellec, Gwenole

2016-08-01

In this paper, we propose a new secure content based image retrieval (SCBIR) system adapted to the cloud framework. This solution allows a physician to retrieve images of similar content within an outsourced and encrypted image database, without decrypting them. Contrarily to actual CBIR approaches in the encrypted domain, the originality of the proposed scheme stands on the fact that the features extracted from the encrypted images are themselves encrypted. This is achieved by means of homomorphic encryption and two non-colluding servers, we however both consider as honest but curious. In that way an end to end secure CBIR process is ensured. Experimental results carried out on a diabetic retinopathy database encrypted with the Paillier cryptosystem indicate that our SCBIR achieves retrieval performance as good as if images were processed in their non-encrypted form.
USAID Anticorruption Projects Database

Data.gov (United States)

US Agency for International Development — The Anticorruption Projects Database (Database) includes information about USAID projects with anticorruption interventions implemented worldwide between 2007 and...
Database design using entity-relationship diagrams

CERN Document Server

Bagui, Sikha

2011-01-01

Data, Databases, and the Software Engineering ProcessDataBuilding a DatabaseWhat is the Software Engineering Process?Entity Relationship Diagrams and the Software Engineering Life Cycle Phase 1: Get the Requirements for the Database Phase 2: Specify the Database Phase 3: Design the DatabaseData and Data ModelsFiles, Records, and Data ItemsMoving from 3 × 5 Cards to ComputersDatabase Models The Hierarchical ModelThe Network ModelThe Relational ModelThe Relational Model and Functional DependenciesFundamental Relational DatabaseRelational Database and SetsFunctional
The database for reaching experiments and models.

Directory of Open Access Journals (Sweden)

Ben Walker

Full Text Available Reaching is one of the central experimental paradigms in the field of motor control, and many computational models of reaching have been published. While most of these models try to explain subject data (such as movement kinematics, reaching performance, forces, etc. from only a single experiment, distinct experiments often share experimental conditions and record similar kinematics. This suggests that reaching models could be applied to (and falsified by multiple experiments. However, using multiple datasets is difficult because experimental data formats vary widely. Standardizing data formats promises to enable scientists to test model predictions against many experiments and to compare experimental results across labs. Here we report on the development of a new resource available to scientists: a database of reaching called the Database for Reaching Experiments And Models (DREAM. DREAM collects both experimental datasets and models and facilitates their comparison by standardizing formats. The DREAM project promises to be useful for experimentalists who want to understand how their data relates to models, for modelers who want to test their theories, and for educators who want to help students better understand reaching experiments, models, and data analysis.
Signaling pathways in a Citrus EST database

Directory of Open Access Journals (Sweden)

Angela Mehta

2007-01-01

Full Text Available Citrus spp. are economically important crops, which in Brazil are grown mainly in the State of São Paulo. Citrus cultures are attacked by several pathogens, causing severe yield losses. In order to better understand this culture, the Millenium Project (IAC Cordeirópolis was launched in order to sequence Citrus ESTs (expressed sequence tags from different tissues, including leaf, bark, fruit, root and flower. Plants were submitted to biotic and abiotic stresses and investigated under different development stages (adult vs. juvenile. Several cDNA libraries were constructed and the sequences obtained formed the Citrus ESTs database with almost 200,000 sequences. Searches were performed in the Citrus database to investigate the presence of different signaling pathway components. Several of the genes involved in the signaling of sugar, calcium, cytokinin, plant hormones, inositol phosphate, MAPKinase and COP9 were found in the citrus genome and are discussed in this paper. The results obtained may indicate that similar mechanisms described in other plants, such as Arabidopsis, occur in citrus. Further experimental studies must be conducted in order to understand the different signaling pathways present.
Respiratory cancer database: An open access database of respiratory cancer gene and miRNA.

Science.gov (United States)

Choubey, Jyotsna; Choudhari, Jyoti Kant; Patel, Ashish; Verma, Mukesh Kumar

2017-01-01

Respiratory cancer database (RespCanDB) is a genomic and proteomic database of cancer of respiratory organ. It also includes the information of medicinal plants used for the treatment of various respiratory cancers with structure of its active constituents as well as pharmacological and chemical information of drug associated with various respiratory cancers. Data in RespCanDB has been manually collected from published research article and from other databases. Data has been integrated using MySQL an object-relational database management system. MySQL manages all data in the back-end and provides commands to retrieve and store the data into the database. The web interface of database has been built in ASP. RespCanDB is expected to contribute to the understanding of scientific community regarding respiratory cancer biology as well as developments of new way of diagnosing and treating respiratory cancer. Currently, the database consist the oncogenomic information of lung cancer, laryngeal cancer, and nasopharyngeal cancer. Data for other cancers, such as oral and tracheal cancers, will be added in the near future. The URL of RespCanDB is http://ridb.subdic-bioinformatics-nitrr.in/.
ADANS database specification

Energy Technology Data Exchange (ETDEWEB)

NONE

1997-01-16

The purpose of the Air Mobility Command (AMC) Deployment Analysis System (ADANS) Database Specification (DS) is to describe the database organization and storage allocation and to provide the detailed data model of the physical design and information necessary for the construction of the parts of the database (e.g., tables, indexes, rules, defaults). The DS includes entity relationship diagrams, table and field definitions, reports on other database objects, and a description of the ADANS data dictionary. ADANS is the automated system used by Headquarters AMC and the Tanker Airlift Control Center (TACC) for airlift planning and scheduling of peacetime and contingency operations as well as for deliberate planning. ADANS also supports planning and scheduling of Air Refueling Events by the TACC and the unit-level tanker schedulers. ADANS receives input in the form of movement requirements and air refueling requests. It provides a suite of tools for planners to manipulate these requirements/requests against mobility assets and to develop, analyze, and distribute schedules. Analysis tools are provided for assessing the products of the scheduling subsystems, and editing capabilities support the refinement of schedules. A reporting capability provides formatted screen, print, and/or file outputs of various standard reports. An interface subsystem handles message traffic to and from external systems. The database is an integral part of the functionality summarized above.
Development of thermodynamic databases for geochemical calculations

Energy Technology Data Exchange (ETDEWEB)

Arthur, R.C. [Monitor Scientific, L.L.C., Denver, Colorado (United States); Sasamoto, Hiroshi; Shibata, Masahiro; Yui, Mikazu [Japan Nuclear Cycle Development Inst., Tokai, Ibaraki (Japan); Neyama, Atsushi [Computer Software Development Corp., Tokyo (Japan)

1999-09-01

Two thermodynamic databases for geochemical calculations supporting research and development on geological disposal concepts for high level radioactive waste are described in this report. One, SPRONS.JNC, is compatible with thermodynamic relations comprising the SUPCRT model and software, which permits calculation of the standard molal and partial molal thermodynamic properties of minerals, gases, aqueous species and reactions from 1 to 5000 bars and 0 to 1000degC. This database includes standard molal Gibbs free energies and enthalpies of formation, standard molal entropies and volumes, and Maier-Kelly heat capacity coefficients at the reference pressure (1 bar) and temperature (25degC) for 195 minerals and 16 gases. It also includes standard partial molal Gibbs free energies and enthalpies of formation, standard partial molal entropies, and Helgeson, Kirkham and Flowers (HKF) equation-of-state coefficients at the reference pressure and temperature for 1147 inorganic and organic aqueous ions and complexes. SPRONS.JNC extends similar databases described elsewhere by incorporating new and revised data published in the peer-reviewed literature since 1991. The other database, PHREEQE.JNC, is compatible with the PHREEQE series of geochemical modeling codes. It includes equilibrium constants at 25degC and l bar for mineral-dissolution, gas-solubility, aqueous-association and oxidation-reduction reactions. Reaction enthalpies, or coefficients in an empirical log K(T) function, are also included in this database, which permits calculation of equilibrium constants between 0 and 100degC at 1 bar. All equilibrium constants, reaction enthalpies, and log K(T) coefficients in PHREEQE.JNC are calculated using SUPCRT and SPRONS.JNC, which ensures that these two databases are mutually consistent. They are also internally consistent insofar as all the data are compatible with basic thermodynamic definitions and functional relations in the SUPCRT model, and because primary
Development of thermodynamic databases for geochemical calculations

International Nuclear Information System (INIS)

Arthur, R.C.; Sasamoto, Hiroshi; Shibata, Masahiro; Yui, Mikazu; Neyama, Atsushi

1999-09-01

Two thermodynamic databases for geochemical calculations supporting research and development on geological disposal concepts for high level radioactive waste are described in this report. One, SPRONS.JNC, is compatible with thermodynamic relations comprising the SUPCRT model and software, which permits calculation of the standard molal and partial molal thermodynamic properties of minerals, gases, aqueous species and reactions from 1 to 5000 bars and 0 to 1000degC. This database includes standard molal Gibbs free energies and enthalpies of formation, standard molal entropies and volumes, and Maier-Kelly heat capacity coefficients at the reference pressure (1 bar) and temperature (25degC) for 195 minerals and 16 gases. It also includes standard partial molal Gibbs free energies and enthalpies of formation, standard partial molal entropies, and Helgeson, Kirkham and Flowers (HKF) equation-of-state coefficients at the reference pressure and temperature for 1147 inorganic and organic aqueous ions and complexes. SPRONS.JNC extends similar databases described elsewhere by incorporating new and revised data published in the peer-reviewed literature since 1991. The other database, PHREEQE.JNC, is compatible with the PHREEQE series of geochemical modeling codes. It includes equilibrium constants at 25degC and l bar for mineral-dissolution, gas-solubility, aqueous-association and oxidation-reduction reactions. Reaction enthalpies, or coefficients in an empirical log K(T) function, are also included in this database, which permits calculation of equilibrium constants between 0 and 100degC at 1 bar. All equilibrium constants, reaction enthalpies, and log K(T) coefficients in PHREEQE.JNC are calculated using SUPCRT and SPRONS.JNC, which ensures that these two databases are mutually consistent. They are also internally consistent insofar as all the data are compatible with basic thermodynamic definitions and functional relations in the SUPCRT model, and because primary
The Hanford Site generic component failure-rate database compared with other generic failure-rate databases

International Nuclear Information System (INIS)

Reardon, M.F.; Zentner, M.D.

1992-11-01

The Risk Assessment Technology Group, Westinghouse Hanford Company (WHC), has compiled a component failure rate database to be used during risk and reliability analysis of nonreactor facilities. Because site-specific data for the Hanford Site are generally not kept or not compiled in a usable form, the database was assembled using information from a variety of other established sources. Generally, the most conservative failure rates were chosen from the databases reviewed. The Hanford Site database has since been used extensively in fault tree modeling of many Hanford Site facilities and systems. The purpose of this study was to evaluate the reasonableness of the data chosen for the Hanford Site database by comparing the values chosen with the values from the other databases
Estimation of daily reference evapotranspiration (ETo) using artificial intelligence methods: Offering a new approach for lagged ETo data-based modeling

Science.gov (United States)

Mehdizadeh, Saeid

2018-04-01

Evapotranspiration (ET) is considered as a key factor in hydrological and climatological studies, agricultural water management, irrigation scheduling, etc. It can be directly measured using lysimeters. Moreover, other methods such as empirical equations and artificial intelligence methods can be used to model ET. In the recent years, artificial intelligence methods have been widely utilized to estimate reference evapotranspiration (ETo). In the present study, local and external performances of multivariate adaptive regression splines (MARS) and gene expression programming (GEP) were assessed for estimating daily ETo. For this aim, daily weather data of six stations with different climates in Iran, namely Urmia and Tabriz (semi-arid), Isfahan and Shiraz (arid), Yazd and Zahedan (hyper-arid) were employed during 2000-2014. Two types of input patterns consisting of weather data-based and lagged ETo data-based scenarios were considered to develop the models. Four statistical indicators including root mean square error (RMSE), mean absolute error (MAE), coefficient of determination (R2), and mean absolute percentage error (MAPE) were used to check the accuracy of models. The local performance of models revealed that the MARS and GEP approaches have the capability to estimate daily ETo using the meteorological parameters and the lagged ETo data as inputs. Nevertheless, the MARS had the best performance in the weather data-based scenarios. On the other hand, considerable differences were not observed in the models' accuracy for the lagged ETo data-based scenarios. In the innovation of this study, novel hybrid models were proposed in the lagged ETo data-based scenarios through combination of MARS and GEP models with autoregressive conditional heteroscedasticity (ARCH) time series model. It was concluded that the proposed novel models named MARS-ARCH and GEP-ARCH improved the performance of ETo modeling compared to the single MARS and GEP. In addition, the external
Method and electronic database search engine for exposing the content of an electronic database

NARCIS (Netherlands)

Stappers, P.J.

2000-01-01

The invention relates to an electronic database search engine comprising an electronic memory device suitable for storing and releasing elements from the database, a display unit, a user interface for selecting and displaying at least one element from the database on the display unit, and control
The LHCb configuration database

CERN Document Server

Abadie, L; Van Herwijnen, Eric; Jacobsson, R; Jost, B; Neufeld, N

2005-01-01

The aim of the LHCb configuration database is to store information about all the controllable devices of the detector. The experiment's control system (that uses PVSS ) will configure, start up and monitor the detector from the information in the configuration database. The database will contain devices with their properties, connectivity and hierarchy. The ability to store and rapidly retrieve huge amounts of data, and the navigability between devices are important requirements. We have collected use cases to ensure the completeness of the design. Using the entity relationship modelling technique we describe the use cases as classes with attributes and links. We designed the schema for the tables using relational diagrams. This methodology has been applied to the TFC (switches) and DAQ system. Other parts of the detector will follow later. The database has been implemented using Oracle to benefit from central CERN database support. The project also foresees the creation of tools to populate, maintain, and co...
The CUTLASS database facilities

International Nuclear Information System (INIS)

Jervis, P.; Rutter, P.

1988-09-01

The enhancement of the CUTLASS database management system to provide improved facilities for data handling is seen as a prerequisite to its effective use for future power station data processing and control applications. This particularly applies to the larger projects such as AGR data processing system refurbishments, and the data processing systems required for the new Coal Fired Reference Design stations. In anticipation of the need for improved data handling facilities in CUTLASS, the CEGB established a User Sub-Group in the early 1980's to define the database facilities required by users. Following the endorsement of the resulting specification and a detailed design study, the database facilities have been implemented as an integral part of the CUTLASS system. This paper provides an introduction to the range of CUTLASS Database facilities, and emphasises the role of Database as the central facility around which future Kit 1 and (particularly) Kit 6 CUTLASS based data processing and control systems will be designed and implemented. (author)
Kazusa Marker DataBase: a database for genomics, genetics, and molecular breeding in plants

Science.gov (United States)

Shirasawa, Kenta; Isobe, Sachiko; Tabata, Satoshi; Hirakawa, Hideki

2014-01-01

In order to provide useful genomic information for agronomical plants, we have established a database, the Kazusa Marker DataBase (http://marker.kazusa.or.jp). This database includes information on DNA markers, e.g., SSR and SNP markers, genetic linkage maps, and physical maps, that were developed at the Kazusa DNA Research Institute. Keyword searches for the markers, sequence data used for marker development, and experimental conditions are also available through this database. Currently, 10 plant species have been targeted: tomato (Solanum lycopersicum), pepper (Capsicum annuum), strawberry (Fragaria × ananassa), radish (Raphanus sativus), Lotus japonicus, soybean (Glycine max), peanut (Arachis hypogaea), red clover (Trifolium pratense), white clover (Trifolium repens), and eucalyptus (Eucalyptus camaldulensis). In addition, the number of plant species registered in this database will be increased as our research progresses. The Kazusa Marker DataBase will be a useful tool for both basic and applied sciences, such as genomics, genetics, and molecular breeding in crops. PMID:25320561
HIV Structural Database

Science.gov (United States)

SRD 102 HIV Structural Database (Web, free access) The HIV Protease Structural Database is an archive of experimentally determined 3-D structures of Human Immunodeficiency Virus 1 (HIV-1), Human Immunodeficiency Virus 2 (HIV-2) and Simian Immunodeficiency Virus (SIV) Proteases and their complexes with inhibitors or products of substrate cleavage.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.