public domain database: Topics by WorldWideScience.org

Sample records for public domain database

DATABASES AND THE SUI-GENERIS RIGHT – PROTECTION OUTSIDE THE ORIGINALITY. THE DISREGARD OF THE PUBLIC DOMAIN

Directory of Open Access Journals (Sweden)

Monica LUPAȘCU

2018-05-01

Full Text Available This study focuses on databases as they are regulated by Directive no.96/9/EC regarding the protection of databases. There are also several references to Romanian Law no.8/1996 on copyright and neighbouring rights which implements the mentioned European Directive. The study analyses certain effects that the sui-generis protection has on public domain. The study tries to demonstrate that the reglementation specific to databases neglects the interests correlated with the public domain. The effect of such a regulation is the abusive creation of some databases in which the public domain (meaning information not protected by copyright such as news, ideas, procedures, methods, systems, processes, concepts, principles, discoveries ends up being encapsulated and made available only to some private interests, the access to public domain being regulated indirectly. The study begins by explaining the sui- generis right and its origin. The first mention of databases can be found in “Green Paper on Copyright (1998,” a document that clearly shows, the database protection was thought to cover a sphere of information non-protectable from the scientific and industrial fields. Several arguments are made by the author, most of them based on the report of the Public Consultation sustained in 2014 in regards to the necessity of the sui-generis right. There are some references made to a specific case law, namely British Houseracing Board vs William Hill and Fixture Marketing Ldt. The ECJ’s decision în that case is of great importance for the support of public interest to access information corresponding to some restrictive fields that are derived as a result of the maker’s activities, because in the absence of the sui-generis right, all this information can be freely accessed and used.
Public Domain; Public Interest; Public Funding: Focussing on the three Ps in Scientific Research

Directory of Open Access Journals (Sweden)

Mags McGinley

2005-03-01

Full Text Available The purpose of this paper is to discuss the three Ps of scientific research: Public Domain; Public Interest; Public Funding. This is done by examining some of the difficulties faced by scientists engaged in scientific research who may have problems working within the constraints of current copyright and database legislation, where property claims can place obstacles in the way of research, in other words, the public domain. The article then looks at perceptions of the public interest and asks whether copyright and the database right reflect understandings of how this concept should operate. Thirdly, it considers the relevance of public funding for scientific research in the context of both the public domain and of the public interest. Finally, some recent initiatives seeking to change the contours of the legal framework are be examined.
Database Concepts in a Domain Ontology

Directory of Open Access Journals (Sweden)

Gorskis Henrihs

2017-12-01

Full Text Available There are multiple approaches for mapping from a domain ontology to a database in the task of ontology-based data access. For that purpose, external mapping documents are most commonly used. These documents describe how the data necessary for the description of ontology individuals and other values, are to be obtained from the database. The present paper investigates the use of special database concepts. These concepts are not separated from the domain ontology; they are mixed with domain concepts to form a combined application ontology. By creating natural relationships between database concepts and domain concepts, mapping can be implemented more easily and with a specific purpose. The paper also investigates how the use of such database concepts in addition to domain concepts impacts ontology building and data retrieval.
Towards development of a high quality public domain global roads database

Directory of Open Access Journals (Sweden)

Andrew Nelson

2006-12-01

Full Text Available There is clear demand for a global spatial public domain roads data set with improved geographic and temporal coverage, consistent coding of road types, and clear documentation of sources. The currently best available global public domain product covers only one-quarter to one-third of the existing road networks, and this varies considerably by region. Applications for such a data set span multiple sectors and would be particularly valuable for the international economic development, disaster relief, and biodiversity conservation communities, not to mention national and regional agencies and organizations around the world. The building blocks for such a global product are available for many countries and regions, yet thus far there has been neither strategy nor leadership for developing it. This paper evaluates the best available public domain and commercial data sets, assesses the gaps in global coverage, and proposes a number of strategies for filling them. It also identifies stakeholder organizations with an interest in such a data set that might either provide leadership or funding for its development. It closes with a proposed set of actions to begin the process.
Domain fusion analysis by applying relational algebra to protein sequence and domain databases.

Science.gov (United States)

Truong, Kevin; Ikura, Mitsuhiko

2003-05-06

Domain fusion analysis is a useful method to predict functionally linked proteins that may be involved in direct protein-protein interactions or in the same metabolic or signaling pathway. As separate domain databases like BLOCKS, PROSITE, Pfam, SMART, PRINTS-S, ProDom, TIGRFAMs, and amalgamated domain databases like InterPro continue to grow in size and quality, a computational method to perform domain fusion analysis that leverages on these efforts will become increasingly powerful. This paper proposes a computational method employing relational algebra to find domain fusions in protein sequence databases. The feasibility of this method was illustrated on the SWISS-PROT+TrEMBL sequence database using domain predictions from the Pfam HMM (hidden Markov model) database. We identified 235 and 189 putative functionally linked protein partners in H. sapiens and S. cerevisiae, respectively. From scientific literature, we were able to confirm many of these functional linkages, while the remainder offer testable experimental hypothesis. Results can be viewed at http://calcium.uhnres.utoronto.ca/pi. As the analysis can be computed quickly on any relational database that supports standard SQL (structured query language), it can be dynamically updated along with the sequence and domain databases, thereby improving the quality of predictions over time.
A protein domain interaction interface database: InterPare

Directory of Open Access Journals (Sweden)

Lee Jungsul

2005-08-01

Full Text Available Abstract Background Most proteins function by interacting with other molecules. Their interaction interfaces are highly conserved throughout evolution to avoid undesirable interactions that lead to fatal disorders in cells. Rational drug discovery includes computational methods to identify the interaction sites of lead compounds to the target molecules. Identifying and classifying protein interaction interfaces on a large scale can help researchers discover drug targets more efficiently. Description We introduce a large-scale protein domain interaction interface database called InterPare http://interpare.net. It contains both inter-chain (between chains interfaces and intra-chain (within chain interfaces. InterPare uses three methods to detect interfaces: 1 the geometric distance method for checking the distance between atoms that belong to different domains, 2 Accessible Surface Area (ASA, a method for detecting the buried region of a protein that is detached from a solvent when forming multimers or complexes, and 3 the Voronoi diagram, a computational geometry method that uses a mathematical definition of interface regions. InterPare includes visualization tools to display protein interior, surface, and interaction interfaces. It also provides statistics such as the amino acid propensities of queried protein according to its interior, surface, and interface region. The atom coordinates that belong to interface, surface, and interior regions can be downloaded from the website. Conclusion InterPare is an open and public database server for protein interaction interface information. It contains the large-scale interface data for proteins whose 3D-structures are known. As of November 2004, there were 10,583 (Geometric distance, 10,431 (ASA, and 11,010 (Voronoi diagram entries in the Protein Data Bank (PDB containing interfaces, according to the above three methods. In the case of the geometric distance method, there are 31,620 inter-chain domain-domain
Database Publication Practices

DEFF Research Database (Denmark)

Bernstein, P.A.; DeWitt, D.; Heuer, A.

2005-01-01

There has been a growing interest in improving the publication processes for database research papers. This panel reports on recent changes in those processes and presents an initial cut at historical data for the VLDB Journal and ACM Transactions on Database Systems.......There has been a growing interest in improving the publication processes for database research papers. This panel reports on recent changes in those processes and presents an initial cut at historical data for the VLDB Journal and ACM Transactions on Database Systems....
Database of ligand-induced domain movements in enzymes

Directory of Open Access Journals (Sweden)

Hayward Steven

2009-03-01

Full Text Available Abstract Background Conformational change induced by the binding of a substrate or coenzyme is a poorly understood stage in the process of enzyme catalysed reactions. For enzymes that exhibit a domain movement, the conformational change can be clearly characterized and therefore the opportunity exists to gain an understanding of the mechanisms involved. The development of the non-redundant database of protein domain movements contains examples of ligand-induced domain movements in enzymes, but this valuable data has remained unexploited. Description The domain movements in the non-redundant database of protein domain movements are those found by applying the DynDom program to pairs of crystallographic structures contained in Protein Data Bank files. For each pair of structures cross-checking ligands in their Protein Data Bank files with the KEGG-LIGAND database and using methods that search for ligands that contact the enzyme in one conformation but not the other, the non-redundant database of protein domain movements was refined down to a set of 203 enzymes where a domain movement is apparently triggered by the binding of a functional ligand. For these cases, ligand binding information, including hydrogen bonds and salt-bridges between the ligand and specific residues on the enzyme is presented in the context of dynamical information such as the regions that form the dynamic domains, the hinge bending residues, and the hinge axes. Conclusion The presentation at a single website of data on interactions between a ligand and specific residues on the enzyme alongside data on the movement that these interactions induce, should lead to new insights into the mechanisms of these enzymes in particular, and help in trying to understand the general process of ligand-induced domain closure in enzymes. The website can be found at: http://www.cmp.uea.ac.uk/dyndom/enzymeList.do
PrionScan: an online database of predicted prion domains in complete proteomes.

Science.gov (United States)

Espinosa Angarica, Vladimir; Angulo, Alfonso; Giner, Arturo; Losilla, Guillermo; Ventura, Salvador; Sancho, Javier

2014-02-05

Prions are a particular type of amyloids related to a large variety of important processes in cells, but also responsible for serious diseases in mammals and humans. The number of experimentally characterized prions is still low and corresponds to a handful of examples in microorganisms and mammals. Prion aggregation is mediated by specific protein domains with a remarkable compositional bias towards glutamine/asparagine and against charged residues and prolines. These compositional features have been used to predict new prion proteins in the genomes of different organisms. Despite these efforts, there are only a few available data sources containing prion predictions at a genomic scale. Here we present PrionScan, a new database of predicted prion-like domains in complete proteomes. We have previously developed a predictive methodology to identify and score prionogenic stretches in protein sequences. In the present work, we exploit this approach to scan all the protein sequences in public databases and compile a repository containing relevant information of proteins bearing prion-like domains. The database is updated regularly alongside UniprotKB and in its present version contains approximately 28000 predictions in proteins from different functional categories in more than 3200 organisms from all the taxonomic subdivisions. PrionScan can be used in two different ways: database query and analysis of protein sequences submitted by the users. In the first mode, simple queries allow to retrieve a detailed description of the properties of a defined protein. Queries can also be combined to generate more complex and specific searching patterns. In the second mode, users can submit and analyze their own sequences. It is expected that this database would provide relevant insights on prion functions and regulation from a genome-wide perspective, allowing researches performing cross-species prion biology studies. Our database might also be useful for guiding experimentalists
Extracting meronomy relations from domain-specific, textual corporate databases

NARCIS (Netherlands)

Ittoo, R.A.; Bouma, G.; Maruster, L.; Wortmann, J.C.; Hopfe, C.J.; Rezgui, Y.; Métais, E.; Preece, A.; Li, H.

2010-01-01

Various techniques for learning meronymy relationships from open-domain corpora exist. However, extracting meronymy relationships from domain-specific, textual corporate databases has been overlooked, despite numerous application opportunities particularly in domains like product development and/or
24 CFR 81.72 - Public-use database and public information.

Science.gov (United States)

2010-04-01

... 24 Housing and Urban Development 1 2010-04-01 2010-04-01 false Public-use database and public... Public-use database and public information. (a) General. Except as provided in paragraph (c) of this section, the Secretary shall establish and make available for public use, a public-use database containing...
TOPDOM: database of conservatively located domains and motifs in proteins.

Science.gov (United States)

Varga, Julia; Dobson, László; Tusnády, Gábor E

2016-09-01

The TOPDOM database-originally created as a collection of domains and motifs located consistently on the same side of the membranes in α-helical transmembrane proteins-has been updated and extended by taking into consideration consistently localized domains and motifs in globular proteins, too. By taking advantage of the recently developed CCTOP algorithm to determine the type of a protein and predict topology in case of transmembrane proteins, and by applying a thorough search for domains and motifs as well as utilizing the most up-to-date version of all source databases, we managed to reach a 6-fold increase in the size of the whole database and a 2-fold increase in the number of transmembrane proteins. TOPDOM database is available at http://topdom.enzim.hu The webpage utilizes the common Apache, PHP5 and MySQL software to provide the user interface for accessing and searching the database. The database itself is generated on a high performance computer. tusnady.gabor@ttk.mta.hu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Public participation in genetic databases: crossing the boundaries between biobanks and forensic DNA databases through the principle of solidarity.

Science.gov (United States)

Machado, Helena; Silva, Susana

2015-10-01

The ethical aspects of biobanks and forensic DNA databases are often treated as separate issues. As a reflection of this, public participation, or the involvement of citizens in genetic databases, has been approached differently in the fields of forensics and medicine. This paper aims to cross the boundaries between medicine and forensics by exploring the flows between the ethical issues presented in the two domains and the subsequent conceptualisation of public trust and legitimisation. We propose to introduce the concept of 'solidarity', traditionally applied only to medical and research biobanks, into a consideration of public engagement in medicine and forensics. Inclusion of a solidarity-based framework, in both medical biobanks and forensic DNA databases, raises new questions that should be included in the ethical debate, in relation to both health services/medical research and activities associated with the criminal justice system. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
PUBLIC DOMAIN PROTECTION. USES AND REUSES OF PUBLIC DOMAIN WORKS

Directory of Open Access Journals (Sweden)

Monica Adriana LUPAȘCU

2015-07-01

Full Text Available This study tries to highlight the necessity of an awareness of the right of access to the public domain, particularly using the example of works whose protection period has expired, as well as the ones which the law considers to be excluded from protection. Such works are used not only by large libraries from around the world, but also by rights holders, via different means of use, including incorporations into original works or adaptations. However, the reuse that follows these uses often only remains at the level of concept, as the notion of the public’s right of access to public domain works is not substantiated, nor is the notion of the correct or legal use of such works.
Databases and their application

NARCIS (Netherlands)

Grimm, E.C.; Bradshaw, R.H.W; Brewer, S.; Flantua, S.; Giesecke, T.; Lézine, A.M.; Takahara, H.; Williams, J.W.,Jr; Elias, S.A.; Mock, C.J.

2013-01-01

During the past 20 years, several pollen database cooperatives have been established. These databases are now constituent databases of the Neotoma Paleoecology Database, a public domain, multiproxy, relational database designed for Quaternary-Pliocene fossil data and modern surface samples. The
Database Support for Research in Public Administration

Science.gov (United States)

Tucker, James Cory

2005-01-01

This study examines the extent to which databases support student and faculty research in the area of public administration. A list of journals in public administration, public policy, political science, public budgeting and finance, and other related areas was compared to the journal content list of six business databases. These databases…
iPfam: a database of protein family and domain interactions found in the Protein Data Bank.

Science.gov (United States)

Finn, Robert D; Miller, Benjamin L; Clements, Jody; Bateman, Alex

2014-01-01

The database iPfam, available at http://ipfam.org, catalogues Pfam domain interactions based on known 3D structures that are found in the Protein Data Bank, providing interaction data at the molecular level. Previously, the iPfam domain-domain interaction data was integrated within the Pfam database and website, but it has now been migrated to a separate database. This allows for independent development, improving data access and giving clearer separation between the protein family and interactions datasets. In addition to domain-domain interactions, iPfam has been expanded to include interaction data for domain bound small molecule ligands. Functional annotations are provided from source databases, supplemented by the incorporation of Wikipedia articles where available. iPfam (version 1.0) contains >9500 domain-domain and 15 500 domain-ligand interactions. The new website provides access to this data in a variety of ways, including interactive visualizations of the interaction data.
Publications of Australian LIS Academics in Databases

Science.gov (United States)

Wilson, Concepcion S.; Boell, Sebastian K.; Kennan, Mary Anne; Willard, Patricia

2011-01-01

This paper examines aspects of journal articles published from 1967 to 2008, located in eight databases, and authored or co-authored by academics serving for at least two years in Australian LIS programs from 1959 to 2008. These aspects are: inclusion of publications in databases, publications in journals, authorship characteristics of…
Influencing Database Use in Public Libraries.

Science.gov (United States)

Tenopir, Carol

1999-01-01

Discusses results of a survey of factors influencing database use in public libraries. Highlights the importance of content; ease of use; and importance of instruction. Tabulates importance indications for number and location of workstations, library hours, availability of remote login, usefulness and quality of content, lack of other databases,…
Identification and correction of abnormal, incomplete and mispredicted proteins in public databases

Directory of Open Access Journals (Sweden)

Bányai László

2008-08-01

Full Text Available Abstract Background Despite significant improvements in computational annotation of genomes, sequences of abnormal, incomplete or incorrectly predicted genes and proteins remain abundant in public databases. Since the majority of incomplete, abnormal or mispredicted entries are not annotated as such, these errors seriously affect the reliability of these databases. Here we describe the MisPred approach that may provide an efficient means for the quality control of databases. The current version of the MisPred approach uses five distinct routines for identifying abnormal, incomplete or mispredicted entries based on the principle that a sequence is likely to be incorrect if some of its features conflict with our current knowledge about protein-coding genes and proteins: (i conflict between the predicted subcellular localization of proteins and the absence of the corresponding sequence signals; (ii presence of extracellular and cytoplasmic domains and the absence of transmembrane segments; (iii co-occurrence of extracellular and nuclear domains; (iv violation of domain integrity; (v chimeras encoded by two or more genes located on different chromosomes. Results Analyses of predicted EnsEMBL protein sequences of nine deuterostome (Homo sapiens, Mus musculus, Rattus norvegicus, Monodelphis domestica, Gallus gallus, Xenopus tropicalis, Fugu rubripes, Danio rerio and Ciona intestinalis and two protostome species (Caenorhabditis elegans and Drosophila melanogaster have revealed that the absence of expected signal peptides and violation of domain integrity account for the majority of mispredictions. Analyses of sequences predicted by NCBI's GNOMON annotation pipeline show that the rates of mispredictions are comparable to those of EnsEMBL. Interestingly, even the manually curated UniProtKB/Swiss-Prot dataset is contaminated with mispredicted or abnormal proteins, although to a much lesser extent than UniProtKB/TrEMBL or the EnsEMBL or GNOMON

Violence defied? : A review of prevention of violence in public and semi-public domain

NARCIS (Netherlands)

Knaap, L.M. van der; Nijssen, L.T.J.; Bogaerts, S.

2006-01-01

This report provides a synthesis of 48 studies of the effects of the prevention of violence in the public and semi-public domain. The following research questions were states for this study:What measures for the prevention of violence in the public and semi-public domain are known and have been
Public licenses and public domain as alternatives to copyright

OpenAIRE

Köppel, Petr

2012-01-01

The work first introduces the area of public licenses as a space between the copyright law and public domain. After that, consecutively for proprietary software, free and open source software, open hardware and open content, it maps particular types of public licenses and the accompanying social and cultural movements, puts them in mutual as well as historical context, examines their characteristics and compares them to each other, shows how the public licenses are defined by various accompan...
Preserving the positive functions of the public domain in science

Directory of Open Access Journals (Sweden)

Pamela Samuelson

2003-11-01

Full Text Available Science has advanced in part because data and scientific methodologies have traditionally not been subject to intellectual property protection. In recent years, intellectual property has played a greater role in scientific work. While intellectual property rights may have a positive role to play in some fields of science, so does the public domain. This paper will discuss some of the positive functions of the public domain and ways in which certain legal developments may negatively impact the public domain. It suggests some steps that scientists can take to preserve the positive functions of the public domain for science.
Awareness and use of electronic databases by public library users ...

African Journals Online (AJOL)

The study investigated awareness, access and use of electronic database by public library users in Ibadan Oyo State in Nigeria. The purpose of this study was to determine awareness of public library users' electronic databases, find out what these users used electronic databases to do and to identify problems associated ...
Analysis of commercial and public bioactivity databases.

Science.gov (United States)

Tiikkainen, Pekka; Franke, Lutz

2012-02-27

Activity data for small molecules are invaluable in chemoinformatics. Various bioactivity databases exist containing detailed information of target proteins and quantitative binding data for small molecules extracted from journals and patents. In the current work, we have merged several public and commercial bioactivity databases into one bioactivity metabase. The molecular presentation, target information, and activity data of the vendor databases were standardized. The main motivation of the work was to create a single relational database which allows fast and simple data retrieval by in-house scientists. Second, we wanted to know the amount of overlap between databases by commercial and public vendors to see whether the former contain data complementing the latter. Third, we quantified the degree of inconsistency between data sources by comparing data points derived from the same scientific article cited by more than one vendor. We found that each data source contains unique data which is due to different scientific articles cited by the vendors. When comparing data derived from the same article we found that inconsistencies between the vendors are common. In conclusion, using databases of different vendors is still useful since the data overlap is not complete. It should be noted that this can be partially explained by the inconsistencies and errors in the source data.
Documentation for the U.S. Geological Survey Public-Supply Database (PSDB): A database of permitted public-supply wells, surface-water intakes, and systems in the United States

Science.gov (United States)

Price, Curtis V.; Maupin, Molly A.

2014-01-01

The U.S. Geological Survey (USGS) has developed a database containing information about wells, surface-water intakes, and distribution systems that are part of public water systems across the United States, its territories, and possessions. Programs of the USGS such as the National Water Census, the National Water Use Information Program, and the National Water-Quality Assessment Program all require a complete and current inventory of public water systems, the sources of water used by those systems, and the size of populations served by the systems across the Nation. Although the U.S. Environmental Protection Agency’s Safe Drinking Water Information System (SDWIS) database already exists as the primary national Federal database for information on public water systems, the Public-Supply Database (PSDB) was developed to add value to SDWIS data with enhanced location and ancillary information, and to provide links to other databases, including the USGS’s National Water Information System (NWIS) database.
A novel approach: chemical relational databases, and the role of the ISSCAN database on assessing chemical carcinogenicity.

Science.gov (United States)

Benigni, Romualdo; Bossa, Cecilia; Richard, Ann M; Yang, Chihae

2008-01-01

Mutagenicity and carcinogenicity databases are crucial resources for toxicologists and regulators involved in chemicals risk assessment. Until recently, existing public toxicity databases have been constructed primarily as "look-up-tables" of existing data, and most often did not contain chemical structures. Concepts and technologies originated from the structure-activity relationships science have provided powerful tools to create new types of databases, where the effective linkage of chemical toxicity with chemical structure can facilitate and greatly enhance data gathering and hypothesis generation, by permitting: a) exploration across both chemical and biological domains; and b) structure-searchability through the data. This paper reviews the main public databases, together with the progress in the field of chemical relational databases, and presents the ISSCAN database on experimental chemical carcinogens.
From data repositories to submission portals: rethinking the role of domain-specific databases in CollecTF.

Science.gov (United States)

Kılıç, Sefa; Sagitova, Dinara M; Wolfish, Shoshannah; Bely, Benoit; Courtot, Mélanie; Ciufo, Stacy; Tatusova, Tatiana; O'Donovan, Claire; Chibucos, Marcus C; Martin, Maria J; Erill, Ivan

2016-01-01

Domain-specific databases are essential resources for the biomedical community, leveraging expert knowledge to curate published literature and provide access to referenced data and knowledge. The limited scope of these databases, however, poses important challenges on their infrastructure, visibility, funding and usefulness to the broader scientific community. CollecTF is a community-oriented database documenting experimentally validated transcription factor (TF)-binding sites in the Bacteria domain. In its quest to become a community resource for the annotation of transcriptional regulatory elements in bacterial genomes, CollecTF aims to move away from the conventional data-repository paradigm of domain-specific databases. Through the adoption of well-established ontologies, identifiers and collaborations, CollecTF has progressively become also a portal for the annotation and submission of information on transcriptional regulatory elements to major biological sequence resources (RefSeq, UniProtKB and the Gene Ontology Consortium). This fundamental change in database conception capitalizes on the domain-specific knowledge of contributing communities to provide high-quality annotations, while leveraging the availability of stable information hubs to promote long-term access and provide high-visibility to the data. As a submission portal, CollecTF generates TF-binding site information through direct annotation of RefSeq genome records, definition of TF-based regulatory networks in UniProtKB entries and submission of functional annotations to the Gene Ontology. As a database, CollecTF provides enhanced search and browsing, targeted data exports, binding motif analysis tools and integration with motif discovery and search platforms. This innovative approach will allow CollecTF to focus its limited resources on the generation of high-quality information and the provision of specialized access to the data.Database URL: http://www.collectf.org/. © The Author(s) 2016
The Definition, Dimensions, and Domain of Public Relations.

Science.gov (United States)

Hutton, James G.

1999-01-01

Discusses how the field of public relations has left itself vulnerable to other fields that are making inroads into public relations' traditional domain, and to critics who are filling in their own definitions of public relations. Proposes a definition and a three-dimensional framework to compare competing philosophies of public relations and to…
Initial spatio-temporal domain expansion of the Modelfest database

Science.gov (United States)

Carney, Thom; Mozaffari, Sahar; Sun, Sean; Johnson, Ryan; Shirvastava, Sharona; Shen, Priscilla; Ly, Emma

2013-03-01

The first Modelfest group publication appeared in the SPIE Human Vision and Electronic Imaging conference proceedings in 1999. "One of the group's goals is to develop a public database of test images with threshold data from multiple laboratories for designing and testing HVS (Human Vision Models)." After extended discussions the group selected a set of 45 static images thought to best meet that goal and collected psychophysical detection data which is available on the WEB and presented in the 2000 SPIE conference proceedings. Several groups have used these datasets to test spatial modeling ideas. Further discussions led to the preliminary stimulus specification for extending the database into the temporal domain which was published in the 2002 conference proceeding. After a hiatus of 12 years, some of us have collected spatio-temporal thresholds on an expanded stimulus set of 41 video clips; the original specification included 35 clips. The principal change involved adding one additional spatial pattern beyond the three originally specified. The stimuli consisted of 4 spatial patterns, Gaussian Blob, 4 c/d Gabor patch, 11.3 c/d Gabor patch and a 2D white noise patch. Across conditions the patterns were temporally modulated over a range of approximately 0-25 Hz as well as temporal edge and pulse modulation conditions. The display and data collection specifications were as specified by the Modelfest groups in the 2002 conference proceedings. To date seven subjects have participated in this phase of the data collection effort, one of which also participated in the first phase of Modelfest. Three of the spatio-temporal stimuli were identical to conditions in the original static dataset. Small differences in the thresholds were evident and may point to a stimulus limitation. The temporal CSF peaked between 4 and 8 Hz for the 0 c/d (Gaussian blob) and 4 c/d patterns. The 4 c/d and 11.3 c/d Gabor temporal CSF was low pass while the 0 c/d pattern was band pass. This
Prototype Food and Nutrient Database for Dietary Studies: Branded Food Products Database for Public Health Proof of Concept

Science.gov (United States)

The Prototype Food and Nutrient Database for Dietary Studies (Prototype FNDDS) Branded Food Products Database for Public Health is a proof of concept database. The database contains a small selection of food products which is being used to exhibit the approach for incorporation of the Branded Food ...
Using the structure-function linkage database to characterize functional domains in enzymes.

Science.gov (United States)

Brown, Shoshana; Babbitt, Patricia

2014-12-12

The Structure-Function Linkage Database (SFLD; http://sfld.rbvi.ucsf.edu/) is a Web-accessible database designed to link enzyme sequence, structure, and functional information. This unit describes the protocols by which a user may query the database to predict the function of uncharacterized enzymes and to correct misannotated functional assignments. The information in this unit is especially useful in helping a user discriminate functional capabilities of a sequence that is only distantly related to characterized sequences in publicly available databases. Copyright © 2014 John Wiley & Sons, Inc.
Academic impact of a public electronic health database: bibliometric analysis of studies using the general practice research database.

Directory of Open Access Journals (Sweden)

Yu-Chun Chen

Full Text Available BACKGROUND: Studies that use electronic health databases as research material are getting popular but the influence of a single electronic health database had not been well investigated yet. The United Kingdom's General Practice Research Database (GPRD is one of the few electronic health databases publicly available to academic researchers. This study analyzed studies that used GPRD to demonstrate the scientific production and academic impact by a single public health database. METHODOLOGY AND FINDINGS: A total of 749 studies published between 1995 and 2009 with 'General Practice Research Database' as their topics, defined as GPRD studies, were extracted from Web of Science. By the end of 2009, the GPRD had attracted 1251 authors from 22 countries and been used extensively in 749 studies published in 193 journals across 58 study fields. Each GPRD study was cited 2.7 times by successive studies. Moreover, the total number of GPRD studies increased rapidly, and it is expected to reach 1500 by 2015, twice the number accumulated till the end of 2009. Since 17 of the most prolific authors (1.4% of all authors contributed nearly half (47.9% of GPRD studies, success in conducting GPRD studies may accumulate. The GPRD was used mainly in, but not limited to, the three study fields of "Pharmacology and Pharmacy", "General and Internal Medicine", and "Public, Environmental and Occupational Health". The UK and United States were the two most active regions of GPRD studies. One-third of GRPD studies were internationally co-authored. CONCLUSIONS: A public electronic health database such as the GPRD will promote scientific production in many ways. Data owners of electronic health databases at a national level should consider how to reduce access barriers and to make data more available for research.
Cultural Heritage and the Public Domain

Directory of Open Access Journals (Sweden)

Bas Savenije

2012-09-01

by providing their resources on the Internet” (Berlin Declaration 2003. Therefore, in the spirit of the Berlin Declaration, the ARL encourages its members’ libraries to grant all non-commercial users “a free, irrevocable, worldwide, right of access to, and a license to copy, use, distribute, transmit and display the work publicly and to make and distribute derivative works, in any digital medium for any responsible purpose, subject to proper attribution of authorship”. And: “If fees are to be assessed for the use of digitised public domain works, those fees should only apply to commercial uses” (ARL Principles July 2010. In our view, cultural heritage institutions should make public domain material digitised with public funding as widely available as possible for access and reuse. The public sector has the primary responsibility to fund digitisation. The involvement of private partners, however, is encouraged by ARL as well as the Comité des Sages. Private funding for digitisation is a complement to the necessary public investment, especially in times of economic crisis, but should not be seen as a substitute for public funding. As we can see from these reports there are a number of arguments in favour of digitisation and also of providing maximum accessibility to the digitised cultural heritage. In this paper we will investigate the legal aspects of digitisation of cultural heritage, especially public domain material. On the basis of these we will make an inventory of policy considerations regarding reuse. Furthermore, we will describe the conclusions the National Library of the Netherlands (hereafter: KB has formulated and the arguments that support these. In this context we will review public-private partnerships and also the policy of the KB. We will conclude with recommendations for cultural heritage institutions concerning a reuse policy for digitised public domain material.
A spatial national health facility database for public health sector planning in Kenya in 2008

Directory of Open Access Journals (Sweden)

Gething Peter W

2009-03-01

improving planning. Expansion in public health care in Kenya has resulted in significant increases in geographic access although several areas of the country need further improvements. This information is key to future planning and with this paper we have released the digital spatial database in the public domain to assist the Kenyan Government and its partners in the health sector.
A spatial national health facility database for public health sector planning in Kenya in 2008.

Science.gov (United States)

Noor, Abdisalan M; Alegana, Victor A; Gething, Peter W; Snow, Robert W

2009-03-06

resulted in significant increases in geographic access although several areas of the country need further improvements. This information is key to future planning and with this paper we have released the digital spatial database in the public domain to assist the Kenyan Government and its partners in the health sector.
Academic Impact of a Public Electronic Health Database: Bibliometric Analysis of Studies Using the General Practice Research Database

Science.gov (United States)

Chen, Yu-Chun; Wu, Jau-Ching; Haschler, Ingo; Majeed, Azeem; Chen, Tzeng-Ji; Wetter, Thomas

2011-01-01

Background Studies that use electronic health databases as research material are getting popular but the influence of a single electronic health database had not been well investigated yet. The United Kingdom's General Practice Research Database (GPRD) is one of the few electronic health databases publicly available to academic researchers. This study analyzed studies that used GPRD to demonstrate the scientific production and academic impact by a single public health database. Methodology and Findings A total of 749 studies published between 1995 and 2009 with ‘General Practice Research Database’ as their topics, defined as GPRD studies, were extracted from Web of Science. By the end of 2009, the GPRD had attracted 1251 authors from 22 countries and been used extensively in 749 studies published in 193 journals across 58 study fields. Each GPRD study was cited 2.7 times by successive studies. Moreover, the total number of GPRD studies increased rapidly, and it is expected to reach 1500 by 2015, twice the number accumulated till the end of 2009. Since 17 of the most prolific authors (1.4% of all authors) contributed nearly half (47.9%) of GPRD studies, success in conducting GPRD studies may accumulate. The GPRD was used mainly in, but not limited to, the three study fields of “Pharmacology and Pharmacy”, “General and Internal Medicine”, and “Public, Environmental and Occupational Health”. The UK and United States were the two most active regions of GPRD studies. One-third of GRPD studies were internationally co-authored. Conclusions A public electronic health database such as the GPRD will promote scientific production in many ways. Data owners of electronic health databases at a national level should consider how to reduce access barriers and to make data more available for research. PMID:21731733
75 FR 41180 - Notice of Order: Revisions to Enterprise Public Use Database

Science.gov (United States)

2010-07-15

.... This responsibility to maintain a public use database (PUDB) for such mortgage data was transferred to... FEDERAL HOUSING FINANCE AGENCY [No. 2010-N-10] Notice of Order: Revisions to Enterprise Public Use Database AGENCY: Federal Housing Finance Agency. ACTION: Notice of order. SUMMARY: Section 1323(a)(1) of...
Experiences with IR Top N Optimization in a Main Memory DBMS: Applying 'The Database Approach' in New Domains

NARCIS (Netherlands)

Read, B.; Blok, H.E.; de Vries, A.P.; Blanken, Henk; Apers, Peter M.G.

Data abstraction and query processing techniques are usually studied in the domain of administrative applications. We present a case-study in the non-standard domain of (multimedia) information retrieval, mainly intended as a feasibility study in favor of the `database approach' to data management.
TabSQL: a MySQL tool to facilitate mapping user data to public databases.

Science.gov (United States)

Xia, Xiao-Qin; McClelland, Michael; Wang, Yipeng

2010-06-23

With advances in high-throughput genomics and proteomics, it is challenging for biologists to deal with large data files and to map their data to annotations in public databases. We developed TabSQL, a MySQL-based application tool, for viewing, filtering and querying data files with large numbers of rows. TabSQL provides functions for downloading and installing table files from public databases including the Gene Ontology database (GO), the Ensembl databases, and genome databases from the UCSC genome bioinformatics site. Any other database that provides tab-delimited flat files can also be imported. The downloaded gene annotation tables can be queried together with users' data in TabSQL using either a graphic interface or command line. TabSQL allows queries across the user's data and public databases without programming. It is a convenient tool for biologists to annotate and enrich their data.

Implementing database system for LHCb publications page

CERN Document Server

Abdullayev, Fakhriddin

2017-01-01

The LHCb is one of the main detectors of Large Hadron Collider, where physicists and scientists work together on high precision measurements of matter-antimatter asymmetries and searches for rare and forbidden decays, with the aim of discovering new and unexpected forces. The work does not only consist of analyzing data collected from experiments but also in publishing the results of those analyses. The LHCb publications are gathered on LHCb publications page to maximize their availability to both LHCb members and to the high energy community. In this project a new database system was implemented for LHCb publications page. This will help to improve access to research papers for scientists and better integration with current CERN library website and others.
A publication database for optical long baseline interferometry

Science.gov (United States)

Malbet, Fabien; Mella, Guillaume; Lawson, Peter; Taillifet, Esther; Lafrasse, Sylvain

2010-07-01

Optical long baseline interferometry is a technique that has generated almost 850 refereed papers to date. The targets span a large variety of objects from planetary systems to extragalactic studies and all branches of stellar physics. We have created a database hosted by the JMMC and connected to the Optical Long Baseline Interferometry Newsletter (OLBIN) web site using MySQL and a collection of XML or PHP scripts in order to store and classify these publications. Each entry is defined by its ADS bibcode, includes basic ADS informations and metadata. The metadata are specified by tags sorted in categories: interferometric facilities, instrumentation, wavelength of operation, spectral resolution, type of measurement, target type, and paper category, for example. The whole OLBIN publication list has been processed and we present how the database is organized and can be accessed. We use this tool to generate statistical plots of interest for the community in optical long baseline interferometry.
Annotation error in public databases: misannotation of molecular function in enzyme superfamilies.

Directory of Open Access Journals (Sweden)

Alexandra M Schnoes

2009-12-01

Full Text Available Due to the rapid release of new data from genome sequencing projects, the majority of protein sequences in public databases have not been experimentally characterized; rather, sequences are annotated using computational analysis. The level of misannotation and the types of misannotation in large public databases are currently unknown and have not been analyzed in depth. We have investigated the misannotation levels for molecular function in four public protein sequence databases (UniProtKB/Swiss-Prot, GenBank NR, UniProtKB/TrEMBL, and KEGG for a model set of 37 enzyme families for which extensive experimental information is available. The manually curated database Swiss-Prot shows the lowest annotation error levels (close to 0% for most families; the two other protein sequence databases (GenBank NR and TrEMBL and the protein sequences in the KEGG pathways database exhibit similar and surprisingly high levels of misannotation that average 5%-63% across the six superfamilies studied. For 10 of the 37 families examined, the level of misannotation in one or more of these databases is >80%. Examination of the NR database over time shows that misannotation has increased from 1993 to 2005. The types of misannotation that were found fall into several categories, most associated with "overprediction" of molecular function. These results suggest that misannotation in enzyme superfamilies containing multiple families that catalyze different reactions is a larger problem than has been recognized. Strategies are suggested for addressing some of the systematic problems contributing to these high levels of misannotation.
Annotation error in public databases: misannotation of molecular function in enzyme superfamilies.

Science.gov (United States)

Schnoes, Alexandra M; Brown, Shoshana D; Dodevski, Igor; Babbitt, Patricia C

2009-12-01

Due to the rapid release of new data from genome sequencing projects, the majority of protein sequences in public databases have not been experimentally characterized; rather, sequences are annotated using computational analysis. The level of misannotation and the types of misannotation in large public databases are currently unknown and have not been analyzed in depth. We have investigated the misannotation levels for molecular function in four public protein sequence databases (UniProtKB/Swiss-Prot, GenBank NR, UniProtKB/TrEMBL, and KEGG) for a model set of 37 enzyme families for which extensive experimental information is available. The manually curated database Swiss-Prot shows the lowest annotation error levels (close to 0% for most families); the two other protein sequence databases (GenBank NR and TrEMBL) and the protein sequences in the KEGG pathways database exhibit similar and surprisingly high levels of misannotation that average 5%-63% across the six superfamilies studied. For 10 of the 37 families examined, the level of misannotation in one or more of these databases is >80%. Examination of the NR database over time shows that misannotation has increased from 1993 to 2005. The types of misannotation that were found fall into several categories, most associated with "overprediction" of molecular function. These results suggest that misannotation in enzyme superfamilies containing multiple families that catalyze different reactions is a larger problem than has been recognized. Strategies are suggested for addressing some of the systematic problems contributing to these high levels of misannotation.
Error analysis of a public domain pronunciation dictionary

CSIR Research Space (South Africa)

Martirosian, O

2007-11-01

Full Text Available ], a popular public domain resource that is widely used in English speech processing systems. The techniques being investigated are applied to the lexicon and the results of each step are illustrated using sample entries. The authors found that as many...
Conserved Domain Database (CDD)

Data.gov (United States)

U.S. Department of Health & Human Services — CDD is a protein annotation resource that consists of a collection of well-annotated multiple sequence alignment models for ancient domains and full-length proteins.
Assessment of Residential History Generation Using a Public-Record Database

Directory of Open Access Journals (Sweden)

David C. Wheeler

2015-09-01

Full Text Available In studies of disease with potential environmental risk factors, residential location is often used as a surrogate for unknown environmental exposures or as a basis for assigning environmental exposures. These studies most typically use the residential location at the time of diagnosis due to ease of collection. However, previous residential locations may be more useful for risk analysis because of population mobility and disease latency. When residential histories have not been collected in a study, it may be possible to generate them through public-record databases. In this study, we evaluated the ability of a public-records database from LexisNexis to provide residential histories for subjects in a geographically diverse cohort study. We calculated 11 performance metrics comparing study-collected addresses and two address retrieval services from LexisNexis. We found 77% and 90% match rates for city and state and 72% and 87% detailed address match rates with the basic and enhanced services, respectively. The enhanced LexisNexis service covered 86% of the time at residential addresses recorded in the study. The mean match rate for detailed address matches varied spatially over states. The results suggest that public record databases can be useful for reconstructing residential histories for subjects in epidemiologic studies.
Ubiquitin domain proteins in disease

DEFF Research Database (Denmark)

Klausen, Louise Kjær; Schulze, Andrea; Seeger, Michael

2007-01-01

The human genome encodes several ubiquitin-like (UBL) domain proteins (UDPs). Members of this protein family are involved in a variety of cellular functions and many are connected to the ubiquitin proteasome system, an essential pathway for protein degradation in eukaryotic cells. Despite...... and cancer. Publication history: Republished from Current BioData's Targeted Proteins database (TPdb; http://www.targetedproteinsdb.com)....
Power source roadmaps using bibliometrics and database tomography

International Nuclear Information System (INIS)

Kostoff, R.N.; Tshiteya, R.; Pfeil, K.M.; Humenik, J.A.; Karypis, G.

2005-01-01

Database Tomography (DT) is a textual database analysis system consisting of two major components: (1) algorithms for extracting multi-word phrase frequencies and phrase proximities (physical closeness of the multi-word technical phrases) from any type of large textual database, to augment (2) interpretative capabilities of the expert human analyst. DT was used to derive technical intelligence from a Power Sources database derived from the Science Citation Index. Phrase frequency analysis by the technical domain experts provided the pervasive technical themes of the Power Sources database, and the phrase proximity analysis provided the relationships among the pervasive technical themes. Bibliometric analysis of the Power Sources literature supplemented the DT results with author/journal/institution/country publication and citation data
Colil: a database and search service for citation contexts in the life sciences domain.

Science.gov (United States)

Fujiwara, Toyofumi; Yamamoto, Yasunori

2015-01-01

To promote research activities in a particular research area, it is important to efficiently identify current research trends, advances, and issues in that area. Although review papers in the research area can suffice for this purpose in general, researchers are not necessarily able to obtain these papers from research aspects of their interests at the time they are required. Therefore, the utilization of the citation contexts of papers in a research area has been considered as another approach. However, there are few search services to retrieve citation contexts in the life sciences domain; furthermore, efficiently obtaining citation contexts is becoming difficult due to the large volume and rapid growth of life sciences papers. Here, we introduce the Colil (Comments on Literature in Literature) database to store citation contexts in the life sciences domain. By using the Resource Description Framework (RDF) and a newly compiled vocabulary, we built the Colil database and made it available through the SPARQL endpoint. In addition, we developed a web-based search service called Colil that searches for a cited paper in the Colil database and then returns a list of citation contexts for it along with papers relevant to it based on co-citations. The citation contexts in the Colil database were extracted from full-text papers of the PubMed Central Open Access Subset (PMC-OAS), which includes 545,147 papers indexed in PubMed. These papers are distributed across 3,171 journals and cite 5,136,741 unique papers that correspond to approximately 25 % of total PubMed entries. By utilizing Colil, researchers can easily refer to a set of citation contexts and relevant papers based on co-citations for a target paper. Colil helps researchers to comprehend life sciences papers in a research area more efficiently and makes their biological research more efficient.
Data publication: towards a database of everything

Directory of Open Access Journals (Sweden)

Smith Vincent S

2009-06-01

Full Text Available Abstract The fabric of science is changing, driven by a revolution in digital technologies that facilitate the acquisition and communication of massive amounts of data. This is changing the nature of collaboration and expanding opportunities to participate in science. If digital technologies are the engine of this revolution, digital data are its fuel. But for many scientific disciplines, this fuel is in short supply. The publication of primary data is not a universal or mandatory part of science, and despite policies and proclamations to the contrary, calls to make data publicly available have largely gone unheeded. In this short essay I consider why, and explore some of the challenges that lie ahead, as we work toward a database of everything.
76 FR 60031 - Notice of Order: Revisions to Enterprise Public Use Database Incorporating High-Cost Single...

Science.gov (United States)

2011-09-28

... single-family matrix in FHFA's Public Use Database (PUDB) to include data fields for the high-cost single... Use Database Incorporating High-Cost Single-Family Securitized Loan Data Fields and Technical Data... amended, it is necessary to revise the single-family matrix of FHFA's Public Use Database (PUDB) by adding...
Terminology of the public relations field: corpus — automatic term recognition — terminology database

Directory of Open Access Journals (Sweden)

Nataša Logar Berginc

2013-12-01

Full Text Available The article describes an analysis of automatic term recognition results performed for single- and multi-word terms with the LUIZ term extraction system. The target application of the results is a terminology database of Public Relations and the main resource the KoRP Public Relations Corpus. Our analysis is focused on two segments: (a single-word noun term candidates, which we compare with the frequency list of nouns from KoRP and evaluate termhood on the basis of the judgements of two domain experts, and (b multi-word term candidates with verb and noun as headword. In order to better assess the performance of the system and the soundness of our approach we also performed an analysis of recall. Our results show that the terminological relevance of extracted nouns is indeed higher than that of merely frequent nouns, and that verbal phrases only rarely count as proper terms. The most productive patterns of multi-word terms with noun as a headword have the following structure: [adjective + noun], [adjective + and + adjective + noun] and [adjective + adjective + noun]. The analysis of recall shows low inter-annotator agreement, but nevertheless very satisfactory recall levels.
mirPub: a database for searching microRNA publications.

Science.gov (United States)

Vergoulis, Thanasis; Kanellos, Ilias; Kostoulas, Nikos; Georgakilas, Georgios; Sellis, Timos; Hatzigeorgiou, Artemis; Dalamagas, Theodore

2015-05-01

Identifying, amongst millions of publications available in MEDLINE, those that are relevant to specific microRNAs (miRNAs) of interest based on keyword search faces major obstacles. References to miRNA names in the literature often deviate from standard nomenclature for various reasons, since even the official nomenclature evolves. For instance, a single miRNA name may identify two completely different molecules or two different names may refer to the same molecule. mirPub is a database with a powerful and intuitive interface, which facilitates searching for miRNA literature, addressing the aforementioned issues. To provide effective search services, mirPub applies text mining techniques on MEDLINE, integrates data from several curated databases and exploits data from its user community following a crowdsourcing approach. Other key features include an interactive visualization service that illustrates intuitively the evolution of miRNA data, tag clouds summarizing the relevance of publications to particular diseases, cell types or tissues and access to TarBase 6.0 data to oversee genes related to miRNA publications. mirPub is freely available at http://www.microrna.gr/mirpub/. vergoulis@imis.athena-innovation.gr or dalamag@imis.athena-innovation.gr Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.
76 FR 77533 - Notice of Order: Revisions to Enterprise Public Use Database Incorporating High-Cost Single...

Science.gov (United States)

2011-12-13

..., regarding FHFA's adoption of an Order revising FHFA's Public Use Database matrices to include certain data... FEDERAL HOUSING FINANCE AGENCY [No. 2011-N-13] Notice of Order: Revisions to Enterprise Public Use Database Incorporating High-Cost Single-Family Securitized Loan Data Fields and Technical Data Field...
DIMA 3.0: Domain Interaction Map.

Science.gov (United States)

Luo, Qibin; Pagel, Philipp; Vilne, Baiba; Frishman, Dmitrij

2011-01-01

Domain Interaction MAp (DIMA, available at http://webclu.bio.wzw.tum.de/dima) is a database of predicted and known interactions between protein domains. It integrates 5807 structurally known interactions imported from the iPfam and 3did databases and 46,900 domain interactions predicted by four computational methods: domain phylogenetic profiling, domain pair exclusion algorithm correlated mutations and domain interaction prediction in a discriminative way. Additionally predictions are filtered to exclude those domain pairs that are reported as non-interacting by the Negatome database. The DIMA Web site allows to calculate domain interaction networks either for a domain of interest or for entire organisms, and to explore them interactively using the Flash-based Cytoscape Web software.
The CATH database

Directory of Open Access Journals (Sweden)

Knudsen Michael

2010-02-01

Full Text Available Abstract The CATH database provides hierarchical classification of protein domains based on their folding patterns. Domains are obtained from protein structures deposited in the Protein Data Bank and both domain identification and subsequent classification use manual as well as automated procedures. The accompanying website http://www.cathdb.info provides an easy-to-use entry to the classification, allowing for both browsing and downloading of data. Here, we give a brief review of the database, its corresponding website and some related tools.
Accessing the public MIMIC-II intensive care relational database for clinical research.

Science.gov (United States)

Scott, Daniel J; Lee, Joon; Silva, Ikaro; Park, Shinhyuk; Moody, George B; Celi, Leo A; Mark, Roger G

2013-01-10

The Multiparameter Intelligent Monitoring in Intensive Care II (MIMIC-II) database is a free, public resource for intensive care research. The database was officially released in 2006, and has attracted a growing number of researchers in academia and industry. We present the two major software tools that facilitate accessing the relational database: the web-based QueryBuilder and a downloadable virtual machine (VM) image. QueryBuilder and the MIMIC-II VM have been developed successfully and are freely available to MIMIC-II users. Simple example SQL queries and the resulting data are presented. Clinical studies pertaining to acute kidney injury and prediction of fluid requirements in the intensive care unit are shown as typical examples of research performed with MIMIC-II. In addition, MIMIC-II has also provided data for annual PhysioNet/Computing in Cardiology Challenges, including the 2012 Challenge "Predicting mortality of ICU Patients". QueryBuilder is a web-based tool that provides easy access to MIMIC-II. For more computationally intensive queries, one can locally install a complete copy of MIMIC-II in a VM. Both publicly available tools provide the MIMIC-II research community with convenient querying interfaces and complement the value of the MIMIC-II relational database.
BioQ: tracing experimental origins in public genomic databases using a novel data provenance model.

Science.gov (United States)

Saccone, Scott F; Quan, Jiaxi; Jones, Peter L

2012-04-15

Public genomic databases, which are often used to guide genetic studies of human disease, are now being applied to genomic medicine through in silico integrative genomics. These databases, however, often lack tools for systematically determining the experimental origins of the data. We introduce a new data provenance model that we have implemented in a public web application, BioQ, for assessing the reliability of the data by systematically tracing its experimental origins to the original subjects and biologics. BioQ allows investigators to both visualize data provenance as well as explore individual elements of experimental process flow using precise tools for detailed data exploration and documentation. It includes a number of human genetic variation databases such as the HapMap and 1000 Genomes projects. BioQ is freely available to the public at http://bioq.saclab.net.
A Public Domain Software Library for Reading and Language Arts.

Science.gov (United States)

Balajthy, Ernest

A three-year project carried out by the Microcomputers and Reading Committee of the New Jersey Reading Association involved the collection, improvement, and distribution of free microcomputer software (public domain programs) designed to deal with reading and writing skills. Acknowledging that this free software is not without limitations (poor…

A Public Image Database for Benchmark of Plant Seedling Classification Algorithms

DEFF Research Database (Denmark)

Giselsson, Thomas Mosgaard; Nyholm Jørgensen, Rasmus; Jensen, Peter Kryger

A database of images of approximately 960 unique plants belonging to 12 species at several growth stages is made publicly available. It comprises annotated RGB images with a physical resolution of roughly 10 pixels per mm. To standardise the evaluation of classification results obtained...
Criteria for compilation of a site-specific thermodynamic database for geochemical speciation calculations

International Nuclear Information System (INIS)

Chandratillake, M.; Trivedi, D.P.; Randall, M.G.; Humphreys, P.N.

1998-01-01

A methodology has been developed to establish a site-specific database appropriate to geochemical modelling the critical components and the wide range of near field conditions expected in the low level radioactive waste disposal site at Drigg in the UK. Several databases available in the public domain have been compared to select a foundation database. The foundation database was 'trimmed-down' and then customised to suit Drigg applications. The species dominant at Drigg have been identified and the thermodynamic constants of these species have been critically evaluated. The evaluated database has been validated for quality by comparing speciation calculations with plutonium and uranium experimental solubility results. (orig.)
Domain Regeneration for Cross-Database Micro-Expression Recognition

Science.gov (United States)

Zong, Yuan; Zheng, Wenming; Huang, Xiaohua; Shi, Jingang; Cui, Zhen; Zhao, Guoying

2018-05-01

In this paper, we investigate the cross-database micro-expression recognition problem, where the training and testing samples are from two different micro-expression databases. Under this setting, the training and testing samples would have different feature distributions and hence the performance of most existing micro-expression recognition methods may decrease greatly. To solve this problem, we propose a simple yet effective method called Target Sample Re-Generator (TSRG) in this paper. By using TSRG, we are able to re-generate the samples from target micro-expression database and the re-generated target samples would share same or similar feature distributions with the original source samples. For this reason, we can then use the classifier learned based on the labeled source samples to accurately predict the micro-expression categories of the unlabeled target samples. To evaluate the performance of the proposed TSRG method, extensive cross-database micro-expression recognition experiments designed based on SMIC and CASME II databases are conducted. Compared with recent state-of-the-art cross-database emotion recognition methods, the proposed TSRG achieves more promising results.
37 CFR 201.26 - Recordation of documents pertaining to computer shareware and donation of public domain computer...

Science.gov (United States)

2010-07-01

... pertaining to computer shareware and donation of public domain computer software. 201.26 Section 201.26... GENERAL PROVISIONS § 201.26 Recordation of documents pertaining to computer shareware and donation of public domain computer software. (a) General. This section prescribes the procedures for submission of...
Teaching Case: Adapting the Access Northwind Database to Support a Database Course

Science.gov (United States)

Dyer, John N.; Rogers, Camille

2015-01-01

A common problem encountered when teaching database courses is that few large illustrative databases exist to support teaching and learning. Most database textbooks have small "toy" databases that are chapter objective specific, and thus do not support application over the complete domain of design, implementation and management concepts…
Microcomputer Database Management Systems that Interface with Online Public Access Catalogs.

Science.gov (United States)

Rice, James

1988-01-01

Describes a study that assessed the availability and use of microcomputer database management interfaces to online public access catalogs. The software capabilities needed to effect such an interface are identified, and available software packages are evaluated by these criteria. A directory of software vendors is provided. (4 notes with…
Personal Publications Lists Serve as a Reliable Calibration Parameter to Compare Coverage in Academic Citation Databases with Scientific Social Media

Directory of Open Access Journals (Sweden)

Emma Hughes

2017-03-01

Full Text Available A Review of: Hilbert, F., Barth, J., Gremm, J., Gros, D., Haiter, J., Henkel, M., Reinhardt, W., & Stock, W.G. (2015. Coverage of academic citation databases compared with coverage of scientific social media: personal publication lists as calibration parameters. Online Information Review 39(2: 255-264. http://dx.doi.org/10.1108/OIR-07-2014-0159 Objective – The purpose of this study was to explore coverage rates of information science publications in academic citation databases and scientific social media using a new method of personal publication lists as a calibration parameter. The research questions were: How many publications are covered in different databases, which has the best coverage, and what institutions are represented and how does the language of the publication play a role? Design – Bibliometric analysis. Setting – Academic citation databases (Web of Science, Scopus, Google Scholar and scientific social media (Mendeley, CiteULike, Bibsonomy. Subjects – 1,017 library and information science publications produced by 76 information scientists at 5 German-speaking universities in Germany and Austria. Methods – Only documents which were published between 1 January 2003 and 31 December 2012 were included. In that time the 76 information scientists had produced 1,017 documents. The information scientists confirmed that their publication lists were complete and these served as the calibration parameter for the study. The citations from the publication lists were searched in three academic databases: Google Scholar, Web of Science (WoS, and Scopus; as well as three social media citation sites: Mendeley, CiteULike, and BibSonomy and the results were compared. The publications were searched for by author name and words from the title. Main results – None of the databases investigated had 100% coverage. In the academic databases, Google Scholar had the highest amount of coverage with an average of 63%, Scopus an average of 31%, and
Characteristics of scientific web publications

DEFF Research Database (Denmark)

Thorlund Jepsen, Erik; Seiden, Piet; Ingwersen, Peter Emil Rerup

2004-01-01

were generated based on specifically selected domain topics that are searched for in three publicly accessible search engines (Google, AllTheWeb, and AltaVista). A sample of the retrieved hits was analyzed with regard to how various publication attributes correlated with the scientific quality...... of the content and whether this information could be employed to harvest, filter, and rank Web publications. The attributes analyzed were inlinks, outlinks, bibliographic references, file format, language, search engine overlap, structural position (according to site structure), and the occurrence of various...... types of metadata. As could be expected, the ranked output differs between the three search engines. Apparently, this is caused by differences in ranking algorithms rather than the databases themselves. In fact, because scientific Web content in this subject domain receives few inlinks, both Alta...
Accumulation of Domain-Specific Physical Inactivity and Presence of Hypertension in Brazilian Public Healthcare System.

Science.gov (United States)

Turi, Bruna Camilo; Codogno, Jamile S; Fernandes, Romulo A; Sui, Xuemei; Lavie, Carl J; Blair, Steven N; Monteiro, Henrique Luiz

2015-11-01

Hypertension is one of the most common noncommunicable diseases worldwide, and physical inactivity is a risk factor predisposing to its occurrence and complications. However, it is still unclear the association between physical inactivity domains and hypertension, especially in public healthcare systems. Thus, this study aimed to investigate the association between physical inactivity aggregation in different domains and prevalence of hypertension among users of Brazilian public health system. 963 participants composed the sample. Subjects were divided into quartiles groups according to 3 different domains of physical activity (occupational; physical exercises; and leisure-time and transportation). Hypertension was based on physician diagnosis. Physical inactivity in occupational domain was significantly associated with higher prevalence of hypertension (OR = 1.52 [1.05 to 2.21]). The same pattern occurred for physical inactivity in leisure-time (OR = 1.63 [1.11 to 2.39]) and aggregation of physical inactivity in 3 domains (OR = 2.46 [1.14 to 5.32]). However, the multivariate-adjusted model showed significant association between hypertension and physical inactivity in 3 domains (OR = 2.57 [1.14 to 5.79]). The results suggest an unequal prevalence of hypertension according to physical inactivity across different domains and increasing the promotion of physical activity in the healthcare system is needed.
The International Nucleotide Sequence Database Collaboration.

Science.gov (United States)

Cochrane, Guy; Karsch-Mizrachi, Ilene; Nakamura, Yasukazu

2011-01-01

Under the International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org), globally comprehensive public domain nucleotide sequence is captured, preserved and presented. The partners of this long-standing collaboration work closely together to provide data formats and conventions that enable consistent data submission to their databases and support regular data exchange around the globe. Clearly defined policy and governance in relation to free access to data and relationships with journal publishers have positioned INSDC databases as a key provider of the scientific record and a core foundation for the global bioinformatics data infrastructure. While growth in sequence data volumes comes no longer as a surprise to INSDC partners, the uptake of next-generation sequencing technology by mainstream science that we have witnessed in recent years brings a step-change to growth, necessarily making a clear mark on INSDC strategy. In this article, we introduce the INSDC, outline data growth patterns and comment on the challenges of increased growth.
Language Choice and Use of Malaysian Public University Lecturers in the Education Domain

Directory of Open Access Journals (Sweden)

Tam Lee Mei

2016-02-01

Full Text Available It is a norm for people from a multilingual and multicultural country such as Malaysia to speak at least two or more languages. Thus, the Malaysian multilingual situation resulted in speakers having to make decisions about which languages are to be used for different purposes in different domains. In order to explain the phenomenon of language choice, Fishman domain analysis (1964 was adapted into this research. According to Fishman’s domain analysis, language choice and use may depend on the speaker’s experiences situated in different settings, different language repertoires that are available to the speaker, different interlocutors and different topics. Such situations inevitably cause barriers and difficulties to those professionals who work in the education domain. Therefore, the purpose of this research is to explore the language choice and use of Malaysian public university lecturers in the education domain and to investigate whether any significant differences exist between ethnicity and field of study with the English language choice and use of the lecturers. 200 survey questionnaires were distributed to examine the details of the lecturers’ language choice and use. The findings of this research reveal that all of the respondents generally preferred to choose and use English language in both formal and informal education domain. Besides, all of the respondents claimed that they chose and used more than one language. It is also found that ethnicity and field of study of the respondents influence the language choice and use in the education domain. In addition, this research suggested that the language and educational policy makers have been largely successful in raising the role and status of the English language as the medium of instruction in tertiary education while maintaining the Malay language as having an important role in the communicative acts, thus characterizing the lecturers’ language choice and use. Keywords: Language
Toward fish and seafood traceability: anchovy species determination in fish products by molecular markers and support through a public domain database.

Science.gov (United States)

Jérôme, Marc; Martinsohn, Jann Thorsten; Ortega, Delphine; Carreau, Philippe; Verrez-Bagnis, Véronique; Mouchel, Olivier

2008-05-28

Traceability in the fish food sector plays an increasingly important role for consumer protection and confidence building. This is reflected by the introduction of legislation and rules covering traceability on national and international levels. Although traceability through labeling is well established and supported by respective regulations, monitoring and enforcement of these rules are still hampered by the lack of efficient diagnostic tools. We describe protocols using a direct sequencing method based on 212-274-bp diagnostic sequences derived from species-specific mitochondria DNA cytochrome b, 16S rRNA, and cytochrome oxidase subunit I sequences which can efficiently be applied to unambiguously determine even closely related fish species in processed food products labeled "anchovy". Traceability of anchovy-labeled products is supported by the public online database AnchovyID ( http://anchovyid.jrc.ec.europa.eu), which provided data obtained during our study and tools for analytical purposes.
Coverage and quality: A comparison of Web of Science and Scopus databases for reporting faculty nursing publication metrics.

Science.gov (United States)

Powell, Kimberly R; Peterson, Shenita R

Web of Science and Scopus are the leading databases of scholarly impact. Recent studies outside the field of nursing report differences in journal coverage and quality. A comparative analysis of nursing publications reported impact. Journal coverage by each database for the field of nursing was compared. Additionally, publications by 2014 nursing faculty were collected in both databases and compared for overall coverage and reported quality, as modeled by Scimajo Journal Rank, peer review status, and MEDLINE inclusion. Individual author impact, modeled by the h-index, was calculated by each database for comparison. Scopus offered significantly higher journal coverage. For 2014 faculty publications, 100% of journals were found in Scopus, Web of Science offered 82%. No significant difference was found in the quality of reported journals. Author h-index was found to be higher in Scopus. When reporting faculty publications and scholarly impact, academic nursing programs may be better represented by Scopus, without compromising journal quality. Programs with strong interdisciplinary work should examine all areas of strength to ensure appropriate coverage. Copyright © 2017 Elsevier Inc. All rights reserved.
Blockchain-based Public Key Infrastructure for Inter-Domain Secure Routing

OpenAIRE

de la Rocha Gómez-Arevalillo , Alfonso; Papadimitratos , Panos

2017-01-01

International audience; A gamut of secure inter-domain routing protocols has been proposed in the literature. They use traditional PGP-like and centralized Public Key Infrastructures for trust management. In this paper, we propose our alternative approach for managing security associations, Secure Blockchain Trust Management (SBTM), a trust management system that instantiates a blockchain-based PKI for the operation of securerouting protocols. A main motivation for SBTM is to facilitate gradu...
Remotely Piloted Aircraft and War in the Public Relations Domain

Science.gov (United States)

2014-10-01

the terms as they appear in quoted texts. 2. Peter Kreeft, Socratic Logic: A Logic Text Using Socratic Method , Platonic Questions, and Aristotelian...Ronald Brooks.22 This method of refuting an argu- ment reflects option C (above), demonstrating that the conclusion does not follow from the premises...and War in the Public Relations Domain Feature tional Security Assistance Force (ISAF) met to discuss methods of elim- inating civilian casualties in
Axiomatic Specification of Database Domain Statics

NARCIS (Netherlands)

Wieringa, Roelf J.

1987-01-01

In the past ten years, much work has been done to add more structure to database models 1 than what is represented by a mere collection of flat relations (Albano & Cardelli [1985], Albano et al. [1986], Borgida eta. [1984], Brodie [1984], Brodie & Ridjanovic [1984], Brodie & Silva (1982], Codd
SoyDB: a knowledge database of soybean transcription factors

Directory of Open Access Journals (Sweden)

Valliyodan Babu

2010-01-01

Full Text Available Abstract Background Transcription factors play the crucial rule of regulating gene expression and influence almost all biological processes. Systematically identifying and annotating transcription factors can greatly aid further understanding their functions and mechanisms. In this article, we present SoyDB, a user friendly database containing comprehensive knowledge of soybean transcription factors. Description The soybean genome was recently sequenced by the Department of Energy-Joint Genome Institute (DOE-JGI and is publicly available. Mining of this sequence identified 5,671 soybean genes as putative transcription factors. These genes were comprehensively annotated as an aid to the soybean research community. We developed SoyDB - a knowledge database for all the transcription factors in the soybean genome. The database contains protein sequences, predicted tertiary structures, putative DNA binding sites, domains, homologous templates in the Protein Data Bank (PDB, protein family classifications, multiple sequence alignments, consensus protein sequence motifs, web logo of each family, and web links to the soybean transcription factor database PlantTFDB, known EST sequences, and other general protein databases including Swiss-Prot, Gene Ontology, KEGG, EMBL, TAIR, InterPro, SMART, PROSITE, NCBI, and Pfam. The database can be accessed via an interactive and convenient web server, which supports full-text search, PSI-BLAST sequence search, database browsing by protein family, and automatic classification of a new protein sequence into one of 64 annotated transcription factor families by hidden Markov models. Conclusions A comprehensive soybean transcription factor database was constructed and made publicly accessible at http://casp.rnet.missouri.edu/soydb/.
Logical database design principles

CERN Document Server

Garmany, John; Clark, Terry

2005-01-01

INTRODUCTION TO LOGICAL DATABASE DESIGNUnderstanding a Database Database Architectures Relational Databases Creating the Database System Development Life Cycle (SDLC)Systems Planning: Assessment and Feasibility System Analysis: RequirementsSystem Analysis: Requirements Checklist Models Tracking and Schedules Design Modeling Functional Decomposition DiagramData Flow Diagrams Data Dictionary Logical Structures and Decision Trees System Design: LogicalSYSTEM DESIGN AND IMPLEMENTATION The ER ApproachEntities and Entity Types Attribute Domains AttributesSet-Valued AttributesWeak Entities Constraint
A reference methylome database and analysis pipeline to facilitate integrative and comparative epigenomics.

Directory of Open Access Journals (Sweden)

Qiang Song

Full Text Available DNA methylation is implicated in a surprising diversity of regulatory, evolutionary processes and diseases in eukaryotes. The introduction of whole-genome bisulfite sequencing has enabled the study of DNA methylation at a single-base resolution, revealing many new aspects of DNA methylation and highlighting the usefulness of methylome data in understanding a variety of genomic phenomena. As the number of publicly available whole-genome bisulfite sequencing studies reaches into the hundreds, reliable and convenient tools for comparing and analyzing methylomes become increasingly important. We present MethPipe, a pipeline for both low and high-level methylome analysis, and MethBase, an accompanying database of annotated methylomes from the public domain. Together these resources enable researchers to extract interesting features from methylomes and compare them with those identified in public methylomes in our database.
A public database of macromolecular diffraction experiments.

Science.gov (United States)

Grabowski, Marek; Langner, Karol M; Cymborowski, Marcin; Porebski, Przemyslaw J; Sroka, Piotr; Zheng, Heping; Cooper, David R; Zimmerman, Matthew D; Elsliger, Marc André; Burley, Stephen K; Minor, Wladek

2016-11-01

The low reproducibility of published experimental results in many scientific disciplines has recently garnered negative attention in scientific journals and the general media. Public transparency, including the availability of `raw' experimental data, will help to address growing concerns regarding scientific integrity. Macromolecular X-ray crystallography has led the way in requiring the public dissemination of atomic coordinates and a wealth of experimental data, making the field one of the most reproducible in the biological sciences. However, there remains no mandate for public disclosure of the original diffraction data. The Integrated Resource for Reproducibility in Macromolecular Crystallography (IRRMC) has been developed to archive raw data from diffraction experiments and, equally importantly, to provide related metadata. Currently, the database of our resource contains data from 2920 macromolecular diffraction experiments (5767 data sets), accounting for around 3% of all depositions in the Protein Data Bank (PDB), with their corresponding partially curated metadata. IRRMC utilizes distributed storage implemented using a federated architecture of many independent storage servers, which provides both scalability and sustainability. The resource, which is accessible via the web portal at http://www.proteindiffraction.org, can be searched using various criteria. All data are available for unrestricted access and download. The resource serves as a proof of concept and demonstrates the feasibility of archiving raw diffraction data and associated metadata from X-ray crystallographic studies of biological macromolecules. The goal is to expand this resource and include data sets that failed to yield X-ray structures in order to facilitate collaborative efforts that will improve protein structure-determination methods and to ensure the availability of `orphan' data left behind for various reasons by individual investigators and/or extinct structural genomics

Quality criteria for electronic publications in medicine.

Science.gov (United States)

Schulz, S; Auhuber, T; Schrader, U; Klar, R

1998-01-01

This paper defines "electronic publications in medicine (EPM)" as computer based training programs, databases, knowledge-based systems, multimedia applications and electronic books running on standard platforms and available by usual distribution channels. A detailed catalogue of quality criteria as a basis for development and evaluation of EPMs is presented. The necessity to raise the quality level of electronic publications is stressed considering aspects of domain knowledge, software engineering, media development, interface design and didactics.
Development of a Publicly Available, Comprehensive Database of Fiber and Health Outcomes: Rationale and Methods.

Directory of Open Access Journals (Sweden)

Kara A Livingston

Full Text Available Dietary fiber is a broad category of compounds historically defined as partially or completely indigestible plant-based carbohydrates and lignin with, more recently, the additional criteria that fibers incorporated into foods as additives should demonstrate functional human health outcomes to receive a fiber classification. Thousands of research studies have been published examining fibers and health outcomes.(1 Develop a database listing studies testing fiber and physiological health outcomes identified by experts at the Ninth Vahouny Conference; (2 Use evidence mapping methodology to summarize this body of literature. This paper summarizes the rationale, methodology, and resulting database. The database will help both scientists and policy-makers to evaluate evidence linking specific fibers with physiological health outcomes, and identify missing information.To build this database, we conducted a systematic literature search for human intervention studies published in English from 1946 to May 2015. Our search strategy included a broad definition of fiber search terms, as well as search terms for nine physiological health outcomes identified at the Ninth Vahouny Fiber Symposium. Abstracts were screened using a priori defined eligibility criteria and a low threshold for inclusion to minimize the likelihood of rejecting articles of interest. Publications then were reviewed in full text, applying additional a priori defined exclusion criteria. The database was built and published on the Systematic Review Data Repository (SRDR™, a web-based, publicly available application.A fiber database was created. This resource will reduce the unnecessary replication of effort in conducting systematic reviews by serving as both a central database archiving PICO (population, intervention, comparator, outcome data on published studies and as a searchable tool through which this data can be extracted and updated.
A Public Database of Memory and Naive B-Cell Receptor Sequences.

Directory of Open Access Journals (Sweden)

William S DeWitt

Full Text Available The vast diversity of B-cell receptors (BCR and secreted antibodies enables the recognition of, and response to, a wide range of epitopes, but this diversity has also limited our understanding of humoral immunity. We present a public database of more than 37 million unique BCR sequences from three healthy adult donors that is many fold deeper than any existing resource, together with a set of online tools designed to facilitate the visualization and analysis of the annotated data. We estimate the clonal diversity of the naive and memory B-cell repertoires of healthy individuals, and provide a set of examples that illustrate the utility of the database, including several views of the basic properties of immunoglobulin heavy chain sequences, such as rearrangement length, subunit usage, and somatic hypermutation positions and dynamics.
A Public Database of Memory and Naive B-Cell Receptor Sequences.

Science.gov (United States)

DeWitt, William S; Lindau, Paul; Snyder, Thomas M; Sherwood, Anna M; Vignali, Marissa; Carlson, Christopher S; Greenberg, Philip D; Duerkopp, Natalie; Emerson, Ryan O; Robins, Harlan S

2016-01-01

The vast diversity of B-cell receptors (BCR) and secreted antibodies enables the recognition of, and response to, a wide range of epitopes, but this diversity has also limited our understanding of humoral immunity. We present a public database of more than 37 million unique BCR sequences from three healthy adult donors that is many fold deeper than any existing resource, together with a set of online tools designed to facilitate the visualization and analysis of the annotated data. We estimate the clonal diversity of the naive and memory B-cell repertoires of healthy individuals, and provide a set of examples that illustrate the utility of the database, including several views of the basic properties of immunoglobulin heavy chain sequences, such as rearrangement length, subunit usage, and somatic hypermutation positions and dynamics.
Suburban development – a search for public domains in Danish suburban neighbourhoods

DEFF Research Database (Denmark)

Melgaard, Bente; Bech-Danielsen, Claus

These years some of the post-war Danish suburbs are facing great challenges – social segregation, demographic changes and challenges in building technology. In particular, segregation prevents social life from unfolding across social, economic and cultural borders. Therefore, in this paper......, potentials for bridge-building across the enclaves of the suburb are looked for through a combined architectural-anthropological mapping of public spaces in a specific suburb in Denmark, the analyses being carried out in the light of Hajer & Reijndorp’s definition of public domains and the term exchange...
The Value of Privacy and Surveillance Drones in the Public Domain : Scrutinizing the Dutch Flexible Deployment of Mobile Cameras Act

NARCIS (Netherlands)

Gerdo Kuiper; Quirine Eijkman

2017-01-01

The flexible deployment of drones in the public domain, is in this article assessed from a legal philosophical perspective. On the basis of theories of Dworkin and Moore the distinction between individual rights and collective security policy goals is discussed. Mobile cameras in the public domain
dBBQs: dataBase of Bacterial Quality scores.

Science.gov (United States)

Wanchai, Visanu; Patumcharoenpol, Preecha; Nookaew, Intawat; Ussery, David

2017-12-28

It is well-known that genome sequencing technologies are becoming significantly cheaper and faster. As a result of this, the exponential growth in sequencing data in public databases allows us to explore ever growing large collections of genome sequences. However, it is less known that the majority of available sequenced genome sequences in public databases are not complete, drafts of varying qualities. We have calculated quality scores for around 100,000 bacterial genomes from all major genome repositories and put them in a fast and easy-to-use database. Prokaryotic genomic data from all sources were collected and combined to make a non-redundant set of bacterial genomes. The genome quality score for each was calculated by four different measurements: assembly quality, number of rRNA and tRNA genes, and the occurrence of conserved functional domains. The dataBase of Bacterial Quality scores (dBBQs) was designed to store and retrieve quality scores. It offers fast searching and download features which the result can be used for further analysis. In addition, the search results are shown in interactive JavaScript chart framework using DC.js. The analysis of quality scores across major public genome databases find that around 68% of the genomes are of acceptable quality for many uses. dBBQs (available at http://arc-gem.uams.edu/dbbqs ) provides genome quality scores for all available prokaryotic genome sequences with a user-friendly Web-interface. These scores can be used as cut-offs to get a high-quality set of genomes for testing bioinformatics tools or improving the analysis. Moreover, all data of the four measurements that were combined to make the quality score for each genome, which can potentially be used for further analysis. dBBQs will be updated regularly and is freely use for non-commercial purpose.
Agents unleashed a public domain look at agent technology

CERN Document Server

Wayner, Peter

1995-01-01

Agents Unleashed: A Public Domain Look at Agent Technology covers details of building a secure agent realm. The book discusses the technology for creating seamlessly integrated networks that allow programs to move from machine to machine without leaving a trail of havoc; as well as the technical details of how an agent will move through the network, prove its identity, and execute its code without endangering the host. The text also describes the organization of the host's work processing an agent; error messages, bad agent expulsion, and errors in XLISP-agents; and the simulators of errors, f
The International River Interface Cooperative: Public Domain Software for River Flow and Morphodynamics (Invited)

Science.gov (United States)

Nelson, J. M.; Shimizu, Y.; McDonald, R.; Takebayashi, H.

2009-12-01

The International River Interface Cooperative is an informal organization made up of academic faculty and government scientists with the goal of developing, distributing and providing education for a public-domain software interface for modeling river flow and morphodynamics. Formed in late 2007, the group released the first version of this interface (iRIC) in late 2009. iRIC includes models for two and three-dimensional flow, sediment transport, bed evolution, groundwater-surface water interaction, topographic data processing, and habitat assessment, as well as comprehensive data and model output visualization, mapping, and editing tools. All the tools in iRIC are specifically designed for use in river reaches and utilize common river data sets. The models are couched within a single graphical user interface so that a broad spectrum of models are available to users without learning new pre- and post-processing tools. The first version of iRIC was developed by combining the USGS public-domain Multi-Dimensional Surface Water Modeling System (MD_SWMS), developed at the USGS Geomorphology and Sediment Transport Laboratory in Golden, Colorado, with the public-domain river modeling code NAYS developed by the Universities of Hokkaido and Kyoto, Mizuho Corporation, and the Foundation of the River Disaster Prevention Research Institute in Sapporo, Japan. Since this initial effort, other Universities and Agencies have joined the group, and the interface has been expanded to allow users to integrate their own modeling code using Executable Markup Language (XML), which provides easy access and expandability to the iRIC software interface. In this presentation, the current components of iRIC are described and results from several practical modeling applications are presented to illustrate the capabilities and flexibility of the software. In addition, some future extensions to iRIC are demonstrated, including software for Lagrangian particle tracking and the prediction of
HMMerThread: detecting remote, functional conserved domains in entire genomes by combining relaxed sequence-database searches with fold recognition.

Directory of Open Access Journals (Sweden)

Charles Richard Bradshaw

Full Text Available Conserved domains in proteins are one of the major sources of functional information for experimental design and genome-level annotation. Though search tools for conserved domain databases such as Hidden Markov Models (HMMs are sensitive in detecting conserved domains in proteins when they share sufficient sequence similarity, they tend to miss more divergent family members, as they lack a reliable statistical framework for the detection of low sequence similarity. We have developed a greatly improved HMMerThread algorithm that can detect remotely conserved domains in highly divergent sequences. HMMerThread combines relaxed conserved domain searches with fold recognition to eliminate false positive, sequence-based identifications. With an accuracy of 90%, our software is able to automatically predict highly divergent members of conserved domain families with an associated 3-dimensional structure. We give additional confidence to our predictions by validation across species. We have run HMMerThread searches on eight proteomes including human and present a rich resource of remotely conserved domains, which adds significantly to the functional annotation of entire proteomes. We find ∼4500 cross-species validated, remotely conserved domain predictions in the human proteome alone. As an example, we find a DNA-binding domain in the C-terminal part of the A-kinase anchor protein 10 (AKAP10, a PKA adaptor that has been implicated in cardiac arrhythmias and premature cardiac death, which upon stress likely translocates from mitochondria to the nucleus/nucleolus. Based on our prediction, we propose that with this HLH-domain, AKAP10 is involved in the transcriptional control of stress response. Further remotely conserved domains we discuss are examples from areas such as sporulation, chromosome segregation and signalling during immune response. The HMMerThread algorithm is able to automatically detect the presence of remotely conserved domains in
Domains of State-Owned, Privately Held, and Publicly Traded Firms in International Competition.

Science.gov (United States)

Mascarenhas, Briance

1989-01-01

Hypotheses relating ownership to domain differences among state-owned, publicly traded, and privately held firms in international competition were examined in a controlled field study of the offshore drilling industry. Ownership explained selected differences in domestic market dominance, international presence, and customer orientation, even…
Public-domain software for root image analysis

Directory of Open Access Journals (Sweden)

Mirian Cristina Gomes Costa

2014-10-01

Full Text Available In the search for high efficiency in root studies, computational systems have been developed to analyze digital images. ImageJ and Safira are public-domain systems that may be used for image analysis of washed roots. However, differences in root properties measured using ImageJ and Safira are supposed. This study compared values of root length and surface area obtained with public-domain systems with values obtained by a reference method. Root samples were collected in a banana plantation in an area of a shallower Typic Carbonatic Haplic Cambisol (CXk, and an area of a deeper Typic Haplic Ta Eutrophic Cambisol (CXve, at six depths in five replications. Root images were digitized and the systems ImageJ and Safira used to determine root length and surface area. The line-intersect method modified by Tennant was used as reference; values of root length and surface area measured with the different systems were analyzed by Pearson's correlation coefficient and compared by the confidence interval and t-test. Both systems ImageJ and Safira had positive correlation coefficients with the reference method for root length and surface area data in CXk and CXve. The correlation coefficient ranged from 0.54 to 0.80, with lowest value observed for ImageJ in the measurement of surface area of roots sampled in CXve. The IC (95 % revealed that root length measurements with Safira did not differ from that with the reference method in CXk (-77.3 to 244.0 mm. Regarding surface area measurements, Safira did not differ from the reference method for samples collected in CXk (-530.6 to 565.8 mm² as well as in CXve (-4231 to 612.1 mm². However, measurements with ImageJ were different from those obtained by the reference method, underestimating length and surface area in samples collected in CXk and CXve. Both ImageJ and Safira allow an identification of increases or decreases in root length and surface area. However, Safira results for root length and surface area are
Hmrbase: a database of hormones and their receptors

Science.gov (United States)

Rashid, Mamoon; Singla, Deepak; Sharma, Arun; Kumar, Manish; Raghava, Gajendra PS

2009-01-01

Background Hormones are signaling molecules that play vital roles in various life processes, like growth and differentiation, physiology, and reproduction. These molecules are mostly secreted by endocrine glands, and transported to target organs through the bloodstream. Deficient, or excessive, levels of hormones are associated with several diseases such as cancer, osteoporosis, diabetes etc. Thus, it is important to collect and compile information about hormones and their receptors. Description This manuscript describes a database called Hmrbase which has been developed for managing information about hormones and their receptors. It is a highly curated database for which information has been collected from the literature and the public databases. The current version of Hmrbase contains comprehensive information about ~2000 hormones, e.g., about their function, source organism, receptors, mature sequences, structures etc. Hmrbase also contains information about ~3000 hormone receptors, in terms of amino acid sequences, subcellular localizations, ligands, and post-translational modifications etc. One of the major features of this database is that it provides data about ~4100 hormone-receptor pairs. A number of online tools have been integrated into the database, to provide the facilities like keyword search, structure-based search, mapping of a given peptide(s) on the hormone/receptor sequence, sequence similarity search. This database also provides a number of external links to other resources/databases in order to help in the retrieving of further related information. Conclusion Owing to the high impact of endocrine research in the biomedical sciences, the Hmrbase could become a leading data portal for researchers. The salient features of Hmrbase are hormone-receptor pair-related information, mapping of peptide stretches on the protein sequences of hormones and receptors, Pfam domain annotations, categorical browsing options, online data submission, Drug
Assessing the quality of life history information in publicly available databases.

Science.gov (United States)

Thorson, James T; Cope, Jason M; Patrick, Wesley S

2014-01-01

Single-species life history parameters are central to ecological research and management, including the fields of macro-ecology, fisheries science, and ecosystem modeling. However, there has been little independent evaluation of the precision and accuracy of the life history values in global and publicly available databases. We therefore develop a novel method based on a Bayesian errors-in-variables model that compares database entries with estimates from local experts, and we illustrate this process by assessing the accuracy and precision of entries in FishBase, one of the largest and oldest life history databases. This model distinguishes biases among seven life history parameters, two types of information available in FishBase (i.e., published values and those estimated from other parameters), and two taxa (i.e., bony and cartilaginous fishes) relative to values from regional experts in the United States, while accounting for additional variance caused by sex- and region-specific life history traits. For published values in FishBase, the model identifies a small positive bias in natural mortality and negative bias in maximum age, perhaps caused by unacknowledged mortality caused by fishing. For life history values calculated by FishBase, the model identified large and inconsistent biases. The model also demonstrates greatest precision for body size parameters, decreased precision for values derived from geographically distant populations, and greatest between-sex differences in age at maturity. We recommend that our bias and precision estimates be used in future errors-in-variables models as a prior on measurement errors. This approach is broadly applicable to global databases of life history traits and, if used, will encourage further development and improvements in these databases.
The complexity of changes in the domain of managing public expenditures

Directory of Open Access Journals (Sweden)

Dimitrijević Marina

2016-01-01

Full Text Available Public expenditures are a huge problem in contemporary states. In the conditions of a global economic crisis and the circumstances involving high level of citizen dissatisfaction related to the former methods of funding and managing the public sector (reflected in ruining the funding sources, irrational spending of public expenditure funds, increase in the budget deficit and the level of public debt, the changes in the domain of managing public expenditures have become a priority. By their nature, these changes are complex and long-lasting, and they should bring significant improvements in the field of public expenditure; they have to provide for lawful and purposeful spending of public funds. It is expected to lower the needed public incomes for financing public expenditure, to improve production and competition in the market economy, and to increase personal consumption, living standard and the quality of life of the population. Regardless of the social, economic, legal or political environment in each of state, the topical issue of reforming the management of public expenditures seems to imply a return to a somewhat neglected need for the public sector to function within its own financial possibilities. The state modernisation processes and advancement in the process of managing public expenditures call for a realistic evaluation of the existing condition and circumstances in which these processes occur, as well as the assessment of potential and actual risks that may hinder their effectiveness. Otherwise, it seems that the establishment of a significant level of responsibility in spending the budget funds and a greater transparency of public expenditure may be far-fetched goals.
Combating Identity Fraud in the Public Domain: Information Strategies for Healthcare and Criminal Justice

NARCIS (Netherlands)

Plomp, M.G.A.; Grijpink, J.H.A.M.

2011-01-01

Two trends are present in both the private and public domain: increasing interorganisational co-operation and increasing digitisation. Nowadays, more and more processes within and between organisations take place electronically. These developments are visible on local, national and European scale.
PCAS – a precomputed proteome annotation database resource

Directory of Open Access Journals (Sweden)

Luo Jingchu

2003-11-01

Full Text Available Abstract Background Many model proteomes or "complete" sets of proteins of given organisms are now publicly available. Much effort has been invested in computational annotation of those "draft" proteomes. Motif or domain based algorithms play a pivotal role in functional classification of proteins. Employing most available computational algorithms, mainly motif or domain recognition algorithms, we set up to develop an online proteome annotation system with integrated proteome annotation data to complement existing resources. Results We report here the development of PCAS (ProteinCentric Annotation System as an online resource of pre-computed proteome annotation data. We applied most available motif or domain databases and their analysis methods, including hmmpfam search of HMMs in Pfam, SMART and TIGRFAM, RPS-PSIBLAST search of PSSMs in CDD, pfscan of PROSITE patterns and profiles, as well as PSI-BLAST search of SUPERFAMILY PSSMs. In addition, signal peptide and TM are predicted using SignalP and TMHMM respectively. We mapped SUPERFAMILY and COGs to InterPro, so the motif or domain databases are integrated through InterPro. PCAS displays table summaries of pre-computed data and a graphical presentation of motifs or domains relative to the protein. As of now, PCAS contains human IPI, mouse IPI, and rat IPI, A. thaliana, C. elegans, D. melanogaster, S. cerevisiae, and S. pombe proteome. PCAS is available at http://pak.cbi.pku.edu.cn/proteome/gca.php Conclusion PCAS gives better annotation coverage for model proteomes by employing a wider collection of available algorithms. Besides presenting the most confident annotation data, PCAS also allows customized query so users can inspect statistically less significant boundary information as well. Therefore, besides providing general annotation information, PCAS could be used as a discovery platform. We plan to update PCAS twice a year. We will upgrade PCAS when new proteome annotation algorithms
A review of fission gas release data within the Nea/IAEA IFPE database

International Nuclear Information System (INIS)

Turnbull, J.A.; Menut, P.; Sartori, E.

2002-01-01

The paper describes the International Fuel Performance Experimental (IFPE) database on nuclear fuel performance. The aim of the project is to provide a comprehensive and well-qualified database on Zr clad UO 2 fuel for model development and code validation in the public domain. The data encompass both normal and off-normal operation and include prototypic commercial irradiations as well as experiments performed in material testing reactors. To date, the database contains some 380 individual cases, the majority of which provide data on FGR either from in-pile pressure measurements or PIE techniques including puncturing, electron probe microanalysis (EPMA) and X-ray fluorescence (XRF) measurements. The paper outlines parameters affecting fission gas release and highlights individual datasets addressing these issues. (authors)
Toward public volume database management: a case study of NOVA, the National Online Volumetric Archive

Science.gov (United States)

Fletcher, Alex; Yoo, Terry S.

2004-04-01

Public databases today can be constructed with a wide variety of authoring and management structures. The widespread appeal of Internet search engines suggests that public information be made open and available to common search strategies, making accessible information that would otherwise be hidden by the infrastructure and software interfaces of a traditional database management system. We present the construction and organizational details for managing NOVA, the National Online Volumetric Archive. As an archival effort of the Visible Human Project for supporting medical visualization research, archiving 3D multimodal radiological teaching files, and enhancing medical education with volumetric data, our overall database structure is simplified; archives grow by accruing information, but seldom have to modify, delete, or overwrite stored records. NOVA is being constructed and populated so that it is transparent to the Internet; that is, much of its internal structure is mirrored in HTML allowing internet search engines to investigate, catalog, and link directly to the deep relational structure of the collection index. The key organizational concept for NOVA is the Image Content Group (ICG), an indexing strategy for cataloging incoming data as a set structure rather than by keyword management. These groups are managed through a series of XML files and authoring scripts. We cover the motivation for Image Content Groups, their overall construction, authorship, and management in XML, and the pilot results for creating public data repositories using this strategy.
An Overview of Public Domain Tools for Measuring the Sustainability of Environmental Remediation - 12060

Energy Technology Data Exchange (ETDEWEB)

Claypool, John E.; Rogers, Scott [AECOM, Denver, Colorado, 80202 (United States)

2012-07-01

The application of sustainability principles to the investigation and remediation of contaminated sites is an area of rapid development within the environmental profession, with new business practices, tools, and performance standards for identifying, evaluating, and managing the 'collateral' impacts of cleanup projects to the environment, economy and society coming from many organizations. Guidelines, frameworks, and standards of practice for 'green and sustainable remediation' (GSR) have been released and are under development by the Sustainable Remediation Forum (SURF), the American Society for Testing Materials (ASTM), the Interstate Technology Roundtable Commission (ITRC) and other organizations in the U.S. and internationally. In response to Executive Orders from the President, Federal government agencies have developed policies, procedures and guidelines for evaluating and reporting the sustainability of their environmental restoration projects. Private sector companies in the petroleum, utility, manufacturing, defense, and other sectors are developing their own corporate GSR programs to improve day-to-day management of contaminated sites and to support external reporting as part of their corporate social responsibility (CSR) efforts. The explosion of mandates, policy, procedures and guidance raises the question of how to determine whether a remediation technology or cleanup approach is green and/or sustainable. The environmental profession has responded to this question by designing, developing and deploying a wide array of tools, calculators, and databases that enable regulatory agencies, site managers and environmental professionals to calculate the collateral impacts of their remediation projects in the environmental, social, and economic domains. Many of these tools are proprietary ones developed by environmental engineering/consulting firms for use in their consulting engagements and/or tailored specifically to meet the needs of

USDA food and nutrient databases provide the infrastructure for food and nutrition research, policy, and practice.

Science.gov (United States)

Ahuja, Jaspreet K C; Moshfegh, Alanna J; Holden, Joanne M; Harris, Ellen

2013-02-01

The USDA food and nutrient databases provide the basic infrastructure for food and nutrition research, nutrition monitoring, policy, and dietary practice. They have had a long history that goes back to 1892 and are unique, as they are the only databases available in the public domain that perform these functions. There are 4 major food and nutrient databases released by the Beltsville Human Nutrition Research Center (BHNRC), part of the USDA's Agricultural Research Service. These include the USDA National Nutrient Database for Standard Reference, the Dietary Supplement Ingredient Database, the Food and Nutrient Database for Dietary Studies, and the USDA Food Patterns Equivalents Database. The users of the databases are diverse and include federal agencies, the food industry, health professionals, restaurants, software application developers, academia and research organizations, international organizations, and foreign governments, among others. Many of these users have partnered with BHNRC to leverage funds and/or scientific expertise to work toward common goals. The use of the databases has increased tremendously in the past few years, especially the breadth of uses. These new uses of the data are bound to increase with the increased availability of technology and public health emphasis on diet-related measures such as sodium and energy reduction. Hence, continued improvement of the databases is important, so that they can better address these challenges and provide reliable and accurate data.
Reusable data in public health data-bases-problems encountered in Danish Children's Database.

Science.gov (United States)

Høstgaard, Anna Marie; Pape-Haugaard, Louise

2012-01-01

Denmark have unique health informatics databases e.g. "The Children's Database", which since 2009 holds data on all Danish children from birth until 17 years of age. In the current set-up a number of potential sources of errors exist - both technical and human-which means that the data is flawed. This gives rise to erroneous statistics and makes the data unsuitable for research purposes. In order to make the data usable, it is necessary to develop new methods for validating the data generation process at the municipal/regional/national level. In the present ongoing research project, two research areas are combined: Public Health Informatics and Computer Science, and both ethnographic as well as system engineering research methods are used. The project is expected to generate new generic methods and knowledge about electronic data collection and transmission in different social contexts and by different social groups and thus to be of international importance, since this is sparsely documented in the Public Health Informatics perspective. This paper presents the preliminary results, which indicate that health information technology used ought to be subject for redesign, where a thorough insight into the work practices should be point of departure.
Study of event sequence database for a nuclear power domain

International Nuclear Information System (INIS)

Kusumi, Yoshiaki

1998-01-01

A retrieval engine developed to extract event sequences from an accident information database using a time series retrieval formula expressed with ordered retrieval terms is explored. This engine outputs not only a sequence which completely matches with a time series retrieval formula, but also sequence which approximately matches the formula (fuzzy retrieval). An event sequence database in which records consist of three ordered parameters, namely the causal event, the process and result. Then the database is used to assess the feasibility of this engine and favorable results were obtained. (author)
High-throughput STR analysis for DNA database using direct PCR.

Science.gov (United States)

Sim, Jeong Eun; Park, Su Jeong; Lee, Han Chul; Kim, Se-Yong; Kim, Jong Yeol; Lee, Seung Hwan

2013-07-01

Since the Korean criminal DNA database was launched in 2010, we have focused on establishing an automated DNA database profiling system that analyzes short tandem repeat loci in a high-throughput and cost-effective manner. We established a DNA database profiling system without DNA purification using a direct PCR buffer system. The quality of direct PCR procedures was compared with that of conventional PCR system under their respective optimized conditions. The results revealed not only perfect concordance but also an excellent PCR success rate, good electropherogram quality, and an optimal intra/inter-loci peak height ratio. In particular, the proportion of DNA extraction required due to direct PCR failure could be minimized to <3%. In conclusion, the newly developed direct PCR system can be adopted for automated DNA database profiling systems to replace or supplement conventional PCR system in a time- and cost-saving manner. © 2013 American Academy of Forensic Sciences Published 2013. This article is a U.S. Government work and is in the public domain in the U.S.A.
NREL: U.S. Life Cycle Inventory Database - About the LCI Database Project

Science.gov (United States)

About the LCI Database Project The U.S. Life Cycle Inventory (LCI) Database is a publicly available database that allows users to objectively review and compare analysis results that are based on similar source of critically reviewed LCI data through its LCI Database Project. NREL's High-Performance
Geospatial Database for Strata Objects Based on Land Administration Domain Model (ladm)

Science.gov (United States)

Nasorudin, N. N.; Hassan, M. I.; Zulkifli, N. A.; Rahman, A. Abdul

2016-09-01

Recently in our country, the construction of buildings become more complex and it seems that strata objects database becomes more important in registering the real world as people now own and use multilevel of spaces. Furthermore, strata title was increasingly important and need to be well-managed. LADM is a standard model for land administration and it allows integrated 2D and 3D representation of spatial units. LADM also known as ISO 19152. The aim of this paper is to develop a strata objects database using LADM. This paper discusses the current 2D geospatial database and needs for 3D geospatial database in future. This paper also attempts to develop a strata objects database using a standard data model (LADM) and to analyze the developed strata objects database using LADM data model. The current cadastre system in Malaysia includes the strata title is discussed in this paper. The problems in the 2D geospatial database were listed and the needs for 3D geospatial database in future also is discussed. The processes to design a strata objects database are conceptual, logical and physical database design. The strata objects database will allow us to find the information on both non-spatial and spatial strata title information thus shows the location of the strata unit. This development of strata objects database may help to handle the strata title and information.
About DNA databasing and investigative genetic analysis of externally visible characteristics: A public survey.

Science.gov (United States)

Zieger, Martin; Utz, Silvia

2015-07-01

During the last decade, DNA profiling and the use of DNA databases have become two of the most employed instruments of police investigations. This very rapid establishment of forensic genetics is yet far from being complete. In the last few years novel types of analyses have been presented to describe phenotypically a possible perpetrator. We conducted the present study among German speaking Swiss residents for two main reasons: firstly, we aimed at getting an impression of the public awareness and acceptance of the Swiss DNA database and the perception of a hypothetical DNA database containing all Swiss residents. Secondly, we wanted to get a broader picture of how people that are not working in the field of forensic genetics think about legal permission to establish phenotypic descriptions of alleged criminals by genetic means. Even though a significant number of study participants did not even know about the existence of the Swiss DNA database, its acceptance appears to be very high. Generally our results suggest that the current forensic use of DNA profiling is considered highly trustworthy. However, the acceptance of a hypothetical universal database would be only as low as about 30% among the 284 respondents to our study, mostly because people are concerned about the security of their genetic data, their privacy or a possible risk of abuse of such a database. Concerning the genetic analysis of externally visible characteristics and biogeographical ancestry, we discover a high degree of acceptance. The acceptance decreases slightly when precise characteristics are presented to the participants in detail. About half of the respondents would be in favor of the moderate use of physical traits analyses only for serious crimes threatening life, health or sexual integrity. The possible risk of discrimination and reinforcement of racism, as discussed by scholars from anthropology, bioethics, law, philosophy and sociology, is mentioned less frequently by the study
Computer-aided detection of pulmonary nodules: a comparative study using the public LIDC/IDRI database

International Nuclear Information System (INIS)

Jacobs, Colin; Prokop, Mathias; Rikxoort, Eva M. van; Ginneken, Bram van; Murphy, Keelin; Schaefer-Prokop, Cornelia M.

2016-01-01

To benchmark the performance of state-of-the-art computer-aided detection (CAD) of pulmonary nodules using the largest publicly available annotated CT database (LIDC/IDRI), and to show that CAD finds lesions not identified by the LIDC's four-fold double reading process. The LIDC/IDRI database contains 888 thoracic CT scans with a section thickness of 2.5 mm or lower. We report performance of two commercial and one academic CAD system. The influence of presence of contrast, section thickness, and reconstruction kernel on CAD performance was assessed. Four radiologists independently analyzed the false positive CAD marks of the best CAD system. The updated commercial CAD system showed the best performance with a sensitivity of 82 % at an average of 3.1 false positive detections per scan. Forty-five false positive CAD marks were scored as nodules by all four radiologists in our study. On the largest publicly available reference database for lung nodule detection in chest CT, the updated commercial CAD system locates the vast majority of pulmonary nodules at a low false positive rate. Potential for CAD is substantiated by the fact that it identifies pulmonary nodules that were not marked during the extensive four-fold LIDC annotation process. (orig.)
Exploring public databases to characterize urban flood risks in Amsterdam

Science.gov (United States)

Gaitan, Santiago; ten Veldhuis, Marie-claire; van de Giesen, Nick

2015-04-01

Cities worldwide are challenged by increasing urban flood risks. Precise and realistic measures are required to decide upon investment to reduce their impacts. Obvious flooding factors affecting flood risk include sewer systems performance and urban topography. However, currently implemented sewer and topographic models do not provide realistic predictions of local flooding occurrence during heavy rain events. Assessing other factors such as spatially distributed rainfall and socioeconomic characteristics may help to explain probability and impacts of urban flooding. Several public databases were analyzed: complaints about flooding made by citizens, rainfall depths (15 min and 100 Ha spatio-temporal resolution), grids describing number of inhabitants, income, and housing price (1Ha and 25Ha resolution); and buildings age. Data analysis was done using Python and GIS programming, and included spatial indexing of data, cluster analysis, and multivariate regression on the complaints. Complaints were used as a proxy to characterize flooding impacts. The cluster analysis, run for all the variables except the complaints, grouped part of the grid-cells of central Amsterdam into a highly differentiated group, covering 10% of the analyzed area, and accounting for 25% of registered complaints. The configuration of the analyzed variables in central Amsterdam coincides with a high complaint count. Remaining complaints were evenly dispersed along other groups. An adjusted R2 of 0.38 in the multivariate regression suggests that explaining power can improve if additional variables are considered. While rainfall intensity explained 4% of the incidence of complaints, population density and building age significantly explained around 20% each. Data mining of public databases proved to be a valuable tool to identify factors explaining variability in occurrence of urban pluvial flooding, though additional variables must be considered to fully explain flood risk variability.
Reassessing Domain Architecture Evolution of Metazoan Proteins: The Contribution of Different Evolutionary Mechanisms

Directory of Open Access Journals (Sweden)

Laszlo Patthy

2011-08-01

Full Text Available In the accompanying papers we have shown that sequence errors of public databases and confusion of paralogs and epaktologs (proteins that are related only through the independent acquisition of the same domain types significantly distort the picture that emerges from comparison of the domain architecture (DA of multidomain Metazoan proteins since they introduce a strong bias in favor of terminal over internal DA change. The issue of whether terminal or internal DA changes occur with greater probability has very important implications for the DA evolution of multidomain proteins since gene fusion can add domains only at terminal positions, whereas domain-shuffling is capable of inserting domains both at internal and terminal positions. As a corollary, overestimation of terminal DA changes may be misinterpreted as evidence for a dominant role of gene fusion in DA evolution. In this manuscript we show that in several recent studies of DA evolution of Metazoa the authors used databases that are significantly contaminated with incomplete, abnormal and mispredicted sequences (e.g., UniProtKB/TrEMBL, EnsEMBL and/or the authors failed to separate paralogs and epaktologs, explaining why these studies concluded that the major mechanism for gains of new domains in metazoan proteins is gene fusion. In contrast with the latter conclusion, our studies on high quality orthologous and paralogous Swiss-Prot sequences confirm that shuffling of mobile domains had a major role in the evolution of multidomain proteins of Metazoa and especially those formed in early vertebrates.
Private and Efficient Query Processing on Outsourced Genomic Databases.

Science.gov (United States)

Ghasemi, Reza; Al Aziz, Md Momin; Mohammed, Noman; Dehkordi, Massoud Hadian; Jiang, Xiaoqian

2017-09-01

Applications of genomic studies are spreading rapidly in many domains of science and technology such as healthcare, biomedical research, direct-to-consumer services, and legal and forensic. However, there are a number of obstacles that make it hard to access and process a big genomic database for these applications. First, sequencing genomic sequence is a time consuming and expensive process. Second, it requires large-scale computation and storage systems to process genomic sequences. Third, genomic databases are often owned by different organizations, and thus, not available for public usage. Cloud computing paradigm can be leveraged to facilitate the creation and sharing of big genomic databases for these applications. Genomic data owners can outsource their databases in a centralized cloud server to ease the access of their databases. However, data owners are reluctant to adopt this model, as it requires outsourcing the data to an untrusted cloud service provider that may cause data breaches. In this paper, we propose a privacy-preserving model for outsourcing genomic data to a cloud. The proposed model enables query processing while providing privacy protection of genomic databases. Privacy of the individuals is guaranteed by permuting and adding fake genomic records in the database. These techniques allow cloud to evaluate count and top-k queries securely and efficiently. Experimental results demonstrate that a count and a top-k query over 40 Single Nucleotide Polymorphisms (SNPs) in a database of 20 000 records takes around 100 and 150 s, respectively.
Variations in clinicopathologic characteristics of thyroid cancer among racial ethnic groups: analysis of a large public city hospital and the SEER database.

Science.gov (United States)

Moo-Young, Tricia A; Panergo, Jessel; Wang, Chih E; Patel, Subhash; Duh, Hong Yan; Winchester, David J; Prinz, Richard A; Fogelfeld, Leon

2013-11-01

Clinicopathologic variables influence the treatment and prognosis of patients with thyroid cancer. A retrospective analysis of public hospital thyroid cancer database and the Surveillance, Epidemiology and End Results 17 database was conducted. Demographic, clinical, and pathologic data were compared across ethnic groups. Within the public hospital database, Hispanics versus non-Hispanic whites were younger and had more lymph node involvement (34% vs 17%, P ethnic groups. Similar findings were demonstrated within the Surveillance, Epidemiology and End Results database. African Americans aged ethnic groups. Such disparities persist within an equal-access health care system. These findings suggest that factors beyond socioeconomics may contribute to such differences. Copyright © 2013 Elsevier Inc. All rights reserved.
Toward a public analysis database for LHC new physics searches using M ADA NALYSIS 5

Science.gov (United States)

Dumont, B.; Fuks, B.; Kraml, S.; Bein, S.; Chalons, G.; Conte, E.; Kulkarni, S.; Sengupta, D.; Wymant, C.

2015-02-01

We present the implementation, in the MadAnalysis 5 framework, of several ATLAS and CMS searches for supersymmetry in data recorded during the first run of the LHC. We provide extensive details on the validation of our implementations and propose to create a public analysis database within this framework.
TMC-SNPdb: an Indian germline variant database derived from whole exome sequences.

Science.gov (United States)

Upadhyay, Pawan; Gardi, Nilesh; Desai, Sanket; Sahoo, Bikram; Singh, Ankita; Togar, Trupti; Iyer, Prajish; Prasad, Ratnam; Chandrani, Pratik; Gupta, Sudeep; Dutt, Amit

2016-01-01

Cancer is predominantly a somatic disease. A mutant allele present in a cancer cell genome is considered somatic when it's absent in the paired normal genome along with public SNP databases. The current build of dbSNP, the most comprehensive public SNP database, however inadequately represents several non-European Caucasian populations, posing a limitation in cancer genomic analyses of data from these populations. We present the T: ata M: emorial C: entre-SNP D: ata B: ase (TMC-SNPdb), as the first open source, flexible, upgradable, and freely available SNP database (accessible through dbSNP build 149 and ANNOVAR)-representing 114 309 unique germline variants-generated from whole exome data of 62 normal samples derived from cancer patients of Indian origin. The TMC-SNPdb is presented with a companion subtraction tool that can be executed with command line option or using an easy-to-use graphical user interface with the ability to deplete additional Indian population specific SNPs over and above dbSNP and 1000 Genomes databases. Using an institutional generated whole exome data set of 132 samples of Indian origin, we demonstrate that TMC-SNPdb could deplete 42, 33 and 28% false positive somatic events post dbSNP depletion in Indian origin tongue, gallbladder, and cervical cancer samples, respectively. Beyond cancer somatic analyses, we anticipate utility of the TMC-SNPdb in several Mendelian germline diseases. In addition to dbSNP build 149 and ANNOVAR, the TMC-SNPdb along with the subtraction tool is available for download in the public domain at the following:Database URL: http://www.actrec.gov.in/pi-webpages/AmitDutt/TMCSNP/TMCSNPdp.html. © The Author(s) 2016. Published by Oxford University Press.
Privacy protection and public goods: building a genetic database for health research in Newfoundland and Labrador.

Science.gov (United States)

Kosseim, Patricia; Pullman, Daryl; Perrot-Daley, Astrid; Hodgkinson, Kathy; Street, Catherine; Rahman, Proton

2013-01-01

To provide a legal and ethical analysis of some of the implementation challenges faced by the Population Therapeutics Research Group (PTRG) at Memorial University (Canada), in using genealogical information offered by individuals for its genetics research database. This paper describes the unique historical and genetic characteristics of the Newfoundland and Labrador founder population, which gave rise to the opportunity for PTRG to build the Newfoundland Genealogy Database containing digitized records of all pre-confederation (1949) census records of the Newfoundland founder population. In addition to building the database, PTRG has developed the Heritability Analytics Infrastructure, a data management structure that stores genotype, phenotype, and pedigree information in a single database, and custom linkage software (KINNECT) to perform pedigree linkages on the genealogy database. A newly adopted legal regimen in Newfoundland and Labrador is discussed. It incorporates health privacy legislation with a unique research ethics statute governing the composition and activities of research ethics boards and, for the first time in Canada, elevating the status of national research ethics guidelines into law. The discussion looks at this integration of legal and ethical principles which provides a flexible and seamless framework for balancing the privacy rights and welfare interests of individuals, families, and larger societies in the creation and use of research data infrastructures as public goods. The complementary legal and ethical frameworks that now coexist in Newfoundland and Labrador provide the legislative authority, ethical legitimacy, and practical flexibility needed to find a workable balance between privacy interests and public goods. Such an approach may also be instructive for other jurisdictions as they seek to construct and use biobanks and related research platforms for genetic research.
Security aspects of database systems implementation

OpenAIRE

Pokorný, Tomáš

2009-01-01

The aim of this thesis is to provide a comprehensive overview of database systems security. Reader is introduced into the basis of information security and its development. Following chapter defines a concept of database system security using ISO/IEC 27000 Standard. The findings from this chapter form a complex list of requirements on database security. One chapter also deals with legal aspects of this domain. Second part of this thesis offers a comparison of four object-relational database s...
Toxico-Cheminformatics: New and Expanding Public ...

Science.gov (United States)

High-throughput screening (HTS) technologies, along with efforts to improve public access to chemical toxicity information resources and to systematize older toxicity studies, have the potential to significantly improve information gathering efforts for chemical assessments and predictive capabilities in toxicology. Important developments include: 1) large and growing public resources that link chemical structures to biological activity and toxicity data in searchable format, and that offer more nuanced and varied representations of activity; 2) standardized relational data models that capture relevant details of chemical treatment and effects of published in vivo experiments; and 3) the generation of large amounts of new data from public efforts that are employing HTS technologies to probe a wide range of bioactivity and cellular processes across large swaths of chemical space. By annotating toxicity data with associated chemical structure information, these efforts link data across diverse study domains (e.g., ‘omics’, HTS, traditional toxicity studies), toxicity domains (carcinogenicity, developmental toxicity, neurotoxicity, immunotoxicity, etc) and database sources (EPA, FDA, NCI, DSSTox, PubChem, GEO, ArrayExpress, etc.). Public initiatives are developing systematized data models of toxicity study areas and introducing standardized templates, controlled vocabularies, hierarchical organization, and powerful relational searching capability across capt
Depiction of global trends in publications on mobile health

Directory of Open Access Journals (Sweden)

Shahla Foozonkhah

2017-07-01

Full Text Available Background: Variety of mobile health initiatives in different levels have been undertaken across many countries. Trends of these initiatives can be reflected in the research published in m-health domain. Aim: This paper aims to depict global trends in the published works on m-health topic. Materials and Methods: The Web of Science database was used to identify all relevant published papers on mobile health domain worldwide. The search was conducted on documents published from January 1898 to December 2014. The criteria for searching were set to be “mHealth” or “Mobile health” or “m health” or “m_health” or “m-health” in topics. Results: Findings revealed an increasing trend of citations and publications on m-health research since 2012. English was the first most predominant language of the publication. The US had the highest number of publication with 649 papers; however, the Netherlands ranked first after considering publication number in terms of countries population. “Studies in Health Technology and Informatics” was the source title with highest number of publications on mobile health topics. Conclusion: Trend of research observed in this study indicates the continuing growth is happening in mobile health domain. This may imply that the new model of health-care delivery is emerging. Further research is needed to specify directions of mobile health research. It is necessary to identify and prioritize the research gaps in this domain.
A comparative gene expression database for invertebrates

Directory of Open Access Journals (Sweden)

Ormestad Mattias

2011-08-01

Full Text Available Abstract Background As whole genome and transcriptome sequencing gets cheaper and faster, a great number of 'exotic' animal models are emerging, rapidly adding valuable data to the ever-expanding Evo-Devo field. All these new organisms serve as a fantastic resource for the research community, but the sheer amount of data, some published, some not, makes detailed comparison of gene expression patterns very difficult to summarize - a problem sometimes even noticeable within a single lab. The need to merge existing data with new information in an organized manner that is publicly available to the research community is now more necessary than ever. Description In order to offer a homogenous way of storing and handling gene expression patterns from a variety of organisms, we have developed the first web-based comparative gene expression database for invertebrates that allows species-specific as well as cross-species gene expression comparisons. The database can be queried by gene name, developmental stage and/or expression domains. Conclusions This database provides a unique tool for the Evo-Devo research community that allows the retrieval, analysis and comparison of gene expression patterns within or among species. In addition, this database enables a quick identification of putative syn-expression groups that can be used to initiate, among other things, gene regulatory network (GRN projects.
Dynamics of domain coverage of the protein sequence universe

Science.gov (United States)

2012-01-01

Background The currently known protein sequence space consists of millions of sequences in public databases and is rapidly expanding. Assigning sequences to families leads to a better understanding of protein function and the nature of the protein universe. However, a large portion of the current protein space remains unassigned and is referred to as its “dark matter”. Results Here we suggest that true size of “dark matter” is much larger than stated by current definitions. We propose an approach to reducing the size of “dark matter” by identifying and subtracting regions in protein sequences that are not likely to contain any domain. Conclusions Recent improvements in computational domain modeling result in a decrease, albeit slowly, in the relative size of “dark matter”; however, its absolute size increases substantially with the growth of sequence data. PMID:23157439

Dynamics of domain coverage of the protein sequence universe

Directory of Open Access Journals (Sweden)

Rekapalli Bhanu

2012-11-01

Full Text Available Abstract Background The currently known protein sequence space consists of millions of sequences in public databases and is rapidly expanding. Assigning sequences to families leads to a better understanding of protein function and the nature of the protein universe. However, a large portion of the current protein space remains unassigned and is referred to as its “dark matter”. Results Here we suggest that true size of “dark matter” is much larger than stated by current definitions. We propose an approach to reducing the size of “dark matter” by identifying and subtracting regions in protein sequences that are not likely to contain any domain. Conclusions Recent improvements in computational domain modeling result in a decrease, albeit slowly, in the relative size of “dark matter”; however, its absolute size increases substantially with the growth of sequence data.
On the level of coverage and citation of publications by mechanicians of the national academy of sciences of Ukraine in the Scopus database

Science.gov (United States)

Guz, A. N.; Rushchitsky, J. J.

2009-11-01

The paper analyzes the level of coverage and citation of publications by mechanicians of the National Academy of Sciences of Ukraine (NASU) in the Scopus database. Two groups of mechanicians are considered. One group includes 66 doctors of sciences of the S. P. Timoshenko Institute of Mechanics as representatives of the oldest institute of the NASU. The other group includes 34 members (academicians and corresponding members) of the Division of Mechanics of the NASU as representatives of the authoritative community of mechanicians in Ukraine. The results are presented for each scientist in the form of two indices—the total number of publications accessible in the database as the level of coverage of the scientist's publications in this database and the h-index as the citation level of these publications. This paper may be considered to continue the papers [6-12] published in Prikladnaya Mekhanika (International Applied Mechanics) in 2005-2009
FISH REPRODUCTION: BIBLIOMETRIC ANALYSIS OF WORLDWIDE AND BRAZILIAN PUBLICATIONS IN SCOPUS DATABASE

Directory of Open Access Journals (Sweden)

Marcella Costa RADAEL

2015-12-01

Full Text Available Reproduction is a fundamental part of life being and studies related to fish reproduction have been much accessed. The aim of this study was to perform a bibliometric analysis in intend to identify trends in this kind of publication. During June 2013, were performed searches on Scopus Database, using the term “fish reproduction”, being compiled and presented information related to the number of publications per year, number of publications by country, publications by author, by journal, by institution and most used keywords. Based on the study, it was possible to obtain the following results: Brazil occupies a highlight position in number of papers, being that the Brazilian participation compared to worldwide publishing production is having an exponential increase; in Brazil, there is a high concentration of articles when concerning the top 10 authors and institutions. The present study allows verifying that the term “fish reproduction” has been focused by many scientific papers, being that in Brazil there is a special research effort related to this subject, especially in the last few years. The main contribution concerns to the use of bibliometric methods to describe the growth and concentration of researches in the area of fishfarm and reproduction.
Privacy protection and public goods: building a genetic database for health research in Newfoundland and Labrador

Science.gov (United States)

Pullman, Daryl; Perrot-Daley, Astrid; Hodgkinson, Kathy; Street, Catherine; Rahman, Proton

2013-01-01

Objective To provide a legal and ethical analysis of some of the implementation challenges faced by the Population Therapeutics Research Group (PTRG) at Memorial University (Canada), in using genealogical information offered by individuals for its genetics research database. Materials and methods This paper describes the unique historical and genetic characteristics of the Newfoundland and Labrador founder population, which gave rise to the opportunity for PTRG to build the Newfoundland Genealogy Database containing digitized records of all pre-confederation (1949) census records of the Newfoundland founder population. In addition to building the database, PTRG has developed the Heritability Analytics Infrastructure, a data management structure that stores genotype, phenotype, and pedigree information in a single database, and custom linkage software (KINNECT) to perform pedigree linkages on the genealogy database. Discussion A newly adopted legal regimen in Newfoundland and Labrador is discussed. It incorporates health privacy legislation with a unique research ethics statute governing the composition and activities of research ethics boards and, for the first time in Canada, elevating the status of national research ethics guidelines into law. The discussion looks at this integration of legal and ethical principles which provides a flexible and seamless framework for balancing the privacy rights and welfare interests of individuals, families, and larger societies in the creation and use of research data infrastructures as public goods. Conclusion The complementary legal and ethical frameworks that now coexist in Newfoundland and Labrador provide the legislative authority, ethical legitimacy, and practical flexibility needed to find a workable balance between privacy interests and public goods. Such an approach may also be instructive for other jurisdictions as they seek to construct and use biobanks and related research platforms for genetic research. PMID
Multiple graph regularized protein domain ranking.

Science.gov (United States)

Wang, Jim Jing-Yan; Bensmail, Halima; Gao, Xin

2012-11-19

Protein domain ranking is a fundamental task in structural biology. Most protein domain ranking methods rely on the pairwise comparison of protein domains while neglecting the global manifold structure of the protein domain database. Recently, graph regularized ranking that exploits the global structure of the graph defined by the pairwise similarities has been proposed. However, the existing graph regularized ranking methods are very sensitive to the choice of the graph model and parameters, and this remains a difficult problem for most of the protein domain ranking methods. To tackle this problem, we have developed the Multiple Graph regularized Ranking algorithm, MultiG-Rank. Instead of using a single graph to regularize the ranking scores, MultiG-Rank approximates the intrinsic manifold of protein domain distribution by combining multiple initial graphs for the regularization. Graph weights are learned with ranking scores jointly and automatically, by alternately minimizing an objective function in an iterative algorithm. Experimental results on a subset of the ASTRAL SCOP protein domain database demonstrate that MultiG-Rank achieves a better ranking performance than single graph regularized ranking methods and pairwise similarity based ranking methods. The problem of graph model and parameter selection in graph regularized protein domain ranking can be solved effectively by combining multiple graphs. This aspect of generalization introduces a new frontier in applying multiple graphs to solving protein domain ranking applications.
Fast resolution of the neutron diffusion equation through public domain Ode codes

Energy Technology Data Exchange (ETDEWEB)

Garcia, V.M.; Vidal, V.; Garayoa, J. [Universidad Politecnica de Valencia, Departamento de Sistemas Informaticos, Valencia (Spain); Verdu, G. [Universidad Politecnica de Valencia, Departamento de Ingenieria Quimica y Nuclear, Valencia (Spain); Gomez, R. [I.E.S. de Tavernes Blanques, Valencia (Spain)

2003-07-01

The time-dependent neutron diffusion equation is a partial differential equation with source terms. The resolution method usually includes discretizing the spatial domain, obtaining a large system of linear, stiff ordinary differential equations (ODEs), whose resolution is computationally very expensive. Some standard techniques use a fixed time step to solve the ODE system. This can result in errors (if the time step is too large) or in long computing times (if the time step is too little). To speed up the resolution method, two well-known public domain codes have been selected: DASPK and FCVODE that are powerful codes for the resolution of large systems of stiff ODEs. These codes can estimate the error after each time step, and, depending on this estimation can decide which is the new time step and, possibly, which is the integration method to be used in the next step. With these mechanisms, it is possible to keep the overall error below the chosen tolerances, and, when the system behaves smoothly, to take large time steps increasing the execution speed. In this paper we address the use of the public domain codes DASPK and FCVODE for the resolution of the time-dependent neutron diffusion equation. The efficiency of these codes depends largely on the preconditioning of the big systems of linear equations that must be solved. Several pre-conditioners have been programmed and tested; it was found that the multigrid method is the best of the pre-conditioners tested. Also, it has been found that DASPK has performed better than FCVODE, being more robust for our problem.We can conclude that the use of specialized codes for solving large systems of ODEs can reduce drastically the computational work needed for the solution; and combining them with appropriate pre-conditioners, the reduction can be still more important. It has other crucial advantages, since it allows the user to specify the allowed error, which cannot be done in fixed step implementations; this, of course
Toward An Unstructured Mesh Database

Science.gov (United States)

Rezaei Mahdiraji, Alireza; Baumann, Peter Peter

2014-05-01

Unstructured meshes are used in several application domains such as earth sciences (e.g., seismology), medicine, oceanography, cli- mate modeling, GIS as approximate representations of physical objects. Meshes subdivide a domain into smaller geometric elements (called cells) which are glued together by incidence relationships. The subdivision of a domain allows computational manipulation of complicated physical structures. For instance, seismologists model earthquakes using elastic wave propagation solvers on hexahedral meshes. The hexahedral con- tains several hundred millions of grid points and millions of hexahedral cells. Each vertex node in the hexahedrals stores a multitude of data fields. To run simulation on such meshes, one needs to iterate over all the cells, iterate over incident cells to a given cell, retrieve coordinates of cells, assign data values to cells, etc. Although meshes are used in many application domains, to the best of our knowledge there is no database vendor that support unstructured mesh features. Currently, the main tool for querying and manipulating unstructured meshes are mesh libraries, e.g., CGAL and GRAL. Mesh li- braries are dedicated libraries which includes mesh algorithms and can be run on mesh representations. The libraries do not scale with dataset size, do not have declarative query language, and need deep C++ knowledge for query implementations. Furthermore, due to high coupling between the implementations and input file structure, the implementations are less reusable and costly to maintain. A dedicated mesh database offers the following advantages: 1) declarative querying, 2) ease of maintenance, 3) hiding mesh storage structure from applications, and 4) transparent query optimization. To design a mesh database, the first challenge is to define a suitable generic data model for unstructured meshes. We proposed ImG-Complexes data model as a generic topological mesh data model which extends incidence graph model to multi
Multiple graph regularized protein domain ranking

KAUST Repository

Wang, Jim Jing-Yan

2012-11-19

Background: Protein domain ranking is a fundamental task in structural biology. Most protein domain ranking methods rely on the pairwise comparison of protein domains while neglecting the global manifold structure of the protein domain database. Recently, graph regularized ranking that exploits the global structure of the graph defined by the pairwise similarities has been proposed. However, the existing graph regularized ranking methods are very sensitive to the choice of the graph model and parameters, and this remains a difficult problem for most of the protein domain ranking methods.Results: To tackle this problem, we have developed the Multiple Graph regularized Ranking algorithm, MultiG-Rank. Instead of using a single graph to regularize the ranking scores, MultiG-Rank approximates the intrinsic manifold of protein domain distribution by combining multiple initial graphs for the regularization. Graph weights are learned with ranking scores jointly and automatically, by alternately minimizing an objective function in an iterative algorithm. Experimental results on a subset of the ASTRAL SCOP protein domain database demonstrate that MultiG-Rank achieves a better ranking performance than single graph regularized ranking methods and pairwise similarity based ranking methods.Conclusion: The problem of graph model and parameter selection in graph regularized protein domain ranking can be solved effectively by combining multiple graphs. This aspect of generalization introduces a new frontier in applying multiple graphs to solving protein domain ranking applications. 2012 Wang et al; licensee BioMed Central Ltd.
Multiple graph regularized protein domain ranking

KAUST Repository

Wang, Jim Jing-Yan; Bensmail, Halima; Gao, Xin

2012-01-01

Background: Protein domain ranking is a fundamental task in structural biology. Most protein domain ranking methods rely on the pairwise comparison of protein domains while neglecting the global manifold structure of the protein domain database. Recently, graph regularized ranking that exploits the global structure of the graph defined by the pairwise similarities has been proposed. However, the existing graph regularized ranking methods are very sensitive to the choice of the graph model and parameters, and this remains a difficult problem for most of the protein domain ranking methods.Results: To tackle this problem, we have developed the Multiple Graph regularized Ranking algorithm, MultiG-Rank. Instead of using a single graph to regularize the ranking scores, MultiG-Rank approximates the intrinsic manifold of protein domain distribution by combining multiple initial graphs for the regularization. Graph weights are learned with ranking scores jointly and automatically, by alternately minimizing an objective function in an iterative algorithm. Experimental results on a subset of the ASTRAL SCOP protein domain database demonstrate that MultiG-Rank achieves a better ranking performance than single graph regularized ranking methods and pairwise similarity based ranking methods.Conclusion: The problem of graph model and parameter selection in graph regularized protein domain ranking can be solved effectively by combining multiple graphs. This aspect of generalization introduces a new frontier in applying multiple graphs to solving protein domain ranking applications. 2012 Wang et al; licensee BioMed Central Ltd.
Multiple graph regularized protein domain ranking

Directory of Open Access Journals (Sweden)

Wang Jim

2012-11-01

Full Text Available Abstract Background Protein domain ranking is a fundamental task in structural biology. Most protein domain ranking methods rely on the pairwise comparison of protein domains while neglecting the global manifold structure of the protein domain database. Recently, graph regularized ranking that exploits the global structure of the graph defined by the pairwise similarities has been proposed. However, the existing graph regularized ranking methods are very sensitive to the choice of the graph model and parameters, and this remains a difficult problem for most of the protein domain ranking methods. Results To tackle this problem, we have developed the Multiple Graph regularized Ranking algorithm, MultiG-Rank. Instead of using a single graph to regularize the ranking scores, MultiG-Rank approximates the intrinsic manifold of protein domain distribution by combining multiple initial graphs for the regularization. Graph weights are learned with ranking scores jointly and automatically, by alternately minimizing an objective function in an iterative algorithm. Experimental results on a subset of the ASTRAL SCOP protein domain database demonstrate that MultiG-Rank achieves a better ranking performance than single graph regularized ranking methods and pairwise similarity based ranking methods. Conclusion The problem of graph model and parameter selection in graph regularized protein domain ranking can be solved effectively by combining multiple graphs. This aspect of generalization introduces a new frontier in applying multiple graphs to solving protein domain ranking applications.
SIMS: addressing the problem of heterogeneity in databases

Science.gov (United States)

Arens, Yigal

1997-02-01

The heterogeneity of remotely accessible databases -- with respect to contents, query language, semantics, organization, etc. -- presents serious obstacles to convenient querying. The SIMS (single interface to multiple sources) system addresses this global integration problem. It does so by defining a single language for describing the domain about which information is stored in the databases and using this language as the query language. Each database to which SIMS is to provide access is modeled using this language. The model describes a database's contents, organization, and other relevant features. SIMS uses these models, together with a planning system drawing on techniques from artificial intelligence, to decompose a given user's high-level query into a series of queries against the databases and other data manipulation steps. The retrieval plan is constructed so as to minimize data movement over the network and maximize parallelism to increase execution speed. SIMS can recover from network failures during plan execution by obtaining data from alternate sources, when possible. SIMS has been demonstrated in the domains of medical informatics and logistics, using real databases.
A database of immunoglobulins with integrated tools: DIGIT.

KAUST Repository

Chailyan, Anna; Tramontano, Anna; Marcatili, Paolo

2011-01-01

The DIGIT (Database of ImmunoGlobulins with Integrated Tools) database (http://biocomputing.it/digit) is an integrated resource storing sequences of annotated immunoglobulin variable domains and enriched with tools for searching and analyzing them. The annotations in the database include information on the type of antigen, the respective germline sequences and on pairing information between light and heavy chains. Other annotations, such as the identification of the complementarity determining regions, assignment of their structural class and identification of mutations with respect to the germline, are computed on the fly and can also be obtained for user-submitted sequences. The system allows customized BLAST searches and automatic building of 3D models of the domains to be performed.
A database of immunoglobulins with integrated tools: DIGIT.

KAUST Repository

Chailyan, Anna

2011-11-10

The DIGIT (Database of ImmunoGlobulins with Integrated Tools) database (http://biocomputing.it/digit) is an integrated resource storing sequences of annotated immunoglobulin variable domains and enriched with tools for searching and analyzing them. The annotations in the database include information on the type of antigen, the respective germline sequences and on pairing information between light and heavy chains. Other annotations, such as the identification of the complementarity determining regions, assignment of their structural class and identification of mutations with respect to the germline, are computed on the fly and can also be obtained for user-submitted sequences. The system allows customized BLAST searches and automatic building of 3D models of the domains to be performed.
Allie: a database and a search service of abbreviations and long forms

Science.gov (United States)

Yamamoto, Yasunori; Yamaguchi, Atsuko; Bono, Hidemasa; Takagi, Toshihisa

2011-01-01

Many abbreviations are used in the literature especially in the life sciences, and polysemous abbreviations appear frequently, making it difficult to read and understand scientific papers that are outside of a reader’s expertise. Thus, we have developed Allie, a database and a search service of abbreviations and their long forms (a.k.a. full forms or definitions). Allie searches for abbreviations and their corresponding long forms in a database that we have generated based on all titles and abstracts in MEDLINE. When a user query matches an abbreviation, Allie returns all potential long forms of the query along with their bibliographic data (i.e. title and publication year). In addition, for each candidate, co-occurring abbreviations and a research field in which it frequently appears in the MEDLINE data are displayed. This function helps users learn about the context in which an abbreviation appears. To deal with synonymous long forms, we use a dictionary called GENA that contains domain-specific terms such as gene, protein or disease names along with their synonymic information. Conceptually identical domain-specific terms are regarded as one term, and then conceptually identical abbreviation-long form pairs are grouped taking into account their appearance in MEDLINE. To keep up with new abbreviations that are continuously introduced, Allie has an automatic update system. In addition, the database of abbreviations and their long forms with their corresponding PubMed IDs is constructed and updated weekly. Database URL: The Allie service is available at http://allie.dbcls.jp/. PMID:21498548
The MAR databases: development and implementation of databases specific for marine metagenomics.

Science.gov (United States)

Klemetsen, Terje; Raknes, Inge A; Fu, Juan; Agafonov, Alexander; Balasundaram, Sudhagar V; Tartari, Giacomo; Robertsen, Espen; Willassen, Nils P

2018-01-04

We introduce the marine databases; MarRef, MarDB and MarCat (https://mmp.sfb.uit.no/databases/), which are publicly available resources that promote marine research and innovation. These data resources, which have been implemented in the Marine Metagenomics Portal (MMP) (https://mmp.sfb.uit.no/), are collections of richly annotated and manually curated contextual (metadata) and sequence databases representing three tiers of accuracy. While MarRef is a database for completely sequenced marine prokaryotic genomes, which represent a marine prokaryote reference genome database, MarDB includes all incomplete sequenced prokaryotic genomes regardless level of completeness. The last database, MarCat, represents a gene (protein) catalog of uncultivable (and cultivable) marine genes and proteins derived from marine metagenomics samples. The first versions of MarRef and MarDB contain 612 and 3726 records, respectively. Each record is built up of 106 metadata fields including attributes for sampling, sequencing, assembly and annotation in addition to the organism and taxonomic information. Currently, MarCat contains 1227 records with 55 metadata fields. Ontologies and controlled vocabularies are used in the contextual databases to enhance consistency. The user-friendly web interface lets the visitors browse, filter and search in the contextual databases and perform BLAST searches against the corresponding sequence databases. All contextual and sequence databases are freely accessible and downloadable from https://s1.sfb.uit.no/public/mar/. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
21SSD: a new public 21-cm EoR database

Science.gov (United States)

Eames, Evan; Semelin, Benoît

2018-05-01

With current efforts inching closer to detecting the 21-cm signal from the Epoch of Reionization (EoR), proper preparation will require publicly available simulated models of the various forms the signal could take. In this work we present a database of such models, available at 21ssd.obspm.fr. The models are created with a fully-coupled radiative hydrodynamic simulation (LICORICE), and are created at high resolution (10243). We also begin to analyse and explore the possible 21-cm EoR signals (with Power Spectra and Pixel Distribution Functions), and study the effects of thermal noise on our ability to recover the signal out to high redshifts. Finally, we begin to explore the concepts of `distance' between different models, which represents a crucial step towards optimising parameter space sampling, training neural networks, and finally extracting parameter values from observations.
Healthcare Databases in Thailand and Japan: Potential Sources for Health Technology Assessment Research.

Directory of Open Access Journals (Sweden)

Surasak Saokaew

Full Text Available Health technology assessment (HTA has been continuously used for value-based healthcare decisions over the last decade. Healthcare databases represent an important source of information for HTA, which has seen a surge in use in Western countries. Although HTA agencies have been established in Asia-Pacific region, application and understanding of healthcare databases for HTA is rather limited. Thus, we reviewed existing databases to assess their potential for HTA in Thailand where HTA has been used officially and Japan where HTA is going to be officially introduced.Existing healthcare databases in Thailand and Japan were compiled and reviewed. Databases' characteristics e.g. name of database, host, scope/objective, time/sample size, design, data collection method, population/sample, and variables were described. Databases were assessed for its potential HTA use in terms of safety/efficacy/effectiveness, social/ethical, organization/professional, economic, and epidemiological domains. Request route for each database was also provided.Forty databases- 20 from Thailand and 20 from Japan-were included. These comprised of national censuses, surveys, registries, administrative data, and claimed databases. All databases were potentially used for epidemiological studies. In addition, data on mortality, morbidity, disability, adverse events, quality of life, service/technology utilization, length of stay, and economics were also found in some databases. However, access to patient-level data was limited since information about the databases was not available on public sources.Our findings have shown that existing databases provided valuable information for HTA research with limitation on accessibility. Mutual dialogue on healthcare database development and usage for HTA among Asia-Pacific region is needed.
Healthcare Databases in Thailand and Japan: Potential Sources for Health Technology Assessment Research.

Science.gov (United States)

Saokaew, Surasak; Sugimoto, Takashi; Kamae, Isao; Pratoomsoot, Chayanin; Chaiyakunapruk, Nathorn

2015-01-01

Health technology assessment (HTA) has been continuously used for value-based healthcare decisions over the last decade. Healthcare databases represent an important source of information for HTA, which has seen a surge in use in Western countries. Although HTA agencies have been established in Asia-Pacific region, application and understanding of healthcare databases for HTA is rather limited. Thus, we reviewed existing databases to assess their potential for HTA in Thailand where HTA has been used officially and Japan where HTA is going to be officially introduced. Existing healthcare databases in Thailand and Japan were compiled and reviewed. Databases' characteristics e.g. name of database, host, scope/objective, time/sample size, design, data collection method, population/sample, and variables were described. Databases were assessed for its potential HTA use in terms of safety/efficacy/effectiveness, social/ethical, organization/professional, economic, and epidemiological domains. Request route for each database was also provided. Forty databases- 20 from Thailand and 20 from Japan-were included. These comprised of national censuses, surveys, registries, administrative data, and claimed databases. All databases were potentially used for epidemiological studies. In addition, data on mortality, morbidity, disability, adverse events, quality of life, service/technology utilization, length of stay, and economics were also found in some databases. However, access to patient-level data was limited since information about the databases was not available on public sources. Our findings have shown that existing databases provided valuable information for HTA research with limitation on accessibility. Mutual dialogue on healthcare database development and usage for HTA among Asia-Pacific region is needed.
The public understanding of nanotechnology in the food domain: the hidden role of views on science, technology, and nature.

Science.gov (United States)

Vandermoere, Frederic; Blanchemanche, Sandrine; Bieberstein, Andrea; Marette, Stephan; Roosen, Jutta

2011-03-01

In spite of great expectations about the potential of nanotechnology, this study shows that people are rather ambiguous and pessimistic about nanotechnology applications in the food domain. Our findings are drawn from a survey of public perceptions about nanotechnology food and nanotechnology food packaging (N = 752). Multinomial logistic regression analyses further reveal that knowledge about food risks and nanotechnology significantly influences people's views about nanotechnology food packaging. However, knowledge variables were unrelated to support for nanofood, suggesting that an increase in people's knowledge might not be sufficient to bridge the gap between the excitement some business leaders in the food sector have and the restraint of the public. Additionally, opposition to nanofood was not related to the use of heuristics but to trust in governmental agencies. Furthermore, the results indicate that public perceptions of nanoscience in the food domain significantly relate to views on science, technology, and nature.
Segmentation of anatomical structures in chest radiographs using supervised methods: a comparative study on a public database

DEFF Research Database (Denmark)

van Ginneken, Bram; Stegmann, Mikkel Bille; Loog, Marco

2006-01-01

classification method that employs a multi-scale filter bank of Gaussian derivatives and a k-nearest-neighbors classifier. The methods have been tested on a publicly available database of 247 chest radiographs, in which all objects have been manually segmented by two human observers. A parameter optimization...

Development of Human Face Literature Database Using Text Mining Approach: Phase I.

Science.gov (United States)

Kaur, Paramjit; Krishan, Kewal; Sharma, Suresh K

2018-06-01

The face is an important part of the human body by which an individual communicates in the society. Its importance can be highlighted by the fact that a person deprived of face cannot sustain in the living world. The amount of experiments being performed and the number of research papers being published under the domain of human face have surged in the past few decades. Several scientific disciplines, which are conducting research on human face include: Medical Science, Anthropology, Information Technology (Biometrics, Robotics, and Artificial Intelligence, etc.), Psychology, Forensic Science, Neuroscience, etc. This alarms the need of collecting and managing the data concerning human face so that the public and free access of it can be provided to the scientific community. This can be attained by developing databases and tools on human face using bioinformatics approach. The current research emphasizes on creating a database concerning literature data of human face. The database can be accessed on the basis of specific keywords, journal name, date of publication, author's name, etc. The collected research papers will be stored in the form of a database. Hence, the database will be beneficial to the research community as the comprehensive information dedicated to the human face could be found at one place. The information related to facial morphologic features, facial disorders, facial asymmetry, facial abnormalities, and many other parameters can be extracted from this database. The front end has been developed using Hyper Text Mark-up Language and Cascading Style Sheets. The back end has been developed using hypertext preprocessor (PHP). The JAVA Script has used as scripting language. MySQL (Structured Query Language) is used for database development as it is most widely used Relational Database Management System. XAMPP (X (cross platform), Apache, MySQL, PHP, Perl) open source web application software has been used as the server.The database is still under the
There Is a Significant Discrepancy Between "Big Data" Database and Original Research Publications on Hip Arthroscopy Outcomes: A Systematic Review.

Science.gov (United States)

Sochacki, Kyle R; Jack, Robert A; Safran, Marc R; Nho, Shane J; Harris, Joshua D

2018-06-01

The purpose of this study was to compare (1) major complication, (2) revision, and (3) conversion to arthroplasty rates following hip arthroscopy between database studies and original research peer-reviewed publications. A systematic review was performed using PRISMA guidelines. PubMed, SCOPUS, SportDiscus, and Cochrane Central Register of Controlled Trials were searched for studies that investigated major complication (dislocation, femoral neck fracture, avascular necrosis, fluid extravasation, septic arthritis, death), revision, and hip arthroplasty conversion rates following hip arthroscopy. Major complication, revision, and conversion to hip arthroplasty rates were compared between original research (single- or multicenter therapeutic studies) and database (insurance database using ICD-9/10 and/or current procedural terminology coding terminology) publishing studies. Two hundred seven studies (201 original research publications [15,780 subjects; 54% female] and 6 database studies [20,825 subjects; 60% female]) were analyzed (mean age, 38.2 ± 11.6 years old; mean follow-up, 2.7 ± 2.9 years). The database studies had a significantly higher age (40.6 + 2.8 vs 35.4 ± 11.6), body mass index (27.4 ± 5.6 vs 24.9 ± 3.1), percentage of females (60.1% vs 53.8%), and longer follow-up (3.1 ± 1.6 vs 2.7 ± 3.0) compared with original research (P database studies (P = .029; relative risk [RR], 1.3). There was a significantly higher rate of femoral neck fracture (0.24% vs 0.03%; P database studies. Reoperations occurred at a significantly higher rate in the database studies (11.1% vs 7.3%; P database studies (8.0% vs 3.7%; P Database studies report significantly increased major complication, revision, and conversion to hip arthroplasty rates compared with original research investigations of hip arthroscopy outcomes. Level IV, systematic review of Level I-IV studies. Copyright © 2018 Arthroscopy Association of North America. Published by Elsevier Inc. All rights
Hospice palliative care article publications: An analysis of the Web of Science database from 1993 to 2013.

Science.gov (United States)

Chang, Hsiao-Ting; Lin, Ming-Hwai; Chen, Chun-Ku; Hwang, Shinn-Jang; Hwang, I-Hsuan; Chen, Yu-Chun

2016-01-01

Academic publications are important for developing a medical specialty or discipline and improvements of quality of care. As hospice palliative care medicine is a rapidly growing medical specialty in Taiwan, this study aimed to analyze the hospice palliative care-related publications from 1993 through 2013 both worldwide and in Taiwan, by using the Web of Science database. Academic articles published with topics including "hospice", "palliative care", "end of life care", and "terminal care" were retrieved and analyzed from the Web of Science database, which includes documents published in Science Citation Index-Expanded and Social Science Citation Indexed journals from 1993 to 2013. Compound annual growth rates (CAGRs) were calculated to evaluate the trends of publications. There were a total of 27,788 documents published worldwide during the years 1993 to 2013. The top five most prolific countries/areas with published documents were the United States (11,419 documents, 41.09%), England (3620 documents, 13.03%), Canada (2428 documents, 8.74%), Germany (1598 documents, 5.75%), and Australia (1580 documents, 5.69%). Three hundred and ten documents (1.12%) were published from Taiwan, which ranks second among Asian countries (after Japan, with 594 documents, 2.14%) and 16(th) in the world. During this 21-year period, the number of hospice palliative care-related article publications increased rapidly. The worldwide CAGR for hospice palliative care publications during 1993 through 2013 was 12.9%. As for Taiwan, the CAGR for publications during 1999 through 2013 was 19.4%. The majority of these documents were submitted from universities or hospitals affiliated to universities. The number of hospice palliative care-related publications increased rapidly from 1993 to 2013 in the world and in Taiwan; however, the number of publications from Taiwan is still far below those published in several other countries. Further research is needed to identify and try to reduce the
The Danish Intensive Care Database

DEFF Research Database (Denmark)

Christiansen, Christian Fynbo; Møller, Morten Hylander; Nielsen, Henrik

2016-01-01

AIM OF DATABASE: The aim of this database is to improve the quality of care in Danish intensive care units (ICUs) by monitoring key domains of intensive care and to compare these with predefined standards. STUDY POPULATION: The Danish Intensive Care Database (DID) was established in 2007...... and standardized mortality ratios for death within 30 days after admission using case-mix adjustment (initially using age, sex, and comorbidity level, and, since 2013, using SAPS II) for all patients and for patients with septic shock. DESCRIPTIVE DATA: The DID currently includes 335,564 ICU admissions during 2005...
Mapping small molecule binding data to structural domains.

Science.gov (United States)

Kruger, Felix A; Rostom, Raghd; Overington, John P

2012-01-01

Large-scale bioactivity/SAR Open Data has recently become available, and this has allowed new analyses and approaches to be developed to help address the productivity and translational gaps of current drug discovery. One of the current limitations of these data is the relative sparsity of reported interactions per protein target, and complexities in establishing clear relationships between bioactivity and targets using bioinformatics tools. We detail in this paper the indexing of targets by the structural domains that bind (or are likely to bind) the ligand within a full-length protein. Specifically, we present a simple heuristic to map small molecule binding to Pfam domains. This profiling can be applied to all proteins within a genome to give some indications of the potential pharmacological modulation and regulation of all proteins. In this implementation of our heuristic, ligand binding to protein targets from the ChEMBL database was mapped to structural domains as defined by profiles contained within the Pfam-A database. Our mapping suggests that the majority of assay targets within the current version of the ChEMBL database bind ligands through a small number of highly prevalent domains, and conversely the majority of Pfam domains sampled by our data play no currently established role in ligand binding. Validation studies, carried out firstly against Uniprot entries with expert binding-site annotation and secondly against entries in the wwPDB repository of crystallographic protein structures, demonstrate that our simple heuristic maps ligand binding to the correct domain in about 90 percent of all assessed cases. Using the mappings obtained with our heuristic, we have assembled ligand sets associated with each Pfam domain. Small molecule binding has been mapped to Pfam-A domains of protein targets in the ChEMBL bioactivity database. The result of this mapping is an enriched annotation of small molecule bioactivity data and a grouping of activity classes
ADLIB: A simple database framework for beamline codes

International Nuclear Information System (INIS)

Mottershead, C.T.

1993-01-01

There are many well developed codes available for beamline design and analysis. A significant fraction of each of these codes is devoted to processing its own unique input language for describing the problem. None of these large, complex, and powerful codes does everything. Adding a new bit of specialized physics can be a difficult task whose successful completion makes the code even larger and more complex. This paper describes an attempt to move in the opposite direction, toward a family of small, simple, single purpose physics and utility modules, linked by an open, portable, public domain database framework. These small specialized physics codes begin with the beamline parameters already loaded in the database, and accessible via the handful of subroutines that constitute ADLIB. Such codes are easier to write, and inherently organized in a manner suitable for incorporation in model based control system algorithms. Examples include programs for analyzing beamline misalignment sensitivities, for simulating and fitting beam steering data, and for translating among MARYLIE, TRANSPORT, and TRACE3D formats
Global Tsunami Database: Adding Geologic Deposits, Proxies, and Tools

Science.gov (United States)

Brocko, V. R.; Varner, J.

2007-12-01

A result of collaboration between NOAA's National Geophysical Data Center (NGDC) and the Cooperative Institute for Research in the Environmental Sciences (CIRES), the Global Tsunami Database includes instrumental records, human observations, and now, information inferred from the geologic record. Deep Ocean Assessment and Reporting of Tsunamis (DART) data, historical reports, and information gleaned from published tsunami deposit research build a multi-faceted view of tsunami hazards and their history around the world. Tsunami history provides clues to what might happen in the future, including frequency of occurrence and maximum wave heights. However, instrumental and written records commonly span too little time to reveal the full range of a region's tsunami hazard. The sedimentary deposits of tsunamis, identified with the aid of modern analogs, increasingly complement instrumental and human observations. By adding the component of tsunamis inferred from the geologic record, the Global Tsunami Database extends the record of tsunamis backward in time. Deposit locations, their estimated age and descriptions of the deposits themselves fill in the tsunami record. Tsunamis inferred from proxies, such as evidence for coseismic subsidence, are included to estimate recurrence intervals, but are flagged to highlight the absence of a physical deposit. Authors may submit their own descriptions and upload digital versions of publications. Users may sort by any populated field, including event, location, region, age of deposit, author, publication type (extract information from peer reviewed publications only, if you wish), grain size, composition, presence/absence of plant material. Users may find tsunami deposit references for a given location, event or author; search for particular properties of tsunami deposits; and even identify potential collaborators. Users may also download public-domain documents. Data and information may be viewed using tools designed to extract and
Directory of IAEA databases. 4. ed.

International Nuclear Information System (INIS)

1997-06-01

This fourth edition of the Directory of IAEA Databases has been prepared within the Division of NESI. ITs main objective is to describe the computerized information sources available to the public. This directory contains all publicly available databases which are produced at the IAEA. This includes databases stored on the mainframe, LAN servers and user PCs. All IAEA Division Directors have been requested to register the existence of their databases with NESI. At the data of printing, some of the information in the directory will be already obsolete. For the most up-to-date information please see the IAEA's World Wide Web site at URL: http:/www.iaea.or.at/databases/dbdir/. Refs, figs, tabs
Databases of Publications and Observations as a Part of the Crimean Astronomical Virtual Observatory

Directory of Open Access Journals (Sweden)

Shlyapnikov A.

2015-12-01

Full Text Available We describe the main principles of formation of databases (DBs with information about astronomical objects and their physical characteristics derived from observations obtained at the Crimean Astrophysical Observatory (CrAO and published in the “Izvestiya of the CrAO” and elsewhere. Emphasis is placed on the DBs missing from the most complete global library of catalogs and data tables, VizieR (supported by the Center of Astronomical Data, Strasbourg. We specially consider the problem of forming a digital archive of observational data obtained at the CrAO as an interactive DB related to database objects and publications. We present examples of all our DBs as elements integrated into the Crimean Astronomical Virtual Observatory. We illustrate the work with the CrAO DBs using tools of the International Virtual Observatory: Aladin, VOPlot, VOSpec, in conjunction with the VizieR and Simbad DBs.
Aviation Safety Issues Database

Science.gov (United States)

Morello, Samuel A.; Ricks, Wendell R.

2009-01-01

The aviation safety issues database was instrumental in the refinement and substantiation of the National Aviation Safety Strategic Plan (NASSP). The issues database is a comprehensive set of issues from an extremely broad base of aviation functions, personnel, and vehicle categories, both nationally and internationally. Several aviation safety stakeholders such as the Commercial Aviation Safety Team (CAST) have already used the database. This broader interest was the genesis to making the database publically accessible and writing this report.
Database Resources of the BIG Data Center in 2018.

Science.gov (United States)

2018-01-04

The BIG Data Center at Beijing Institute of Genomics (BIG) of the Chinese Academy of Sciences provides freely open access to a suite of database resources in support of worldwide research activities in both academia and industry. With the vast amounts of omics data generated at ever-greater scales and rates, the BIG Data Center is continually expanding, updating and enriching its core database resources through big-data integration and value-added curation, including BioCode (a repository archiving bioinformatics tool codes), BioProject (a biological project library), BioSample (a biological sample library), Genome Sequence Archive (GSA, a data repository for archiving raw sequence reads), Genome Warehouse (GWH, a centralized resource housing genome-scale data), Genome Variation Map (GVM, a public repository of genome variations), Gene Expression Nebulas (GEN, a database of gene expression profiles based on RNA-Seq data), Methylation Bank (MethBank, an integrated databank of DNA methylomes), and Science Wikis (a series of biological knowledge wikis for community annotations). In addition, three featured web services are provided, viz., BIG Search (search as a service; a scalable inter-domain text search engine), BIG SSO (single sign-on as a service; a user access control system to gain access to multiple independent systems with a single ID and password) and Gsub (submission as a service; a unified submission service for all relevant resources). All of these resources are publicly accessible through the home page of the BIG Data Center at http://bigd.big.ac.cn. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Evolution of Industry Knowledge in the Public Domain: Prior Art Searching for Software Patents

Directory of Open Access Journals (Sweden)

Jinseok Park

2005-03-01

Full Text Available Searching prior art is a key part of the patent application and examination processes. A comprehensive prior art search gives the inventor ideas as to how he can improve or circumvent existing technology by providing up to date knowledge on the state of the art. It also enables the patent applicant to minimise the likelihood of an objection from the patent office. This article explores the characteristics of prior art associated with software patents, dealing with difficulties in searching prior art due to the lack of resources, and considers public contribution to the formation of prior art databases. It addresses the evolution of electronic prior art in line with technological development, and discusses laws and practices in the EPO, USPTO, and the JPO in relation to the validity of prior art resources on the Internet. This article also investigates the main features of searching sources and tools in the three patent offices as well as non-patent literature databases. Based on the analysis of various searching databases, it provides some strategies of efficient prior art searching that should be considered for software-related inventions.
Methodology for Automatic Ontology Generation Using Database Schema Information

Directory of Open Access Journals (Sweden)

JungHyen An

2018-01-01

Full Text Available An ontology is a model language that supports the functions to integrate conceptually distributed domain knowledge and infer relationships among the concepts. Ontologies are developed based on the target domain knowledge. As a result, methodologies to automatically generate an ontology from metadata that characterize the domain knowledge are becoming important. However, existing methodologies to automatically generate an ontology using metadata are required to generate the domain metadata in a predetermined template, and it is difficult to manage data that are increased on the ontology itself when the domain OWL (Ontology Web Language individuals are continuously increased. The database schema has a feature of domain knowledge and provides structural functions to efficiently process the knowledge-based data. In this paper, we propose a methodology to automatically generate ontologies and manage the OWL individual through an interaction of the database and the ontology. We describe the automatic ontology generation process with example schema and demonstrate the effectiveness of the automatically generated ontology by comparing it with existing ontologies using the ontology quality score.
Towards an information strategy for combating identity fraud in the public domain: Cases from healthcare and criminal justice

NARCIS (Netherlands)

Plomp, M.G.A.; Grijpink, J.H.A.M.

2011-01-01

Two trends are present in both the private and public domain: increasing interorganisational co-operation and increasing digitisation. More and more processes within and between organisations take place electronically, on local, national and European scale. The technological and organisational
The Danish fetal medicine database

DEFF Research Database (Denmark)

Ekelund, Charlotte Kvist; Kopp, Tine Iskov; Tabor, Ann

2016-01-01

trimester ultrasound scan performed at all public hospitals in Denmark are registered in the database. Main variables/descriptive data: Data on maternal characteristics, ultrasonic, and biochemical variables are continuously sent from the fetal medicine units’Astraia databases to the central database via...... analyses are sent to the database. Conclusion: It has been possible to establish a fetal medicine database, which monitors first-trimester screening for chromosomal abnormalities and second-trimester screening for major fetal malformations with the input from already collected data. The database...
A Partnership for Public Health: USDA Branded Food Products Database

Science.gov (United States)

The importance of comprehensive food composition databases is more critical than ever in helping to address global food security. The USDA National Nutrient Database for Standard Reference is the “gold standard” for food composition databases. The presentation will include new developments in stren...
The GED4GEM project: development of a Global Exposure Database for the Global Earthquake Model initiative

Science.gov (United States)

Gamba, P.; Cavalca, D.; Jaiswal, K.S.; Huyck, C.; Crowley, H.

2012-01-01

In order to quantify earthquake risk of any selected region or a country of the world within the Global Earthquake Model (GEM) framework (www.globalquakemodel.org/), a systematic compilation of building inventory and population exposure is indispensable. Through the consortium of leading institutions and by engaging the domain-experts from multiple countries, the GED4GEM project has been working towards the development of a first comprehensive publicly available Global Exposure Database (GED). This geospatial exposure database will eventually facilitate global earthquake risk and loss estimation through GEM’s OpenQuake platform. This paper provides an overview of the GED concepts, aims, datasets, and inference methodology, as well as the current implementation scheme, status and way forward.
Benchmarking Ligand-Based Virtual High-Throughput Screening with the PubChem Database

Directory of Open Access Journals (Sweden)

Mariusz Butkiewicz

2013-01-01

Full Text Available With the rapidly increasing availability of High-Throughput Screening (HTS data in the public domain, such as the PubChem database, methods for ligand-based computer-aided drug discovery (LB-CADD have the potential to accelerate and reduce the cost of probe development and drug discovery efforts in academia. We assemble nine data sets from realistic HTS campaigns representing major families of drug target proteins for benchmarking LB-CADD methods. Each data set is public domain through PubChem and carefully collated through confirmation screens validating active compounds. These data sets provide the foundation for benchmarking a new cheminformatics framework BCL::ChemInfo, which is freely available for non-commercial use. Quantitative structure activity relationship (QSAR models are built using Artificial Neural Networks (ANNs, Support Vector Machines (SVMs, Decision Trees (DTs, and Kohonen networks (KNs. Problem-specific descriptor optimization protocols are assessed including Sequential Feature Forward Selection (SFFS and various information content measures. Measures of predictive power and confidence are evaluated through cross-validation, and a consensus prediction scheme is tested that combines orthogonal machine learning algorithms into a single predictor. Enrichments ranging from 15 to 101 for a TPR cutoff of 25% are observed.
Knowledge-based public health situation awareness

Science.gov (United States)

Mirhaji, Parsa; Zhang, Jiajie; Srinivasan, Arunkumar; Richesson, Rachel L.; Smith, Jack W.

2004-09-01

There have been numerous efforts to create comprehensive databases from multiple sources to monitor the dynamics of public health and most specifically to detect the potential threats of bioterrorism before widespread dissemination. But there are not many evidences for the assertion that these systems are timely and dependable, or can reliably identify man made from natural incident. One must evaluate the value of so called 'syndromic surveillance systems' along with the costs involved in design, development, implementation and maintenance of such systems and the costs involved in investigation of the inevitable false alarms1. In this article we will introduce a new perspective to the problem domain with a shift in paradigm from 'surveillance' toward 'awareness'. As we conceptualize a rather different approach to tackle the problem, we will introduce a different methodology in application of information science, computer science, cognitive science and human-computer interaction concepts in design and development of so called 'public health situation awareness systems'. We will share some of our design and implementation concepts for the prototype system that is under development in the Center for Biosecurity and Public Health Informatics Research, in the University of Texas Health Science Center at Houston. The system is based on a knowledgebase containing ontologies with different layers of abstraction, from multiple domains, that provide the context for information integration, knowledge discovery, interactive data mining, information visualization, information sharing and communications. The modular design of the knowledgebase and its knowledge representation formalism enables incremental evolution of the system from a partial system to a comprehensive knowledgebase of 'public health situation awareness' as it acquires new knowledge through interactions with domain experts or automatic discovery of new knowledge.
Managing Multiuser Database Buffers Using Data Mining Techniques

NARCIS (Netherlands)

Feng, L.; Lu, H.J.

2004-01-01

In this paper, we propose a data-mining-based approach to public buffer management for a multiuser database system, where database buffers are organized into two areas – public and private. While the private buffer areas contain pages to be updated by particular users, the public

Stackfile Database

Science.gov (United States)

deVarvalho, Robert; Desai, Shailen D.; Haines, Bruce J.; Kruizinga, Gerhard L.; Gilmer, Christopher

2013-01-01

This software provides storage retrieval and analysis functionality for managing satellite altimetry data. It improves the efficiency and analysis capabilities of existing database software with improved flexibility and documentation. It offers flexibility in the type of data that can be stored. There is efficient retrieval either across the spatial domain or the time domain. Built-in analysis tools are provided for frequently performed altimetry tasks. This software package is used for storing and manipulating satellite measurement data. It was developed with a focus on handling the requirements of repeat-track altimetry missions such as Topex and Jason. It was, however, designed to work with a wide variety of satellite measurement data [e.g., Gravity Recovery And Climate Experiment -- GRACE). The software consists of several command-line tools for importing, retrieving, and analyzing satellite measurement data.
Combining Public Domain and Professional Panoramic Imagery for the Accurate and Dense 3d Reconstruction of the Destroyed Bel Temple in Palmyra

Science.gov (United States)

Wahbeh, W.; Nebiker, S.; Fangi, G.

2016-06-01

This paper exploits the potential of dense multi-image 3d reconstruction of destroyed cultural heritage monuments by either using public domain touristic imagery only or by combining the public domain imagery with professional panoramic imagery. The focus of our work is placed on the reconstruction of the temple of Bel, one of the Syrian heritage monuments, which was destroyed in September 2015 by the so called "Islamic State". The great temple of Bel is considered as one of the most important religious buildings of the 1st century AD in the East with a unique design. The investigations and the reconstruction were carried out using two types of imagery. The first are freely available generic touristic photos collected from the web. The second are panoramic images captured in 2010 for documenting those monuments. In the paper we present a 3d reconstruction workflow for both types of imagery using state-of-the art dense image matching software, addressing the non-trivial challenges of combining uncalibrated public domain imagery with panoramic images with very wide base-lines. We subsequently investigate the aspects of accuracy and completeness obtainable from the public domain touristic images alone and from the combination with spherical panoramas. We furthermore discuss the challenges of co-registering the weakly connected 3d point cloud fragments resulting from the limited coverage of the touristic photos. We then describe an approach using spherical photogrammetry as a virtual topographic survey allowing the co-registration of a detailed and accurate single 3d model of the temple interior and exterior.
The Government Finance Database: A Common Resource for Quantitative Research in Public Financial Analysis.

Science.gov (United States)

Pierson, Kawika; Hand, Michael L; Thompson, Fred

2015-01-01

Quantitative public financial management research focused on local governments is limited by the absence of a common database for empirical analysis. While the U.S. Census Bureau distributes government finance data that some scholars have utilized, the arduous process of collecting, interpreting, and organizing the data has led its adoption to be prohibitive and inconsistent. In this article we offer a single, coherent resource that contains all of the government financial data from 1967-2012, uses easy to understand natural-language variable names, and will be extended when new data is available.
SeedStor: A Germplasm Information Management System and Public Database.

Science.gov (United States)

Horler, R S P; Turner, A S; Fretter, P; Ambrose, M

2018-01-01

SeedStor (https://www.seedstor.ac.uk) acts as the publicly available database for the seed collections held by the Germplasm Resources Unit (GRU) based at the John Innes Centre, Norwich, UK. The GRU is a national capability supported by the Biotechnology and Biological Sciences Research Council (BBSRC). The GRU curates germplasm collections of a range of temperate cereal, legume and Brassica crops and their associated wild relatives, as well as precise genetic stocks, near-isogenic lines and mapping populations. With >35,000 accessions, the GRU forms part of the UK's plant conservation contribution to the Multilateral System (MLS) of the International Treaty for Plant Genetic Resources for Food and Agriculture (ITPGRFA) for wheat, barley, oat and pea. SeedStor is a fully searchable system that allows our various collections to be browsed species by species through to complicated multipart phenotype criteria-driven queries. The results from these searches can be downloaded for later analysis or used to order germplasm via our shopping cart. The user community for SeedStor is the plant science research community, plant breeders, specialist growers, hobby farmers and amateur gardeners, and educationalists. Furthermore, SeedStor is much more than a database; it has been developed to act internally as a Germplasm Information Management System that allows team members to track and process germplasm requests, determine regeneration priorities, handle cost recovery and Material Transfer Agreement paperwork, manage the Seed Store holdings and easily report on a wide range of the aforementioned tasks. © The Author(s) 2017. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists.
Development and application of a database of food ingredient fraud and economically motivated adulteration from 1980 to 2010.

Science.gov (United States)

Moore, Jeffrey C; Spink, John; Lipp, Markus

2012-04-01

Food ingredient fraud and economically motivated adulteration are emerging risks, but a comprehensive compilation of information about known problematic ingredients and detection methods does not currently exist. The objectives of this research were to collect such information from publicly available articles in scholarly journals and general media, organize into a database, and review and analyze the data to identify trends. The results summarized are a database that will be published in the US Pharmacopeial Convention's Food Chemicals Codex, 8th edition, and includes 1305 records, including 1000 records with analytical methods collected from 677 references. Olive oil, milk, honey, and saffron were the most common targets for adulteration reported in scholarly journals, and potentially harmful issues identified include spices diluted with lead chromate and lead tetraoxide, substitution of Chinese star anise with toxic Japanese star anise, and melamine adulteration of high protein content foods. High-performance liquid chromatography and infrared spectroscopy were the most common analytical detection procedures, and chemometrics data analysis was used in a large number of reports. Future expansion of this database will include additional publically available articles published before 1980 and in other languages, as well as data outside the public domain. The authors recommend in-depth analyses of individual incidents. This report describes the development and application of a database of food ingredient fraud issues from publicly available references. The database provides baseline information and data useful to governments, agencies, and individual companies assessing the risks of specific products produced in specific regions as well as products distributed and sold in other regions. In addition, the report describes current analytical technologies for detecting food fraud and identifies trends and developments. © 2012 US Pharmacupia Journal of Food Science �
Mycobacteriophage genome database.

Science.gov (United States)

Joseph, Jerrine; Rajendran, Vasanthi; Hassan, Sameer; Kumar, Vanaja

2011-01-01

Mycobacteriophage genome database (MGDB) is an exclusive repository of the 64 completely sequenced mycobacteriophages with annotated information. It is a comprehensive compilation of the various gene parameters captured from several databases pooled together to empower mycobacteriophage researchers. The MGDB (Version No.1.0) comprises of 6086 genes from 64 mycobacteriophages classified into 72 families based on ACLAME database. Manual curation was aided by information available from public databases which was enriched further by analysis. Its web interface allows browsing as well as querying the classification. The main objective is to collect and organize the complexity inherent to mycobacteriophage protein classification in a rational way. The other objective is to browse the existing and new genomes and describe their functional annotation. The database is available for free at http://mpgdb.ibioinformatics.org/mpgdb.php.
Assessment of current cybersecurity practices in the public domain : cyber indications and warnings domain.

Energy Technology Data Exchange (ETDEWEB)

Hamlet, Jason R.; Keliiaa, Curtis M.

2010-09-01

This report assesses current public domain cyber security practices with respect to cyber indications and warnings. It describes cybersecurity industry and government activities, including cybersecurity tools, methods, practices, and international and government-wide initiatives known to be impacting current practice. Of particular note are the U.S. Government's Trusted Internet Connection (TIC) and 'Einstein' programs, which are serving to consolidate the Government's internet access points and to provide some capability to monitor and mitigate cyber attacks. Next, this report catalogs activities undertaken by various industry and government entities. In addition, it assesses the benchmarks of HPC capability and other HPC attributes that may lend themselves to assist in the solution of this problem. This report draws few conclusions, as it is intended to assess current practice in preparation for future work, however, no explicit references to HPC usage for the purpose of analyzing cyber infrastructure in near-real-time were found in the current practice. This report and a related SAND2010-4766 National Cyber Defense High Performance Computing and Analysis: Concepts, Planning and Roadmap report are intended to provoke discussion throughout a broad audience about developing a cohesive HPC centric solution to wide-area cybersecurity problems.
Sports medicine clinical trial research publications in academic medical journals between 1996 and 2005: an audit of the PubMed MEDLINE database.

Science.gov (United States)

Nichols, A W

2008-11-01

To identify sports medicine-related clinical trial research articles in the PubMed MEDLINE database published between 1996 and 2005 and conduct a review and analysis of topics of research, experimental designs, journals of publication and the internationality of authorships. Sports medicine research is international in scope with improving study methodology and an evolution of topics. Structured review of articles identified in a search of a large electronic medical database. PubMed MEDLINE database. Sports medicine-related clinical research trials published between 1996 and 2005. Review and analysis of articles that meet inclusion criteria. Articles were examined for study topics, research methods, experimental subject characteristics, journal of publication, lead authors and journal countries of origin and language of publication. The search retrieved 414 articles, of which 379 (345 English language and 34 non-English language) met the inclusion criteria. The number of publications increased steadily during the study period. Randomised clinical trials were the most common study type and the "diagnosis, management and treatment of sports-related injuries and conditions" was the most popular study topic. The knee, ankle/foot and shoulder were the most frequent anatomical sites of study. Soccer players and runners were the favourite study subjects. The American Journal of Sports Medicine had the highest number of publications and shared the greatest international diversity of authorships with the British Journal of Sports Medicine. The USA, Australia, Germany and the UK produced a good number of the lead authorships. In all, 91% of articles and 88% of journals were published in English. Sports medicine-related research is internationally diverse, clinical trial publications are increasing and the sophistication of research design may be improving.
Trends in global acupuncture publications: An analysis of the Web of Science database from 1988 to 2015.

Science.gov (United States)

Kung, Yen-Ying; Hwang, Shinn-Jang; Li, Tsai-Feng; Ko, Seong-Gyu; Huang, Ching-Wen; Chen, Fang-Pey

2017-08-01

Acupuncture is a rapidly growing medical specialty worldwide. This study aimed to analyze the acupuncture publications from 1988 to 2015 by using the Web of Science (WoS) database. Familiarity with the trend of acupuncture publications will facilitate a better understanding of existing academic research in acupuncture and its applications. Academic articles published focusing on acupuncture were retrieved and analyzed from the WoS database which included articles published in Science Citation Index-Expanded and Social Science Citation Indexed journals from 1988 to 2015. A total of 7450 articles were published in the field of acupuncture during the period of 1988-2015. Annual article publications increased from 109 in 1988 to 670 in 2015. The People's Republic of China (published 2076 articles, 27.9%), USA (published 1638 articles, 22.0%) and South Korea (published 707 articles, 9.5%) were the most abundantly prolific countries. According to the WoS subject categories, 2591 articles (34.8%) were published in the category of Integrative and Complementary Medicine, followed by Neurosciences (1147 articles, 15.4%), and General Internal Medicine (918 articles, 12.3%). Kyung Hee University (South Korea) is the most prolific organization that is the source of acupuncture publications (365 articles, 4.9%). Fields within acupuncture with the most cited articles included mechanism, clinical trials, epidemiology, and a new research method of acupuncture. Publications associated with acupuncture increased rapidly from 1988 to 2015. The different applications of acupuncture were extensive in multiple fields of medicine. It is important to maintain and even nourish a certain quantity and quality of published acupuncture papers, which can play an important role in developing a medical discipline for acupuncture. Copyright © 2017. Published by Elsevier Taiwan LLC.
Relational Databases and Biomedical Big Data.

Science.gov (United States)

de Silva, N H Nisansa D

2017-01-01

In various biomedical applications that collect, handle, and manipulate data, the amounts of data tend to build up and venture into the range identified as bigdata. In such occurrences, a design decision has to be taken as to what type of database would be used to handle this data. More often than not, the default and classical solution to this in the biomedical domain according to past research is relational databases. While this used to be the norm for a long while, it is evident that there is a trend to move away from relational databases in favor of other types and paradigms of databases. However, it still has paramount importance to understand the interrelation that exists between biomedical big data and relational databases. This chapter will review the pros and cons of using relational databases to store biomedical big data that previous researches have discussed and used.
Enhancing public access to legal information : A proposal for a new official legal information generic top-level domain

NARCIS (Netherlands)

Mitee, Leesi Ebenezer

2017-01-01

Abstract: This article examines the use of a new legal information generic Top-Level Domain (gTLD) as a viable tool for easy identification of official legal information websites (OLIWs) and enhancing global public access to their resources. This intervention is necessary because of the existence of
Into the Dark Domain: The UK Web Archive as a Source for the Contemporary History of Public Health

Science.gov (United States)

Gorsky, Martin

2015-01-01

With the migration of the written record from paper to digital format, archivists and historians must urgently consider how web content should be conserved, retrieved and analysed. The British Library has recently acquired a large number of UK domain websites, captured 1996–2010, which is colloquially termed the Dark Domain Archive while technical issues surrounding user access are resolved. This article reports the results of an invited pilot project that explores methodological issues surrounding use of this archive. It asks how the relationship between UK public health and local government was represented on the web, drawing on the ‘declinist’ historiography to frame its questions. It points up some difficulties in developing an aggregate picture of web content due to duplication of sites. It also highlights their potential for thematic and discourse analysis, using both text and image, illustrated through an argument about the contradictory rationale for public health policy under New Labour. PMID:26217072
Artist Material BRDF Database for Computer Graphics Rendering

Science.gov (United States)

Ashbaugh, Justin C.

The primary goal of this thesis was to create a physical library of artist material samples. This collection provides necessary data for the development of a gonio-imaging system for use in museums to more accurately document their collections. A sample set was produced consisting of 25 panels and containing nearly 600 unique samples. Selected materials are representative of those commonly used by artists both past and present. These take into account the variability in visual appearance resulting from the materials and application techniques used. Five attributes of variability were identified including medium, color, substrate, application technique and overcoat. Combinations of these attributes were selected based on those commonly observed in museum collections and suggested by surveying experts in the field. For each sample material, image data is collected and used to measure an average bi-directional reflectance distribution function (BRDF). The results are available as a public-domain image and optical database of artist materials at art-si.org. Additionally, the database includes specifications for each sample along with other information useful for computer graphics rendering such as the rectified sample images and normal maps.
Canis mtDNA HV1 database: a web-based tool for collecting and surveying Canis mtDNA HV1 haplotype in public database.

Science.gov (United States)

Thai, Quan Ke; Chung, Dung Anh; Tran, Hoang-Dung

2017-06-26

Canine and wolf mitochondrial DNA haplotypes, which can be used for forensic or phylogenetic analyses, have been defined in various schemes depending on the region analyzed. In recent studies, the 582 bp fragment of the HV1 region is most commonly used. 317 different canine HV1 haplotypes have been reported in the rapidly growing public database GenBank. These reported haplotypes contain several inconsistencies in their haplotype information. To overcome this issue, we have developed a Canis mtDNA HV1 database. This database collects data on the HV1 582 bp region in dog mitochondrial DNA from the GenBank to screen and correct the inconsistencies. It also supports users in detection of new novel mutation profiles and assignment of new haplotypes. The Canis mtDNA HV1 database (CHD) contains 5567 nucleotide entries originating from 15 subspecies in the species Canis lupus. Of these entries, 3646 were haplotypes and grouped into 804 distinct sequences. 319 sequences were recognized as previously assigned haplotypes, while the remaining 485 sequences had new mutation profiles and were marked as new haplotype candidates awaiting further analysis for haplotype assignment. Of the 3646 nucleotide entries, only 414 were annotated with correct haplotype information, while 3232 had insufficient or lacked haplotype information and were corrected or modified before storing in the CHD. The CHD can be accessed at http://chd.vnbiology.com . It provides sequences, haplotype information, and a web-based tool for mtDNA HV1 haplotyping. The CHD is updated monthly and supplies all data for download. The Canis mtDNA HV1 database contains information about canine mitochondrial DNA HV1 sequences with reconciled annotation. It serves as a tool for detection of inconsistencies in GenBank and helps identifying new HV1 haplotypes. Thus, it supports the scientific community in naming new HV1 haplotypes and to reconcile existing annotation of HV1 582 bp sequences.
Database Changes (Post-Publication). ERIC Processing Manual, Section X.

Science.gov (United States)

Brandhorst, Ted, Ed.

The purpose of this section is to specify the procedure for making changes to the ERIC database after the data involved have been announced in the abstract journals RIE or CIJE. As a matter of general ERIC policy, a document or journal article is not re-announced or re-entered into the database as a new accession for the purpose of accomplishing a…
A Data Preparation Methodology in Data Mining Applied to Mortality Population Databases.

Science.gov (United States)

Pérez, Joaquín; Iturbide, Emmanuel; Olivares, Víctor; Hidalgo, Miguel; Martínez, Alicia; Almanza, Nelva

2015-11-01

It is known that the data preparation phase is the most time consuming in the data mining process, using up to 50% or up to 70% of the total project time. Currently, data mining methodologies are of general purpose and one of their limitations is that they do not provide a guide about what particular task to develop in a specific domain. This paper shows a new data preparation methodology oriented to the epidemiological domain in which we have identified two sets of tasks: General Data Preparation and Specific Data Preparation. For both sets, the Cross-Industry Standard Process for Data Mining (CRISP-DM) is adopted as a guideline. The main contribution of our methodology is fourteen specialized tasks concerning such domain. To validate the proposed methodology, we developed a data mining system and the entire process was applied to real mortality databases. The results were encouraging because it was observed that the use of the methodology reduced some of the time consuming tasks and the data mining system showed findings of unknown and potentially useful patterns for the public health services in Mexico.
Public and private energy RTD expenditures in Belgium, Luxembourg and the Netherlands. A pilot study on behalf of SenterNovem based on an IEA format

International Nuclear Information System (INIS)

Lako, P.; Ros, M.E.

2007-07-01

This study aims to present a broad view of energy RTD expenditures of Belgium, Luxembourg, and the Netherlands, in the public domain and by private enterprises. Data is provided as much as possible by disaggregating into a format of the IEA (IEA code). IEA data serve as the starting point for data collection. The main task is to fill in the gaps in the database, viz.: Completing the IEA database for Belgium with regard to public energy RTD; Starting with a database of public energy RTD for Luxembourg; Collecting, retrieving, and analysing private energy RTD data for the Netherlands. The latter data, based on a 'bottom-up' approach, are compared to recent data of SenterNovem based on an R and D subsidy scheme in the Netherlands. The private energy RTD expenditures from both sources (the bottom-up approach in this study and the data of SenterNovem) are combined to one database of private energy RTD that may be used for, e.g., the IEA
USAID Public-Private Partnerships Database

Data.gov (United States)

US Agency for International Development — This dataset brings together information collected since 2001 on PPPs that have been supported by USAID. For the purposes of this dataset a Public-Private...
Consumer Product Category Database

Science.gov (United States)

The Chemical and Product Categories database (CPCat) catalogs the use of over 40,000 chemicals and their presence in different consumer products. The chemical use information is compiled from multiple sources while product information is gathered from publicly available Material Safety Data Sheets (MSDS). EPA researchers are evaluating the possibility of expanding the database with additional product and use information.
FARE-CAFE: a database of functional and regulatory elements of cancer-associated fusion events.

Science.gov (United States)

Korla, Praveen Kumar; Cheng, Jack; Huang, Chien-Hung; Tsai, Jeffrey J P; Liu, Yu-Hsuan; Kurubanjerdjit, Nilubon; Hsieh, Wen-Tsong; Chen, Huey-Yi; Ng, Ka-Lok

2015-01-01

Chromosomal translocation (CT) is of enormous clinical interest because this disorder is associated with various major solid tumors and leukemia. A tumor-specific fusion gene event may occur when a translocation joins two separate genes. Currently, various CT databases provide information about fusion genes and their genomic elements. However, no database of the roles of fusion genes, in terms of essential functional and regulatory elements in oncogenesis, is available. FARE-CAFE is a unique combination of CTs, fusion proteins, protein domains, domain-domain interactions, protein-protein interactions, transcription factors and microRNAs, with subsequent experimental information, which cannot be found in any other CT database. Genomic DNA information including, for example, manually collected exact locations of the first and second break points, sequences and karyotypes of fusion genes are included. FARE-CAFE will substantially facilitate the cancer biologist's mission of elucidating the pathogenesis of various types of cancer. This database will ultimately help to develop 'novel' therapeutic approaches. Database URL: http://ppi.bioinfo.asia.edu.tw/FARE-CAFE. © The Author(s) 2015. Published by Oxford University Press.

The study of co-citation analysis and knowledge structure on healthcare domain

Science.gov (United States)

Chu, Kuo-Chung; Liu, Wen-I.; Tsai, Ming-Yu

2012-11-01

With the prevalence of Internet and digital archives, the online e-journal database facilitates scholars to search literature in a research domain, or to cross-search an inter-disciplined field; the key literature can be efficiently traced out. This study intends to build a Web-based citation analysis system, which consists of four modules, they are: 1) literature search module; (2) statistics module; (3) articles analysis module; and (4) co-citation analysis module. The system focuses on PubMed Central dataset that has 170,000 records. In a research domain, a specific keyword searches in terms of authors, journals, and core issues. In addition, we use data mining techniques for co-citation analysis. The results assist researchers with in-depth understanding of the domain knowledge. Having an automated system for co-citation analysis, it helps to understand changes, trends, and knowledge structure of research domain. For the best of our knowledge, the proposed system differentiates from existing online electronic retrieval database analysis function. Perhaps, the proposed system is going to be a value-added database of healthcare domain, and hope to contribute the researchers.
Comprehensive Thematic T-matrix Reference Database: a 2013-2014 Update

Science.gov (United States)

Mishchenko, Michael I.; Zakharova, Nadezhda T.; Khlebtsov, Nikolai G.; Wriedt, Thomas; Videen, Gorden

2014-01-01

This paper is the sixth update to the comprehensive thematic database of peer-reviewedT-matrix publications initiated by us in 2004 and includes relevant publications that have appeared since 2013. It also lists several earlier publications not incorporated in the original database and previous updates.
Atomic and molecular databases in the context of virtual observatories

International Nuclear Information System (INIS)

Dubernet, Marie-Lise; Roueff, Evelyne

2006-01-01

Numerical and bibliographic Databases in Atomic and Molecular Physics are essential for both the modelling of various astrophysical media and the interpretation of astrophysical spectra provided by ground or space-based telescopes. We report here on our current project concerning the access to Atomic and Molecular Databases within the Virtual Observatories. This presentation aims at informing people about interoperability matters, in order to put together the efforts which have already started in this domain, to evaluate the needs and requirements of the targeted interrelation between atomic and molecular data bases and VO projects. Collaborations in this domain are welcome. (author)
E-MSD: the European Bioinformatics Institute Macromolecular Structure Database.

Science.gov (United States)

Boutselakis, H; Dimitropoulos, D; Fillon, J; Golovin, A; Henrick, K; Hussain, A; Ionides, J; John, M; Keller, P A; Krissinel, E; McNeil, P; Naim, A; Newman, R; Oldfield, T; Pineda, J; Rachedi, A; Copeland, J; Sitnov, A; Sobhany, S; Suarez-Uruena, A; Swaminathan, J; Tagari, M; Tate, J; Tromm, S; Velankar, S; Vranken, W

2003-01-01

The E-MSD macromolecular structure relational database (http://www.ebi.ac.uk/msd) is designed to be a single access point for protein and nucleic acid structures and related information. The database is derived from Protein Data Bank (PDB) entries. Relational database technologies are used in a comprehensive cleaning procedure to ensure data uniformity across the whole archive. The search database contains an extensive set of derived properties, goodness-of-fit indicators, and links to other EBI databases including InterPro, GO, and SWISS-PROT, together with links to SCOP, CATH, PFAM and PROSITE. A generic search interface is available, coupled with a fast secondary structure domain search tool.
Bibliographical database of radiation biological dosimetry and risk assessment: Part 2

International Nuclear Information System (INIS)

Straume, T.; Ricker, Y.; Thut, M.

1990-09-01

This is part 11 of a database constructed to support research in radiation biological dosimetry and risk assessment. Relevant publications were identified through detailed searches of national and international electronic databases and through our personal knowledge of the subject. Publications were numbered and key worded, and referenced in an electronic data-retrieval system that permits quick access through computerized searches on authors, key words, title, year, journal name, or publication number. Photocopies of the publications contained in the database are maintained in a file that is numerically arranged by our publication acquisition numbers. This volume contains 1048 additional entries, which are listed in alphabetical order by author. The computer software used for the database is a simple but sophisticated relational database program that permits quick information access, high flexibility, and the creation of customized reports. This program is inexpensive and is commercially available for the Macintosh and the IBM PC. Although the database entries were made using a Macintosh computer, we have the capability to convert the files into the IBM PC version. As of this date, the database cites 2260 publications. Citations in the database are from 200 different scientific journals. There are also references to 80 books and published symposia, and 158 reports. Information relevant to radiation biological dosimetry and risk assessment is widely distributed within the scientific literature, although a few journals clearly predominate. The journals publishing the largest number of relevant papers are Health Physics, with a total of 242 citations in the database, and Mutation Research, with 185 citations. Other journals with over 100 citations in the database, are Radiation Research, with 136, and International Journal of Radiation Biology, with 132
A Novel Approach: Chemical Relational Databases, and the Role of the ISSCAN Database on Assessing Chemical Carcinogenity

Science.gov (United States)

Mutagenicity and carcinogenicity databases are crucial resources for toxicologists and regulators involved in chemicals risk assessment. Until recently, existing public toxicity databases have been constructed primarily as "look-up-tables" of existing data, and most often did no...
Complementary Value of Databases for Discovery of Scholarly Literature: A User Survey of Online Searching for Publications in Art History

Science.gov (United States)

Nemeth, Erik

2010-01-01

Discovery of academic literature through Web search engines challenges the traditional role of specialized research databases. Creation of literature outside academic presses and peer-reviewed publications expands the content for scholarly research within a particular field. The resulting body of literature raises the question of whether scholars…
E-SovTox: An online database of the main publicly-available sources of toxicity data concerning REACH-relevant chemicals published in the Russian language.

Science.gov (United States)

Sihtmäe, Mariliis; Blinova, Irina; Aruoja, Villem; Dubourguier, Henri-Charles; Legrand, Nicolas; Kahru, Anne

2010-08-01

A new open-access online database, E-SovTox, is presented. E-SovTox provides toxicological data for substances relevant to the EU Registration, Evaluation, Authorisation and Restriction of Chemicals (REACH) system, from publicly-available Russian language data sources. The database contains information selected mainly from scientific journals published during the Soviet Union era. The main information source for this database - the journal, Gigiena Truda i Professional'nye Zabolevania [Industrial Hygiene and Occupational Diseases], published between 1957 and 1992 - features acute, but also chronic, toxicity data for numerous industrial chemicals, e.g. for rats, mice, guinea-pigs and rabbits. The main goal of the abovementioned toxicity studies was to derive the maximum allowable concentration limits for industrial chemicals in the occupational health settings of the former Soviet Union. Thus, articles featured in the database include mostly data on LD50 values, skin and eye irritation, skin sensitisation and cumulative properties. Currently, the E-SovTox database contains toxicity data selected from more than 500 papers covering more than 600 chemicals. The user is provided with the main toxicity information, as well as abstracts of these papers in Russian and in English (given as provided in the original publication). The search engine allows cross-searching of the database by the name or CAS number of the compound, and the author of the paper. The E-SovTox database can be used as a decision-support tool by researchers and regulators for the hazard assessment of chemical substances. 2010 FRAME.
PIPEMicroDB: microsatellite database and primer generation tool for pigeonpea genome.

Science.gov (United States)

Sarika; Arora, Vasu; Iquebal, M A; Rai, Anil; Kumar, Dinesh

2013-01-01

Molecular markers play a significant role for crop improvement in desirable characteristics, such as high yield, resistance to disease and others that will benefit the crop in long term. Pigeonpea (Cajanus cajan L.) is the recently sequenced legume by global consortium led by ICRISAT (Hyderabad, India) and been analysed for gene prediction, synteny maps, markers, etc. We present PIgeonPEa Microsatellite DataBase (PIPEMicroDB) with an automated primer designing tool for pigeonpea genome, based on chromosome wise as well as location wise search of primers. Total of 123 387 Short Tandem Repeats (STRs) were extracted from pigeonpea genome, available in public domain using MIcroSAtellite tool (MISA). The database is an online relational database based on 'three-tier architecture' that catalogues information of microsatellites in MySQL and user-friendly interface is developed using PHP. Search for STRs may be customized by limiting their location on chromosome as well as number of markers in that range. This is a novel approach and is not been implemented in any of the existing marker database. This database has been further appended with Primer3 for primer designing of selected markers with left and right flankings of size up to 500 bp. This will enable researchers to select markers of choice at desired interval over the chromosome. Furthermore, one can use individual STRs of a targeted region over chromosome to narrow down location of gene of interest or linked Quantitative Trait Loci (QTLs). Although it is an in silico approach, markers' search based on characteristics and location of STRs is expected to be beneficial for researchers. Database URL: http://cabindb.iasri.res.in/pigeonpea/
New e-learning method using databases

Directory of Open Access Journals (Sweden)

Andreea IONESCU

2012-10-01

Full Text Available The objective of this paper is to present a new e-learning method that use databases. The solution could pe implemented for any typeof e-learning system in any domain. The article will purpose a solution to improve the learning process for virtual classes.
40 CFR 1400.13 - Read-only database.

Science.gov (United States)

2010-07-01

... 40 Protection of Environment 32 2010-07-01 2010-07-01 false Read-only database. 1400.13 Section... INFORMATION Other Provisions § 1400.13 Read-only database. The Administrator is authorized to establish... public off-site consequence analysis information by means of a central database under the control of the...
The World Bacterial Biogeography and Biodiversity through Databases: A Case Study of NCBI Nucleotide Database and GBIF Database

Directory of Open Access Journals (Sweden)

Okba Selama

2013-01-01

Full Text Available Databases are an essential tool and resource within the field of bioinformatics. The primary aim of this study was to generate an overview of global bacterial biodiversity and biogeography using available data from the two largest public online databases, NCBI Nucleotide and GBIF. The secondary aim was to highlight the contribution each geographic area has to each database. The basis for data analysis of this study was the metadata provided by both databases, mainly, the taxonomy and the geographical area origin of isolation of the microorganism (record. These were directly obtained from GBIF through the online interface, while E-utilities and Python were used in combination with a programmatic web service access to obtain data from the NCBI Nucleotide Database. Results indicate that the American continent, and more specifically the USA, is the top contributor, while Africa and Antarctica are less well represented. This highlights the imbalance of exploration within these areas rather than any reduction in biodiversity. This study describes a novel approach to generating global scale patterns of bacterial biodiversity and biogeography and indicates that the Proteobacteria are the most abundant and widely distributed phylum within both databases.
Distribution and classification of Serine β-lactamases in Brazilian Hospital Sewage and Other Environmental Metagenomes deposited in Public Databases

Directory of Open Access Journals (Sweden)

Adriana Fróes

2016-11-01

Full Text Available β-lactam is the most used antibiotic class in the clinical area and it acts on blocking the bacteria cell wall synthesis, causing cell death. However, some bacteria have evolved resistance to these antibiotics mainly due the production of enzymes known as β-lactamases. Hospital sewage is an important source of dispersion of multidrug-resistant bacteria in rivers and oceans. In this work, we used next-generation DNA sequencing to explore the diversity and dissemination of serine β-lactamases in two hospital sewage from Rio de Janeiro, Brazil (South -SZ- and North Zone -NZ, presenting different profiles, and to compare them with public environmental data available. Also, we propose a Hidden-Markov-Model approach to screen potential serine β-lactamases genes (in public environments samples and generated hospital sewage data, exploring its evolutionary relationships. Due to the high variability in β-lactamases, we used a position-specific scoring matrix search method (RPS-BLAST against conserved domain database profiles (CDD, Pfam, and COG followed by visual inspection to detect conserved motifs, to increase the reliability of the results and remove possible false positives. We were able to identify novel β-lactamases from Brazilian hospital sewage and to estimate relative abundance of its types. The highest relative abundance found in SZ was the Class A (50%, while Class D is predominant in NZ (55%. CfxA (65% and ACC (47% types were the most abundant genes detected in SZ, while in NZ the most frequent were OXA-10 (32%, CfxA (28%, ACC (21%, CEPA (20% and FOX (19%. Phylogenetic analysis revealed β-lactamases from Brazilian hospital sewage grouped in the same clade and close to sequences belonging to Firmicutes and Bacteroidetes groups, but distant from potential β-lactamases screened from public environmental data, that grouped closer to β-lactamases of Proteobacteria. Our results demonstrated that HMM-based approach identified homologs of
Distribution and Classification of Serine β-Lactamases in Brazilian Hospital Sewage and Other Environmental Metagenomes Deposited in Public Databases.

Science.gov (United States)

Fróes, Adriana M; da Mota, Fábio F; Cuadrat, Rafael R C; Dávila, Alberto M R

2016-01-01

β-lactam is the most used antibiotic class in the clinical area and it acts on blocking the bacteria cell wall synthesis, causing cell death. However, some bacteria have evolved resistance to these antibiotics mainly due the production of enzymes known as β-lactamases. Hospital sewage is an important source of dispersion of multidrug-resistant bacteria in rivers and oceans. In this work, we used next-generation DNA sequencing to explore the diversity and dissemination of serine β-lactamases in two hospital sewage from Rio de Janeiro, Brazil (South Zone, SZ and North Zone, NZ), presenting different profiles, and to compare them with public environmental data available. Also, we propose a Hidden-Markov-Model approach to screen potential serine β-lactamases genes (in public environments samples and generated hospital sewage data), exploring its evolutionary relationships. Due to the high variability in β-lactamases, we used a position-specific scoring matrix search method (RPS-BLAST) against conserved domain database profiles (CDD, Pfam, and COG) followed by visual inspection to detect conserved motifs, to increase the reliability of the results and remove possible false positives. We were able to identify novel β-lactamases from Brazilian hospital sewage and to estimate relative abundance of its types. The highest relative abundance found in SZ was the Class A (50%), while Class D is predominant in NZ (55%). CfxA (65%) and ACC (47%) types were the most abundant genes detected in SZ, while in NZ the most frequent were OXA-10 (32%), CfxA (28%), ACC (21%), CEPA (20%), and FOX (19%). Phylogenetic analysis revealed β-lactamases from Brazilian hospital sewage grouped in the same clade and close to sequences belonging to Firmicutes and Bacteroidetes groups, but distant from potential β-lactamases screened from public environmental data, that grouped closer to β-lactamases of Proteobacteria. Our results demonstrated that HMM-based approach identified homologs of
A web-based system architecture for ontology-based data integration in the domain of IT benchmarking

Science.gov (United States)

Pfaff, Matthias; Krcmar, Helmut

2018-03-01

In the domain of IT benchmarking (ITBM), a variety of data and information are collected. Although these data serve as the basis for business analyses, no unified semantic representation of such data yet exists. Consequently, data analysis across different distributed data sets and different benchmarks is almost impossible. This paper presents a system architecture and prototypical implementation for an integrated data management of distributed databases based on a domain-specific ontology. To preserve the semantic meaning of the data, the ITBM ontology is linked to data sources and functions as the central concept for database access. Thus, additional databases can be integrated by linking them to this domain-specific ontology and are directly available for further business analyses. Moreover, the web-based system supports the process of mapping ontology concepts to external databases by introducing a semi-automatic mapping recommender and by visualizing possible mapping candidates. The system also provides a natural language interface to easily query linked databases. The expected result of this ontology-based approach of knowledge representation and data access is an increase in knowledge and data sharing in this domain, which will enhance existing business analysis methods.
PubChemQC Project: A Large-Scale First-Principles Electronic Structure Database for Data-Driven Chemistry.

Science.gov (United States)

Nakata, Maho; Shimazaki, Tomomi

2017-06-26

Large-scale molecular databases play an essential role in the investigation of various subjects such as the development of organic materials, in silico drug design, and data-driven studies with machine learning. We have developed a large-scale quantum chemistry database based on first-principles methods. Our database currently contains the ground-state electronic structures of 3 million molecules based on density functional theory (DFT) at the B3LYP/6-31G* level, and we successively calculated 10 low-lying excited states of over 2 million molecules via time-dependent DFT with the B3LYP functional and the 6-31+G* basis set. To select the molecules calculated in our project, we referred to the PubChem Project, which was used as the source of the molecular structures in short strings using the InChI and SMILES representations. Accordingly, we have named our quantum chemistry database project "PubChemQC" ( http://pubchemqc.riken.jp/ ) and placed it in the public domain. In this paper, we show the fundamental features of the PubChemQC database and discuss the techniques used to construct the data set for large-scale quantum chemistry calculations. We also present a machine learning approach to predict the electronic structure of molecules as an example to demonstrate the suitability of the large-scale quantum chemistry database.
SeqHound: biological sequence and structure database as a platform for bioinformatics research

Directory of Open Access Journals (Sweden)

Dumontier Michel

2002-10-01

Full Text Available Abstract Background SeqHound has been developed as an integrated biological sequence, taxonomy, annotation and 3-D structure database system. It provides a high-performance server platform for bioinformatics research in a locally-hosted environment. Results SeqHound is based on the National Center for Biotechnology Information data model and programming tools. It offers daily updated contents of all Entrez sequence databases in addition to 3-D structural data and information about sequence redundancies, sequence neighbours, taxonomy, complete genomes, functional annotation including Gene Ontology terms and literature links to PubMed. SeqHound is accessible via a web server through a Perl, C or C++ remote API or an optimized local API. It provides functionality necessary to retrieve specialized subsets of sequences, structures and structural domains. Sequences may be retrieved in FASTA, GenBank, ASN.1 and XML formats. Structures are available in ASN.1, XML and PDB formats. Emphasis has been placed on complete genomes, taxonomy, domain and functional annotation as well as 3-D structural functionality in the API, while fielded text indexing functionality remains under development. SeqHound also offers a streamlined WWW interface for simple web-user queries. Conclusions The system has proven useful in several published bioinformatics projects such as the BIND database and offers a cost-effective infrastructure for research. SeqHound will continue to develop and be provided as a service of the Blueprint Initiative at the Samuel Lunenfeld Research Institute. The source code and examples are available under the terms of the GNU public license at the Sourceforge site http://sourceforge.net/projects/slritools/ in the SLRI Toolkit.
SynechoNET: integrated protein-protein interaction database of a model cyanobacterium Synechocystis sp. PCC 6803

OpenAIRE

Kim, Woo-Yeon; Kang, Sungsoo; Kim, Byoung-Chul; Oh, Jeehyun; Cho, Seongwoong; Bhak, Jong; Choi, Jong-Soon

2008-01-01

Background Cyanobacteria are model organisms for studying photosynthesis, carbon and nitrogen assimilation, evolution of plant plastids, and adaptability to environmental stresses. Despite many studies on cyanobacteria, there is no web-based database of their regulatory and signaling protein-protein interaction networks to date. Description We report a database and website SynechoNET that provides predicted protein-protein interactions. SynechoNET shows cyanobacterial domain-domain interactio...
Tibetan Magmatism Database

Science.gov (United States)

Chapman, James B.; Kapp, Paul

2017-11-01

A database containing previously published geochronologic, geochemical, and isotopic data on Mesozoic to Quaternary igneous rocks in the Himalayan-Tibetan orogenic system are presented. The database is intended to serve as a repository for new and existing igneous rock data and is publicly accessible through a web-based platform that includes an interactive map and data table interface with search, filtering, and download options. To illustrate the utility of the database, the age, location, and ɛHft composition of magmatism from the central Gangdese batholith in the southern Lhasa terrane are compared. The data identify three high-flux events, which peak at 93, 50, and 15 Ma. They are characterized by inboard arc migration and a temporal and spatial shift to more evolved isotopic compositions.
A Novel Approach: Chemical Relational Databases, and the ...

Science.gov (United States)

Mutagenicity and carcinogenicity databases are crucial resources for toxicologists and regulators involved in chemicals risk assessment. Until recently, existing public toxicity databases have been constructed primarily as

DAE emergency response centre (ERC) at Kalpakkam for response to nuclear and radiological emergencies in public domain

International Nuclear Information System (INIS)

Meenakshisundaram, V.; Rajagopal, V.; Mathiyarasu, R.; Subramanian, V.; Rajaram, S.; Somayaji, K.M.; Kannan, V.; Rajagopalan, H.

2008-01-01

In India, Department of Atomic Energy (DAE) has been identified as the nodal agency/authority in respect of providing the necessary technical inputs in the event of any radiation emergency that may occur in public domain. The overall system takes into consideration statutory requirements, executive decisions as well as National and International obligations. This paper highlights the details about the strength of the Kalpakkam ERC and other essential requisites and their compliance since its formation
UnoViS: the MedIT public unobtrusive vital signs database.

Science.gov (United States)

Wartzek, Tobias; Czaplik, Michael; Antink, Christoph Hoog; Eilebrecht, Benjamin; Walocha, Rafael; Leonhardt, Steffen

2015-01-01

While PhysioNet is a large database for standard clinical vital signs measurements, such a database does not exist for unobtrusively measured signals. This inhibits progress in the vital area of signal processing for unobtrusive medical monitoring as not everybody owns the specific measurement systems to acquire signals. Furthermore, if no common database exists, a comparison between different signal processing approaches is not possible. This gap will be closed by our UnoViS database. It contains different recordings in various scenarios ranging from a clinical study to measurements obtained while driving a car. Currently, 145 records with a total of 16.2 h of measurement data is available, which are provided as MATLAB files or in the PhysioNet WFDB file format. In its initial state, only (multichannel) capacitive ECG and unobtrusive PPG signals are, together with a reference ECG, included. All ECG signals contain annotations by a peak detector and by a medical expert. A dataset from a clinical study contains further clinical annotations. Additionally, supplementary functions are provided, which simplify the usage of the database and thus the development and evaluation of new algorithms. The development of urgently needed methods for very robust parameter extraction or robust signal fusion in view of frequent severe motion artifacts in unobtrusive monitoring is now possible with the database.
The SH2 Domain Interaction Landscape

Directory of Open Access Journals (Sweden)

Michele Tinti

2013-04-01

Full Text Available Members of the SH2 domain family modulate signal transduction by binding to short peptides containing phosphorylated tyrosines. Each domain displays a distinct preference for the sequence context of the phosphorylated residue. We have developed a high-density peptide chip technology that allows for probing of the affinity of most SH2 domains for a large fraction of the entire complement of tyrosine phosphopeptides in the human proteome. Using this technique, we have experimentally identified thousands of putative SH2-peptide interactions for more than 70 different SH2 domains. By integrating this rich data set with orthogonal context-specific information, we have assembled an SH2-mediated probabilistic interaction network, which we make available as a community resource in the PepspotDB database. A predicted dynamic interaction between the SH2 domains of the tyrosine phosphatase SHP2 and the phosphorylated tyrosine in the extracellular signal-regulated kinase activation loop was validated by experiments in living cells.
Reflections on a decade of research by ASEAN dental faculties: analysis of publications from ISI-WOS databases from 2000 to 2009.

Science.gov (United States)

Sirisinha, Stitaya; Koontongkaew, Sittichai; Phantumvanit, Prathip; Wittayawuttikul, Ruchareka

2011-05-01

This communication analyzed research publications in dentistry in the Institute of Scientific Information Web of Science databases of 10 dental faculties in the Association of South-East Asian Nations (ASEAN) from 2000 to 2009. The term used for the "all-document types" search was "Faculty of Dentistry/College of Dentistry." Abstracts presented at regional meetings were also included in the analysis. The Times Higher Education System QS World University Rankings showed that universities in the region fare poorly in world university rankings. Only the National University of Singapore and Nanyang Technological University appeared in the top 100 in 2009; 19 universities in the region, including Indonesia, Malaysia, the Philippines, Singapore, and Thailand, appeared in the top 500. Data from the databases showed that research publications by dental institutes in the region fall short of their Asian counterparts. Singapore and Thailand are the most active in dental research of the ASEAN countries. © 2011 Blackwell Publishing Asia Pty Ltd.
Global Volcano Locations Database

Data.gov (United States)

National Oceanic and Atmospheric Administration, Department of Commerce — NGDC maintains a database of over 1,500 volcano locations obtained from the Smithsonian Institution Global Volcanism Program, Volcanoes of the World publication. The...
INIST: databases reorientation

International Nuclear Information System (INIS)

Bidet, J.C.

1995-01-01

INIST is a CNRS (Centre National de la Recherche Scientifique) laboratory devoted to the treatment of scientific and technical informations and to the management of these informations compiled in a database. Reorientation of the database content has been proposed in 1994 to increase the transfer of research towards enterprises and services, to develop more automatized accesses to the informations, and to create a quality assurance plan. The catalog of publications comprises 5800 periodical titles (1300 for fundamental research and 4500 for applied research). A science and technology multi-thematic database will be created in 1995 for the retrieval of applied and technical informations. ''Grey literature'' (reports, thesis, proceedings..) and human and social sciences data will be added to the base by the use of informations selected in the existing GRISELI and Francis databases. Strong modifications are also planned in the thematic cover of Earth sciences and will considerably reduce the geological information content. (J.S.). 1 tab
Interactive bibliographical database on color

Science.gov (United States)

Caivano, Jose L.

2002-06-01

The paper describes the methodology and results of a project under development, aimed at the elaboration of an interactive bibliographical database on color in all fields of application: philosophy, psychology, semiotics, education, anthropology, physical and natural sciences, biology, medicine, technology, industry, architecture and design, arts, linguistics, geography, history. The project is initially based upon an already developed bibliography, published in different journals, updated in various opportunities, and now available at the Internet, with more than 2,000 entries. The interactive database will amplify that bibliography, incorporating hyperlinks and contents (indexes, abstracts, keywords, introductions, or eventually the complete document), and devising mechanisms for information retrieval. The sources to be included are: books, doctoral dissertations, multimedia publications, reference works. The main arrangement will be chronological, but the design of the database will allow rearrangements or selections by different fields: subject, Decimal Classification System, author, language, country, publisher, etc. A further project is to develop another database, including color-specialized journals or newsletters, and articles on color published in international journals, arranged in this case by journal name and date of publication, but allowing also rearrangements or selections by author, subject and keywords.
VKCDB: Voltage-gated potassium channel database

Directory of Open Access Journals (Sweden)

Gallin Warren J

2004-01-01

Full Text Available Abstract Background The family of voltage-gated potassium channels comprises a functionally diverse group of membrane proteins. They help maintain and regulate the potassium ion-based component of the membrane potential and are thus central to many critical physiological processes. VKCDB (Voltage-gated potassium [K] Channel DataBase is a database of structural and functional data on these channels. It is designed as a resource for research on the molecular basis of voltage-gated potassium channel function. Description Voltage-gated potassium channel sequences were identified by using BLASTP to search GENBANK and SWISSPROT. Annotations for all voltage-gated potassium channels were selectively parsed and integrated into VKCDB. Electrophysiological and pharmacological data for the channels were collected from published journal articles. Transmembrane domain predictions by TMHMM and PHD are included for each VKCDB entry. Multiple sequence alignments of conserved domains of channels of the four Kv families and the KCNQ family are also included. Currently VKCDB contains 346 channel entries. It can be browsed and searched using a set of functionally relevant categories. Protein sequences can also be searched using a local BLAST engine. Conclusions VKCDB is a resource for comparative studies of voltage-gated potassium channels. The methods used to construct VKCDB are general; they can be used to create specialized databases for other protein families. VKCDB is accessible at http://vkcdb.biology.ualberta.ca.
Second-Tier Database for Ecosystem Focus, 2000-2001 Annual Report.

Energy Technology Data Exchange (ETDEWEB)

Van Holmes, Chris; Muongchanh, Christine; Anderson, James J. (University of Washington, School of Aquatic and Fishery Sciences, Seattle, WA)

2001-11-01

The Second-Tier Database for Ecosystem Focus (Contract 00004124) provides direct and timely public access to Columbia Basin environmental, operational, fishery and riverine data resources for federal, state, public and private entities. The Second-Tier Database known as Data Access in Realtime (DART) does not duplicate services provided by other government entities in the region. Rather, it integrates public data for effective access, consideration and application.
Memory aware query scheduling in a database cluster

NARCIS (Netherlands)

F. Waas; M.L. Kersten (Martin)

2000-01-01

textabstractQuery throughput is one of the primary optimization goals in interactive web-based information systems in order to achieve the performance necessary to serve large user communities. Queries in this application domain differ significantly from those in traditional database applications:
FY 1998 survey report. Examinational research on the construction of body function database; 1998 nendo chosa hokokusho. Shintai kino database no kochiku ni kansuru chosa kenkyu

Energy Technology Data Exchange (ETDEWEB)

NONE

1999-03-01

The body function database is aimed at supplying and supporting products and environment friendly to aged people by supplying the data on body function of aged people in case of planning, designing and production when companies supply the products and environment. As a method for survey, group measuring was made for measurement of visual characteristics. For the measurement of action characteristics, the moving action including posture change was studied, the experimental plan was carried out, and items of group measurement and measuring methods were finally proposed. The database structure was made public at the end of this fiscal year, through the pre-publication/evaluation after the trial evaluation conducted using pilot database. In the study of the measurement of action characteristics, the verification test was conducted for a small-size group. By this, the measurement of action characteristics was finally proposed. In the body function database system, subjects on operation were extracted/bettered by trially evaluating pilot database, and also adjustment of right relations toward publication and preparation of management methods were made. An evaluation version was made supposing its publication. (NEDO)
New Inversion and Interpretation of Public-Domain Electromagnetic Survey Data from Selected Areas in Alaska

Science.gov (United States)

Smith, B. D.; Kass, A.; Saltus, R. W.; Minsley, B. J.; Deszcz-Pan, M.; Bloss, B. R.; Burns, L. E.

2013-12-01

Public-domain airborne geophysical surveys (combined electromagnetics and magnetics), mostly collected for and released by the State of Alaska, Division of Geological and Geophysical Surveys (DGGS), are a unique and valuable resource for both geologic interpretation and geophysical methods development. A new joint effort by the US Geological Survey (USGS) and the DGGS aims to add value to these data through the application of novel advanced inversion methods and through innovative and intuitive display of data: maps, profiles, voxel-based models, and displays of estimated inversion quality and confidence. Our goal is to make these data even more valuable for interpretation of geologic frameworks, geotechnical studies, and cryosphere studies, by producing robust estimates of subsurface resistivity that can be used by non-geophysicists. The available datasets, which are available in the public domain, include 39 frequency-domain electromagnetic datasets collected since 1993, and continue to grow with 5 more data releases pending in 2013. The majority of these datasets were flown for mineral resource purposes, with one survey designed for infrastructure analysis. In addition, several USGS datasets are included in this study. The USGS has recently developed new inversion methodologies for airborne EM data and have begun to apply these and other new techniques to the available datasets. These include a trans-dimensional Markov Chain Monte Carlo technique, laterally-constrained regularized inversions, and deterministic inversions which include calibration factors as a free parameter. Incorporation of the magnetic data as an additional constraining dataset has also improved the inversion results. Processing has been completed in several areas, including Fortymile and the Alaska Highway surveys, and continues in others such as the Styx River and Nome surveys. Utilizing these new techniques, we provide models beyond the apparent resistivity maps supplied by the original
Sources and Resources Into the Dark Domain: The UK Web Archive as a Source for the Contemporary History of Public Health.

Science.gov (United States)

Gorsky, Martin

2015-08-01

With the migration of the written record from paper to digital format, archivists and historians must urgently consider how web content should be conserved, retrieved and analysed. The British Library has recently acquired a large number of UK domain websites, captured 1996-2010, which is colloquially termed the Dark Domain Archive while technical issues surrounding user access are resolved. This article reports the results of an invited pilot project that explores methodological issues surrounding use of this archive. It asks how the relationship between UK public health and local government was represented on the web, drawing on the 'declinist' historiography to frame its questions. It points up some difficulties in developing an aggregate picture of web content due to duplication of sites. It also highlights their potential for thematic and discourse analysis, using both text and image, illustrated through an argument about the contradictory rationale for public health policy under New Labour.
International scientific seminar «Chronicle of Nature – a common database for scientific analysis and joint planning of scientific publications»

Directory of Open Access Journals (Sweden)

Juri P. Kurhinen

2016-05-01

Full Text Available Provides information about the results of the international scienti fic seminar «Сhronicle of Nature – a common database for scientific analysis and joint planning of scientific publications», held at Findland-Russian project «Linking environmental change to biodiversity change: large scale analysis оf Eurasia ecosystem».
HEALTH GeoJunction: place-time-concept browsing of health publications.

Science.gov (United States)

MacEachren, Alan M; Stryker, Michael S; Turton, Ian J; Pezanowski, Scott

2010-05-18

The volume of health science publications is escalating rapidly. Thus, keeping up with developments is becoming harder as is the task of finding important cross-domain connections. When geographic location is a relevant component of research reported in publications, these tasks are more difficult because standard search and indexing facilities have limited or no ability to identify geographic foci in documents. This paper introduces HEALTH GeoJunction, a web application that supports researchers in the task of quickly finding scientific publications that are relevant geographically and temporally as well as thematically. HEALTH GeoJunction is a geovisual analytics-enabled web application providing: (a) web services using computational reasoning methods to extract place-time-concept information from bibliographic data for documents and (b) visually-enabled place-time-concept query, filtering, and contextualizing tools that apply to both the documents and their extracted content. This paper focuses specifically on strategies for visually-enabled, iterative, facet-like, place-time-concept filtering that allows analysts to quickly drill down to scientific findings of interest in PubMed abstracts and to explore relations among abstracts and extracted concepts in place and time. The approach enables analysts to: find publications without knowing all relevant query parameters, recognize unanticipated geographic relations within and among documents in multiple health domains, identify the thematic emphasis of research targeting particular places, notice changes in concepts over time, and notice changes in places where concepts are emphasized. PubMed is a database of over 19 million biomedical abstracts and citations maintained by the National Center for Biotechnology Information; achieving quick filtering is an important contribution due to the database size. Including geography in filters is important due to rapidly escalating attention to geographic factors in public
HEALTH GeoJunction: place-time-concept browsing of health publications

Directory of Open Access Journals (Sweden)

Turton Ian J

2010-05-01

Full Text Available Abstract Background The volume of health science publications is escalating rapidly. Thus, keeping up with developments is becoming harder as is the task of finding important cross-domain connections. When geographic location is a relevant component of research reported in publications, these tasks are more difficult because standard search and indexing facilities have limited or no ability to identify geographic foci in documents. This paper introduces HEALTH GeoJunction, a web application that supports researchers in the task of quickly finding scientific publications that are relevant geographically and temporally as well as thematically. Results HEALTH GeoJunction is a geovisual analytics-enabled web application providing: (a web services using computational reasoning methods to extract place-time-concept information from bibliographic data for documents and (b visually-enabled place-time-concept query, filtering, and contextualizing tools that apply to both the documents and their extracted content. This paper focuses specifically on strategies for visually-enabled, iterative, facet-like, place-time-concept filtering that allows analysts to quickly drill down to scientific findings of interest in PubMed abstracts and to explore relations among abstracts and extracted concepts in place and time. The approach enables analysts to: find publications without knowing all relevant query parameters, recognize unanticipated geographic relations within and among documents in multiple health domains, identify the thematic emphasis of research targeting particular places, notice changes in concepts over time, and notice changes in places where concepts are emphasized. Conclusions PubMed is a database of over 19 million biomedical abstracts and citations maintained by the National Center for Biotechnology Information; achieving quick filtering is an important contribution due to the database size. Including geography in filters is important due to
Dissection of the IgNAR V domain: molecular scanning and orthologue database mining define novel IgNAR hallmarks and affinity maturation mechanisms.

Science.gov (United States)

Fennell, B J; Darmanin-Sheehan, A; Hufton, S E; Calabro, V; Wu, L; Müller, M R; Cao, W; Gill, D; Cunningham, O; Finlay, W J J

2010-07-09

The shark antigen-binding V(NAR) domain has the potential to provide an attractive alternative to traditional biotherapeutics based on its small size, advantageous physiochemical properties, and unusual ability to target clefts in enzymes or cell surface molecules. The V(NAR) shares many of the properties of the well-characterised single-domain camelid V(H)H but is much less understood at the molecular level. We chose the hen-egg-lysozyme-specific archetypal Type I V(NAR) 5A7 and used ribosome display in combination with error-prone mutagenesis to interrogate the entire sequence space. We found a high level of mutational plasticity across the V(NAR) domain, particularly within the framework 2 and hypervariable region 2 regions. A number of residues important for affinity were identified, and a triple mutant combining A1D, S61R, and G62R resulted in a K(D) of 460 pM for hen egg lysozyme, a 20-fold improvement over wild-type 5A7, and the highest K(D) yet reported for V(NAR)-antigen interactions. These findings were rationalised using structural modelling and indicate the importance of residues outside the classical complementarity determining regions in making novel antigen contacts that modulate affinity. We also located two solvent-exposed residues (G15 and G42), distant from the V(NAR) paratope, which retain function upon mutation to cysteine and have the potential to be exploited as sites for targeted covalent modification. Our findings with 5A7 were extended to all known NAR structures using an in-depth bioinformatic analysis of sequence data available in the literature and a newly generated V(NAR) database. This study allowed us to identify, for the first time, both V(NAR)-specific and V(NAR)/Ig V(L)/TCR V(alpha) overlapping hallmark residues, which are critical for the structural and functional integrity of the single domain. Intriguingly, each of our designated V(NAR)-specific hallmarks align precisely with previously defined mutational 'cold spots' in
Assessing Data Quality in Emergent Domains of Earth Sciences

Science.gov (United States)

Darch, P. T.; Borgman, C.

2016-12-01

As earth scientists seek to study known phenomena in new ways, and to study new phenomena, they often develop new technologies and new methods such as embedded network sensing, or reapply extant technologies, such as seafloor drilling. Emergent domains are often highly multidisciplinary as researchers from many backgrounds converge on new research questions. They may adapt existing methods, or develop methods de novo. As a result, emerging domains tend to be methodologically heterogeneous. As these domains mature, pressure to standardize methods increases. Standardization promotes trust, reliability, accuracy, and reproducibility, and simplifies data management. However, for standardization to occur, researchers must be able to assess which of the competing methods produces the highest quality data. The exploratory nature of emerging domains discourages standardization. Because competing methods originate in different disciplinary backgrounds, their scientific credibility is difficult to compare. Instead of direct comparison, researchers attempt to conduct meta-analyses. Scientists compare datasets produced by different methods to assess their consistency and efficiency. This paper presents findings from a long-term qualitative case study of research on the deep subseafloor biosphere, an emergent domain. A diverse community converged on the study of microbes in the seafloor and those microbes' interactions with the physical environments they inhabit. Data on this problem are scarce, leading to calls for standardization as a means to acquire and analyze greater volumes of data. Lacking consistent methods, scientists attempted to conduct meta-analyses to determine the most promising methods on which to standardize. Among the factors that inhibited meta-analyses were disparate approaches to metadata and to curating data. Datasets may be deposited in a variety of databases or kept on individual scientists' servers. Associated metadata may be inconsistent or hard to
Enhanced Publications Linking Publications and Research Data in Digital Repositories

CERN Document Server

Vernooy-Gerritsen, Marjan

2009-01-01

The traditional publication will be overhauled by the 'Enhanced Publication'. This is a publication that is enhanced with research data, extra materials, post publication data, and database records. It has an object-based structure with explicit l
The SH2 domain interaction landscape.

Science.gov (United States)

Tinti, Michele; Kiemer, Lars; Costa, Stefano; Miller, Martin L; Sacco, Francesca; Olsen, Jesper V; Carducci, Martina; Paoluzi, Serena; Langone, Francesca; Workman, Christopher T; Blom, Nikolaj; Machida, Kazuya; Thompson, Christopher M; Schutkowski, Mike; Brunak, Søren; Mann, Matthias; Mayer, Bruce J; Castagnoli, Luisa; Cesareni, Gianni

2013-04-25

Members of the SH2 domain family modulate signal transduction by binding to short peptides containing phosphorylated tyrosines. Each domain displays a distinct preference for the sequence context of the phosphorylated residue. We have developed a high-density peptide chip technology that allows for probing of the affinity of most SH2 domains for a large fraction of the entire complement of tyrosine phosphopeptides in the human proteome. Using this technique, we have experimentally identified thousands of putative SH2-peptide interactions for more than 70 different SH2 domains. By integrating this rich data set with orthogonal context-specific information, we have assembled an SH2-mediated probabilistic interaction network, which we make available as a community resource in the PepspotDB database. A predicted dynamic interaction between the SH2 domains of the tyrosine phosphatase SHP2 and the phosphorylated tyrosine in the extracellular signal-regulated kinase activation loop was validated by experiments in living cells. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.

Comprehensive T-matrix Reference Database: A 2009-2011 Update

Science.gov (United States)

Zakharova, Nadezhda T.; Videen, G.; Khlebtsov, Nikolai G.

2012-01-01

The T-matrix method is one of the most versatile and efficient theoretical techniques widely used for the computation of electromagnetic scattering by single and composite particles, discrete random media, and particles in the vicinity of an interface separating two half-spaces with different refractive indices. This paper presents an update to the comprehensive database of peer-reviewed T-matrix publications compiled by us previously and includes the publications that appeared since 2009. It also lists several earlier publications not included in the original database.
Comprehensive T-Matrix Reference Database: A 2007-2009 Update

Science.gov (United States)

Mishchenko, Michael I.; Zakharova, Nadia T.; Videen, Gorden; Khlebtsov, Nikolai G.; Wriedt, Thomas

2010-01-01

The T-matrix method is among the most versatile, efficient, and widely used theoretical techniques for the numerically exact computation of electromagnetic scattering by homogeneous and composite particles, clusters of particles, discrete random media, and particles in the vicinity of an interface separating two half-spaces with different refractive indices. This paper presents an update to the comprehensive database of T-matrix publications compiled by us previously and includes the publications that appeared since 2007. It also lists several earlier publications not included in the original database.
A community effort to construct a gravity database for the United States and an associated Web portal

Science.gov (United States)

Keller, Gordon R.; Hildenbrand, T.G.; Kucks, R.; Webring, M.; Briesacher, A.; Rujawitz, K.; Hittleman, A.M.; Roman, D.R.; Winester, D.; Aldouri, R.; Seeley, J.; Rasillo, J.; Torres, R.; Hinze, W. J.; Gates, A.; Kreinovich, V.; Salayandia, L.

2006-01-01

Potential field data (gravity and magnetic measurements) are both useful and costeffective tools for many geologic investigations. Significant amounts of these data are traditionally in the public domain. A new magnetic database for North America was released in 2002, and as a result, a cooperative effort between government agencies, industry, and universities to compile an upgraded digital gravity anomaly database, grid, and map for the conterminous United States was initiated and is the subject of this paper. This database is being crafted into a data system that is accessible through a Web portal. This data system features the database, software tools, and convenient access. The Web portal will enhance the quality and quantity of data contributed to the gravity database that will be a shared community resource. The system's totally digital nature ensures that it will be flexible so that it can grow and evolve as new data, processing procedures, and modeling and visualization tools become available. Another goal of this Web-based data system is facilitation of the efforts of researchers and students who wish to collect data from regions currently not represented adequately in the database. The primary goal of upgrading the United States gravity database and this data system is to provide more reliable data that support societal and scientific investigations of national importance. An additional motivation is the international intent to compile an enhanced North American gravity database, which is critical to understanding regional geologic features, the tectonic evolution of the continent, and other issues that cross national boundaries. ?? 2006 Geological Society of America. All rights reserved.
The PMDB Protein Model Database

Science.gov (United States)

Castrignanò, Tiziana; De Meo, Paolo D'Onorio; Cozzetto, Domenico; Talamo, Ivano Giuseppe; Tramontano, Anna

2006-01-01

The Protein Model Database (PMDB) is a public resource aimed at storing manually built 3D models of proteins. The database is designed to provide access to models published in the scientific literature, together with validating experimental data. It is a relational database and it currently contains >74 000 models for ∼240 proteins. The system is accessible at and allows predictors to submit models along with related supporting evidence and users to download them through a simple and intuitive interface. Users can navigate in the database and retrieve models referring to the same target protein or to different regions of the same protein. Each model is assigned a unique identifier that allows interested users to directly access the data. PMID:16381873
PRGPred: A platform for prediction of domains of resistance gene analogue (RGA in Arecaceae developed using machine learning algorithms

Directory of Open Access Journals (Sweden)

MATHODIYIL S. MANJULA

2015-12-01

Full Text Available Plant disease resistance genes (R-genes are responsible for initiation of defense mechanism against various phytopathogens. The majority of plant R-genes are members of very large multi-gene families, which encode structurally related proteins containing nucleotide binding site domains (NBS and C-terminal leucine rich repeats (LRR. Other classes possess' an extracellular LRR domain, a transmembrane domain and sometimes, an intracellular serine/threonine kinase domain. R-proteins work in pathogen perception and/or the activation of conserved defense signaling networks. In the present study, sequences representing resistance gene analogues (RGAs of coconut, arecanut, oil palm and date palm were collected from NCBI, sorted based on domains and assembled into a database. The sequences were analyzed in PRINTS database to find out the conserved domains and their motifs present in the RGAs. Based on these domains, we have also developed a tool to predict the domains of palm R-genes using various machine learning algorithms. The model files were selected based on the performance of the best classifier in training and testing. All these information is stored and made available in the online ‘PRGpred' database and prediction tool.
“NaKnowBase”: A Nanomaterials Relational Database

Science.gov (United States)

NaKnowBase is an internal relational database populated with data from peer-reviewed ORD nanomaterials research publications. The database focuses on papers describing the actions of nanomaterials in environmental or biological media including their interactions, transformations...
The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): A Completed Reference Database of Lung Nodules on CT Scans

International Nuclear Information System (INIS)

2011-01-01

Purpose: The development of computer-aided diagnostic (CAD) methods for lung nodule detection, classification, and quantitative assessment can be facilitated through a well-characterized repository of computed tomography (CT) scans. The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI) completed such a database, establishing a publicly available reference for the medical imaging research community. Initiated by the National Cancer Institute (NCI), further advanced by the Foundation for the National Institutes of Health (FNIH), and accompanied by the Food and Drug Administration (FDA) through active participation, this public-private partnership demonstrates the success of a consortium founded on a consensus-based process. Methods: Seven academic centers and eight medical imaging companies collaborated to identify, address, and resolve challenging organizational, technical, and clinical issues to provide a solid foundation for a robust database. The LIDC/IDRI Database contains 1018 cases, each of which includes images from a clinical thoracic CT scan and an associated XML file that records the results of a two-phase image annotation process performed by four experienced thoracic radiologists. In the initial blinded-read phase, each radiologist independently reviewed each CT scan and marked lesions belonging to one of three categories (''nodule≥3 mm,''''nodule<3 mm,'' and ''non-nodule≥3 mm''). In the subsequent unblinded-read phase, each radiologist independently reviewed their own marks along with the anonymized marks of the three other radiologists to render a final opinion. The goal of this process was to identify as completely as possible all lung nodules in each CT scan without requiring forced consensus. Results: The Database contains 7371 lesions marked ''nodule'' by at least one radiologist. 2669 of these lesions were marked ''nodule≥3 mm'' by at least one radiologist, of which 928 (34.7%) received such marks from all
Trend of R and D publications in pressurised heavy water reactors: A study using INIS and other databases

International Nuclear Information System (INIS)

Kumar, V.; Kalyane, V.L.; Prakasan, E.R.; Kumar, A.; Sagar, A.; Mohan, L.

2004-01-01

Digital databases INIS (1970-2002), INSPEC (1969-2002), Chemical Abstracts (1977-2002), ISMEC (1973-June 2002), Web of Sciences (1974-2002), and Science Citation Index (1982-2002), were used for comprehensive retrieval of bibliographic details of research publications on Pressurized Heavy Water Reactor (PHWR) research. Among the countries contributing to PHWR research, India (having 1737 papers) is the forerunner followed by Canada (1492), Romania (508) and Argentina (334). Collaboration of Canadian researchers with researchers of other countries resulted in 75 publications. Among the most productive researchers in this field, the first 15 are from India. Top three contributors to PHWR publications with their respective authorship credits are: H.S. Kushwaha (106), Anil Kakodkar (100) and V. Venkat Raj (76). Prominent interdomainary interactions in PHWR subfields are: Specific nuclear reactors and associated plants with General studies of nuclear reactors (481), followed by Environmental sciences (185), and Materials science (154). Number of publications dealing with Geosciences aspect of environmental sciences are 141. Romania, Argentina, India and Republic of Korea have used mostly (≥75%) non-conventional media for publications. Out of the 4851 publications, 1228 have been published in 292 distinct journals. Top most journals publishing PHWR papers are: Radiation Protection and Environment (continued from: Bulletin of Radiation Protection since 1997), India (115); Nuclear Engineering International, UK (84); and Transactions of the American Nuclear Society, USA (68). (author)
A database of archived drilling records of the drill cuttings piles at the North West Hutton oil platform

International Nuclear Information System (INIS)

Marsh, Roy

2003-01-01

Drill cuttings piles are found underneath several hundred oil platforms in the North Sea, and are contaminated with hydrocarbons and chemical products. This study characterised the environmental risk posed by the cuttings pile at the North West Hutton (NWH) oil platform. Data on the drilling fluids and chemical products used over the platform's drilling history were transferred from archived well reports into a custom database, to which were added toxicological and safety data. Although the database contained many gaps, it established that only seven chemical products used at NWH were not in the lowest category of the Offshore Chemicals Notification Scheme, and were used in only small quantities. The study therefore supports the view that the main environmental risk posed by cuttings piles comes from hydrocarbon contamination. The (dated) well records could help future core sampling to be targeted at specific locations in the cuttings piles. Data from many platforms could also be pooled to determine generic 'discharge profiles.' Future study would benefit from the existence, in the public domain, of a standardised, 'legacy' database of chemical products
Advancements in web-database applications for rabies surveillance

Directory of Open Access Journals (Sweden)

Bélanger Denise

2011-08-01

Full Text Available Abstract Background Protection of public health from rabies is informed by the analysis of surveillance data from human and animal populations. In Canada, public health, agricultural and wildlife agencies at the provincial and federal level are responsible for rabies disease control, and this has led to multiple agency-specific data repositories. Aggregation of agency-specific data into one database application would enable more comprehensive data analyses and effective communication among participating agencies. In Québec, RageDB was developed to house surveillance data for the raccoon rabies variant, representing the next generation in web-based database applications that provide a key resource for the protection of public health. Results RageDB incorporates data from, and grants access to, all agencies responsible for the surveillance of raccoon rabies in Québec. Technological advancements of RageDB to rabies surveillance databases include 1 automatic integration of multi-agency data and diagnostic results on a daily basis; 2 a web-based data editing interface that enables authorized users to add, edit and extract data; and 3 an interactive dashboard to help visualize data simply and efficiently, in table, chart, and cartographic formats. Furthermore, RageDB stores data from citizens who voluntarily report sightings of rabies suspect animals. We also discuss how sightings data can indicate public perception to the risk of racoon rabies and thus aid in directing the allocation of disease control resources for protecting public health. Conclusions RageDB provides an example in the evolution of spatio-temporal database applications for the storage, analysis and communication of disease surveillance data. The database was fast and inexpensive to develop by using open-source technologies, simple and efficient design strategies, and shared web hosting. The database increases communication among agencies collaborating to protect human health from
On the use of databases about research performance

NARCIS (Netherlands)

Rodela, Romina

2016-01-01

The accuracy of interdisciplinarity measurements depends on how well the data is used for this purpose and whether it can meaningfully inform about work that crosses disciplinary domains. At present, there are no ad hoc databases compiling information only and exclusively about interdisciplinary
The Danish Urogynaecological Database

DEFF Research Database (Denmark)

Guldberg, Rikke; Brostrøm, Søren; Hansen, Jesper Kjær

2013-01-01

in the DugaBase from 1 January 2009 to 31 October 2010, using medical records as a reference. RESULTS: A total of 16,509 urogynaecological procedures were registered in the DugaBase by 31 December 2010. The database completeness has increased by calendar time, from 38.2 % in 2007 to 93.2 % in 2010 for public......INTRODUCTION AND HYPOTHESIS: The Danish Urogynaecological Database (DugaBase) is a nationwide clinical database established in 2006 to monitor, ensure and improve the quality of urogynaecological surgery. We aimed to describe its establishment and completeness and to validate selected variables....... This is the first study based on data from the DugaBase. METHODS: The database completeness was calculated as a comparison between urogynaecological procedures reported to the Danish National Patient Registry and to the DugaBase. Validity was assessed for selected variables from a random sample of 200 women...
The architectural design of networks of protein domain architectures.

Science.gov (United States)

Hsu, Chia-Hsin; Chen, Chien-Kuo; Hwang, Ming-Jing

2013-08-23

Protein domain architectures (PDAs), in which single domains are linked to form multiple-domain proteins, are a major molecular form used by evolution for the diversification of protein functions. However, the design principles of PDAs remain largely uninvestigated. In this study, we constructed networks to connect domain architectures that had grown out from the same single domain for every single domain in the Pfam-A database and found that there are three main distinctive types of these networks, which suggests that evolution can exploit PDAs in three different ways. Further analysis showed that these three different types of PDA networks are each adopted by different types of protein domains, although many networks exhibit the characteristics of more than one of the three types. Our results shed light on nature's blueprint for protein architecture and provide a framework for understanding architectural design from a network perspective.
“NaKnowBase”: A Nanomaterials Relational Database

Science.gov (United States)

NaKnowBase is a relational database populated with data from peer-reviewed ORD nanomaterials research publications. The database focuses on papers describing the actions of nanomaterials in environmental or biological media including their interactions, transformations and poten...
dBBQs: dataBase of Bacterial Quality scores

OpenAIRE

Wanchai, Visanu; Patumcharoenpol, Preecha; Nookaew, Intawat; Ussery, David

2017-01-01

Background: It is well-known that genome sequencing technologies are becoming significantly cheaper and faster. As a result of this, the exponential growth in sequencing data in public databases allows us to explore ever growing large collections of genome sequences. However, it is less known that the majority of available sequenced genome sequences in public databases are not complete, drafts of varying qualities. We have calculated quality scores for around 100,000 bacterial genomes from al...
Database management in the new GANIL control system

International Nuclear Information System (INIS)

Lecorche, E.; Lermine, P.

1993-01-01

At the start of the new control system design, decision was made to manage the huge amount of data by means of a database management system. The first implementations built on the INGRES relational database are described. Real time and data management domains are shown, and problems induced by Ada/SQL interfacing are briefly discussed. Database management concerns the whole hardware and software configuration for the GANIL pieces of equipment and the alarm system either for the alarm configuration or for the alarm logs. An other field of application encompasses the beam parameter archiving as a function of the various kinds of beams accelerated at GANIL (ion species, energies, charge states). (author) 3 refs., 4 figs
ARTI refrigerant database

Energy Technology Data Exchange (ETDEWEB)

Calm, J.M.

1998-03-15

The Refrigerant Database is an information system on alternative refrigerants, associated lubricants, and their use in air conditioning and refrigeration. It consolidates and facilitates access to thermophysical properties, compatibility, environmental, safety, application and other information. It provides corresponding information on older refrigerants, to assist manufacturers and those using alternative refrigerants, to make comparisons and determine differences. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern. The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air conditioning and refrigeration equipment. It also references documents addressing compatibility of refrigerants and lubricants with other materials.
Methods and Techniques for the Design and Implementation of Domain-Specific Languages

NARCIS (Netherlands)

Hemel, Z.

2012-01-01

Domain-Specific Languages (DSLs) are programming language aimed at a particular problem domain, e.g. banking, database querying or website page lay-outs. Through the use of high-level concepts, a DSL raises the level of abstraction and expressive power of the programmer, and reduces the size of
Correlates of Access to Business Research Databases

Science.gov (United States)

Gottfried, John C.

2010-01-01

This study examines potential correlates of business research database access through academic libraries serving top business programs in the United States. Results indicate that greater access to research databases is related to enrollment in graduate business programs, but not to overall enrollment or status as a public or private institution.…
A Database Approach to Distributed State Space Generation

NARCIS (Netherlands)

Blom, Stefan; Lisser, Bert; van de Pol, Jan Cornelis; Weber, M.

2007-01-01

We study distributed state space generation on a cluster of workstations. It is explained why state space partitioning by a global hash function is problematic when states contain variables from unbounded domains, such as lists or other recursive datatypes. Our solution is to introduce a database

A Database Approach to Distributed State Space Generation

NARCIS (Netherlands)

Blom, Stefan; Lisser, Bert; van de Pol, Jan Cornelis; Weber, M.; Cerna, I.; Haverkort, Boudewijn R.H.M.

2008-01-01

We study distributed state space generation on a cluster of workstations. It is explained why state space partitioning by a global hash function is problematic when states contain variables from unbounded domains, such as lists or other recursive datatypes. Our solution is to introduce a database
Bibliographical database of radiation biological dosimetry and risk assessment: Part 1, through June 1988

Energy Technology Data Exchange (ETDEWEB)

Straume, T.; Ricker, Y.; Thut, M.

1988-08-29

This database was constructed to support research in radiation biological dosimetry and risk assessment. Relevant publications were identified through detailed searches of national and international electronic databases and through our personal knowledge of the subject. Publications were numbered and key worded, and referenced in an electronic data-retrieval system that permits quick access through computerized searches on publication number, authors, key words, title, year, and journal name. Photocopies of all publications contained in the database are maintained in a file that is numerically arranged by citation number. This report of the database is provided as a useful reference and overview. It should be emphasized that the database will grow as new citations are added to it. With that in mind, we arranged this report in order of ascending citation number so that follow-up reports will simply extend this document. The database cite 1212 publications. Publications are from 119 different scientific journals, 27 of these journals are cited at least 5 times. It also contains reference to 42 books and published symposia, and 129 reports. Information relevant to radiation biological dosimetry and risk assessment is widely distributed among the scientific literature, although a few journals clearly dominate. The four journals publishing the largest number of relevant papers are Health Physics, Mutation Research, Radiation Research, and International Journal of Radiation Biology. Publications in Health Physics make up almost 10% of the current database.
Bibliographical database of radiation biological dosimetry and risk assessment: Part 1, through June 1988

International Nuclear Information System (INIS)

Straume, T.; Ricker, Y.; Thut, M.

1988-01-01

This database was constructed to support research in radiation biological dosimetry and risk assessment. Relevant publications were identified through detailed searches of national and international electronic databases and through our personal knowledge of the subject. Publications were numbered and key worded, and referenced in an electronic data-retrieval system that permits quick access through computerized searches on publication number, authors, key words, title, year, and journal name. Photocopies of all publications contained in the database are maintained in a file that is numerically arranged by citation number. This report of the database is provided as a useful reference and overview. It should be emphasized that the database will grow as new citations are added to it. With that in mind, we arranged this report in order of ascending citation number so that follow-up reports will simply extend this document. The database cite 1212 publications. Publications are from 119 different scientific journals, 27 of these journals are cited at least 5 times. It also contains reference to 42 books and published symposia, and 129 reports. Information relevant to radiation biological dosimetry and risk assessment is widely distributed among the scientific literature, although a few journals clearly dominate. The four journals publishing the largest number of relevant papers are Health Physics, Mutation Research, Radiation Research, and International Journal of Radiation Biology. Publications in Health Physics make up almost 10% of the current database
Analysis of isotropic turbulence using a public database and the Web service model, and applications to study subgrid models

Science.gov (United States)

Meneveau, Charles; Yang, Yunke; Perlman, Eric; Wan, Minpin; Burns, Randal; Szalay, Alex; Chen, Shiyi; Eyink, Gregory

2008-11-01

A public database system archiving a direct numerical simulation (DNS) data set of isotropic, forced turbulence is used for studying basic turbulence dynamics. The data set consists of the DNS output on 1024-cubed spatial points and 1024 time-samples spanning about one large-scale turn-over timescale. This complete space-time history of turbulence is accessible to users remotely through an interface that is based on the Web-services model (see http://turbulence.pha.jhu.edu). Users may write and execute analysis programs on their host computers, while the programs make subroutine-like calls that request desired parts of the data over the network. The architecture of the database is briefly explained, as are some of the new functions such as Lagrangian particle tracking and spatial box-filtering. These tools are used to evaluate and compare subgrid stresses and models.
Data Model and Relational Database Design for Highway Runoff Water-Quality Metadata

Science.gov (United States)

Granato, Gregory E.; Tessler, Steven

2001-01-01

A National highway and urban runoff waterquality metadatabase was developed by the U.S. Geological Survey in cooperation with the Federal Highway Administration as part of the National Highway Runoff Water-Quality Data and Methodology Synthesis (NDAMS). The database was designed to catalog available literature and to document results of the synthesis in a format that would facilitate current and future research on highway and urban runoff. This report documents the design and implementation of the NDAMS relational database, which was designed to provide a catalog of available information and the results of an assessment of the available data. All the citations and the metadata collected during the review process are presented in a stratified metadatabase that contains citations for relevant publications, abstracts (or previa), and reportreview metadata for a sample of selected reports that document results of runoff quality investigations. The database is referred to as a metadatabase because it contains information about available data sets rather than a record of the original data. The database contains the metadata needed to evaluate and characterize how valid, current, complete, comparable, and technically defensible published and available information may be when evaluated for application to the different dataquality objectives as defined by decision makers. This database is a relational database, in that all information is ultimately linked to a given citation in the catalog of available reports. The main database file contains 86 tables consisting of 29 data tables, 11 association tables, and 46 domain tables. The data tables all link to a particular citation, and each data table is focused on one aspect of the information collected in the literature search and the evaluation of available information. This database is implemented in the Microsoft (MS) Access database software because it is widely used within and outside of government and is familiar to many
The Danish Inguinal Hernia database

DEFF Research Database (Denmark)

Friis-Andersen, Hans; Bisgaard, Thue

2016-01-01

AIM OF DATABASE: To monitor and improve nation-wide surgical outcome after groin hernia repair based on scientific evidence-based surgical strategies for the national and international surgical community. STUDY POPULATION: Patients ≥18 years operated for groin hernia. MAIN VARIABLES: Type and size...... access to their own data stratified on individual surgeons. Registrations are based on a closed, protected Internet system requiring personal codes also identifying the operating institution. A national steering committee consisting of 13 voluntary and dedicated surgeons, 11 of whom are unpaid, handles...... the medical management of the database. RESULTS: The Danish Inguinal Hernia Database comprises intraoperative data from >130,000 repairs (May 2015). A total of 49 peer-reviewed national and international publications have been published from the database (June 2015). CONCLUSION: The Danish Inguinal Hernia...
Causal Analysis of Databases Concerning Electromagnetism and Health

Directory of Open Access Journals (Sweden)

Kristian Alonso-Stenberg

2016-12-01

Full Text Available In this article, we conducted a causal analysis of a system extracted from a database of current data in the telecommunications domain, namely the Eurobarometer 73.3 database arose from a survey of 26,602 citizens EU on the potential health effects that electromagnetic fields can produce. To determine the cause-effect relationships between variables, we represented these data by a directed graph that can be applied to a qualitative version of the theory of discrete chaos to highlight causal circuits and attractors, as these are basic elements of system behavior.
Protocol for developing a Database of Zoonotic disease Research in India (DoZooRI).

Science.gov (United States)

Chatterjee, Pranab; Bhaumik, Soumyadeep; Chauhan, Abhimanyu Singh; Kakkar, Manish

2017-12-10

Zoonotic and emerging infectious diseases (EIDs) represent a public health threat that has been acknowledged only recently although they have been on the rise for the past several decades. On an average, every year since the Second World War, one pathogen has emerged or re-emerged on a global scale. Low/middle-income countries such as India bear a significant burden of zoonotic and EIDs. We propose that the creation of a database of published, peer-reviewed research will open up avenues for evidence-based policymaking for targeted prevention and control of zoonoses. A large-scale systematic mapping of the published peer-reviewed research conducted in India will be undertaken. All published research will be included in the database, without any prejudice for quality screening, to broaden the scope of included studies. Structured search strategies will be developed for priority zoonotic diseases (leptospirosis, rabies, anthrax, brucellosis, cysticercosis, salmonellosis, bovine tuberculosis, Japanese encephalitis and rickettsial infections), and multiple databases will be searched for studies conducted in India. The database will be managed and hosted on a cloud-based platform called Rayyan. Individual studies will be tagged based on key preidentified parameters (disease, study design, study type, location, randomisation status and interventions, host involvement and others, as applicable). The database will incorporate already published studies, obviating the need for additional ethical clearances. The database will be made available online, and in collaboration with multisectoral teams, domains of enquiries will be identified and subsequent research questions will be raised. The database will be queried for these and resulting evidence will be analysed and published in peer-reviewed journals. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise
Bibliometric analysis of publications on wine tourism in the databases Scopus and WoS

Directory of Open Access Journals (Sweden)

Amador Durán Sánchez

2017-01-01

Full Text Available The aim of this study was to show the current state of scientific research regarding wine tourism, by comparing the platforms of scientific information WoS and Scopus and applying quantitative methods. For this purpose, a bibliometric study of the publications indexed in WoS and Scopus was conducted, analyzing the correlation between increases, coverage, overlap, dispersion and concentration of documents. During the search process, a set of 238 articles and 122 different journals were obtained. Based on the results of the comparative study, we conclude that WoS and Scopus databases differ in scope, data volume and coverage policies with a high degree of unique sources and articles, resulting both of them complementary and not mutually exclusive. Scopus covers the area of wine tourism better, by including a greater number of journals, papers and signatures.
The PEP-II project-wide database

International Nuclear Information System (INIS)

Chan, A.; Calish, S.; Crane, G.; MacGregor, I.; Meyer, S.; Wong, J.

1995-05-01

The PEP-II Project Database is a tool for monitoring the technical and documentation aspects of this accelerator construction. It holds the PEP-II design specifications, fabrication and installation data in one integrated system. Key pieces of the database include the machine parameter list, magnet and vacuum fabrication data. CAD drawings, publications and documentation, survey and alignment data and property control. The database can be extended to contain information required for the operations phase of the accelerator and detector. Features such as viewing CAD drawing graphics from the database will be implemented in the future. This central Oracle database on a UNIX server is built using ORACLE Case tools. Users at the three collaborating laboratories (SLAC, LBL, LLNL) can access the data remotely, using various desktop computer platforms and graphical interfaces
Bibliometric analysis of Spanish scientific publications in the subject Construction & Building Technology in Web of Science database (1997-2008)

OpenAIRE

Rojas-Sola, J. I.; de San-Antonio-Gómez, C.

2010-01-01

In this paper the publications from Spanish institutions listed in the journals of the Construction & Building Technology subject of Web of Science database for the period 1997- 2008 are analyzed. The number of journals in whose is published is 35 and the number of articles was 760 (Article or Review). Also a bibliometric assessment has done and we propose two new parameters: Weighted Impact Factor and Relative Impact Factor; also includes the number of citations and the number documents ...
Concomitant prediction of function and fold at the domain level with GO-based profiles.

Science.gov (United States)

Lopez, Daniel; Pazos, Florencio

2013-01-01

Predicting the function of newly sequenced proteins is crucial due to the pace at which these raw sequences are being obtained. Almost all resources for predicting protein function assign functional terms to whole chains, and do not distinguish which particular domain is responsible for the allocated function. This is not a limitation of the methodologies themselves but it is due to the fact that in the databases of functional annotations these methods use for transferring functional terms to new proteins, these annotations are done on a whole-chain basis. Nevertheless, domains are the basic evolutionary and often functional units of proteins. In many cases, the domains of a protein chain have distinct molecular functions, independent from each other. For that reason resources with functional annotations at the domain level, as well as methodologies for predicting function for individual domains adapted to these resources are required.We present a methodology for predicting the molecular function of individual domains, based on a previously developed database of functional annotations at the domain level. The approach, which we show outperforms a standard method based on sequence searches in assigning function, concomitantly predicts the structural fold of the domains and can give hints on the functionally important residues associated to the predicted function.
Type Error Customization for Embedded Domain-Specific Languages

NARCIS (Netherlands)

Serrano Mena, Alejandro

2018-01-01

Domain-specific languages (DSLs) are a widely used technique in the programming world, since they make communication between experts and developers more fluid. Some well-known examples are SQL for databases and HTML for web page description. There are two different approaches to developing DSLs:
The Vocational Guidance Research Database: A Scientometric Approach

Science.gov (United States)

Flores-Buils, Raquel; Gil-Beltran, Jose Manuel; Caballer-Miedes, Antonio; Martinez-Martinez, Miguel Angel

2012-01-01

The scientometric study of scientific output through publications in specialized journals cannot be undertaken exclusively with the databases available today. For this reason, the objective of this article is to introduce the "Base de Datos de Investigacion en Orientacion Vocacional" [Vocational Guidance Research Database], based on the…
CD-ROM-aided Databases

Science.gov (United States)

Masuyama, Keiichi

CD-ROM has rapidly evolved as a new information medium with large capacity, In the U.S. it is predicted that it will become two hundred billion yen market in three years, and thus CD-ROM is strategic target of database industry. Here in Japan the movement toward its commercialization has been active since this year. Shall CD-ROM bussiness ever conquer information market as an on-disk database or electronic publication? Referring to some cases of the applications in the U.S. the author views marketability and the future trend of this new optical disk medium.
The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): A Completed Reference Database of Lung Nodules on CT Scans

Energy Technology Data Exchange (ETDEWEB)

NONE

2011-02-15

Purpose: The development of computer-aided diagnostic (CAD) methods for lung nodule detection, classification, and quantitative assessment can be facilitated through a well-characterized repository of computed tomography (CT) scans. The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI) completed such a database, establishing a publicly available reference for the medical imaging research community. Initiated by the National Cancer Institute (NCI), further advanced by the Foundation for the National Institutes of Health (FNIH), and accompanied by the Food and Drug Administration (FDA) through active participation, this public-private partnership demonstrates the success of a consortium founded on a consensus-based process. Methods: Seven academic centers and eight medical imaging companies collaborated to identify, address, and resolve challenging organizational, technical, and clinical issues to provide a solid foundation for a robust database. The LIDC/IDRI Database contains 1018 cases, each of which includes images from a clinical thoracic CT scan and an associated XML file that records the results of a two-phase image annotation process performed by four experienced thoracic radiologists. In the initial blinded-read phase, each radiologist independently reviewed each CT scan and marked lesions belonging to one of three categories (''nodule{>=}3 mm,''''nodule<3 mm,'' and ''non-nodule{>=}3 mm''). In the subsequent unblinded-read phase, each radiologist independently reviewed their own marks along with the anonymized marks of the three other radiologists to render a final opinion. The goal of this process was to identify as completely as possible all lung nodules in each CT scan without requiring forced consensus. Results: The Database contains 7371 lesions marked ''nodule'' by at least one radiologist. 2669 of these lesions were marked &apos
An approach in building a chemical compound search engine in oracle database.

Science.gov (United States)

Wang, H; Volarath, P; Harrison, R

2005-01-01

A searching or identifying of chemical compounds is an important process in drug design and in chemistry research. An efficient search engine involves a close coupling of the search algorithm and database implementation. The database must process chemical structures, which demands the approaches to represent, store, and retrieve structures in a database system. In this paper, a general database framework for working as a chemical compound search engine in Oracle database is described. The framework is devoted to eliminate data type constrains for potential search algorithms, which is a crucial step toward building a domain specific query language on top of SQL. A search engine implementation based on the database framework is also demonstrated. The convenience of the implementation emphasizes the efficiency and simplicity of the framework.
WGDB: Wood Gene Database with search interface.

Science.gov (United States)

Goyal, Neha; Ginwal, H S

2014-01-01

Wood quality can be defined in terms of particular end use with the involvement of several traits. Over the last fifteen years researchers have assessed the wood quality traits in forest trees. The wood quality was categorized as: cell wall biochemical traits, fibre properties include the microfibril angle, density and stiffness in loblolly pine [1]. The user friendly and an open-access database has been developed named Wood Gene Database (WGDB) for describing the wood genes along the information of protein and published research articles. It contains 720 wood genes from species namely Pinus, Deodar, fast growing trees namely Poplar, Eucalyptus. WGDB designed to encompass the majority of publicly accessible genes codes for cellulose, hemicellulose and lignin in tree species which are responsive to wood formation and quality. It is an interactive platform for collecting, managing and searching the specific wood genes; it also enables the data mining relate to the genomic information specifically in Arabidopsis thaliana, Populus trichocarpa, Eucalyptus grandis, Pinus taeda, Pinus radiata, Cedrus deodara, Cedrus atlantica. For user convenience, this database is cross linked with public databases namely NCBI, EMBL & Dendrome with the search engine Google for making it more informative and provides bioinformatics tools named BLAST,COBALT. The database is freely available on www.wgdb.in.
Large-scale Health Information Database and Privacy Protection*1

OpenAIRE

YAMAMOTO, Ryuichi

2016-01-01

Japan was once progressive in the digitalization of healthcare fields but unfortunately has fallen behind in terms of the secondary use of data for public interest. There has recently been a trend to establish large-scale health databases in the nation, and a conflict between data use for public interest and privacy protection has surfaced as this trend has progressed. Databases for health insurance claims or for specific health checkups and guidance services were created according to the law...
Development of a Consumer Product Ingredient Database for ...

Science.gov (United States)

Consumer products are a primary source of chemical exposures, yet little structured information is available on the chemical ingredients of these products and the concentrations at which ingredients are present. To address this data gap, we created a database of chemicals in consumer products using product Material Safety Data Sheets (MSDSs) publicly provided by a large retailer. The resulting database represents 1797 unique chemicals mapped to 8921 consumer products and a hierarchy of 353 consumer product “use categories” within a total of 15 top-level categories. We examine the utility of this database and discuss ways in which it will support (i) exposure screening and prioritization, (ii) generic or framework formulations for several indoor/consumer product exposure modeling initiatives, (iii) candidate chemical selection for monitoring near field exposure from proximal sources, and (iv) as activity tracers or ubiquitous exposure sources using “chemical space” map analyses. Chemicals present at high concentrations and across multiple consumer products and use categories that hold high exposure potential are identified. Our database is publicly available to serve regulators, retailers, manufacturers, and the public for predictive screening of chemicals in new and existing consumer products on the basis of exposure and risk. The National Exposure Research Laboratory’s (NERL’s) Human Exposure and Atmospheric Sciences Division (HEASD) conducts resear

The AMMA database

Science.gov (United States)

Boichard, Jean-Luc; Brissebrat, Guillaume; Cloche, Sophie; Eymard, Laurence; Fleury, Laurence; Mastrorillo, Laurence; Moulaye, Oumarou; Ramage, Karim

2010-05-01

The AMMA project includes aircraft, ground-based and ocean measurements, an intensive use of satellite data and diverse modelling studies. Therefore, the AMMA database aims at storing a great amount and a large variety of data, and at providing the data as rapidly and safely as possible to the AMMA research community. In order to stimulate the exchange of information and collaboration between researchers from different disciplines or using different tools, the database provides a detailed description of the products and uses standardized formats. The AMMA database contains: - AMMA field campaigns datasets; - historical data in West Africa from 1850 (operational networks and previous scientific programs); - satellite products from past and future satellites, (re-)mapped on a regular latitude/longitude grid and stored in NetCDF format (CF Convention); - model outputs from atmosphere or ocean operational (re-)analysis and forecasts, and from research simulations. The outputs are processed as the satellite products are. Before accessing the data, any user has to sign the AMMA data and publication policy. This chart only covers the use of data in the framework of scientific objectives and categorically excludes the redistribution of data to third parties and the usage for commercial applications. Some collaboration between data producers and users, and the mention of the AMMA project in any publication is also required. The AMMA database and the associated on-line tools have been fully developed and are managed by two teams in France (IPSL Database Centre, Paris and OMP, Toulouse). Users can access data of both data centres using an unique web portal. This website is composed of different modules : - Registration: forms to register, read and sign the data use chart when an user visits for the first time - Data access interface: friendly tool allowing to build a data extraction request by selecting various criteria like location, time, parameters... The request can
Fund Finder: A case study of database-to-ontology mapping

OpenAIRE

Barrasa Rodríguez, Jesús; Corcho, Oscar; Gómez-Pérez, A.

2003-01-01

The mapping between databases and ontologies is a basic problem when trying to "upgrade" deep web content to the semantic web. Our approach suggests the declarative definition of mappings as a way to achieve domain independency and reusability. A specific language (expressive enough to cover some real world mapping situations like lightly structured databases or not 1st normal form ones) is defined for this purpose. Along with this mapping description language, the ODEMapster processor is in ...
Systematic review of public health branding.

Science.gov (United States)

Evans, W Douglas; Blitstein, Jonathan; Hersey, James C; Renaud, Jeanette; Yaroch, Amy L

2008-12-01

Brands build relationships between consumers and products, services, or lifestyles by providing beneficial exchanges and adding value to their objects. Brands can be measured through associations that consumers hold for products and services. Public health brands are the associations that individuals hold for health behaviors, or lifestyles that embody multiple health behaviors. We systematically reviewed the literature on public health brands; developed a methodology for describing branded health messages and campaigns; and examined specific branding strategies across a range of topic areas, campaigns, and global settings. We searched the literature for published studies on public health branding available through all relevant, major online publication databases. Public health branding was operationalized as any manuscripts in the health, social science, and business literature on branding or brands in health promotion marketing. We developed formalized decision rules and applied them in identifying articles for review. We initially identified 154 articles and reviewed a final set of 37, 10 from Africa, Australia, and Europe. Branded health campaigns spanned most of the major domains of public health and numerous communication strategies and evaluation methodologies. Most studies provided clear information on planning, development, and evaluation of the branding effort, while some provided minimal information. Branded health messages typically are theory based, and there is a body of evidence on their behavior change effectiveness, especially in nutrition, tobacco control, and HIV/AIDS. More rigorous research is needed, however, on how branded health messages impact specific populations and behaviors.
Go Figure: Computer Database Adds the Personal Touch.

Science.gov (United States)

Gaffney, Jean; Crawford, Pat

1992-01-01

A database for recordkeeping for a summer reading club was developed for a public library system using an IBM PC and Microsoft Works. Use of the database resulted in more efficient program management, giving librarians more time to spend with patrons and enabling timely awarding of incentives. (LAE)
Effects of multi-domain interventions in (prefrail elderly on frailty, functional, and cognitive status: a systematic review

Directory of Open Access Journals (Sweden)

Dedeyne L

2017-05-01

Full Text Available Lenore Dedeyne,1 Mieke Deschodt,2–4 Sabine Verschueren,5 Jos Tournoy,1,3 Evelien Gielen1,3 1Department of Clinical and Experimental Medicine, 2Department of Public Health and Primary Care, KU Leuven – University of Leuven, Leuven, Belgium; 3Department of Geriatric Medicine, University Hospitals Leuven, Leuven, Belgium; 4Department of Public Health, Institute of Nursing Science, University of Basel, Basel, Switzerland; 5Department of Rehabilitation Sciences, KU Leuven – University of Leuven, Heverlee, Belgium Background: Frailty is an aging syndrome caused by exceeding a threshold of decline across multiple organ systems leading to a decreased resistance to stressors. Treatment for frailty focuses on multi-domain interventions to target multiple affected functions in order to decrease the adverse outcomes of frailty. No systematic reviews on the effectiveness of multi-domain interventions exist in a well-defined frail population. Objectives: This systematic review aimed to determine the effect of multi-domain compared to mono-domain interventions on frailty status and score, cognition, muscle mass, strength and power, functional and social outcomes in (prefrail elderly (≥65 years. It included interventions targeting two or more domains (physical exercise, nutritional, pharmacological, psychological, or social interventions in participants defined as (prefrail by an operationalized frailty definition. Methods: The databases PubMed, EMBASE, CINAHL, PEDro, CENTRAL, and the Cochrane Central register of Controlled Trials were searched from inception until September 14, 2016. Additional articles were searched by citation search, author search, and reference lists of relevant articles. The protocol for this review was registered on PROSPERO (CRD42016032905. Results: Twelve studies were included, reporting a large diversity of interventions in terms of content, duration, and follow-up period. Overall, multi-domain interventions tended to be more
A database of new zeolite-like materials.

Science.gov (United States)

Pophale, Ramdas; Cheeseman, Phillip A; Deem, Michael W

2011-07-21

We here describe a database of computationally predicted zeolite-like materials. These crystals were discovered by a Monte Carlo search for zeolite-like materials. Positions of Si atoms as well as unit cell, space group, density, and number of crystallographically unique atoms were explored in the construction of this database. The database contains over 2.6 M unique structures. Roughly 15% of these are within +30 kJ mol(-1) Si of α-quartz, the band in which most of the known zeolites lie. These structures have topological, geometrical, and diffraction characteristics that are similar to those of known zeolites. The database is the result of refinement by two interatomic potentials that both satisfy the Pauli exclusion principle. The database has been deposited in the publicly available PCOD database and in www.hypotheticalzeolites.net/database/deem/. This journal is © the Owner Societies 2011
Legume and Lotus japonicus Databases

DEFF Research Database (Denmark)

Hirakawa, Hideki; Mun, Terry; Sato, Shusei

2014-01-01

Since the genome sequence of Lotus japonicus, a model plant of family Fabaceae, was determined in 2008 (Sato et al. 2008), the genomes of other members of the Fabaceae family, soybean (Glycine max) (Schmutz et al. 2010) and Medicago truncatula (Young et al. 2011), have been sequenced. In this sec....... In this section, we introduce representative, publicly accessible online resources related to plant materials, integrated databases containing legume genome information, and databases for genome sequence and derived marker information of legume species including L. japonicus...
The Danish Depression Database

DEFF Research Database (Denmark)

Videbech, Poul Bror Hemming; Deleuran, Anette

2016-01-01

AIM OF DATABASE: The purpose of the Danish Depression Database (DDD) is to monitor and facilitate the improvement of the quality of the treatment of depression in Denmark. Furthermore, the DDD has been designed to facilitate research. STUDY POPULATION: Inpatients as well as outpatients...... with depression, aged above 18 years, and treated in the public psychiatric hospital system were enrolled. MAIN VARIABLES: Variables include whether the patient has been thoroughly somatically examined and has been interviewed about the psychopathology by a specialist in psychiatry. The Hamilton score as well...... as an evaluation of the risk of suicide are measured before and after treatment. Whether psychiatric aftercare has been scheduled for inpatients and the rate of rehospitalization are also registered. DESCRIPTIVE DATA: The database was launched in 2011. Every year since then ~5,500 inpatients and 7,500 outpatients...
The TMI-2 clean-up project collection and databases

International Nuclear Information System (INIS)

Osif, B.A.; Conkling, T.W.

1996-01-01

A publicly accessible collection containing several thousand of the videotapes, photographs, slides and technical reports generated during the clean-up of the TMI-2 reactor has been established by the Pennsylvania State University Libraries. The collection is intended to serve as a technical resource for the nuclear industry as well as the interested public. Two Internet-searchable databases describing the videotapes and technical reports have been created. The development and use of these materials and databases are described in this paper. (orig.)
Outputs and Growth of Primary Care Databases in the United Kingdom: Bibliometric Analysis

Directory of Open Access Journals (Sweden)

Zain Chaudhry

2017-10-01

Full Text Available Background: Electronic health database (EHD data is increasingly used by researchers. The major United Kingdom EHDs are the ‘Clinical Practice Research Datalink’ (CPRD, ‘The Health Improvement Network’ (THIN and ‘QResearch’. Over time, outputs from these databases have increased, but have not been evaluated. Objective: This study compares research outputs from CPRD, THIN and QResearch assessing growth and publication outputs over a 10-year period (2004-2013. CPRD was also reviewed separately over 20 years as a case study. Methods: Publications from CPRD and QResearch were extracted using the Science Citation Index (SCI of the Thomson Scientific Institute for Scientific Information (Web of Science. THIN data was obtained from University College London and validated in Web of Science. All databases were analysed for growth in publications, the speciality areas and the journals in which their data have been published. Results: These databases collectively produced 1,296 publications over a ten-year period, with CPRD representing 63.6% (n=825 papers, THIN 30.4% (n=394 and QResearch 5.9% (n=77. Pharmacoepidemiology and General Medicine were the most common specialities featured. Over the 9-year period (2004-2013, publications for THIN and QResearch have slowly increased over time, whereas CPRD publications have increased substantially in last 4 years with almost 75% of CPRD publications published in the past 9 years. Conclusion: These databases are enhancing scientific research and are growing yearly, however display variability in their growth. They could become more powerful research tools if the National Health Service and general practitioners can provide accurate and comprehensive data for inclusion in these databases.
Development of radionuclide parameter database on internal contamination in nuclear emergencies

International Nuclear Information System (INIS)

Zhao Li; Xu Cuihua; Li Wenhong; Su Xu

2010-01-01

Objective: To develop a radionuclide parameter database on internal contamination in nuclear emergencies. Methods: By researching the radionuclides composition discharged from different nuclear emergencies, the radionuclide parameters were achieved on physical decay, absorption and metabolism in the body from ICRP publications and some other publications. The database on internal contamination for nuclear incidents was developed by using MS Visual Studio 2005 C and MS Access programming language. Results: The radionuclide parameter database on internal contamination in nuclear emergency was established. Conclusions: The database may be very convenient for searching radionuclides and radionuclide parameter data discharged from different nuclear emergencies, which would be helpful to the monitoring and assessment and assessment of internal contamination in nuclear emergencies. (authors)
Metagenomic Taxonomy-Guided Database-Searching Strategy for Improving Metaproteomic Analysis.

Science.gov (United States)

Xiao, Jinqiu; Tanca, Alessandro; Jia, Ben; Yang, Runqing; Wang, Bo; Zhang, Yu; Li, Jing

2018-04-06

Metaproteomics provides a direct measure of the functional information by investigating all proteins expressed by a microbiota. However, due to the complexity and heterogeneity of microbial communities, it is very hard to construct a sequence database suitable for a metaproteomic study. Using a public database, researchers might not be able to identify proteins from poorly characterized microbial species, while a sequencing-based metagenomic database may not provide adequate coverage for all potentially expressed protein sequences. To address this challenge, we propose a metagenomic taxonomy-guided database-search strategy (MT), in which a merged database is employed, consisting of both taxonomy-guided reference protein sequences from public databases and proteins from metagenome assembly. By applying our MT strategy to a mock microbial mixture, about two times as many peptides were detected as with the metagenomic database only. According to the evaluation of the reliability of taxonomic attribution, the rate of misassignments was comparable to that obtained using an a priori matched database. We also evaluated the MT strategy with a human gut microbial sample, and we found 1.7 times as many peptides as using a standard metagenomic database. In conclusion, our MT strategy allows the construction of databases able to provide high sensitivity and precision in peptide identification in metaproteomic studies, enabling the detection of proteins from poorly characterized species within the microbiota.
PineElm_SSRdb: a microsatellite marker database identified from genomic, chloroplast, mitochondrial and EST sequences of pineapple (Ananas comosus (L.) Merrill).

Science.gov (United States)

Chaudhary, Sakshi; Mishra, Bharat Kumar; Vivek, Thiruvettai; Magadum, Santoshkumar; Yasin, Jeshima Khan

2016-01-01

Simple Sequence Repeats or microsatellites are resourceful molecular genetic markers. There are only few reports of SSR identification and development in pineapple. Complete genome sequence of pineapple available in the public domain can be used to develop numerous novel SSRs. Therefore, an attempt was made to identify SSRs from genomic, chloroplast, mitochondrial and EST sequences of pineapple which will help in deciphering genetic makeup of its germplasm resources. A total of 359511 SSRs were identified in pineapple (356385 from genome sequence, 45 from chloroplast sequence, 249 in mitochondrial sequence and 2832 from EST sequences). The list of EST-SSR markers and their details are available in the database. PineElm_SSRdb is an open source database available for non-commercial academic purpose at http://app.bioelm.com/ with a mapping tool which can develop circular maps of selected marker set. This database will be of immense use to breeders, researchers and graduates working on Ananas spp. and to others working on cross-species transferability of markers, investigating diversity, mapping and DNA fingerprinting.
CERCLIS (Superfund) ASCII Text Format - CPAD Database

Data.gov (United States)

U.S. Environmental Protection Agency — The Comprehensive Environmental Response, Compensation and Liability Information System (CERCLIS) (Superfund) Public Access Database (CPAD) contains a selected set...
The starch-binding domain family CBM41 - an in silico analysis of evolutionary relationships

DEFF Research Database (Denmark)

Janeček, Štefan; Majzlová, Katarína; Svensson, Birte

2017-01-01

Within the CAZy database, there are 81 carbohydrate-binding module (CBM) families. A CBM represents a non-catalytic domain in a modular arrangement of glycoside hydrolases (GHs). The present in silico study has been focused on starch-binding domains from the family CBM41 that are usually part...
GenderMedDB: an interactive database of sex and gender-specific medical literature.

Science.gov (United States)

Oertelt-Prigione, Sabine; Gohlke, Björn-Oliver; Dunkel, Mathias; Preissner, Robert; Regitz-Zagrosek, Vera

2014-01-01

Searches for sex and gender-specific publications are complicated by the absence of a specific algorithm within search engines and by the lack of adequate archives to collect the retrieved results. We previously addressed this issue by initiating the first systematic archive of medical literature containing sex and/or gender-specific analyses. This initial collection has now been greatly enlarged and re-organized as a free user-friendly database with multiple functions: GenderMedDB (http://gendermeddb.charite.de). GenderMedDB retrieves the included publications from the PubMed database. Manuscripts containing sex and/or gender-specific analysis are continuously screened and the relevant findings organized systematically into disciplines and diseases. Publications are furthermore classified by research type, subject and participant numbers. More than 11,000 abstracts are currently included in the database, after screening more than 40,000 publications. The main functions of the database include searches by publication data or content analysis based on pre-defined classifications. In addition, registrants are enabled to upload relevant publications, access descriptive publication statistics and interact in an open user forum. Overall, GenderMedDB offers the advantages of a discipline-specific search engine as well as the functions of a participative tool for the gender medicine community.
ARTI Refrigerant Database

Energy Technology Data Exchange (ETDEWEB)

Cain, J.M. (Calm (James M.), Great Falls, VA (United States))

1993-04-30

The Refrigerant Database consolidates and facilitates access to information to assist industry in developing equipment using alternative refrigerants. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern. The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air-conditioning and refrigeration equipment. The complete documents are not included. The database identifies sources of specific information on R-32, R-123, R-124, R-125, R-134, R-134a, R-141b, R-142b, R-143a, R-152a, R-245ca, R-290 (propane), R-717 (ammonia), ethers, and others as well as azeotropic and zeotropic blends of these fluids. It addresses lubricants including alkylbenzene, polyalkylene glycol, ester, and other synthetics as well as mineral oils. It also references documents addressing compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits. Incomplete citations or abstracts are provided for some documents to accelerate availability of the information and will be completed or replaced in future updates.
ARTI refrigerant database

Energy Technology Data Exchange (ETDEWEB)

Calm, J.M.

1997-02-01

The Refrigerant Database is an information system on alternative refrigerants, associated lubricants, and their use in air conditioning and refrigeration. It consolidates and facilitates access to property, compatibility, environmental, safety, application and other information. It provides corresponding information on older refrigerants, to assist manufacturers and those using alterative refrigerants, to make comparisons and determine differences. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern. The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air-conditioning and refrigeration equipment. The complete documents are not included, though some may be added at a later date. The database identifies sources of specific information on various refrigerants. It addresses lubricants including alkylbenzene, polyalkylene glycol, polyolester, and other synthetics as well as mineral oils. It also references documents addressing compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits. Incomplete citations or abstracts are provided for some documents. They are included to accelerate availability of the information and will be completed or replaced in future updates.
Protected Areas Database for New Mexico

Data.gov (United States)

Earth Data Analysis Center, University of New Mexico — The Protected Areas Database of the United States (PAD-US) is a geodatabase, managed by USGS GAP, that illustrates and describes public land ownership, management...
Pacific Northwest Salmon Habitat Project Database

Data.gov (United States)

National Oceanic and Atmospheric Administration, Department of Commerce — In the Pacific Northwest Salmon Habitat Project Database Across the Pacific Northwest, both public and private agents are working to improve riverine habitat for a...

TcruziDB, an Integrated Database, and the WWW Information Server for the Trypanosoma cruzi Genome Project

Directory of Open Access Journals (Sweden)

Degrave Wim

1997-01-01

Full Text Available Data analysis, presentation and distribution is of utmost importance to a genome project. A public domain software, ACeDB, has been chosen as the common basis for parasite genome databases, and a first release of TcruziDB, the Trypanosoma cruzi genome database, is available by ftp from ftp://iris.dbbm.fiocruz.br/pub/genomedb/TcruziDB as well as versions of the software for different operating systems (ftp://iris.dbbm.fiocruz.br/pub/unixsoft/. Moreover, data originated from the project are available from the WWW server at http://www.dbbm.fiocruz.br. It contains biological and parasitological data on CL Brener, its karyotype, all available T. cruzi sequences from Genbank, data on the EST-sequencing project and on available libraries, a T. cruzi codon table and a listing of activities and participating groups in the genome project, as well as meeting reports. T. cruzi discussion lists (tcruzi-l@iris.dbbm.fiocruz.br and tcgenics@iris.dbbm.fiocruz.br are being maintained for communication and to promote collaboration in the genome project
Assessing water availability over peninsular Malaysia using public domain satellite data products

International Nuclear Information System (INIS)

Ali, M I; Hashim, M; Zin, H S M

2014-01-01

Water availability monitoring is an essential task for water resource sustainability and security. In this paper, the assessment of satellite remote sensing technique for determining water availability is reported. The water-balance analysis is used to compute the spatio-temporal water availability with main inputs; the precipitation and actual evapotranspiration rate (AET), both fully derived from public-domain satellite products of Tropical Rainfall Measurement Mission (TRMM) and MODIS, respectively. Both these satellite products were first subjected to calibration to suit corresponding selected local precipitation and AET samples. Multi-temporal data sets acquired 2000-2010 were used in this study. The results of study, indicated strong agreement of monthly water availability with the basin flow rate (r 2 = 0.5, p < 0.001). Similar agreements were also noted between the estimated annual average water availability with the in-situ measurement. It is therefore concluded that the method devised in this study provide a new alternative for water availability mapping over large area, hence offers the only timely and cost-effective method apart from providing comprehensive spatio-temporal patterns, crucial in water resource planning to ensure water security
The Moroccan Genetic Disease Database (MGDD): a database for DNA variations related to inherited disorders and disease susceptibility.

Science.gov (United States)

Charoute, Hicham; Nahili, Halima; Abidi, Omar; Gabi, Khalid; Rouba, Hassan; Fakiri, Malika; Barakat, Abdelhamid

2014-03-01

National and ethnic mutation databases provide comprehensive information about genetic variations reported in a population or an ethnic group. In this paper, we present the Moroccan Genetic Disease Database (MGDD), a catalogue of genetic data related to diseases identified in the Moroccan population. We used the PubMed, Web of Science and Google Scholar databases to identify available articles published until April 2013. The Database is designed and implemented on a three-tier model using Mysql relational database and the PHP programming language. To date, the database contains 425 mutations and 208 polymorphisms found in 301 genes and 259 diseases. Most Mendelian diseases in the Moroccan population follow autosomal recessive mode of inheritance (74.17%) and affect endocrine, nutritional and metabolic physiology. The MGDD database provides reference information for researchers, clinicians and health professionals through a user-friendly Web interface. Its content should be useful to improve researches in human molecular genetics, disease diagnoses and design of association studies. MGDD can be publicly accessed at http://mgdd.pasteur.ma.
SH3 domain tyrosine phosphorylation--sites, role and evolution.

Directory of Open Access Journals (Sweden)

Zuzana Tatárová

Full Text Available BACKGROUND: SH3 domains are eukaryotic protein domains that participate in a plethora of cellular processes including signal transduction, proliferation, and cellular movement. Several studies indicate that tyrosine phosphorylation could play a significant role in the regulation of SH3 domains. RESULTS: To explore the incidence of the tyrosine phosphorylation within SH3 domains we queried the PhosphoSite Plus database of phosphorylation sites. Over 100 tyrosine phosphorylations occurring on 20 different SH3 domain positions were identified. The tyrosine corresponding to c-Src Tyr-90 was by far the most frequently identified SH3 domain phosphorylation site. A comparison of sequences around this tyrosine led to delineation of a preferred sequence motif ALYD(Y/F. This motif is present in about 15% of human SH3 domains and is structurally well conserved. We further observed that tyrosine phosphorylation is more abundant than serine or threonine phosphorylation within SH3 domains and other adaptor domains, such as SH2 or WW domains. Tyrosine phosphorylation could represent an important regulatory mechanism of adaptor domains. CONCLUSIONS: While tyrosine phosphorylation typically promotes signaling protein interactions via SH2 or PTB domains, its role in SH3 domains is the opposite - it blocks or prevents interactions. The regulatory function of tyrosine phosphorylation is most likely achieved by the phosphate moiety and its charge interfering with binding of polyproline helices of SH3 domain interacting partners.
Monet: a next-generation database kernel for query-intensive applications

NARCIS (Netherlands)

P.A. Boncz (Peter)

2002-01-01

htmlabstractMonet is a database kernel targeted at query-intensive, heavy analysis applications (the opposite of transaction processing), which include OLAP and data mining, but also go beyond the business domain in GIS processing, multi-media retrieval and XML. The clean sheet approach of Monet
Public Use Airports, Geographic WGS84, BTS (2006) [public_use_airports_BTS_2006

Data.gov (United States)

Louisiana Geographic Information Center — The Public Use Airports database is a geographic point database of aircraft landing facilities in the United States and U.S. Territories. Attribute data is provided...
CORE-Hom: a powerful and exhaustive database of clinical trials in homeopathy.

Science.gov (United States)

Clausen, Jürgen; Moss, Sian; Tournier, Alexander; Lüdtke, Rainer; Albrecht, Henning

2014-10-01

The CORE-Hom database was created to answer the need for a reliable and publicly available source of information in the field of clinical research in homeopathy. As of May 2014 it held 1048 entries of clinical trials, observational studies and surveys in the field of homeopathy, including second publications and re-analyses. 352 of the trials referenced in the database were published in peer reviewed journals, 198 of which were randomised controlled trials. The most often used remedies were Arnica montana (n = 103) and Traumeel(®) (n = 40). The most studied medical conditions were respiratory tract infections (n = 126) and traumatic injuries (n = 110). The aim of this article is to introduce the database to the public, describing and explaining the interface, features and content of the CORE-Hom database. Copyright © 2014 The Faculty of Homeopathy. Published by Elsevier Ltd. All rights reserved.
ARTI Refrigerant Database

Energy Technology Data Exchange (ETDEWEB)

Calm, J.M.

1992-11-09

The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air- conditioning and refrigeration equipment. The database identifies sources of specific information on R-32, R-123, R-124, R-125, R-134, R-134a, R-141b, R-142b, R-143a, R-152a, R-245ca, R-290 (propane), R- 717 (ammonia), ethers, and others as well as azeotropic and zeotropic and zeotropic blends of these fluids. It addresses lubricants including alkylbenzene, polyalkylene glycol, ester, and other synthetics as well as mineral oils. It also references documents on compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits. A computerized version is available that includes retrieval software.
Integrating heterogeneous databases in clustered medic care environments using object-oriented technology

Science.gov (United States)

Thakore, Arun K.; Sauer, Frank

1994-05-01

The organization of modern medical care environments into disease-related clusters, such as a cancer center, a diabetes clinic, etc., has the side-effect of introducing multiple heterogeneous databases, often containing similar information, within the same organization. This heterogeneity fosters incompatibility and prevents the effective sharing of data amongst applications at different sites. Although integration of heterogeneous databases is now feasible, in the medical arena this is often an ad hoc process, not founded on proven database technology or formal methods. In this paper we illustrate the use of a high-level object- oriented semantic association method to model information found in different databases into an integrated conceptual global model that integrates the databases. We provide examples from the medical domain to illustrate an integration approach resulting in a consistent global view, without attacking the autonomy of the underlying databases.
Databases applicable to quantitative hazard/risk assessment-Towards a predictive systems toxicology

International Nuclear Information System (INIS)

Waters, Michael; Jackson, Marcus

2008-01-01

The Workshop on The Power of Aggregated Toxicity Data addressed the requirement for distributed databases to support quantitative hazard and risk assessment. The authors have conceived and constructed with federal support several databases that have been used in hazard identification and risk assessment. The first of these databases, the EPA Gene-Tox Database was developed for the EPA Office of Toxic Substances by the Oak Ridge National Laboratory, and is currently hosted by the National Library of Medicine. This public resource is based on the collaborative evaluation, by government, academia, and industry, of short-term tests for the detection of mutagens and presumptive carcinogens. The two-phased evaluation process resulted in more than 50 peer-reviewed publications on test system performance and a qualitative database on thousands of chemicals. Subsequently, the graphic and quantitative EPA/IARC Genetic Activity Profile (GAP) Database was developed in collaboration with the International Agency for Research on Cancer (IARC). A chemical database driven by consideration of the lowest effective dose, GAP has served IARC for many years in support of hazard classification of potential human carcinogens. The Toxicological Activity Profile (TAP) prototype database was patterned after GAP and utilized acute, subchronic, and chronic data from the Office of Air Quality Planning and Standards. TAP demonstrated the flexibility of the GAP format for air toxics, water pollutants and other environmental agents. The GAP format was also applied to developmental toxicants and was modified to represent quantitative results from the rodent carcinogen bioassay. More recently, the authors have constructed: 1) the NIEHS Genetic Alterations in Cancer (GAC) Database which quantifies specific mutations found in cancers induced by environmental agents, and 2) the NIEHS Chemical Effects in Biological Systems (CEBS) Knowledgebase that integrates genomic and other biological data including
The Neotoma Paleoecology Database

Science.gov (United States)

Grimm, E. C.; Ashworth, A. C.; Barnosky, A. D.; Betancourt, J. L.; Bills, B.; Booth, R.; Blois, J.; Charles, D. F.; Graham, R. W.; Goring, S. J.; Hausmann, S.; Smith, A. J.; Williams, J. W.; Buckland, P.

2015-12-01

The Neotoma Paleoecology Database (www.neotomadb.org) is a multiproxy, open-access, relational database that includes fossil data for the past 5 million years (the late Neogene and Quaternary Periods). Modern distributional data for various organisms are also being made available for calibration and paleoecological analyses. The project is a collaborative effort among individuals from more than 20 institutions worldwide, including domain scientists representing a spectrum of Pliocene-Quaternary fossil data types, as well as experts in information technology. Working groups are active for diatoms, insects, ostracodes, pollen and plant macroscopic remains, testate amoebae, rodent middens, vertebrates, age models, geochemistry and taphonomy. Groups are also active in developing online tools for data analyses and for developing modules for teaching at different levels. A key design concept of NeotomaDB is that stewards for various data types are able to remotely upload and manage data. Cooperatives for different kinds of paleo data, or from different regions, can appoint their own stewards. Over the past year, much progress has been made on development of the steward software-interface that will enable this capability. The steward interface uses web services that provide access to the database. More generally, these web services enable remote programmatic access to the database, which both desktop and web applications can use and which provide real-time access to the most current data. Use of these services can alleviate the need to download the entire database, which can be out-of-date as soon as new data are entered. In general, the Neotoma web services deliver data either from an entire table or from the results of a view. Upon request, new web services can be quickly generated. Future developments will likely expand the spatial and temporal dimensions of the database. NeotomaDB is open to receiving new datasets and stewards from the global Quaternary community
The HITRAN 2008 molecular spectroscopic database

International Nuclear Information System (INIS)

Rothman, L.S.; Gordon, I.E.; Barbe, A.; Benner, D.Chris; Bernath, P.F.; Birk, M.; Boudon, V.; Brown, L.R.; Campargue, A.; Champion, J.-P.; Chance, K.; Coudert, L.H.; Dana, V.; Devi, V.M.; Fally, S.; Flaud, J.-M.

2009-01-01

This paper describes the status of the 2008 edition of the HITRAN molecular spectroscopic database. The new edition is the first official public release since the 2004 edition, although a number of crucial updates had been made available online since 2004. The HITRAN compilation consists of several components that serve as input for radiative-transfer calculation codes: individual line parameters for the microwave through visible spectra of molecules in the gas phase; absorption cross-sections for molecules having dense spectral features, i.e. spectra in which the individual lines are not resolved; individual line parameters and absorption cross-sections for bands in the ultraviolet; refractive indices of aerosols, tables and files of general properties associated with the database; and database management software. The line-by-line portion of the database contains spectroscopic parameters for 42 molecules including many of their isotopologues.
The National Landslide Database of Great Britain: Acquisition, communication and the role of social media

Science.gov (United States)

Pennington, Catherine; Freeborough, Katy; Dashwood, Claire; Dijkstra, Tom; Lawrie, Kenneth

2015-11-01

The British Geological Survey (BGS) is the national geological agency for Great Britain that provides geoscientific information to government, other institutions and the public. The National Landslide Database has been developed by the BGS and is the focus for national geohazard research for landslides in Great Britain. The history and structure of the geospatial database and associated Geographical Information System (GIS) are explained, along with the future developments of the database and its applications. The database is the most extensive source of information on landslides in Great Britain with over 17,000 records of landslide events to date, each documented as fully as possible for inland, coastal and artificial slopes. Data are gathered through a range of procedures, including: incorporation of other databases; automated trawling of current and historical scientific literature and media reports; new field- and desk-based mapping technologies with digital data capture, and using citizen science through social media and other online resources. This information is invaluable for directing the investigation, prevention and mitigation of areas of unstable ground in accordance with Government planning policy guidelines. The national landslide susceptibility map (GeoSure) and a national landslide domains map currently under development, as well as regional mapping campaigns, rely heavily on the information contained within the landslide database. Assessing susceptibility to landsliding requires knowledge of the distribution of failures, an understanding of causative factors, their spatial distribution and likely impacts, whilst understanding the frequency and types of landsliding present is integral to modelling how rainfall will influence the stability of a region. Communication of landslide data through the Natural Hazard Partnership (NHP) and Hazard Impact Model contributes to national hazard mitigation and disaster risk reduction with respect to weather and
SPARQLGraph: a web-based platform for graphically querying biological Semantic Web databases.

Science.gov (United States)

Schweiger, Dominik; Trajanoski, Zlatko; Pabinger, Stephan

2014-08-15

Semantic Web has established itself as a framework for using and sharing data across applications and database boundaries. Here, we present a web-based platform for querying biological Semantic Web databases in a graphical way. SPARQLGraph offers an intuitive drag & drop query builder, which converts the visual graph into a query and executes it on a public endpoint. The tool integrates several publicly available Semantic Web databases, including the databases of the just recently released EBI RDF platform. Furthermore, it provides several predefined template queries for answering biological questions. Users can easily create and save new query graphs, which can also be shared with other researchers. This new graphical way of creating queries for biological Semantic Web databases considerably facilitates usability as it removes the requirement of knowing specific query languages and database structures. The system is freely available at http://sparqlgraph.i-med.ac.at.
The relevance of the IFPE Database to the modelling of WWER-type fuel behaviour

International Nuclear Information System (INIS)

Killeen, J.; Sartori, E.

2006-01-01

The aim of the International Fuel Performance Experimental Database (IFPE Database) is to provide, in the public domain, a comprehensive and well-qualified database on zircaloy-clad UO 2 fuel for model development and code validation. The data encompass both normal and off-normal operation and include prototypic commercial irradiations as well as experiments performed in Material Testing Reactors. To date, the Database contains over 800 individual cases, providing data on fuel centreline temperatures, dimensional changes and FGR either from in-pile pressure measurements or PIE techniques, including puncturing, Electron Probe Micro Analysis (EPMA) and X-ray Fluorescence (XRF) measurements. This work in assembling and disseminating the Database is carried out in close co-operation and co-ordination between OECD/NEA and the IAEA. The majority of data sets are dedicated to fuel behaviour under LWR irradiation, and every effort has been made to obtain data representative of BWR, PWR and WWER conditions. In each case, the data set contains information on the pre-characterisation of the fuel, cladding and fuel rod geometry, the irradiation history presented in as much detail as the source documents allow, and finally any in-pile or PIE measurements that were made. The purpose of this paper is to highlight data that are relevant specifically to WWER application. To this end, the NEA and IAEA have been successful in obtaining appropriate data for both WWER-440 and WWER-1000-type reactors. These are: 1) Twelve (12) rods from the Finnish-Russian co-operative SOFIT programme; 2) Kola-3 WWER-440 irradiation; 3) MIR ramp tests on Kola-3 rods; 4) Zaporozskaya WWER-1000 irradiation; 5) Novovoronezh WWER-1000 irradiation. Before reviewing these data sets and their usefulness, the paper touches briefly on recent, more novel additions to the Database and on progress made in the use of the Database for the current IAEA FUMEX II Project. Finally, the paper describes the Computer
Radiological emergencies due to postulated events of melted radioactive material mixed in steel reaching public domain

International Nuclear Information System (INIS)

Meena, T.R.; Anoj Kumar; Patra, R.P.; Vikas; Patil, S.S.; Chatterjee, M.K.; Sharma, Ranjit; Murali, S.

2014-01-01

National level response mechanism is developed at emergency response centres of DAE (DAE-ERCs) at 22 different locations spread all over the country and National Disaster Response Forces with National Disaster Management Authority (NDMA). ERCs are equipped with radiation monitors, radionuclide identifinders, Personnel Radiation Dosimeters (PRD) with monitoring capabilities of the order of tens of nGy/h (μR/hr) above the radiation background at any suspected locations. Even if small amounts of radioactive material is smuggled and brought in some other form into public domain, ERCs are capable to detect, identify and segregate the radioactive material from any inactive scrap. DAE-ERCs have demonstrated their capability in source search, detection, identification and recovery during the radiological emergency at Mayapuri, New Delhi
Radiological emergencies due to postulated events of melted radioactive material mixed in steel reaching public domain

Energy Technology Data Exchange (ETDEWEB)

Meena, T. R.; Kumar, Anoj; Patra, R. P.; Vikas,; Patil, S. S.; Chatterjee, M. K.; Sharma, Ranjit; Murali, S., E-mail: tejram@barc.gov.in [Radiation Safety Systems Division, Bhabha Atomic Research Centre, Mumbai (India)

2014-07-01

National level response mechanism is developed at emergency response centres of DAE (DAE-ERCs) at 22 different locations spread all over the country and National Disaster Response Forces with National Disaster Management Authority (NDMA). ERCs are equipped with radiation monitors, radionuclide identifinders, Personnel Radiation Dosimeters (PRD) with monitoring capabilities of the order of tens of nGy/h (μR/hr) above the radiation background at any suspected locations. Even if small amounts of radioactive material is smuggled and brought in some other form into public domain, ERCs are capable to detect, identify and segregate the radioactive material from any inactive scrap. DAE-ERCs have demonstrated their capability in source search, detection, identification and recovery during the radiological emergency at Mayapuri, New Delhi.
Application of wavelet transform for PDZ domain classification.

Directory of Open Access Journals (Sweden)

Khaled Daqrouq

Full Text Available PDZ domains have been identified as part of an array of signaling proteins that are often unrelated, except for the well-conserved structural PDZ domain they contain. These domains have been linked to many disease processes including common Avian influenza, as well as very rare conditions such as Fraser and Usher syndromes. Historically, based on the interactions and the nature of bonds they form, PDZ domains have most often been classified into one of three classes (class I, class II and others - class III, that is directly dependent on their binding partner. In this study, we report on three unique feature extraction approaches based on the bigram and trigram occurrence and existence rearrangements within the domain's primary amino acid sequences in assisting PDZ domain classification. Wavelet packet transform (WPT and Shannon entropy denoted by wavelet entropy (WE feature extraction methods were proposed. Using 115 unique human and mouse PDZ domains, the existence rearrangement approach yielded a high recognition rate (78.34%, which outperformed our occurrence rearrangements based method. The recognition rate was (81.41% with validation technique. The method reported for PDZ domain classification from primary sequences proved to be an encouraging approach for obtaining consistent classification results. We anticipate that by increasing the database size, we can further improve feature extraction and correct classification.
Publication rates of public health theses in international and national peer-review journals in Turkey.

Science.gov (United States)

Sipahi, H; Durusoy, R; Ergin, I; Hassoy, H; Davas, A; Karababa, Ao

2012-01-01

Thesis is an important part of specialisation and doctorate education and requires intense work. The aim of this study was to investigate the publication rates of Turkish Public Health Doctorate Theses (PHDT) and Public Health Specialization (PHST) theses in international and Turkish national peer-review journals and to analyze the distribution of research areas. List of all theses upto 30 September 2009 were retrieved from theses database of the Council of Higher Education of the Republic of Turkey. The publication rates of these theses were found by searching PubMed, Science Citation Index-Expanded, Turkish Academic Network and Information Center (ULAKBIM) Turkish Medical Database, and Turkish Medline databases for the names of thesis author and mentor. The theses which were published in journals indexed either in PubMed or SCI-E were considered as international publications. Our search yielded a total of 538 theses (243 PHDT, 295 PHST). It was found that the overall publication rate in Turkish national journals was 18%. The overall publication rate in international journals was 11.9%. Overall the most common research area was occupational health. Publication rates of Turkish PHDT and PHST are low. A better understanding of factors affecting this publication rate is important for public health issues where national data is vital for better intervention programs and develop better public health policies.
Data Cleaning and Semantic Improvement in Biological Databases

Directory of Open Access Journals (Sweden)

Apiletti Daniele

2006-12-01

Full Text Available Public genomic and proteomic databases can be affected by a variety of errors. These errors may involve either the description or the meaning of data (namely, syntactic or semantic errors. We focus our analysis on the detection of semantic errors, in order to verify the accuracy of the stored information. In particular, we address the issue of data constraints and functional dependencies among attributes in a given relational database. Constraints and dependencies show semantics among attributes in a database schema and their knowledge may be exploited to improve data quality and integration in database design, and to perform query optimization and dimensional reduction.

ProOpDB: Prokaryotic Operon DataBase.

Science.gov (United States)

Taboada, Blanca; Ciria, Ricardo; Martinez-Guerrero, Cristian E; Merino, Enrique

2012-01-01

The Prokaryotic Operon DataBase (ProOpDB, http://operons.ibt.unam.mx/OperonPredictor) constitutes one of the most precise and complete repositories of operon predictions now available. Using our novel and highly accurate operon identification algorithm, we have predicted the operon structures of more than 1200 prokaryotic genomes. ProOpDB offers diverse alternatives by which a set of operon predictions can be retrieved including: (i) organism name, (ii) metabolic pathways, as defined by the KEGG database, (iii) gene orthology, as defined by the COG database, (iv) conserved protein domains, as defined by the Pfam database, (v) reference gene and (vi) reference operon, among others. In order to limit the operon output to non-redundant organisms, ProOpDB offers an efficient method to select the most representative organisms based on a precompiled phylogenetic distances matrix. In addition, the ProOpDB operon predictions are used directly as the input data of our Gene Context Tool to visualize their genomic context and retrieve the sequence of their corresponding 5' regulatory regions, as well as the nucleotide or amino acid sequences of their genes.
DEDB: a database of Drosophila melanogaster exons in splicing graph form

Directory of Open Access Journals (Sweden)

Tan Tin

2004-12-01

Full Text Available Abstract Background A wealth of quality genomic and mRNA/EST sequences in recent years has provided the data required for large-scale genome-wide analysis of alternative splicing. We have capitalized on this by constructing a database that contains alternative splicing information organized as splicing graphs, where all transcripts arising from a single gene are collected, organized and classified. The splicing graph then serves as the basis for the classification of the various types of alternative splicing events. Description DEDB http://proline.bic.nus.edu.sg/dedb/index.html is a database of Drosophila melanogaster exons obtained from FlyBase arranged in a splicing graph form that permits the creation of simple rules allowing for the classification of alternative splicing events. Pfam domains were also mapped onto the protein sequences allowing users to access the impact of alternative splicing events on domain organization. Conclusions DEDB's catalogue of splicing graphs facilitates genome-wide classification of alternative splicing events for genome analysis. The splicing graph viewer brings together genome, transcript, protein and domain information to facilitate biologists in understanding the implications of alternative splicing.
JASPAR 2010: the greatly expanded open-access database of transcription factor binding profiles

DEFF Research Database (Denmark)

Portales-Casamar, Elodie; Thongjuea, Supat; Kwon, Andrew T

2009-01-01

JASPAR (http://jaspar.genereg.net) is the leading open-access database of matrix profiles describing the DNA-binding patterns of transcription factors (TFs) and other proteins interacting with DNA in a sequence-specific manner. Its fourth major release is the largest expansion of the core database...... to an active research community. As binding models are refined by newer data, the JASPAR database now uses versioning of matrices: in this release, 12% of the older models were updated to improved versions. Classification of TF families has been improved by adopting a new DNA-binding domain nomenclature...
The COMPADRE Plant Matrix Database

DEFF Research Database (Denmark)

2014-01-01

COMPADRE contains demographic information on hundreds of plant species. The data in COMPADRE are in the form of matrix population models and our goal is to make these publicly available to facilitate their use for research and teaching purposes. COMPADRE is an open-access database. We only request...
Conceptual design of nuclear power plants database system

International Nuclear Information System (INIS)

Ishikawa, Masaaki; Izumi, Fumio; Sudoh, Takashi.

1984-03-01

This report is the result of the joint study on the developments of the nuclear power plants database system. The present conceptual design of the database system, which includes Japanese character processing and image processing, has been made on the data of safety design parameters mainly found in the application documents for reactor construction permit made available to the public. (author)
Comprehensive T-Matrix Reference Database: A 2012 - 2013 Update

Science.gov (United States)

Mishchenko, Michael I.; Videen, Gorden; Khlebtsov, Nikolai G.; Wriedt, Thomas

2013-01-01

The T-matrix method is one of the most versatile, efficient, and accurate theoretical techniques widely used for numerically exact computer calculations of electromagnetic scattering by single and composite particles, discrete random media, and particles imbedded in complex environments. This paper presents the fifth update to the comprehensive database of peer-reviewed T-matrix publications initiated by us in 2004 and includes relevant publications that have appeared since 2012. It also lists several earlier publications not incorporated in the original database, including Peter Waterman's reports from the 1960s illustrating the history of the T-matrix approach and demonstrating that John Fikioris and Peter Waterman were the true pioneers of the multi-sphere method otherwise known as the generalized Lorenz - Mie theory.
[Family of ribosomal proteins S1 contains unique conservative domain].

Science.gov (United States)

Deriusheva, E I; Machulin, A V; Selivanova, O M; Serdiuk, I N

2010-01-01

Different representatives of bacteria have different number of amino acid residues in the ribosomal proteins S1. This number varies from 111 (Spiroplasma kunkelii) to 863 a.a. (Treponema pallidum). Traditionally and for lack of this protein three-dimensional structure, its architecture is represented as repeating S1 domains. Number of these domains depends on the protein's length. Domain's quantity and its boundaries data are contained in the specialized databases, such as SMART, Pfam and PROSITE. However, for the same object these data may be very different. For search of domain's quantity and its boundaries, new approach, based on the analysis of dicted secondary structure (PsiPred), was used. This approach allowed us to reveal structural domains in amino acid sequences of S1 proteins and at that number varied from one to six. Alignment of S1 proteins, containing different domain's number, with the S1 RNAbinding domain of Escherichia coli PNPase elicited a fact that in family of ribosomal proteins SI one domain has maximal homology with S1 domain from PNPase. This conservative domain migrates along polypeptide chain and locates in proteins, containing different domain's number, according to specified pattern. In this domain as well in the S1 domain from PNPase, residues Phe-19, Phe-22, His-34, Asp-64 and Arg-68 are clustered on the surface and formed RNA binding site.
USING THE INTERNATIONAL SCIENTOMETRIC DATABASES OF OPEN ACCESS IN SCIENTIFIC RESEARCH

Directory of Open Access Journals (Sweden)

O. Galchevska

2015-05-01

Full Text Available In the article the problem of the use of international scientometric databases in research activities as web-oriented resources and services that are the means of publication and dissemination of research results is considered. Selection criteria of scientometric platforms of open access in conducting scientific researches (coverage Ukrainian scientific periodicals and publications, data accuracy, general characteristics of international scientometrics database, technical, functional characteristics and their indexes are emphasized. The review of the most popular scientometric databases of open access Google Scholar, Russian Scientific Citation Index (RSCI, Scholarometer, Index Copernicus (IC, Microsoft Academic Search is made. Advantages of usage of International Scientometrics database Google Scholar in conducting scientific researches and prospects of research that are in the separation of cloud information and analytical services of the system are determined.
A perspective for biomedical data integration: Design of databases for flow cytometry

Directory of Open Access Journals (Sweden)

Lakoumentas John

2008-02-01

Full Text Available Abstract Background The integration of biomedical information is essential for tackling medical problems. We describe a data model in the domain of flow cytometry (FC allowing for massive management, analysis and integration with other laboratory and clinical information. The paper is concerned with the proper translation of the Flow Cytometry Standard (FCS into a relational database schema, in a way that facilitates end users at either doing research on FC or studying specific cases of patients undergone FC analysis Results The proposed database schema provides integration of data originating from diverse acquisition settings, organized in a way that allows syntactically simple queries that provide results significantly faster than the conventional implementations of the FCS standard. The proposed schema can potentially achieve up to 8 orders of magnitude reduction in query complexity and up to 2 orders of magnitude reduction in response time for data originating from flow cytometers that record 256 colours. This is mainly achieved by managing to maintain an almost constant number of data-mining procedures regardless of the size and complexity of the stored information. Conclusion It is evident that using single-file data storage standards for the design of databases without any structural transformations significantly limits the flexibility of databases. Analysis of the requirements of a specific domain for integration and massive data processing can provide the necessary schema modifications that will unlock the additional functionality of a relational database.
Conservation patterns of HIV-1 RT connection and RNase H domains: identification of new mutations in NRTI-treated patients.

Directory of Open Access Journals (Sweden)

André F A Santos

Full Text Available BACKGROUND: Although extensive HIV drug resistance information is available for the first 400 amino acids of its reverse transcriptase, the impact of antiretroviral treatment in C-terminal domains of Pol (thumb, connection and RNase H is poorly understood. METHODS AND FINDINGS: We wanted to characterize conserved regions in RT C-terminal domains among HIV-1 group M subtypes and CRF. Additionally, we wished to identify NRTI-related mutations in HIV-1 RT C-terminal domains. We sequenced 118 RNase H domains from clinical viral isolates in Brazil, and analyzed 510 thumb and connection domain and 450 RNase H domain sequences collected from public HIV sequence databases, together with their treatment status and histories. Drug-naïve and NRTI-treated datasets were compared for intra- and inter-group conservation, and differences were determined using Fisher's exact tests. One third of RT C-terminal residues were found to be conserved among group M variants. Three mutations were found exclusively in NRTI-treated isolates. Nine mutations in the connection and 6 mutations in the RNase H were associated with NRTI treatment in subtype B. Some of them lay in or close to amino acid residues which contact nucleic acid or near the RNase H active site. Several of the residues pointed out herein have been recently associated to NRTI exposure or increase drug resistance to NRTI. CONCLUSIONS: This is the first comprehensive genotypic analysis of a large sequence dataset that describes NRTI-related mutations in HIV-1 RT C-terminal domains in vivo. The findings into the conservation of RT C-terminal domains may pave the way to more rational drug design initiatives targeting those regions.
Databases of the marine metagenomics

KAUST Repository

Mineta, Katsuhiko

2015-10-28

The metagenomic data obtained from marine environments is significantly useful for understanding marine microbial communities. In comparison with the conventional amplicon-based approach of metagenomics, the recent shotgun sequencing-based approach has become a powerful tool that provides an efficient way of grasping a diversity of the entire microbial community at a sampling point in the sea. However, this approach accelerates accumulation of the metagenome data as well as increase of data complexity. Moreover, when metagenomic approach is used for monitoring a time change of marine environments at multiple locations of the seawater, accumulation of metagenomics data will become tremendous with an enormous speed. Because this kind of situation has started becoming of reality at many marine research institutions and stations all over the world, it looks obvious that the data management and analysis will be confronted by the so-called Big Data issues such as how the database can be constructed in an efficient way and how useful knowledge should be extracted from a vast amount of the data. In this review, we summarize the outline of all the major databases of marine metagenome that are currently publically available, noting that database exclusively on marine metagenome is none but the number of metagenome databases including marine metagenome data are six, unexpectedly still small. We also extend our explanation to the databases, as reference database we call, that will be useful for constructing a marine metagenome database as well as complementing important information with the database. Then, we would point out a number of challenges to be conquered in constructing the marine metagenome database.
Mining Bug Databases for Unidentified Software Vulnerabilities

Energy Technology Data Exchange (ETDEWEB)

Dumidu Wijayasekara; Milos Manic; Jason Wright; Miles McQueen

2012-06-01

Identifying software vulnerabilities is becoming more important as critical and sensitive systems increasingly rely on complex software systems. It has been suggested in previous work that some bugs are only identified as vulnerabilities long after the bug has been made public. These vulnerabilities are known as hidden impact vulnerabilities. This paper discusses the feasibility and necessity to mine common publicly available bug databases for vulnerabilities that are yet to be identified. We present bug database analysis of two well known and frequently used software packages, namely Linux kernel and MySQL. It is shown that for both Linux and MySQL, a significant portion of vulnerabilities that were discovered for the time period from January 2006 to April 2011 were hidden impact vulnerabilities. It is also shown that the percentage of hidden impact vulnerabilities has increased in the last two years, for both software packages. We then propose an improved hidden impact vulnerability identification methodology based on text mining bug databases, and conclude by discussing a few potential problems faced by such a classifier.
Database Description - Trypanosomes Database | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us Trypanosomes Database Database Description General information of database Database name Trypanosomes Database...stitute of Genetics Research Organization of Information and Systems Yata 1111, Mishima, Shizuoka 411-8540, JAPAN E mail: Database...y Name: Trypanosoma Taxonomy ID: 5690 Taxonomy Name: Homo sapiens Taxonomy ID: 9606 Database description The... Article title: Author name(s): Journal: External Links: Original website information Database maintenance s...DB (Protein Data Bank) KEGG PATHWAY Database DrugPort Entry list Available Query search Available Web servic
A high-energy nuclear database proposal

International Nuclear Information System (INIS)

Brown, D.A.; Vogt, R.; UC Davis, CA

2006-01-01

We propose to develop a high-energy heavy-ion experimental database and make it accessible to the scientific community through an on-line interface. This database will be searchable and cross-indexed with relevant publications, including published detector descriptions. Since this database will be a community resource, it requires the high-energy nuclear physics community's financial and manpower support. This database should eventually contain all published data from the Bevalac, AGS and SPS to RHIC and LHC energies, proton-proton to nucleus-nucleus collisions as well as other relevant systems, and all measured observables. Such a database would have tremendous scientific payoff as it makes systematic studies easier and allows simpler benchmarking of theoretical models to a broad range of old and new experiments. Furthermore, there is a growing need for compilations of high-energy nuclear data for applications including stockpile stewardship, technology development for inertial confinement fusion and target and source development for upcoming facilities such as the Next Linear Collider. To enhance the utility of this database, we propose periodically performing evaluations of the data and summarizing the results in topical reviews. (author)
An Entropy Approach to Disclosure Risk Assessment: Lessons from Real Applications and Simulated Domains

Science.gov (United States)

Airoldi, Edoardo M.; Bai, Xue; Malin, Bradley A.

2011-01-01

We live in an increasingly mobile world, which leads to the duplication of information across domains. Though organizations attempt to obscure the identities of their constituents when sharing information for worthwhile purposes, such as basic research, the uncoordinated nature of such environment can lead to privacy vulnerabilities. For instance, disparate healthcare providers can collect information on the same patient. Federal policy requires that such providers share “de-identified” sensitive data, such as biomedical (e.g., clinical and genomic) records. But at the same time, such providers can share identified information, devoid of sensitive biomedical data, for administrative functions. On a provider-by-provider basis, the biomedical and identified records appear unrelated, however, links can be established when multiple providers’ databases are studied jointly. The problem, known as trail disclosure, is a generalized phenomenon and occurs because an individual’s location access pattern can be matched across the shared databases. Due to technical and legal constraints, it is often difficult to coordinate between providers and thus it is critical to assess the disclosure risk in distributed environments, so that we can develop techniques to mitigate such risks. Research on privacy protection has so far focused on developing technologies to suppress or encrypt identifiers associated with sensitive information. There is growing body of work on the formal assessment of the disclosure risk of database entries in publicly shared databases, but a less attention has been paid to the distributed setting. In this research, we review the trail disclosure problem in several domains with known vulnerabilities and show that disclosure risk is influenced by the distribution of how people visit service providers. Based on empirical evidence, we propose an entropy metric for assessing such risk in shared databases prior to their release. This metric assesses risk by
Implementing GraphQL as a Query Language for Deductive Databases in SWI-Prolog Using DCGs, Quasi Quotations, and Dicts

Directory of Open Access Journals (Sweden)

Falco Nogatz

2017-01-01

Full Text Available The methods to access large relational databases in a distributed system are well established: the relational query language SQL often serves as a language for data access and manipulation, and in addition public interfaces are exposed using communication protocols like REST. Similarly to REST, GraphQL is the query protocol of an application layer developed by Facebook. It provides a unified interface between the client and the server for data fetching and manipulation. Using GraphQL's type system, it is possible to specify data handling of various sources and to combine, e.g., relational with NoSQL databases. In contrast to REST, GraphQL provides a single API endpoint and supports flexible queries over linked data. GraphQL can also be used as an interface for deductive databases. In this paper, we give an introduction of GraphQL and a comparison to REST. Using language features recently added to SWI-Prolog 7, we have developed the Prolog library GraphQL.pl, which implements the GraphQL type system and query syntax as a domain-specific language with the help of definite clause grammars (DCG, quasi quotations, and dicts. Using our library, the type system created for a deductive database can be validated, while the query system provides a unified interface for data access and introspection.
Application of new type of distributed multimedia databases to networked electronic museum

Science.gov (United States)

Kuroda, Kazuhide; Komatsu, Naohisa; Komiya, Kazumi; Ikeda, Hiroaki

1999-01-01

Recently, various kinds of multimedia application systems have actively been developed based on the achievement of advanced high sped communication networks, computer processing technologies, and digital contents-handling technologies. Under this background, this paper proposed a new distributed multimedia database system which can effectively perform a new function of cooperative retrieval among distributed databases. The proposed system introduces a new concept of 'Retrieval manager' which functions as an intelligent controller so that the user can recognize a set of distributed databases as one logical database. The logical database dynamically generates and performs a preferred combination of retrieving parameters on the basis of both directory data and the system environment. Moreover, a concept of 'domain' is defined in the system as a managing unit of retrieval. The retrieval can effectively be performed by cooperation of processing among multiple domains. Communication language and protocols are also defined in the system. These are used in every action for communications in the system. A language interpreter in each machine translates a communication language into an internal language used in each machine. Using the language interpreter, internal processing, such internal modules as DBMS and user interface modules can freely be selected. A concept of 'content-set' is also introduced. A content-set is defined as a package of contents. Contents in the content-set are related to each other. The system handles a content-set as one object. The user terminal can effectively control the displaying of retrieved contents, referring to data indicating the relation of the contents in the content- set. In order to verify the function of the proposed system, a networked electronic museum was experimentally built. The results of this experiment indicate that the proposed system can effectively retrieve the objective contents under the control to a number of distributed
Rationale and uses of a public HIV drug-resistance database.

Science.gov (United States)

Shafer, Robert W

2006-09-15

Knowledge regarding the drug resistance of human immunodeficiency virus (HIV) is critical for surveillance of drug resistance, development of antiretroviral drugs, and management of infections with drug-resistant viruses. Such knowledge is derived from studies that correlate genetic variation in the targets of therapy with the antiretroviral treatments received by persons from whom the variant was obtained (genotype-treatment), with drug-susceptibility data on genetic variants (genotype-phenotype), and with virological and clinical response to a new treatment regimen (genotype-outcome). An HIV drug-resistance database is required to represent, store, and analyze the diverse forms of data underlying our knowledge of drug resistance and to make these data available to the broad community of researchers studying drug resistance in HIV and clinicians using HIV drug-resistance tests. Such genotype-treatment, genotype-phenotype, and genotype-outcome correlations are contained in the Stanford HIV RT and Protease Sequence Database and have specific usefulness.
Web Syndication Approaches for Sharing Primary Data in "Small Science" Domains

Directory of Open Access Journals (Sweden)

Eric C Kansa

2010-06-01

Full Text Available In some areas of science, sophisticated web services and semantics underlie "cyberinfrastructure". However, in "small science" domains, especially in field sciences such as archaeology, conservation, and public health, datasets often resist standardization. Publishing data in the small sciences should embrace this diversity rather than attempt to corral research into "universal" (domain standards. A growing ecosystem of increasingly powerful Web syndication based approaches for sharing data on the public Web can offer a viable approach. Atom Feed based services can be used with scientific collections to identify and create linkages across different datasets, even across disciplinary boundaries without shared domain standards.
Large-scale Health Information Database and Privacy Protection.

Science.gov (United States)

Yamamoto, Ryuichi

2016-09-01

Japan was once progressive in the digitalization of healthcare fields but unfortunately has fallen behind in terms of the secondary use of data for public interest. There has recently been a trend to establish large-scale health databases in the nation, and a conflict between data use for public interest and privacy protection has surfaced as this trend has progressed. Databases for health insurance claims or for specific health checkups and guidance services were created according to the law that aims to ensure healthcare for the elderly; however, there is no mention in the act about using these databases for public interest in general. Thus, an initiative for such use must proceed carefully and attentively. The PMDA projects that collect a large amount of medical record information from large hospitals and the health database development project that the Ministry of Health, Labour and Welfare (MHLW) is working on will soon begin to operate according to a general consensus; however, the validity of this consensus can be questioned if issues of anonymity arise. The likelihood that researchers conducting a study for public interest would intentionally invade the privacy of their subjects is slim. However, patients could develop a sense of distrust about their data being used since legal requirements are ambiguous. Nevertheless, without using patients' medical records for public interest, progress in medicine will grind to a halt. Proper legislation that is clear for both researchers and patients will therefore be highly desirable. A revision of the Act on the Protection of Personal Information is currently in progress. In reality, however, privacy is not something that laws alone can protect; it will also require guidelines and self-discipline. We now live in an information capitalization age. I will introduce the trends in legal reform regarding healthcare information and discuss some basics to help people properly face the issue of health big data and privacy

A database application for the Naval Command Physical Readiness Testing Program

OpenAIRE

Quinones, Frances M.

1998-01-01

Approved for public release; distribution is unlimited 1T21 envisions a Navy with tandardized, state-of-art computer systems. Based on this vision, Naval database management systems will also need to become standardized among Naval commands. Today most commercial off the shelf (COTS) database management systems provide a graphical user interface. Among the many Naval database systems currently in use, the Navy's Physical Readiness Program database has continued to exist at the command leve...
NCEI Standard Product: World Ocean Database (WOD)

Data.gov (United States)

National Oceanic and Atmospheric Administration, Department of Commerce — The World Ocean Database (WOD) is the world's largest publicly available uniform format quality controlled ocean profile dataset. Ocean profile data are sets of...
Database Description - SKIP Stemcell Database | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us SKIP Stemcell Database Database Description General information of database Database name SKIP Stemcell Database...rsity Journal Search: Contact address http://www.skip.med.keio.ac.jp/en/contact/ Database classification Human Genes and Diseases Dat...abase classification Stemcell Article Organism Taxonomy Name: Homo sapiens Taxonomy ID: 9606 Database...ks: Original website information Database maintenance site Center for Medical Genetics, School of medicine, ...lable Web services Not available URL of Web services - Need for user registration Not available About This Database Database
Database Description - Arabidopsis Phenome Database | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us Arabidopsis Phenome Database Database Description General information of database Database n... BioResource Center Hiroshi Masuya Database classification Plant databases - Arabidopsis thaliana Organism T...axonomy Name: Arabidopsis thaliana Taxonomy ID: 3702 Database description The Arabidopsis thaliana phenome i...heir effective application. We developed the new Arabidopsis Phenome Database integrating two novel database...seful materials for their experimental research. The other, the “Database of Curated Plant Phenome” focusing
Advances in the design, development, and deployment of the U.S. Army Research Laboratory (ARL) multimodal signatures database

Science.gov (United States)

Bennett, Kelly; Robertson, James

2011-06-01

Recent advances in the design, development, and deployment of U.S. Army Research Laboratory's (ARL) Multimodal Signature Database (MMSDB) create a state-of-the-art database system with Web-based access through a Web interface designed specifically for research and development. Tens of thousands of signatures are currently available for researchers to support their algorithm development and refinement for sensors and other security systems. Each dataset is stored in (Hierarchical Data Format 5 (HDF5) format for easy modeling and storing of signatures and archived sensor data, ground truth, calibration information, algorithms, and other documentation. Archived HDF5 formatted data provides the basis for computational interoperability across a variety of tools including MATLAB, Octave, and Python. The database has a Web-based front-end with public and restricted access interfaces, along with 24/7 availability and support. This paper describes the overall design of the system, and the recent enhancements and future vision, including the ability for researchers to share algorithms, data, and documentation in the cloud, and providing an ability to run algorithms and software for testing and evaluation purposes remotely across multiple domains and computational tools. The paper will also describe in detail the HDF5 format for several multimodal sensor types.
Metagenome Analysis of Protein Domain Collocation within Cellulase Genes of Goat Rumen Microbes

Directory of Open Access Journals (Sweden)

SooYeon Lim

2013-08-01

Full Text Available In this study, protein domains with cellulase activity in goat rumen microbes were investigated using metagenomic and bioinformatic analyses. After the complete genome of goat rumen microbes was obtained using a shotgun sequencing method, 217,892,109 pair reads were filtered, including only those with 70% identity, 100-bp matches, and thresholds below E−10 using METAIDBA. These filtered contigs were assembled and annotated using blastN against the NCBI nucleotide database. As a result, a microbial community structure with 1431 species was analyzed, among which Prevotella ruminicola 23 bacteria and Butyrivibrio proteoclasticus B316 were the dominant groups. In parallel, 201 sequences related with cellulase activities (EC.3.2.1.4 were obtained through blast searches using the enzyme.dat file provided by the NCBI database. After translating the nucleotide sequence into a protein sequence using Interproscan, 28 protein domains with cellulase activity were identified using the HMMER package with threshold E values below 10−5. Cellulase activity protein domain profiling showed that the major protein domains such as lipase GDSL, cellulase, and Glyco hydro 10 were present in bacterial species with strong cellulase activities. Furthermore, correlation plots clearly displayed the strong positive correlation between some protein domain groups, which was indicative of microbial adaption in the goat rumen based on feeding habits. This is the first metagenomic analysis of cellulase activity protein domains using bioinformatics from the goat rumen.
Proposal for a High Energy Nuclear Database

International Nuclear Information System (INIS)

Brown, David A.; Vogt, Ramona

2005-01-01

We propose to develop a high-energy heavy-ion experimental database and make it accessible to the scientific community through an on-line interface. This database will be searchable and cross-indexed with relevant publications, including published detector descriptions. Since this database will be a community resource, it requires the high-energy nuclear physics community's financial and manpower support. This database should eventually contain all published data from Bevalac and AGS to RHIC to CERN-LHC energies, proton-proton to nucleus-nucleus collisions as well as other relevant systems, and all measured observables. Such a database would have tremendous scientific payoff as it makes systematic studies easier and allows simpler benchmarking of theoretical models to a broad range of old and new experiments. Furthermore, there is a growing need for compilations of high-energy nuclear data for applications including stockpile stewardship, technology development for inertial confinement fusion and target and source development for upcoming facilities such as the Next Linear Collider. To enhance the utility of this database, we propose periodically performing evaluations of the data and summarizing the results in topical reviews
A performance study on the synchronisation of heterogeneous Grid databases using CONStanza

CERN Document Server

Pucciani, G; Domenici, Andrea; Stockinger, Heinz

2010-01-01

In Grid environments, several heterogeneous database management systems are used in various administrative domains. However, data exchange and synchronisation need to be available across different sites and different database systems. In this article we present our data consistency service CONStanza and give details on how we achieve relaxed update synchronisation between different database implementations. The integration in existing Grid environments is one of the major goals of the system. Performance tests have been executed following a factorial approach. Detailed experimental results and a statistical analysis are presented to evaluate the system components and drive future developments. (C) 2010 Elsevier B.V. All rights reserved.
Risoe Publication Activities in 1997

International Nuclear Information System (INIS)

Alvi, Hanne; Bennov, Solvejg

1998-08-01

Risoe's publication and lecture activities in the last decades are presented through data of total number of publications, distribution of types of publications, number of citations to the international scientific journal articles, and institutions with which Risoe has published the largest number of articles. The data are derived from Risoe's in-house Publications Database and from the Risoe Institutional Citation Report database produced by the Institute for Scientific Information. The largest part of the report contains a list of references to the scientific and technical journal articles, books, reports, lectures, and to publications for a broader readership authored by researchers at Risoe National Laboratory during the year 1997. The references are organised according to the programme areas of Risoe. (au)
Low dose CT image restoration using a database of image patches

Science.gov (United States)

Ha, Sungsoo; Mueller, Klaus

2015-01-01

Reducing the radiation dose in CT imaging has become an active research topic and many solutions have been proposed to remove the significant noise and streak artifacts in the reconstructed images. Most of these methods operate within the domain of the image that is subject to restoration. This, however, poses limitations on the extent of filtering possible. We advocate to take into consideration the vast body of external knowledge that exists in the domain of already acquired medical CT images, since after all, this is what radiologists do when they examine these low quality images. We can incorporate this knowledge by creating a database of prior scans, either of the same patient or a diverse corpus of different patients, to assist in the restoration process. Our paper follows up on our previous work that used a database of images. Using images, however, is challenging since it requires tedious and error prone registration and alignment. Our new method eliminates these problems by storing a diverse set of small image patches in conjunction with a localized similarity matching scheme. We also empirically show that it is sufficient to store these patches without anatomical tags since their statistics are sufficiently strong to yield good similarity matches from the database and as a direct effect, produce image restorations of high quality. A final experiment demonstrates that our global database approach can recover image features that are difficult to preserve with conventional denoising approaches.
Effective elements of school health promotion across behavioral domains: a systematic review of reviews

Directory of Open Access Journals (Sweden)

Peters Louk WH

2009-06-01

Full Text Available Abstract Background Most school health education programs focus on a single behavioral domain. Integrative programs that address multiple behaviors may be more efficient, but only if the elements of change are similar for these behaviors. The objective of this study was to examine which effective elements of school health education are similar across three particular behavioral domains. Methods A systematic review of reviews of the effectiveness of school-based health promotion programs was conducted for the domains of substance abuse, sexual behavior, and nutrition. The literature search spanned the time period between 1995 and October 2006 and included three databases, websites of review centers and backward search. Fifty-five reviews and meta-analyses met predetermined relevance and publication criteria and were included. Data was extracted by one reviewer and checked by a second reviewer. A standardized data extraction form was used, with detailed attention to effective elements pertaining to program goals, development, content, methods, facilitator, components and intensity. Two assessors rated the quality of reviews as strong, moderate or weak. We included only strong and moderate reviews in two types of analysis: one based on interpretation of conflicting results, the other on a specific vote-counting rule. Results Thirty six reviews were rated strong, 6 moderate, and 13 weak. A multitude of effective elements was identified in the included reviews and many elements were similar for two or more domains. In both types of analysis, five elements with evidence from strong reviews were found to be similar for all three domains: use of theory; addressing social influences, especially social norms; addressing cognitive-behavioral skills; training of facilitators; and multiple components. Two additional elements had positive results in all domains with the rule-based method of analysis, but had inconclusive results in at least one domain with
The Effects of Normalisation of the Satisfaction of Novice End-User Querying Databases

Directory of Open Access Journals (Sweden)

Conrad Benedict

1997-05-01

Full Text Available This paper reports the results of an experiment that investigated the effects different structural characteristics of relational databases have on information satisfaction of end-users querying databases. The results show that unnormalised tables adversely affect end-user satisfaction. The adverse affect on end-user satisfaction is attributable primarily to the use of non atomic data. In this study, the affect on end user satisfaction of repeating fields was not significant. The study contributes to the further development of theories of individual adjustment to information technology in the workplace by alerting organisations and, in particular, database designers to the ways in which the structural characteristics of relational databases may affect end-user satisfaction. More importantly, the results suggest that database designers need to clearly identify the domains for each item appearing in their databases. These issues are of increasing importance because of the growth in the amount of data available to end-users in relational databases.
CHID: a unique health information and education database.

OpenAIRE

Lunin, L F; Stein, R S

1987-01-01

The public's growing interest in health information and the health professions' increasing need to locate health education materials can be answered in part by the new Combined Health Information Database (CHID). This unique database focuses on materials and programs in professional and patient education, general health education, and community risk reduction. Accessible through BRS, CHID suggests sources for procuring brochures, pamphlets, articles, and films on community services, programs ...
The Brainomics/Localizer database.

Science.gov (United States)

Papadopoulos Orfanos, Dimitri; Michel, Vincent; Schwartz, Yannick; Pinel, Philippe; Moreno, Antonio; Le Bihan, Denis; Frouin, Vincent

2017-01-01

The Brainomics/Localizer database exposes part of the data collected by the in-house Localizer project, which planned to acquire four types of data from volunteer research subjects: anatomical MRI scans, functional MRI data, behavioral and demographic data, and DNA sampling. Over the years, this local project has been collecting such data from hundreds of subjects. We had selected 94 of these subjects for their complete datasets, including all four types of data, as the basis for a prior publication; the Brainomics/Localizer database publishes the data associated with these 94 subjects. Since regulatory rules prevent us from making genetic data available for download, the database serves only anatomical MRI scans, functional MRI data, behavioral and demographic data. To publish this set of heterogeneous data, we use dedicated software based on the open-source CubicWeb semantic web framework. Through genericity in the data model and flexibility in the display of data (web pages, CSV, JSON, XML), CubicWeb helps us expose these complex datasets in original and efficient ways. Copyright © 2015 Elsevier Inc. All rights reserved.
A user-friendly phytoremediation database: creating the searchable database, the users, and the broader implications.

Science.gov (United States)

Famulari, Stevie; Witz, Kyla

2015-01-01

Designers, students, teachers, gardeners, farmers, landscape architects, architects, engineers, homeowners, and others have uses for the practice of phytoremediation. This research looks at the creation of a phytoremediation database which is designed for ease of use for a non-scientific user, as well as for students in an educational setting ( http://www.steviefamulari.net/phytoremediation ). During 2012, Environmental Artist & Professor of Landscape Architecture Stevie Famulari, with assistance from Kyla Witz, a landscape architecture student, created an online searchable database designed for high public accessibility. The database is a record of research of plant species that aid in the uptake of contaminants, including metals, organic materials, biodiesels & oils, and radionuclides. The database consists of multiple interconnected indexes categorized into common and scientific plant name, contaminant name, and contaminant type. It includes photographs, hardiness zones, specific plant qualities, full citations to the original research, and other relevant information intended to aid those designing with phytoremediation search for potential plants which may be used to address their site's need. The objective of the terminology section is to remove uncertainty for more inexperienced users, and to clarify terms for a more user-friendly experience. Implications of the work, including education and ease of browsing, as well as use of the database in teaching, are discussed.
Toward designing for trust in database automation

Energy Technology Data Exchange (ETDEWEB)

Duez, P. P.; Jamieson, G. A. [Cognitive Engineering Laboratory, Univ. of Toronto, 5 King' s College Rd., Toronto, Ont. M5S 3G8 (Canada)

2006-07-01

Appropriate reliance on system automation is imperative for safe and productive work, especially in safety-critical systems. It is unsafe to rely on automation beyond its designed use; conversely, it can be both unproductive and unsafe to manually perform tasks that are better relegated to automated tools. Operator trust in automated tools mediates reliance, and trust appears to affect how operators use technology. As automated agents become more complex, the question of trust in automation is increasingly important. In order to achieve proper use of automation, we must engender an appropriate degree of trust that is sensitive to changes in operating functions and context. In this paper, we present research concerning trust in automation in the domain of automated tools for relational databases. Lee and See have provided models of trust in automation. One model developed by Lee and See identifies three key categories of information about the automation that lie along a continuum of attributional abstraction. Purpose-, process-and performance-related information serve, both individually and through inferences between them, to describe automation in such a way as to engender r properly-calibrated trust. Thus, one can look at information from different levels of attributional abstraction as a general requirements analysis for information key to appropriate trust in automation. The model of information necessary to engender appropriate trust in automation [1] is a general one. Although it describes categories of information, it does not provide insight on how to determine the specific information elements required for a given automated tool. We have applied the Abstraction Hierarchy (AH) to this problem in the domain of relational databases. The AH serves as a formal description of the automation at several levels of abstraction, ranging from a very abstract purpose-oriented description to a more concrete description of the resources involved in the automated process
Toward designing for trust in database automation

International Nuclear Information System (INIS)

Duez, P. P.; Jamieson, G. A.

2006-01-01

Appropriate reliance on system automation is imperative for safe and productive work, especially in safety-critical systems. It is unsafe to rely on automation beyond its designed use; conversely, it can be both unproductive and unsafe to manually perform tasks that are better relegated to automated tools. Operator trust in automated tools mediates reliance, and trust appears to affect how operators use technology. As automated agents become more complex, the question of trust in automation is increasingly important. In order to achieve proper use of automation, we must engender an appropriate degree of trust that is sensitive to changes in operating functions and context. In this paper, we present research concerning trust in automation in the domain of automated tools for relational databases. Lee and See have provided models of trust in automation. One model developed by Lee and See identifies three key categories of information about the automation that lie along a continuum of attributional abstraction. Purpose-, process-and performance-related information serve, both individually and through inferences between them, to describe automation in such a way as to engender r properly-calibrated trust. Thus, one can look at information from different levels of attributional abstraction as a general requirements analysis for information key to appropriate trust in automation. The model of information necessary to engender appropriate trust in automation [1] is a general one. Although it describes categories of information, it does not provide insight on how to determine the specific information elements required for a given automated tool. We have applied the Abstraction Hierarchy (AH) to this problem in the domain of relational databases. The AH serves as a formal description of the automation at several levels of abstraction, ranging from a very abstract purpose-oriented description to a more concrete description of the resources involved in the automated process
Building a genome database using an object-oriented approach.

Science.gov (United States)

Barbasiewicz, Anna; Liu, Lin; Lang, B Franz; Burger, Gertraud

2002-01-01

GOBASE is a relational database that integrates data associated with mitochondria and chloroplasts. The most important data in GOBASE, i. e., molecular sequences and taxonomic information, are obtained from the public sequence data repository at the National Center for Biotechnology Information (NCBI), and are validated by our experts. Maintaining a curated genomic database comes with a towering labor cost, due to the shear volume of available genomic sequences and the plethora of annotation errors and omissions in records retrieved from public repositories. Here we describe our approach to increase automation of the database population process, thereby reducing manual intervention. As a first step, we used Unified Modeling Language (UML) to construct a list of potential errors. Each case was evaluated independently, and an expert solution was devised, and represented as a diagram. Subsequently, the UML diagrams were used as templates for writing object-oriented automation programs in the Java programming language.
Research Outputs of England's Hospital Episode Statistics (HES) Database: Bibliometric Analysis.

Science.gov (United States)

Chaudhry, Zain; Mannan, Fahmida; Gibson-White, Angela; Syed, Usama; Ahmed, Shirin; Majeed, Azeem

2017-12-06

Hospital administrative data, such as those provided by the Hospital Episode Statistics (HES) database in England, are increasingly being used for research and quality improvement. To date, no study has tried to quantify and examine trends in the use of HES for research purposes. To examine trends in the use of HES data for research. Publications generated from the use of HES data were extracted from PubMed and analysed. Publications from 1996 to 2014 were then examined further in the Science Citation Index (SCI) of the Thompson Scientific Institute for Science Information (Web of Science) for details of research specialty area. 520 studies, categorised into 44 specialty areas, were extracted from PubMed. The review showed an increase in publications over the 18-year period with an average of 27 publications per year, however with the majority of output observed in the latter part of the study period. The highest number of publications was in the Health Statistics specialty area. The use of HES data for research is becoming more common. Increase in publications over time shows that researchers are beginning to take advantage of the potential of HES data. Although HES is a valuable database, concerns exist over the accuracy and completeness of the data entered. Clinicians need to be more engaged with HES for the full potential of this database to be harnessed.
Audio stream classification for multimedia database search

Science.gov (United States)

Artese, M.; Bianco, S.; Gagliardi, I.; Gasparini, F.

2013-03-01

Search and retrieval of huge archives of Multimedia data is a challenging task. A classification step is often used to reduce the number of entries on which to perform the subsequent search. In particular, when new entries of the database are continuously added, a fast classification based on simple threshold evaluation is desirable. In this work we present a CART-based (Classification And Regression Tree [1]) classification framework for audio streams belonging to multimedia databases. The database considered is the Archive of Ethnography and Social History (AESS) [2], which is mainly composed of popular songs and other audio records describing the popular traditions handed down generation by generation, such as traditional fairs, and customs. The peculiarities of this database are that it is continuously updated; the audio recordings are acquired in unconstrained environment; and for the non-expert human user is difficult to create the ground truth labels. In our experiments, half of all the available audio files have been randomly extracted and used as training set. The remaining ones have been used as test set. The classifier has been trained to distinguish among three different classes: speech, music, and song. All the audio files in the dataset have been previously manually labeled into the three classes above defined by domain experts.

Research reactor records in the INIS database

International Nuclear Information System (INIS)

Marinkovic, N.

2001-01-01

This report presents a statistical analysis of more than 13,000 records of publications concerned with research and technology in the field of research and experimental reactors which are included in the INIS Bibliographic Database for the period from 1970 to 2001. The main objectives of this bibliometric study were: to make an inventory of research reactor related records in the INIS Database; to provide statistics and scientific indicators for the INIS users, namely science managers, researchers, engineers, operators, scientific editors and publishers, decision-makers in the field of research reactors related subjects; to extract other useful information from the INIS Bibliographic Database about articles published in research reactors research and technology. (author)
An online database of nuclear electromagnetic moments

International Nuclear Information System (INIS)

Mertzimekis, T.J.; Stamou, K.; Psaltis, A.

2016-01-01

Measurements of nuclear magnetic dipole and electric quadrupole moments are considered quite important for the understanding of nuclear structure both near and far from the valley of stability. The recent advent of radioactive beams has resulted in a plethora of new, continuously flowing, experimental data on nuclear structure – including nuclear moments – which hinders the information management. A new, dedicated, public and user friendly online database ( (http://magneticmoments.info)) has been created comprising experimental data of nuclear electromagnetic moments. The present database supersedes existing printed compilations, including also non-evaluated series of data and relevant meta-data, while putting strong emphasis on bimonthly updates. The scope, features and extensions of the database are reported.
Databases for neurogenetics: introduction, overview, and challenges.

Science.gov (United States)

Sobrido, María-Jesús; Cacheiro, Pilar; Carracedo, Angel; Bertram, Lars

2012-09-01

The importance for research and clinical utility of mutation databases, as well as the issues and difficulties entailed in their construction, is discussed within the Human Variome Project. While general principles and standards can apply to most human diseases, some specific questions arise when dealing with the nature of genetic neurological disorders. So far, publically accessible mutation databases exist for only about half of the genes causing neurogenetic disorders; and a considerable work is clearly still needed to optimize their content. The current landscape, main challenges, some potential solutions, and future perspectives on genetic databases for disorders of the nervous system are reviewed in this special issue of Human Mutation on neurogenetics. © 2012 Wiley Periodicals, Inc.
JAMSTEC DARWIN Database Assimilates GANSEKI and COEDO

Science.gov (United States)

Tomiyama, T.; Toyoda, Y.; Horikawa, H.; Sasaki, T.; Fukuda, K.; Hase, H.; Saito, H.

2017-12-01

Introduction: Japan Agency for Marine-Earth Science and Technology (JAMSTEC) archives data and samples obtained by JAMSTEC research vessels and submersibles. As a common property of the human society, JAMSTEC archive is open for public users with scientific/educational purposes [1]. For publicizing its data and samples online, JAMSTEC is operating NUUNKUI data sites [2], a group of several databases for various data and sample types. For years, data and metadata of JAMSTEC rock samples, sediment core samples and cruise/dive observation were publicized through databases named GANSEKI, COEDO, and DARWIN, respectively. However, because they had different user interfaces and data structures, these services were somewhat confusing for unfamiliar users. Maintenance costs of multiple hardware and software were also problematic for performing sustainable services and continuous improvements. Database Integration: In 2017, GANSEKI, COEDO and DARWIN were integrated into DARWIN+ [3]. The update also included implementation of map-search function as a substitute of closed portal site. Major functions of previous systems were incorporated into the new system; users can perform the complex search, by thumbnail browsing, map area, keyword filtering, and metadata constraints. As for data handling, the new system is more flexible, allowing the entry of variety of additional data types. Data Management: After the DARWIN major update, JAMSTEC data & sample team has been dealing with minor issues of individual sample data/metadata which sometimes need manual modification to be transferred to the new system. Some new data sets, such as onboard sample photos and surface close-up photos of rock samples, are getting available online. Geochemical data of sediment core samples will supposedly be added in the near future. Reference: [1] http://www.jamstec.go.jp/e/database/data_policy.html [2] http://www.godac.jamstec.go.jp/jmedia/portal/e/ [3] http://www.godac.jamstec.go.jp/darwin/e/
Understanding the productive author who published papers in medicine using National Health Insurance Database: A systematic review and meta-analysis.

Science.gov (United States)

Chien, Tsair-Wei; Chang, Yu; Wang, Hsien-Yi

2018-02-01

Many researchers used National Health Insurance database to publish medical papers which are often retrospective, population-based, and cohort studies. However, the author's research domain and academic characteristics are still unclear.By searching the PubMed database (Pubmed.com), we used the keyword of [Taiwan] and [National Health Insurance Research Database], then downloaded 2913 articles published from 1995 to 2017. Social network analysis (SNA), Gini coefficient, and Google Maps were applied to gather these data for visualizing: the most productive author; the pattern of coauthor collaboration teams; and the author's research domain denoted by abstract keywords and Pubmed MESH (medical subject heading) terms.Utilizing the 2913 papers from Taiwan's National Health Insurance database, we chose the top 10 research teams shown on Google Maps and analyzed one author (Dr. Kao) who published 149 papers in the database in 2015. In the past 15 years, we found Dr. Kao had 2987 connections with other coauthors from 13 research teams. The cooccurrence abstract keywords with the highest frequency are cohort study and National Health Insurance Research Database. The most coexistent MESH terms are tomography, X-ray computed, and positron-emission tomography. The strength of the author research distinct domain is very low (Gini < 0.40).SNA incorporated with Google Maps and Gini coefficient provides insight into the relationships between entities. The results obtained in this study can be applied for a comprehensive understanding of other productive authors in the field of academics.
Plant databases and data analysis tools

Science.gov (United States)

It is anticipated that the coming years will see the generation of large datasets including diagnostic markers in several plant species with emphasis on crop plants. To use these datasets effectively in any plant breeding program, it is essential to have the information available via public database...
The Danish Inguinal Hernia Database

Directory of Open Access Journals (Sweden)

Friis-Andersen H

2016-10-01

Full Text Available Hans Friis-Andersen1,2, Thue Bisgaard2,3 1Surgical Department, Horsens Regional Hospital, Horsens, Denmark; 2Steering Committee, Danish Hernia Database, 3Surgical Gastroenterological Department 235, Copenhagen University Hospital, Hvidovre, Denmark Aim of database: To monitor and improve nation-wide surgical outcome after groin hernia repair based on scientific evidence-based surgical strategies for the national and international surgical community. Study population: Patients ≥18 years operated for groin hernia. Main variables: Type and size of hernia, primary or recurrent, type of surgical repair procedure, mesh and mesh fixation methods. Descriptive data: According to the Danish National Health Act, surgeons are obliged to register all hernia repairs immediately after surgery (3 minute registration time. All institutions have continuous access to their own data stratified on individual surgeons. Registrations are based on a closed, protected Internet system requiring personal codes also identifying the operating institution. A national steering committee consisting of 13 voluntary and dedicated surgeons, 11 of whom are unpaid, handles the medical management of the database. Results: The Danish Inguinal Hernia Database comprises intraoperative data from >130,000 repairs (May 2015. A total of 49 peer-reviewed national and international publications have been published from the database (June 2015. Conclusion: The Danish Inguinal Hernia Database is fully active monitoring surgical quality and contributes to the national and international surgical society to improve outcome after groin hernia repair. Keywords: nation-wide, recurrence, chronic pain, femoral hernia, surgery, quality improvement
The human genome as public: Justifications and implications.

Science.gov (United States)

Bayefsky, Michelle J

2017-03-01

Since the human genome was decoded, great emphasis has been placed on the unique, personal nature of the genome, along with the benefits that personalized medicine can bring to individuals and the importance of safeguarding genetic privacy. As a result, an equally important aspect of the human genome - its common nature - has been underappreciated and underrepresented in the ethics literature and policy dialogue surrounding genetics and genomics. This article will argue that, just as the personal nature of the genome has been used to reinforce individual rights and justify important privacy protections, so too the common nature of the genome can be employed to support protections of the genome at a population level and policies designed to promote the public's wellbeing. In order for public health officials to have the authority to develop genetics policies for the sake of the public good, the genome must have not only a common, but also a public, dimension. This article contends that DNA carries a public dimension through the use of two conceptual frameworks: the common heritage (CH) framework and the common resource (CR) framework. Both frameworks establish a public interest in the human genome, but the CH framework can be used to justify policies aimed at preserving and protecting the genome, while the CR framework can be employed to justify policies for utilizing the genome for the public benefit. A variety of possible policy implications are discussed, with special attention paid to the use of large-scale genomics databases for public health research. © Published 2016. This article is a U.S. Government work and is in the public domain in the USA.
Open Geoscience Database

Science.gov (United States)

Bashev, A.

2012-04-01

Currently there is an enormous amount of various geoscience databases. Unfortunately the only users of the majority of the databases are their elaborators. There are several reasons for that: incompaitability, specificity of tasks and objects and so on. However the main obstacles for wide usage of geoscience databases are complexity for elaborators and complication for users. The complexity of architecture leads to high costs that block the public access. The complication prevents users from understanding when and how to use the database. Only databases, associated with GoogleMaps don't have these drawbacks, but they could be hardly named "geoscience" Nevertheless, open and simple geoscience database is necessary at least for educational purposes (see our abstract for ESSI20/EOS12). We developed a database and web interface to work with them and now it is accessible at maps.sch192.ru. In this database a result is a value of a parameter (no matter which) in a station with a certain position, associated with metadata: the date when the result was obtained; the type of a station (lake, soil etc); the contributor that sent the result. Each contributor has its own profile, that allows to estimate the reliability of the data. The results can be represented on GoogleMaps space image as a point in a certain position, coloured according to the value of the parameter. There are default colour scales and each registered user can create the own scale. The results can be also extracted in *.csv file. For both types of representation one could select the data by date, object type, parameter type, area and contributor. The data are uploaded in *.csv format: Name of the station; Lattitude(dd.dddddd); Longitude(ddd.dddddd); Station type; Parameter type; Parameter value; Date(yyyy-mm-dd). The contributor is recognised while entering. This is the minimal set of features that is required to connect a value of a parameter with a position and see the results. All the complicated data
Simple re-instantiation of small databases using cloud computing.

Science.gov (United States)

Tan, Tin Wee; Xie, Chao; De Silva, Mark; Lim, Kuan Siong; Patro, C Pawan K; Lim, Shen Jean; Govindarajan, Kunde Ramamoorthy; Tong, Joo Chuan; Choo, Khar Heng; Ranganathan, Shoba; Khan, Asif M

2013-01-01

Small bioinformatics databases, unlike institutionally funded large databases, are vulnerable to discontinuation and many reported in publications are no longer accessible. This leads to irreproducible scientific work and redundant effort, impeding the pace of scientific progress. We describe a Web-accessible system, available online at http://biodb100.apbionet.org, for archival and future on demand re-instantiation of small databases within minutes. Depositors can rebuild their databases by downloading a Linux live operating system (http://www.bioslax.com), preinstalled with bioinformatics and UNIX tools. The database and its dependencies can be compressed into an ".lzm" file for deposition. End-users can search for archived databases and activate them on dynamically re-instantiated BioSlax instances, run as virtual machines over the two popular full virtualization standard cloud-computing platforms, Xen Hypervisor or vSphere. The system is adaptable to increasing demand for disk storage or computational load and allows database developers to use the re-instantiated databases for integration and development of new databases. Herein, we demonstrate that a relatively inexpensive solution can be implemented for archival of bioinformatics databases and their rapid re-instantiation should the live databases disappear.
45 CFR 1356.80 - Scope of the National Youth in Transition Database.

Science.gov (United States)

2010-10-01

... 45 Public Welfare 4 2010-10-01 2010-10-01 false Scope of the National Youth in Transition Database... REQUIREMENTS APPLICABLE TO TITLE IV-E § 1356.80 Scope of the National Youth in Transition Database. The requirements of the National Youth in Transition Database (NYTD) §§ 1356.81 through 1356.86 of this part apply...
ARTI refrigerant database

Energy Technology Data Exchange (ETDEWEB)

Calm, J.M. [Calm (James M.), Great Falls, VA (United States)

1998-08-01

The Refrigerant Database is an information system on alternative refrigerants, associated lubricants, and their use in air conditioning and refrigeration. It consolidates and facilitates access to property, compatibility, environmental, safety, application and other information. It provides corresponding information on older refrigerants, to assist manufactures and those using alternative refrigerants, to make comparisons and determine differences. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern. The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air-conditioning and refrigeration equipment. The complete documents are not included, though some may be added at a later date. The database identifies sources of specific information on many refrigerants including propane, ammonia, water, carbon dioxide, propylene, ethers, and others as well as azeotropic and zeotropic blends of these fluids. It addresses lubricants including alkylbenzene, polyalkylene glycol, polyolester, and other synthetics as well as mineral oils. It also references documents addressing compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits. Incomplete citations or abstracts are provided for some documents. They are included to accelerate availability of the information and will be completed or replaced in future updates.
BioWarehouse: a bioinformatics database warehouse toolkit.

Science.gov (United States)

Lee, Thomas J; Pouliot, Yannick; Wagner, Valerie; Gupta, Priyanka; Stringer-Calvert, David W J; Tenenbaum, Jessica D; Karp, Peter D

2006-03-23

This article addresses the problem of interoperation of heterogeneous bioinformatics databases. We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL) but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. BioWarehouse embodies significant progress on the database integration problem for bioinformatics.
The Danish Inguinal Hernia database.

Science.gov (United States)

Friis-Andersen, Hans; Bisgaard, Thue

2016-01-01

To monitor and improve nation-wide surgical outcome after groin hernia repair based on scientific evidence-based surgical strategies for the national and international surgical community. Patients ≥18 years operated for groin hernia. Type and size of hernia, primary or recurrent, type of surgical repair procedure, mesh and mesh fixation methods. According to the Danish National Health Act, surgeons are obliged to register all hernia repairs immediately after surgery (3 minute registration time). All institutions have continuous access to their own data stratified on individual surgeons. Registrations are based on a closed, protected Internet system requiring personal codes also identifying the operating institution. A national steering committee consisting of 13 voluntary and dedicated surgeons, 11 of whom are unpaid, handles the medical management of the database. The Danish Inguinal Hernia Database comprises intraoperative data from >130,000 repairs (May 2015). A total of 49 peer-reviewed national and international publications have been published from the database (June 2015). The Danish Inguinal Hernia Database is fully active monitoring surgical quality and contributes to the national and international surgical society to improve outcome after groin hernia repair.
Prototyping visual interface for maintenance and supply databases

OpenAIRE

Fore, Henry Ray

1989-01-01

Approved for public release; distribution is unlimited This research examined the feasibility of providing a visual interface to standard Army Management Information Systems at the unit level. The potential of improving the Human-Machine Interface of unit level maintenance and supply software, such as ULLS (Unit Level Logistics System), is very attractive. A prototype was implemented in GLAD (Graphics Language for Database). GLAD is a graphics object-oriented environment for databases t...
The Danish ventral hernia database

DEFF Research Database (Denmark)

Helgstrand, Frederik; Jorgensen, Lars Nannestad

2016-01-01

Aim: The Danish Ventral Hernia Database (DVHD) provides national surveillance of current surgical practice and clinical postoperative outcomes. The intention is to reduce postoperative morbidity and hernia recurrence, evaluate new treatment strategies, and facilitate nationwide implementation of ...... of operations and is an excellent tool for observing changes over time, including adjustment of several confounders. This national database registry has impacted on clinical practice in Denmark and led to a high number of scientific publications in recent years.......Aim: The Danish Ventral Hernia Database (DVHD) provides national surveillance of current surgical practice and clinical postoperative outcomes. The intention is to reduce postoperative morbidity and hernia recurrence, evaluate new treatment strategies, and facilitate nationwide implementation...... to the surgical repair are recorded. Data registration is mandatory. Data may be merged with other Danish health registries and information from patient questionnaires or clinical examinations. Descriptive data: More than 37,000 operations have been registered. Data have demonstrated high agreement with patient...
Opening of energy markets: consequences on the missions of public utility and of security of supplies in the domain of electric power and gas

International Nuclear Information System (INIS)

2001-01-01

This conference was jointly organized by the International Energy Agency (IEA) and the French ministry of economy, finances, and industry (general direction of energy and raw materials, DGEMP). It was organized in 6 sessions dealing with: 1 - the public utility in the domain of energy: definition of the public utility missions, experience feedback about liberalized markets, public utility obligation and pricing regulation; 2 - the new US energy policy and the lessons learnt from the California crisis; 3 - the security of electric power supplies: concepts of security of supplies, opinion of operators, security of power supplies versus liberalization and investments; 4 - security of gas supplies: markets liberalization and investments, long-term contracts and security of supplies; 5 - debate: how to integrate the objectives of public utility and of security of supplies in a competing market; 6 - conclusions. This document brings together the available talks and transparencies presented at the conference. (J.S.)
Theoretical domains framework to assess barriers to change for planning health care quality interventions: a systematic literature review.

Science.gov (United States)

Mosavianpour, Mirkaber; Sarmast, Hamideh Helen; Kissoon, Niranjan; Collet, Jean-Paul

2016-01-01

Theoretical domains framework (TDF) provides an integrative model for assessing barriers to behavioral changes in order to suggest interventions for improvement in behavior and ultimately outcomes. However, there are other tools that are used to assess barriers. The objective of this study is to determine the degree of concordance between domains and constructs identified in two versions of the TDF including original (2005) and refined version (2012) and independent studies of other tools. We searched six databases for articles that studied barriers to health-related behavior changes of health care professionals or the general public. We reviewed quantitative papers published in English which included their questionnaires in the article. A table including the TDF domains of both original and refined versions and related constructs was developed to serve as a reference to describe the barriers assessed in the independent studies; descriptive statistics were used to express the results. Out of 552 papers retrieved, 50 were eligible to review. The barrier domains explored in these articles belonged to two to eleven domains of the refined TDF. Eighteen articles (36%) used constructs outside of the refined version. The spectrum of barrier constructs of the original TDF was broader and could meet the domains studied in 48 studies (96%). Barriers in domains of "environmental context and resources", "beliefs about consequences", and "social influences" were the most frequently explored in 42 (84%), 37 (74%), and 33 (66%) of the 50 articles, respectively. Both refined and original TDFs cataloged barriers measured by the other studies that did not use TDF as their framework. However, the original version of TDF explored a broader spectrum of barriers than the refined version. From this perspective, the original version of the TDF seems to be a more comprehensive tool for assessing barriers in practice.
Atlantic Canada's energy research and development website and database

International Nuclear Information System (INIS)

2005-01-01

Petroleum Research Atlantic Canada maintains a website devoted to energy research and development in Atlantic Canada. The site can be viewed on the world wide web at www.energyresearch.ca. It includes a searchable database with information about researchers in Nova Scotia, their projects and published materials on issues related to hydrocarbons, alternative energy technologies, energy efficiency, climate change, environmental impacts and policy. The website also includes links to research funding agencies, external related databases and related energy organizations around the world. Nova Scotia-based users are invited to submit their academic, private or public research to the site. Before being uploaded into the database, a site administrator reviews and processes all new information. Users are asked to identify their areas of interest according to the following research categories: alternative or renewable energy technologies; climate change; coal; computer applications; economics; energy efficiency; environmental impacts; geology; geomatics; geophysics; health and safety; human factors; hydrocarbons; meteorology and oceanology (metocean) activities; petroleum operations in deep and shallow waters; policy; and power generation and supply. The database can be searched 5 ways according to topic, researchers, publication, projects or funding agency. refs., tabs., figs
The Mouse SAGE Site: database of public mouse SAGE libraries

Czech Academy of Sciences Publication Activity Database

Divina, Petr; Forejt, Jiří

2004-01-01

Roč. 32, - (2004), s. D482-D483 ISSN 0305-1048 R&D Projects: GA MŠk LN00A079; GA ČR GV204/98/K015 Grant - others:HHMI(US) 555000306 Institutional research plan: CEZ:AV0Z5052915 Keywords : mouse SAGE libraries * web -based database Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 7.260, year: 2004

Database Constraints Applied to Metabolic Pathway Reconstruction Tools

Directory of Open Access Journals (Sweden)

Jordi Vilaplana

2014-01-01

Full Text Available Our group developed two biological applications, Biblio-MetReS and Homol-MetReS, accessing the same database of organisms with annotated genes. Biblio-MetReS is a data-mining application that facilitates the reconstruction of molecular networks based on automated text-mining analysis of published scientific literature. Homol-MetReS allows functional (reannotation of proteomes, to properly identify both the individual proteins involved in the process(es of interest and their function. It also enables the sets of proteins involved in the process(es in different organisms to be compared directly. The efficiency of these biological applications is directly related to the design of the shared database. We classified and analyzed the different kinds of access to the database. Based on this study, we tried to adjust and tune the configurable parameters of the database server to reach the best performance of the communication data link to/from the database system. Different database technologies were analyzed. We started the study with a public relational SQL database, MySQL. Then, the same database was implemented by a MapReduce-based database named HBase. The results indicated that the standard configuration of MySQL gives an acceptable performance for low or medium size databases. Nevertheless, tuning database parameters can greatly improve the performance and lead to very competitive runtimes.
Database constraints applied to metabolic pathway reconstruction tools.

Science.gov (United States)

Vilaplana, Jordi; Solsona, Francesc; Teixido, Ivan; Usié, Anabel; Karathia, Hiren; Alves, Rui; Mateo, Jordi

2014-01-01

Our group developed two biological applications, Biblio-MetReS and Homol-MetReS, accessing the same database of organisms with annotated genes. Biblio-MetReS is a data-mining application that facilitates the reconstruction of molecular networks based on automated text-mining analysis of published scientific literature. Homol-MetReS allows functional (re)annotation of proteomes, to properly identify both the individual proteins involved in the process(es) of interest and their function. It also enables the sets of proteins involved in the process(es) in different organisms to be compared directly. The efficiency of these biological applications is directly related to the design of the shared database. We classified and analyzed the different kinds of access to the database. Based on this study, we tried to adjust and tune the configurable parameters of the database server to reach the best performance of the communication data link to/from the database system. Different database technologies were analyzed. We started the study with a public relational SQL database, MySQL. Then, the same database was implemented by a MapReduce-based database named HBase. The results indicated that the standard configuration of MySQL gives an acceptable performance for low or medium size databases. Nevertheless, tuning database parameters can greatly improve the performance and lead to very competitive runtimes.
Carbon dynamics of mature and regrowth tropical forests derived from a pantropical database (TropForC-db).

Science.gov (United States)

Anderson-Teixeira, Kristina J; Wang, Maria M H; McGarvey, Jennifer C; LeBauer, David S

2016-05-01

Tropical forests play a critical role in the global carbon (C) cycle, storing ~45% of terrestrial C and constituting the largest component of the terrestrial C sink. Despite their central importance to the global C cycle, their ecosystem-level C cycles are not as well-characterized as those of extra-tropical forests, and knowledge gaps hamper efforts to quantify C budgets across the tropics and to model tropical forest-climate interactions. To advance understanding of C dynamics of pantropical forests, we compiled a new database, the Tropical Forest C database (TropForC-db), which contains data on ground-based measurements of ecosystem-level C stocks and annual fluxes along with disturbance history. This database currently contains 3568 records from 845 plots in 178 geographically distinct areas, making it the largest and most comprehensive database of its type. Using TropForC-db, we characterized C stocks and fluxes for young, intermediate-aged, and mature forests. Relative to existing C budgets of extra-tropical forests, mature tropical broadleaf evergreen forests had substantially higher gross primary productivity (GPP) and ecosystem respiration (Reco), their autotropic respiration (Ra) consumed a larger proportion (~67%) of GPP, and their woody stem growth (ANPPstem) represented a smaller proportion of net primary productivity (NPP, ~32%) or GPP (~9%). In regrowth stands, aboveground biomass increased rapidly during the first 20 years following stand-clearing disturbance, with slower accumulation following agriculture and in deciduous forests, and continued to accumulate at a slower pace in forests aged 20-100 years. Most other C stocks likewise increased with stand age, while potential to describe age trends in C fluxes was generally data-limited. We expect that TropForC-db will prove useful for model evaluation and for quantifying the contribution of forests to the global C cycle. The database version associated with this publication is archived in Dryad (DOI
Using Online Databases in Corporate Issues Management.

Science.gov (United States)

Thomsen, Steven R.

1995-01-01

Finds that corporate public relations practitioners felt they were able, using online database and information services, to intercept issues earlier in the "issue cycle" and thus enable their organizations to develop more "proactionary" or "catalytic" issues management repose strategies. (SR)
Large-scale Health Information Database and Privacy Protection*1

Science.gov (United States)

YAMAMOTO, Ryuichi

2016-01-01

Japan was once progressive in the digitalization of healthcare fields but unfortunately has fallen behind in terms of the secondary use of data for public interest. There has recently been a trend to establish large-scale health databases in the nation, and a conflict between data use for public interest and privacy protection has surfaced as this trend has progressed. Databases for health insurance claims or for specific health checkups and guidance services were created according to the law that aims to ensure healthcare for the elderly; however, there is no mention in the act about using these databases for public interest in general. Thus, an initiative for such use must proceed carefully and attentively. The PMDA*2 projects that collect a large amount of medical record information from large hospitals and the health database development project that the Ministry of Health, Labour and Welfare (MHLW) is working on will soon begin to operate according to a general consensus; however, the validity of this consensus can be questioned if issues of anonymity arise. The likelihood that researchers conducting a study for public interest would intentionally invade the privacy of their subjects is slim. However, patients could develop a sense of distrust about their data being used since legal requirements are ambiguous. Nevertheless, without using patients’ medical records for public interest, progress in medicine will grind to a halt. Proper legislation that is clear for both researchers and patients will therefore be highly desirable. A revision of the Act on the Protection of Personal Information is currently in progress. In reality, however, privacy is not something that laws alone can protect; it will also require guidelines and self-discipline. We now live in an information capitalization age. I will introduce the trends in legal reform regarding healthcare information and discuss some basics to help people properly face the issue of health big data and privacy
Proposal for a high-energy nuclear database

International Nuclear Information System (INIS)

Brown, D.A.; Vogt, R.

2006-01-01

We propose to develop a high-energy heavy-ion experimental database and make it accessible to the scientific community through an on-line interface. This database will be searchable and cross-indexed with relevant publications, including published detector descriptions. Since this database will be a community resource, it requires the high-energy nuclear physics community's financial and manpower support. This database should eventually contain all published data from Bevalac, AGS and SPS to RHIC and LHC energies, proton-proton to nucleus-nucleus collisions as well as other relevant systems, and all measured observables. Such a database would have tremendous scientific payoff as it makes systematic studies easier and allows simpler benchmarking of theoretical models to a broad range of old and new experiments. Furthermore, there is a growing need for compilations of high-energy nuclear data for applications including stockpile stewardship, technology development for inertial confinement fusion and target and source development for upcoming facilities such as the Next Linear Collider. To enhance the utility of this database, we propose periodically performing evaluations of the data and summarizing the results in topical reviews. (author)
Proposal for a High Energy Nuclear Database

International Nuclear Information System (INIS)

Brown, D A; Vogt, R

2005-01-01

The authors propose to develop a high-energy heavy-ion experimental database and make it accessible to the scientific community through an on-line interface. This database will be searchable and cross-indexed with relevant publications, including published detector descriptions. Since this database will be a community resource, it requires the high-energy nuclear physics community's financial and manpower support. This database should eventually contain all published data from Bevalac, AGS and SPS to RHIC and CERN-LHC energies, proton-proton to nucleus-nucleus collisions as well as other relevant systems, and all measured observables. Such a database would have tremendous scientific payoff as it makes systematic studies easier and allows simpler benchmarking of theoretical models to a broad range of old and new experiments. Furthermore, there is a growing need for compilations of high-energy nuclear data for applications including stockpile stewardship, technology development for inertial confinement fusion and target and source development for upcoming facilities such as the Next Linear Collider. To enhance the utility of this database, they propose periodically performing evaluations of the data and summarizing the results in topical reviews
Integrated olfactory receptor and microarray gene expression databases

Directory of Open Access Journals (Sweden)

Crasto Chiquito J

2007-06-01

Full Text Available Abstract Background Gene expression patterns of olfactory receptors (ORs are an important component of the signal encoding mechanism in the olfactory system since they determine the interactions between odorant ligands and sensory neurons. We have developed the Olfactory Receptor Microarray Database (ORMD to house OR gene expression data. ORMD is integrated with the Olfactory Receptor Database (ORDB, which is a key repository of OR gene information. Both databases aim to aid experimental research related to olfaction. Description ORMD is a Web-accessible database that provides a secure data repository for OR microarray experiments. It contains both publicly available and private data; accessing the latter requires authenticated login. The ORMD is designed to allow users to not only deposit gene expression data but also manage their projects/experiments. For example, contributors can choose whether to make their datasets public. For each experiment, users can download the raw data files and view and export the gene expression data. For each OR gene being probed in a microarray experiment, a hyperlink to that gene in ORDB provides access to genomic and proteomic information related to the corresponding olfactory receptor. Individual ORs archived in ORDB are also linked to ORMD, allowing users access to the related microarray gene expression data. Conclusion ORMD serves as a data repository and project management system. It facilitates the study of microarray experiments of gene expression in the olfactory system. In conjunction with ORDB, ORMD integrates gene expression data with the genomic and functional data of ORs, and is thus a useful resource for both olfactory researchers and the public.
The Danish Fetal Medicine Database

Directory of Open Access Journals (Sweden)

Ekelund CK

2016-10-01

Full Text Available Charlotte Kvist Ekelund,1 Tine Iskov Kopp,2 Ann Tabor,1 Olav Bjørn Petersen3 1Department of Obstetrics, Center of Fetal Medicine, Rigshospitalet, University of Copenhagen, Copenhagen, Denmark; 2Registry Support Centre (East – Epidemiology and Biostatistics, Research Centre for Prevention and Health, Glostrup, Denmark; 3Fetal Medicine Unit, Aarhus University Hospital, Aarhus Nord, Denmark Aim: The aim of this study is to set up a database in order to monitor the detection rates and false-positive rates of first-trimester screening for chromosomal abnormalities and prenatal detection rates of fetal malformations in Denmark. Study population: Pregnant women with a first or second trimester ultrasound scan performed at all public hospitals in Denmark are registered in the database. Main variables/descriptive data: Data on maternal characteristics, ultrasonic, and biochemical variables are continuously sent from the fetal medicine units' Astraia databases to the central database via web service. Information about outcome of pregnancy (miscarriage, termination, live birth, or stillbirth is received from the National Patient Register and National Birth Register and linked via the Danish unique personal registration number. Furthermore, results of all pre- and postnatal chromosome analyses are sent to the database. Conclusion: It has been possible to establish a fetal medicine database, which monitors first-trimester screening for chromosomal abnormalities and second-trimester screening for major fetal malformations with the input from already collected data. The database is valuable to assess the performance at a regional level and to compare Danish performance with international results at a national level. Keywords: prenatal screening, nuchal translucency, fetal malformations, chromosomal abnormalities
Development of a Tsunami Scenario Database for Marmara Sea

Science.gov (United States)

Ozer Sozdinler, Ceren; Necmioglu, Ocal; Meral Ozel, Nurcan

2016-04-01

Due to the very short travel times in Marmara Sea, a Tsunami Early Warning System (TEWS) has to be strongly coupled with the earthquake early warning system and should be supported with a pre-computed tsunami scenario database to be queried in near real-time based on the initial earthquake parameters. To address this problem, 30 different composite earthquake scenarios with maximum credible Mw values based on 32 fault segments have been identified to produce a detailed scenario database for all possible earthquakes in the Marmara Sea with a tsunamigenic potential. The bathy/topo data of Marmara Sea was prepared using GEBCO and ASTER data, bathymetric measurements along Bosphorus, Istanbul and Dardanelle, Canakkale and the coastline digitized from satellite images. The coarser domain in 90m-grid size was divided into 11 sub-regions having 30m-grid size in order to increase the data resolution and precision of the calculation results. The analyses were performed in nested domains with numerical model NAMIDANCE using non-linear shallow water equations. In order to cover all the residential areas, industrial facilities and touristic locations, more than 1000 numerical gauge points were selected along the coasts of Marmara Sea, which are located at water depth of 5 to 10m in finer domain. The distributions of tsunami hydrodynamic parameters were investigated together with the change of water surface elevations, current velocities, momentum fluxes and other important parameters at the gauge points. This work is funded by the project MARsite - New Directions in Seismic Hazard assessment through Focused Earth Observation in the Marmara Supersite (FP7-ENV.2012 6.4-2, Grant 308417 - see NH2.3/GMPV7.4/SM7.7) and supported by SATREPS-MarDim Project (Earthquake and Tsunami Disaster Mitigation in the Marmara Region and Disaster Education in Turkey) and JICA (Japan International Cooperation Agency). The authors would like to acknowledge Ms. Basak Firat for her assistance in
Climiate Resilience Screening Index and Domain Scores

Data.gov (United States)

U.S. Environmental Protection Agency — CRSI and related-domain scores for all 50 states and 3135 counties in the U.S. This dataset is not publicly accessible because: They are already available within the...
Databases on safety issues for WWER and RBMK reactors. Users' manual. A publication of the extrabudgetary programme on the safety of WWER and RBMK nuclear power plants

International Nuclear Information System (INIS)

1996-04-01

At the beginning of the IAEA Extrabudgetary Programme on the safety of WWER reactors a great number of findings and recommendations (safety items) were collected as a result of design review and safety review missions of the WWER-440/230 type reactors. On the basis of these findings a technical database containing more than 1300 records was established to support the consolidation of the information obtained and to help in identification of safety issues. After the scope of the WWER extrabudgetary programme was extended similar data sets were prepared for the WWER-440/213, WWER-1000 and RBMK nuclear power plants. This publication describes the structure of the databases on safety issues of WWER and RBMK NPPs, the information sources used in the databases and interrogation capabilities for users to obtain the necessary information. 14 refs, 9 figs, 5 tabs
Using Bibliographic Knowledge for Ranking in Scientific Publication Databases

CERN Document Server

Vesely, Martin; Le Meur, Jean-Yves

2008-01-01

Document ranking for scientific publications involves a variety of specialized resources (e.g. author or citation indexes) that are usually difficult to use within standard general purpose search engines that usually operate on large-scale heterogeneous document collections for which the required specialized resources are not always available for all the documents present in the collections. Integrating such resources into specialized information retrieval engines is therefore important to cope with community-specific user expectations that strongly influence the perception of relevance within the considered community. In this perspective, this paper extends the notion of ranking with various methods exploiting different types of bibliographic knowledge that represent a crucial resource for measuring the relevance of scientific publications. In our work, we experimentally evaluated the adequacy of two such ranking methods (one based on freshness, i.e. the publication date, and the other on a novel index, the ...
Research progress in muscle-derived stem cells: Literature retrieval results based on international database.

Science.gov (United States)

Zhang, Li; Wang, Wei

2012-04-05

To identify global research trends of muscle-derived stem cells (MDSCs) using a bibliometric analysis of the Web of Science, Research Portfolio Online Reporting Tools of the National Institutes of Health (NIH), and the Clinical Trials registry database (ClinicalTrials.gov). We performed a bibliometric analysis of data retrievals for MDSCs from 2002 to 2011 using the Web of Science, NIH, and ClinicalTrials.gov. (1) Web of Science: (a) peer-reviewed articles on MDSCs that were published and indexed in the Web of Science. (b) Type of articles: original research articles, reviews, meeting abstracts, proceedings papers, book chapters, editorial material and news items. (c) Year of publication: 2002-2011. (d) Citation databases: Science Citation Index-Expanded (SCI-E), 1899-present; Conference Proceedings Citation Index-Science (CPCI-S), 1991-present; Book Citation Index-Science (BKCI-S), 2005-present. (2) NIH: (a) Projects on MDSCs supported by the NIH. (b) Fiscal year: 1988-present. (3) ClinicalTrials.gov: All clinical trials relating to MDSCs were searched in this database. (1) Web of Science: (a) Articles that required manual searching or telephone access. (b) We excluded documents that were not published in the public domain. (c) We excluded a number of corrected papers from the total number of articles. (d) We excluded articles from the following databases: Social Sciences Citation Index (SSCI), 1898-present; Arts & Humanities Citation Index (A&HCI), 1975-present; Conference Proceedings Citation Index - Social Science & Humanities (CPCI-SSH), 1991-present; Book Citation Index - Social Sciences & Humanities (BKCI-SSH), 2005-present; Current Chemical Reactions (CCR-EXPANDED), 1985-present; Index Chemicus (IC), 1993-present. (2) NIH: (a) We excluded publications related to MDSCs that were supported by the NIH. (b) We limited the keyword search to studies that included MDSCs within the title or abstract. (3) ClinicalTrials.gov: (a) We excluded clinical trials that were
The UMIST database for astrochemistry 2006

Science.gov (United States)

Woodall, J.; Agúndez, M.; Markwick-Kemper, A. J.; Millar, T. J.

2007-05-01

Aims:We present a new version of the UMIST Database for Astrochemistry, the fourth such version to be released to the public. The current version contains some 4573 binary gas-phase reactions, an increase of 10% from the previous (1999) version, among 420 species, of which 23 are new to the database. Methods: Major updates have been made to ion-neutral reactions, neutral-neutral reactions, particularly at low temperature, and dissociative recombination reactions. We have included for the first time the interstellar chemistry of fluorine. In addition to the usual database, we have also released a reaction set in which the effects of dipole-enhanced ion-neutral rate coefficients are included. Results: These two reactions sets have been used in a dark cloud model and the results of these models are presented and discussed briefly. The database and associated software are available on the World Wide Web at www.udfa.net. Tables 1, 2, 4 and 9 are only available in electronic form at http://www.aanda.org
Benchmarking database performance for genomic data.

Science.gov (United States)

Khushi, Matloob

2015-06-01

Genomic regions represent features such as gene annotations, transcription factor binding sites and epigenetic modifications. Performing various genomic operations such as identifying overlapping/non-overlapping regions or nearest gene annotations are common research needs. The data can be saved in a database system for easy management, however, there is no comprehensive database built-in algorithm at present to identify overlapping regions. Therefore I have developed a novel region-mapping (RegMap) SQL-based algorithm to perform genomic operations and have benchmarked the performance of different databases. Benchmarking identified that PostgreSQL extracts overlapping regions much faster than MySQL. Insertion and data uploads in PostgreSQL were also better, although general searching capability of both databases was almost equivalent. In addition, using the algorithm pair-wise, overlaps of >1000 datasets of transcription factor binding sites and histone marks, collected from previous publications, were reported and it was found that HNF4G significantly co-locates with cohesin subunit STAG1 (SA1).Inc. © 2015 Wiley Periodicals, Inc.
IEEE Conference Publications in Libraries.

Science.gov (United States)

Johnson, Karl E.

1984-01-01

Conclusions of surveys (63 libraries, OCLC database, University of Rhode Island users) assessing handling of Institute of Electrical and Electronics Engineers (IEEE) conference publications indicate that most libraries fully catalog these publications using LC cataloging, and library patrons frequently require series access to publications. Eight…
Transcript structure and domain display: a customizable transcript visualization tool.

Science.gov (United States)

Watanabe, Kenneth A; Ma, Kaiwang; Homayouni, Arielle; Rushton, Paul J; Shen, Qingxi J

2016-07-01

Transcript Structure and Domain Display (TSDD) is a publicly available, web-based program that provides publication quality images of transcript structures and domains. TSDD is capable of producing transcript structures from GFF/GFF3 and BED files. Alternatively, the GFF files of several model organisms have been pre-loaded so that users only needs to enter the locus IDs of the transcripts to be displayed. Visualization of transcripts provides many benefits to researchers, ranging from evolutionary analysis of DNA-binding domains to predictive function modeling. TSDD is freely available for non-commercial users at http://shenlab.sols.unlv.edu/shenlab/software/TSD/transcript_display.html : jeffery.shen@unlv.nevada.edu. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Risoe publication activities in 1998

International Nuclear Information System (INIS)

Bennov, Solvejg

1999-04-01

The report contains a list of references to the scientific and technical journal articles, books, reports, lectures published in full text, and to publications for a broader readership authored by researchers at Risoe National Laboratory and published in 1998. If the publication mentioned in the reference is electronically available the link to the web-address is added. The references are organised according to the programme areas of Risoe. The text is introduced by total number of publications, distribution of types of publications, number of citations to the international scientific journal articles, institutions with which Risoe has published the largest number of articles, and journals in which Risoe has published most articles. The data are derived from Risoe's in-house Publications Database and from the Risoe Institutional Citation Report database produced by the Institute for Scientific Information. (au)
Risoe publication activities in 1998

Energy Technology Data Exchange (ETDEWEB)

Bennov, Solvejg [ed.

1999-04-01

The report contains a list of references to the scientific and technical journal articles, books, reports, lectures published in full text, and to publications for a broader readership authored by researchers at Risoe National Laboratory and published in 1998. If the publication mentioned in the reference is electronically available the link to the web-address is added. The references are organised according to the programme areas of Risoe. The text is introduced by total number of publications, distribution of types of publications, number of citations to the international scientific journal articles, institutions with which Risoe has published the largest number of articles, and journals in which Risoe has published most articles. The data are derived from Risoe`s in-house Publications Database and from the Risoe Institutional Citation Report database produced by the Institute for Scientific Information. (au)

BioWarehouse: a bioinformatics database warehouse toolkit

Directory of Open Access Journals (Sweden)

Stringer-Calvert David WJ

2006-03-01

Full Text Available Abstract Background This article addresses the problem of interoperation of heterogeneous bioinformatics databases. Results We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. Conclusion BioWarehouse embodies significant progress on the
YMDB: the Yeast Metabolome Database

Science.gov (United States)

Jewison, Timothy; Knox, Craig; Neveu, Vanessa; Djoumbou, Yannick; Guo, An Chi; Lee, Jacqueline; Liu, Philip; Mandal, Rupasri; Krishnamurthy, Ram; Sinelnikov, Igor; Wilson, Michael; Wishart, David S.

2012-01-01

The Yeast Metabolome Database (YMDB, http://www.ymdb.ca) is a richly annotated ‘metabolomic’ database containing detailed information about the metabolome of Saccharomyces cerevisiae. Modeled closely after the Human Metabolome Database, the YMDB contains >2000 metabolites with links to 995 different genes/proteins, including enzymes and transporters. The information in YMDB has been gathered from hundreds of books, journal articles and electronic databases. In addition to its comprehensive literature-derived data, the YMDB also contains an extensive collection of experimental intracellular and extracellular metabolite concentration data compiled from detailed Mass Spectrometry (MS) and Nuclear Magnetic Resonance (NMR) metabolomic analyses performed in our lab. This is further supplemented with thousands of NMR and MS spectra collected on pure, reference yeast metabolites. Each metabolite entry in the YMDB contains an average of 80 separate data fields including comprehensive compound description, names and synonyms, structural information, physico-chemical data, reference NMR and MS spectra, intracellular/extracellular concentrations, growth conditions and substrates, pathway information, enzyme data, gene/protein sequence data, as well as numerous hyperlinks to images, references and other public databases. Extensive searching, relational querying and data browsing tools are also provided that support text, chemical structure, spectral, molecular weight and gene/protein sequence queries. Because of S. cervesiae's importance as a model organism for biologists and as a biofactory for industry, we believe this kind of database could have considerable appeal not only to metabolomics researchers, but also to yeast biologists, systems biologists, the industrial fermentation industry, as well as the beer, wine and spirit industry. PMID:22064855
Legacy2Drupal - Conversion of an existing oceanographic relational database to a semantically enabled Drupal content management system

Science.gov (United States)

Maffei, A. R.; Chandler, C. L.; Work, T.; Allen, J.; Groman, R. C.; Fox, P. A.

2009-12-01

Content Management Systems (CMSs) provide powerful features that can be of use to oceanographic (and other geo-science) data managers. However, in many instances, geo-science data management offices have previously designed customized schemas for their metadata. The WHOI Ocean Informatics initiative and the NSF funded Biological Chemical and Biological Data Management Office (BCO-DMO) have jointly sponsored a project to port an existing, relational database containing oceanographic metadata, along with an existing interface coded in Cold Fusion middleware, to a Drupal6 Content Management System. The goal was to translate all the existing database tables, input forms, website reports, and other features present in the existing system to employ Drupal CMS features. The replacement features include Drupal content types, CCK node-reference fields, themes, RDB, SPARQL, workflow, and a number of other supporting modules. Strategic use of some Drupal6 CMS features enables three separate but complementary interfaces that provide access to oceanographic research metadata via the MySQL database: 1) a Drupal6-powered front-end; 2) a standard SQL port (used to provide a Mapserver interface to the metadata and data; and 3) a SPARQL port (feeding a new faceted search capability being developed). Future plans include the creation of science ontologies, by scientist/technologist teams, that will drive semantically-enabled faceted search capabilities planned for the site. Incorporation of semantic technologies included in the future Drupal 7 core release is also anticipated. Using a public domain CMS as opposed to proprietary middleware, and taking advantage of the many features of Drupal 6 that are designed to support semantically-enabled interfaces will help prepare the BCO-DMO database for interoperability with other ecosystem databases.
Microbial F-type lectin domains with affinity for blood group antigens.

Science.gov (United States)

Mahajan, Sonal; Khairnar, Aasawari; Bishnoi, Ritika; Ramya, T N C

2017-09-23

F-type lectins are fucose binding lectins with characteristic fucose binding and calcium binding motifs. Although they occur with a selective distribution in viruses, prokaryotes and eukaryotes, most biochemical studies have focused on vertebrate F-type lectins. Recently, using sensitive bioinformatics search techniques on the non-redundant database, we had identified many microbial F-type lectin domains with diverse domain organizations. We report here the biochemical characterization of F-type lectin domains from Cyanobium sp. PCC 7001, Myxococcus hansupus and Leucothrix mucor. We demonstrate that while all these three microbial F-type lectin domains bind to the blood group H antigen epitope on fucosylated glycans, there are fine differences in their glycan binding specificity. Cyanobium sp. PCC 7001 F-type lectin domain binds exclusively to extended H type-2 motif, Myxococcus hansupus F-type lectin domain binds to B, H type-1 and Lewis b motifs, and Leucothrix mucor F-type lectin domain binds to a wide range of fucosylated glycans, including A, B, H and Lewis antigens. We believe that these microbial lectins will be useful additions to the glycobiologist's toolbox for labeling, isolating and visualizing glycans. Copyright © 2017 Elsevier Inc. All rights reserved.
Winnowing sequences from a database search.

Science.gov (United States)

Berman, P; Zhang, Z; Wolf, Y I; Koonin, E V; Miller, W

2000-01-01

In database searches for sequence similarity, matches to a distinct sequence region (e.g., protein domain) are frequently obscured by numerous matches to another region of the same sequence. In order to cope with this problem, algorithms are developed to discard redundant matches. One model for this problem begins with a list of intervals, each with an associated score; each interval gives the range of positions in the query sequence that align to a database sequence, and the score is that of the alignment. If interval I is contained in interval J, and I's score is less than J's, then I is said to be dominated by J. The problem is then to identify each interval that is dominated by at least K other intervals, where K is a given level of "tolerable redundancy." An algorithm is developed to solve the problem in O(N log N) time and O(N*) space, where N is the number of intervals and N* is a precisely defined value that never exceeds N and is frequently much smaller. This criterion for discarding database hits has been implemented in the Blast program, as illustrated herein with examples. Several variations and extensions of this approach are also described.
Factors impacting time to acceptance and publication for peer-reviewed publications.

Science.gov (United States)

Toroser, Dikran; Carlson, Janice; Robinson, Micah; Gegner, Julie; Girard, Victoria; Smette, Lori; Nilsen, Jon; O'Kelly, James

2017-07-01

Timely publication of data is important for the medical community and provides a valuable contribution to data disclosure. The objective of this study was to identify and evaluate times to acceptance and publication for peer-reviewed manuscripts, reviews, and letters to the editor. Key publication metrics for published manuscripts, reviews, and letters to the editor were identified by eight Amgen publications professionals. Data for publications submitted between 1 January 2013 and 1 November 2015 were extracted from a proprietary internal publication-tracking database. Variables included department initiating the study, publication type, number of submissions per publication, and the total number of weeks from first submission to acceptance, online publication, and final publication. A total of 337 publications were identified, of which 300 (89%) were manuscripts. Time from submission to acceptance and publication was generally similar between clinical and real-world evidence (e.g. observational and health economics studies) publications. Median (range) time from first submission to acceptance was 23.4 (0.2-226.2) weeks. Median (range) time from first submission to online (early-release) publication was 29.7 (2.4-162.6) weeks. Median (range) time from first submission to final (print) publication was 36.2 (2.8-230.8) weeks. Time from first submission to acceptance, online publication, and final publication increased accordingly with number of submissions required for acceptance, with similar times noted between each subsequent submission. Analysis of a single-company publication database showed that the median time for manuscripts to be fully published after initial submission was 36.2 weeks, and time to publication increased accordingly with the number of submissions. Causes for multiple submissions and time from clinical trial completion to first submission were not assessed; these were limitations of the study. Nonetheless, publication planners should consider
Database Description - Yeast Interacting Proteins Database | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us Yeast Interacting Proteins Database Database Description General information of database Database... name Yeast Interacting Proteins Database Alternative name - DOI 10.18908/lsdba.nbdc00742-000 Creator C...-ken 277-8561 Tel: +81-4-7136-3989 FAX: +81-4-7136-3979 E-mail : Database classif...s cerevisiae Taxonomy ID: 4932 Database description Information on interactions and related information obta...l Acad Sci U S A. 2001 Apr 10;98(8):4569-74. Epub 2001 Mar 13. External Links: Original website information Database
Development of a national, dynamic reservoir-sedimentation database

Science.gov (United States)

Gray, J.R.; Bernard, J.M.; Stewart, D.W.; McFaul, E.J.; Laurent, K.W.; Schwarz, G.E.; Stinson, J.T.; Jonas, M.M.; Randle, T.J.; Webb, J.W.

2010-01-01

The importance of dependable, long-term water supplies, coupled with the need to quantify rates of capacity loss of the Nation’s re servoirs due to sediment deposition, were the most compelling reasons for developing the REServoir- SEDimentation survey information (RESSED) database and website. Created under the auspices of the Advisory Committee on Water Information’s Subcommittee on Sedimenta ion by the U.S. Geological Survey and the Natural Resources Conservation Service, the RESSED database is the most comprehensive compilation of data from reservoir bathymetric and dry-basin surveys in the United States. As of March 2010, the database, which contains data compiled on the 1950s vintage Soil Conservation Service’s Form SCS-34 data sheets, contained results from 6,616 surveys on 1,823 reservoirs in the United States and two surveys on one reservoir in Puerto Rico. The data span the period 1755–1997, with 95 percent of the surveys performed from 1930–1990. The reservoir surface areas range from sub-hectare-scale farm ponds to 658 km2 Lake Powell. The data in the RESSED database can be useful for a number of purposes, including calculating changes in reservoir-storage characteristics, quantifying sediment budgets, and estimating erosion rates in a reservoir’s watershed. The March 2010 version of the RESSED database has a number of deficiencies, including a cryptic and out-of-date database architecture; some geospatial inaccuracies (although most have been corrected); other data errors; an inability to store all data in a readily retrievable manner; and an inability to store all data types that currently exist. Perhaps most importantly, the March 2010 version of RESSED database provides no publically available means to submit new data and corrections to existing data. To address these and other deficiencies, the Subcommittee on Sedimentation, through the U.S. Geological Survey and the U.S. Army Corps of Engineers, began a collaborative project in
GigaDB: announcing the GigaScience database

Directory of Open Access Journals (Sweden)

Sneddon Tam P

2012-07-01

Full Text Available Abstract With the launch of GigaScience journal, here we provide insight into the accompanying database GigaDB, which allows the integration of manuscript publication with supporting data and tools. Reinforcing and upholding GigaScience’s goals to promote open-data and reproducibility of research, GigaDB also aims to provide a home, when a suitable public repository does not exist, for the supporting data or tools featured in the journal and beyond.
Open TG-GATEs: a large-scale toxicogenomics database

Science.gov (United States)

Igarashi, Yoshinobu; Nakatsu, Noriyuki; Yamashita, Tomoya; Ono, Atsushi; Ohno, Yasuo; Urushidani, Tetsuro; Yamada, Hiroshi

2015-01-01

Toxicogenomics focuses on assessing the safety of compounds using gene expression profiles. Gene expression signatures from large toxicogenomics databases are expected to perform better than small databases in identifying biomarkers for the prediction and evaluation of drug safety based on a compound's toxicological mechanisms in animal target organs. Over the past 10 years, the Japanese Toxicogenomics Project consortium (TGP) has been developing a large-scale toxicogenomics database consisting of data from 170 compounds (mostly drugs) with the aim of improving and enhancing drug safety assessment. Most of the data generated by the project (e.g. gene expression, pathology, lot number) are freely available to the public via Open TG-GATEs (Toxicogenomics Project-Genomics Assisted Toxicity Evaluation System). Here, we provide a comprehensive overview of the database, including both gene expression data and metadata, with a description of experimental conditions and procedures used to generate the database. Open TG-GATEs is available from http://toxico.nibio.go.jp/english/index.html. PMID:25313160
RNA STRAND: The RNA Secondary Structure and Statistical Analysis Database

Directory of Open Access Journals (Sweden)

Andronescu Mirela

2008-08-01

Full Text Available Abstract Background The ability to access, search and analyse secondary structures of a large set of known RNA molecules is very important for deriving improved RNA energy models, for evaluating computational predictions of RNA secondary structures and for a better understanding of RNA folding. Currently there is no database that can easily provide these capabilities for almost all RNA molecules with known secondary structures. Results In this paper we describe RNA STRAND – the RNA secondary STRucture and statistical ANalysis Database, a curated database containing known secondary structures of any type and organism. Our new database provides a wide collection of known RNA secondary structures drawn from public databases, searchable and downloadable in a common format. Comprehensive statistical information on the secondary structures in our database is provided using the RNA Secondary Structure Analyser, a new tool we have developed to analyse RNA secondary structures. The information thus obtained is valuable for understanding to which extent and with which probability certain structural motifs can appear. We outline several ways in which the data provided in RNA STRAND can facilitate research on RNA structure, including the improvement of RNA energy models and evaluation of secondary structure prediction programs. In order to keep up-to-date with new RNA secondary structure experiments, we offer the necessary tools to add solved RNA secondary structures to our database and invite researchers to contribute to RNA STRAND. Conclusion RNA STRAND is a carefully assembled database of trusted RNA secondary structures, with easy on-line tools for searching, analyzing and downloading user selected entries, and is publicly available at http://www.rnasoft.ca/strand.
Database for environmental monitoring at nuclear facilities

International Nuclear Information System (INIS)

Raceanu, M.; Varlam, C.; Enache, A.; Faurescu, I.

2006-01-01

To ensure that an assessment could be made of the impact of nuclear facilities on the local environment, a program of environmental monitoring must be established well in advance of nuclear facilities operation. Enormous amount of data must be stored and correlated starting with: location, meteorology, type sample characterization from water to different kind of food, radioactivity measurement and isotopic measurement (e.g. for C-14 determination, C-13 isotopic correction it is a must). Data modelling is a well known mechanism describing data structures at a high level of abstraction. Such models are often used to automatically create database structures, and to generate code structures used to access databases. This has the disadvantage of losing data constraints that might be specified in data models for data checking. Embodiment of the system of the present application includes a computer-readable memory for storing a definitional data table for defining variable symbols representing respective measurable physical phenomena. The definitional data table uniquely defines the variable symbols by relating them to respective data domains for the respective phenomena represented by the symbols. Well established rules of how the data should be stored and accessed, are given in the Relational Database Theory. The theory comprise of guidelines such as the avoidance of duplicating data using technique call normalization and how to identify the unique identifier for a database record. (author)
Relative aggregation operator in database fuzzy querying

Directory of Open Access Journals (Sweden)

Luminita DUMITRIU

2005-12-01

Full Text Available Fuzzy selection criteria querying relational databases include vague terms; they usually refer linguistic values form the attribute linguistic domains, defined as fuzzy sets. Generally, when a vague query is processed, the definitions of vague terms must already exist in a knowledge base. But there are also cases when vague terms must be dynamically defined, when a particular operation is used to aggregate simple criteria in a complex selection. The paper presents a new aggregation operator and the corresponding algorithm to evaluate the fuzzy query.
Handbook of video databases design and applications

CERN Document Server

Furht, Borko

2003-01-01

INTRODUCTIONIntroduction to Video DatabasesOge Marques and Borko FurhtVIDEO MODELING AND REPRESENTATIONModeling Video Using Input/Output Markov Models with Application to Multi-Modal Event DetectionAshutosh Garg, Milind R. Naphade, and Thomas S. HuangStatistical Models of Video Structure and SemanticsNuno VasconcelosFlavor: A Language for Media RepresentationAlexandros Eleftheriadis and Danny HongIntegrating Domain Knowledge and Visual Evidence to Support Highlight Detection in Sports VideosJuergen Assfalg, Marco Bertini, Carlo Colombo, and Alberto Del BimboA Generic Event Model and Sports Vid
Update History of This Database - Trypanosomes Database | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us Trypanosomes Database Update History of This Database Date Update contents 2014/05/07 The co...ntact information is corrected. The features and manner of utilization of the database are corrected. 2014/02/04 Trypanosomes Databas...e English archive site is opened. 2011/04/04 Trypanosomes Database ( http://www.tan...paku.org/tdb/ ) is opened. About This Database Database Description Download Lice...nse Update History of This Database Site Policy | Contact Us Update History of This Database - Trypanosomes Database | LSDB Archive ...
The PREDICTS database: a global database of how local terrestrial biodiversity responds to human impacts

Science.gov (United States)

Hudson, Lawrence N; Newbold, Tim; Contu, Sara; Hill, Samantha L L; Lysenko, Igor; De Palma, Adriana; Phillips, Helen R P; Senior, Rebecca A; Bennett, Dominic J; Booth, Hollie; Choimes, Argyrios; Correia, David L P; Day, Julie; Echeverría-Londoño, Susy; Garon, Morgan; Harrison, Michelle L K; Ingram, Daniel J; Jung, Martin; Kemp, Victoria; Kirkpatrick, Lucinda; Martin, Callum D; Pan, Yuan; White, Hannah J; Aben, Job; Abrahamczyk, Stefan; Adum, Gilbert B; Aguilar-Barquero, Virginia; Aizen, Marcelo A; Ancrenaz, Marc; Arbeláez-Cortés, Enrique; Armbrecht, Inge; Azhar, Badrul; Azpiroz, Adrián B; Baeten, Lander; Báldi, András; Banks, John E; Barlow, Jos; Batáry, Péter; Bates, Adam J; Bayne, Erin M; Beja, Pedro; Berg, Åke; Berry, Nicholas J; Bicknell, Jake E; Bihn, Jochen H; Böhning-Gaese, Katrin; Boekhout, Teun; Boutin, Céline; Bouyer, Jérémy; Brearley, Francis Q; Brito, Isabel; Brunet, Jörg; Buczkowski, Grzegorz; Buscardo, Erika; Cabra-García, Jimmy; Calviño-Cancela, María; Cameron, Sydney A; Cancello, Eliana M; Carrijo, Tiago F; Carvalho, Anelena L; Castro, Helena; Castro-Luna, Alejandro A; Cerda, Rolando; Cerezo, Alexis; Chauvat, Matthieu; Clarke, Frank M; Cleary, Daniel F R; Connop, Stuart P; D'Aniello, Biagio; da Silva, Pedro Giovâni; Darvill, Ben; Dauber, Jens; Dejean, Alain; Diekötter, Tim; Dominguez-Haydar, Yamileth; Dormann, Carsten F; Dumont, Bertrand; Dures, Simon G; Dynesius, Mats; Edenius, Lars; Elek, Zoltán; Entling, Martin H; Farwig, Nina; Fayle, Tom M; Felicioli, Antonio; Felton, Annika M; Ficetola, Gentile F; Filgueiras, Bruno K C; Fonte, Steven J; Fraser, Lauchlan H; Fukuda, Daisuke; Furlani, Dario; Ganzhorn, Jörg U; Garden, Jenni G; Gheler-Costa, Carla; Giordani, Paolo; Giordano, Simonetta; Gottschalk, Marco S; Goulson, Dave; Gove, Aaron D; Grogan, James; Hanley, Mick E; Hanson, Thor; Hashim, Nor R; Hawes, Joseph E; Hébert, Christian; Helden, Alvin J; Henden, John-André; Hernández, Lionel; Herzog, Felix; Higuera-Diaz, Diego; Hilje, Branko; Horgan, Finbarr G; Horváth, Roland; Hylander, Kristoffer; Isaacs-Cubides, Paola; Ishitani, Masahiro; Jacobs, Carmen T; Jaramillo, Víctor J; Jauker, Birgit; Jonsell, Mats; Jung, Thomas S; Kapoor, Vena; Kati, Vassiliki; Katovai, Eric; Kessler, Michael; Knop, Eva; Kolb, Annette; Kőrösi, Ádám; Lachat, Thibault; Lantschner, Victoria; Le Féon, Violette; LeBuhn, Gretchen; Légaré, Jean-Philippe; Letcher, Susan G; Littlewood, Nick A; López-Quintero, Carlos A; Louhaichi, Mounir; Lövei, Gabor L; Lucas-Borja, Manuel Esteban; Luja, Victor H; Maeto, Kaoru; Magura, Tibor; Mallari, Neil Aldrin; Marin-Spiotta, Erika; Marshall, E J P; Martínez, Eliana; Mayfield, Margaret M; Mikusinski, Grzegorz; Milder, Jeffrey C; Miller, James R; Morales, Carolina L; Muchane, Mary N; Muchane, Muchai; Naidoo, Robin; Nakamura, Akihiro; Naoe, Shoji; Nates-Parra, Guiomar; Navarrete Gutierrez, Dario A; Neuschulz, Eike L; Noreika, Norbertas; Norfolk, Olivia; Noriega, Jorge Ari; Nöske, Nicole M; O'Dea, Niall; Oduro, William; Ofori-Boateng, Caleb; Oke, Chris O; Osgathorpe, Lynne M; Paritsis, Juan; Parra-H, Alejandro; Pelegrin, Nicolás; Peres, Carlos A; Persson, Anna S; Petanidou, Theodora; Phalan, Ben; Philips, T Keith; Poveda, Katja; Power, Eileen F; Presley, Steven J; Proença, Vânia; Quaranta, Marino; Quintero, Carolina; Redpath-Downing, Nicola A; Reid, J Leighton; Reis, Yana T; Ribeiro, Danilo B; Richardson, Barbara A; Richardson, Michael J; Robles, Carolina A; Römbke, Jörg; Romero-Duque, Luz Piedad; Rosselli, Loreta; Rossiter, Stephen J; Roulston, T'ai H; Rousseau, Laurent; Sadler, Jonathan P; Sáfián, Szabolcs; Saldaña-Vázquez, Romeo A; Samnegård, Ulrika; Schüepp, Christof; Schweiger, Oliver; Sedlock, Jodi L; Shahabuddin, Ghazala; Sheil, Douglas; Silva, Fernando A B; Slade, Eleanor M; Smith-Pardo, Allan H; Sodhi, Navjot S; Somarriba, Eduardo J; Sosa, Ramón A; Stout, Jane C; Struebig, Matthew J; Sung, Yik-Hei; Threlfall, Caragh G; Tonietto, Rebecca; Tóthmérész, Béla; Tscharntke, Teja; Turner, Edgar C; Tylianakis, Jason M; Vanbergen, Adam J; Vassilev, Kiril; Verboven, Hans A F; Vergara, Carlos H; Vergara, Pablo M; Verhulst, Jort; Walker, Tony R; Wang, Yanping; Watling, James I; Wells, Konstans; Williams, Christopher D; Willig, Michael R; Woinarski, John C Z; Wolf, Jan H D; Woodcock, Ben A; Yu, Douglas W; Zaitsev, Andrey S; Collen, Ben; Ewers, Rob M; Mace, Georgina M; Purves, Drew W; Scharlemann, Jörn P W; Purvis, Andy

2014-01-01

Biodiversity continues to decline in the face of increasing anthropogenic pressures such as habitat destruction, exploitation, pollution and introduction of alien species. Existing global databases of species’ threat status or population time series are dominated by charismatic species. The collation of datasets with broad taxonomic and biogeographic extents, and that support computation of a range of biodiversity indicators, is necessary to enable better understanding of historical declines and to project – and avert – future declines. We describe and assess a new database of more than 1.6 million samples from 78 countries representing over 28,000 species, collated from existing spatial comparisons of local-scale biodiversity exposed to different intensities and types of anthropogenic pressures, from terrestrial sites around the world. The database contains measurements taken in 208 (of 814) ecoregions, 13 (of 14) biomes, 25 (of 35) biodiversity hotspots and 16 (of 17) megadiverse countries. The database contains more than 1% of the total number of all species described, and more than 1% of the described species within many taxonomic groups – including flowering plants, gymnosperms, birds, mammals, reptiles, amphibians, beetles, lepidopterans and hymenopterans. The dataset, which is still being added to, is therefore already considerably larger and more representative than those used by previous quantitative models of biodiversity trends and responses. The database is being assembled as part of the PREDICTS project (Projecting Responses of Ecological Diversity In Changing Terrestrial Systems – http://www.predicts.org.uk). We make site-level summary data available alongside this article. The full database will be publicly available in 2015. PMID:25558364
Native Pig and Chicken Breed Database: NPCDB

Directory of Open Access Journals (Sweden)

Hyeon-Soo Jeong

2014-10-01

Full Text Available Indigenous (native breeds of livestock have higher disease resistance and adaptation to the environment due to high genetic diversity. Even though their extinction rate is accelerated due to the increase of commercial breeds, natural disaster, and civil war, there is a lack of well-established databases for the native breeds. Thus, we constructed the native pig and chicken breed database (NPCDB which integrates available information on the breeds from around the world. It is a nonprofit public database aimed to provide information on the genetic resources of indigenous pig and chicken breeds for their conservation. The NPCDB (http://npcdb.snu.ac.kr/ provides the phenotypic information and population size of each breed as well as its specific habitat. In addition, it provides information on the distribution of genetic resources across the country. The database will contribute to understanding of the breed’s characteristics such as disease resistance and adaptation to environmental changes as well as the conservation of indigenous genetic resources.
Database Perspectives on Blockchains

OpenAIRE

Cohen, Sara; Zohar, Aviv

2018-01-01

Modern blockchain systems are a fresh look at the paradigm of distributed computing, applied under assumptions of large-scale public networks. They can be used to store and share information without a trusted central party. There has been much effort to develop blockchain systems for a myriad of uses, ranging from cryptocurrencies to identity control, supply chain management, etc. None of this work has directly studied the fundamental database issues that arise when using blockchains as the u...
CORE: a phylogenetically-curated 16S rDNA database of the core oral microbiome.

Directory of Open Access Journals (Sweden)

Ann L Griffen

2011-04-01

Full Text Available Comparing bacterial 16S rDNA sequences to GenBank and other large public databases via BLAST often provides results of little use for identification and taxonomic assignment of the organisms of interest. The human microbiome, and in particular the oral microbiome, includes many taxa, and accurate identification of sequence data is essential for studies of these communities. For this purpose, a phylogenetically curated 16S rDNA database of the core oral microbiome, CORE, was developed. The goal was to include a comprehensive and minimally redundant representation of the bacteria that regularly reside in the human oral cavity with computationally robust classification at the level of species and genus. Clades of cultivated and uncultivated taxa were formed based on sequence analyses using multiple criteria, including maximum-likelihood-based topology and bootstrap support, genetic distance, and previous naming. A number of classification inconsistencies for previously named species, especially at the level of genus, were resolved. The performance of the CORE database for identifying clinical sequences was compared to that of three publicly available databases, GenBank nr/nt, RDP and HOMD, using a set of sequencing reads that had not been used in creation of the database. CORE offered improved performance compared to other public databases for identification of human oral bacterial 16S sequences by a number of criteria. In addition, the CORE database and phylogenetic tree provide a framework for measures of community divergence, and the focused size of the database offers advantages of efficiency for BLAST searching of large datasets. The CORE database is available as a searchable interface and for download at http://microbiome.osu.edu.
A conserved gene family encodes transmembrane proteins with fibronectin, immunoglobulin and leucine-rich repeat domains (FIGLER

Directory of Open Access Journals (Sweden)

Haga Christopher L

2007-09-01

Full Text Available Abstract Background In mouse the cytokine interleukin-7 (IL-7 is required for generation of B lymphocytes, but human IL-7 does not appear to have this function. A bioinformatics approach was therefore used to identify IL-7 receptor related genes in the hope of identifying the elusive human cytokine. Results Our database search identified a family of nine gene candidates, which we have provisionally named fibronectin immunoglobulin leucine-rich repeat (FIGLER. The FIGLER 1–9 genes are predicted to encode type I transmembrane glycoproteins with 6–12 leucine-rich repeats (LRR, a C2 type Ig domain, a fibronectin type III domain, a hydrophobic transmembrane domain, and a cytoplasmic domain containing one to four tyrosine residues. Members of this multichromosomal gene family possess 20–47% overall amino acid identity and are differentially expressed in cell lines and primary hematopoietic lineage cells. Genes for FIGLER homologs were identified in macaque, orangutan, chimpanzee, mouse, rat, dog, chicken, toad, and puffer fish databases. The non-human FIGLER homologs share 38–99% overall amino acid identity with their human counterpart. Conclusion The extracellular domain structure and absence of recognizable cytoplasmic signaling motifs in members of the highly conserved FIGLER gene family suggest a trophic or cell adhesion function for these molecules.

ARTI refrigerant database

Energy Technology Data Exchange (ETDEWEB)

Calm, J.M. [Calm (James M.), Great Falls, VA (United States)

1996-04-15

The Refrigerant Database is an information system on alternative refrigerants, associated lubricants, and their use in air conditioning and refrigeration. It consolidates and facilitates access to property, compatibility, environmental, safety, application and other information. It provides corresponding information on older refrigerants, to assist manufacturers and those using alternative refrigerants, to make comparisons and determine differences. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern. The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air-conditioning and refrigeration equipment. The complete documents are not included, though some may be added at a later date. The database identifies sources of specific information on refrigerants. It addresses lubricants including alkylbenzene, polyalkylene glycol, polyolester, and other synthetics as well as mineral oils. It also references documents addressing compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits. Incomplete citations or abstracts are provided for some documents. They are included to accelerate availability of the information and will be completed or replaced in future updates. Citations in this report are divided into the following topics: thermophysical properties; materials compatibility; lubricants and tribology; application data; safety; test and analysis methods; impacts; regulatory actions; substitute refrigerants; identification; absorption and adsorption; research programs; and miscellaneous documents. Information is also presented on ordering instructions for the computerized version.
Interactive Exploration for Continuously Expanding Neuron Databases.

Science.gov (United States)

Li, Zhongyu; Metaxas, Dimitris N; Lu, Aidong; Zhang, Shaoting

2017-02-15

This paper proposes a novel framework to help biologists explore and analyze neurons based on retrieval of data from neuron morphological databases. In recent years, the continuously expanding neuron databases provide a rich source of information to associate neuronal morphologies with their functional properties. We design a coarse-to-fine framework for efficient and effective data retrieval from large-scale neuron databases. In the coarse-level, for efficiency in large-scale, we employ a binary coding method to compress morphological features into binary codes of tens of bits. Short binary codes allow for real-time similarity searching in Hamming space. Because the neuron databases are continuously expanding, it is inefficient to re-train the binary coding model from scratch when adding new neurons. To solve this problem, we extend binary coding with online updating schemes, which only considers the newly added neurons and update the model on-the-fly, without accessing the whole neuron databases. In the fine-grained level, we introduce domain experts/users in the framework, which can give relevance feedback for the binary coding based retrieval results. This interactive strategy can improve the retrieval performance through re-ranking the above coarse results, where we design a new similarity measure and take the feedback into account. Our framework is validated on more than 17,000 neuron cells, showing promising retrieval accuracy and efficiency. Moreover, we demonstrate its use case in assisting biologists to identify and explore unknown neurons. Copyright © 2017 Elsevier Inc. All rights reserved.
Compressing DNA sequence databases with coil

Directory of Open Access Journals (Sweden)

Hendy Michael D

2008-05-01

Full Text Available Abstract Background Publicly available DNA sequence databases such as GenBank are large, and are growing at an exponential rate. The sheer volume of data being dealt with presents serious storage and data communications problems. Currently, sequence data is usually kept in large "flat files," which are then compressed using standard Lempel-Ziv (gzip compression – an approach which rarely achieves good compression ratios. While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences. In this study we introduce the sequence database compression software coil. Results We have designed and implemented a portable software package, coil, for compressing and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared towards achieving high compression ratios at the expense of execution time and memory usage during compression – the compression time represents a "one-off investment" whose cost is quickly amortised if the resulting compressed file is transmitted many times. Decompression requires little memory and is extremely fast. We demonstrate a 5% improvement in compression ratio over state-of-the-art general-purpose compression tools for a large GenBank database file containing Expressed Sequence Tag (EST data. Finally, coil can efficiently encode incremental additions to a sequence database. Conclusion coil presents a compelling alternative to conventional compression of flat files for the storage and distribution of DNA sequence databases having a narrow distribution of sequence lengths, such as EST data. Increasing compression levels for databases having a wide distribution of sequence lengths is a direction for future work.
Brassica ASTRA: an integrated database for Brassica genomic research.

Science.gov (United States)

Love, Christopher G; Robinson, Andrew J; Lim, Geraldine A C; Hopkins, Clare J; Batley, Jacqueline; Barker, Gary; Spangenberg, German C; Edwards, David

2005-01-01

Brassica ASTRA is a public database for genomic information on Brassica species. The database incorporates expressed sequences with Swiss-Prot and GenBank comparative sequence annotation as well as secondary Gene Ontology (GO) annotation derived from the comparison with Arabidopsis TAIR GO annotations. Simple sequence repeat molecular markers are identified within resident sequences and mapped onto the closely related Arabidopsis genome sequence. Bacterial artificial chromosome (BAC) end sequences derived from the Multinational Brassica Genome Project are also mapped onto the Arabidopsis genome sequence enabling users to identify candidate Brassica BACs corresponding to syntenic regions of Arabidopsis. This information is maintained in a MySQL database with a web interface providing the primary means of interrogation. The database is accessible at http://hornbill.cspp.latrobe.edu.au.
The Danish Nonmelanoma Skin Cancer Dermatology Database.

Science.gov (United States)

Lamberg, Anna Lei; Sølvsten, Henrik; Lei, Ulrikke; Vinding, Gabrielle Randskov; Stender, Ida Marie; Jemec, Gregor Borut Ernst; Vestergaard, Tine; Thormann, Henrik; Hædersdal, Merete; Dam, Tomas Norman; Olesen, Anne Braae

2016-01-01

The Danish Nonmelanoma Skin Cancer Dermatology Database was established in 2008. The aim of this database was to collect data on nonmelanoma skin cancer (NMSC) treatment and improve its treatment in Denmark. NMSC is the most common malignancy in the western countries and represents a significant challenge in terms of public health management and health care costs. However, high-quality epidemiological and treatment data on NMSC are sparse. The NMSC database includes patients with the following skin tumors: basal cell carcinoma (BCC), squamous cell carcinoma, Bowen's disease, and keratoacanthoma diagnosed by the participating office-based dermatologists in Denmark. Clinical and histological diagnoses, BCC subtype, localization, size, skin cancer history, skin phototype, and evidence of metastases and treatment modality are the main variables in the NMSC database. Information on recurrence, cosmetic results, and complications are registered at two follow-up visits at 3 months (between 0 and 6 months) and 12 months (between 6 and 15 months) after treatment. In 2014, 11,522 patients with 17,575 tumors were registered in the database. Of tumors with a histological diagnosis, 13,571 were BCCs, 840 squamous cell carcinomas, 504 Bowen's disease, and 173 keratoakanthomas. The NMSC database encompasses detailed information on the type of tumor, a variety of prognostic factors, treatment modalities, and outcomes after treatment. The database has revealed that overall, the quality of care of NMSC in Danish dermatological clinics is high, and the database provides the necessary data for continuous quality assurance.
Modelling antibody side chain conformations using heuristic database search.

Science.gov (United States)

Ritchie, D W; Kemp, G J

1997-01-01

We have developed a knowledge-based system which models the side chain conformations of residues in the variable domains of antibody Fv fragments. The system is written in Prolog and uses an object-oriented database of aligned antibody structures in conjunction with a side chain rotamer library. The antibody database provides 3-dimensional clusters of side chain conformations which can be copied en masse into the model structure. The object-oriented database architecture facilitates a navigational style of database access, necessary to assemble side chains clusters. Around 60% of the model is built using side chain clusters and this eliminates much of the combinatorial complexity associated with many other side chain placement algorithms. Construction and placement of side chain clusters is guided by a heuristic cost function based on a simple model of side chain packing interactions. Even with a simple model, we find that a large proportion of side chain conformations are modelled accurately. We expect our approach could be used with other homologous protein families, in addition to antibodies, both to improve the quality of model structures and to give a "smart start" to the side chain placement problem.
Status and perspective of detector databases in the CMS experiment at the LHC

NARCIS (Netherlands)

Aerts, A.T.M.; Glege, F.; Liendl, M.; Vorobiev, I.; Willers, I.M.; Wynhoff, S.

2004-01-01

This note gives an overview at a high conceptual level of the various databases that capture the information concerning the CMS detector. The detector domain has been split up into four, partly overlapping parts that cover phases in the detector life cycle: construction, integration, configuration
Nencki Genomics Database--Ensembl funcgen enhanced with intersections, user data and genome-wide TFBS motifs.

Science.gov (United States)

Krystkowiak, Izabella; Lenart, Jakub; Debski, Konrad; Kuterba, Piotr; Petas, Michal; Kaminska, Bozena; Dabrowski, Michal

2013-01-01

We present the Nencki Genomics Database, which extends the functionality of Ensembl Regulatory Build (funcgen) for the three species: human, mouse and rat. The key enhancements over Ensembl funcgen include the following: (i) a user can add private data, analyze them alongside the public data and manage access rights; (ii) inside the database, we provide efficient procedures for computing intersections between regulatory features and for mapping them to the genes. To Ensembl funcgen-derived data, which include data from ENCODE, we add information on conserved non-coding (putative regulatory) sequences, and on genome-wide occurrence of transcription factor binding site motifs from the current versions of two major motif libraries, namely, Jaspar and Transfac. The intersections and mapping to the genes are pre-computed for the public data, and the result of any procedure run on the data added by the users is stored back into the database, thus incrementally increasing the body of pre-computed data. As the Ensembl funcgen schema for the rat is currently not populated, our database is the first database of regulatory features for this frequently used laboratory animal. The database is accessible without registration using the mysql client: mysql -h database.nencki-genomics.org -u public. Registration is required only to add or access private data. A WSDL webservice provides access to the database from any SOAP client, including the Taverna Workbench with a graphical user interface.
Quality of Service: a study in databases bibliometric international

Directory of Open Access Journals (Sweden)

Deosir Flávio Lobo de Castro Junior

2013-08-01

Full Text Available The purpose of this article is to serve as a source of references on Quality of Service for future research. After surveying the international databases, EBSCO and ProQuest, the results on the state of the art in this issue are presented. The method used was the bibliometrics, and 132 items from a universe of 13,427 were investigated. The analyzed works cover the period from 1985 to 2011. Among the contributions, results and conclusions for future research are presented: i most cited authors ii most used methodology, dimensions and questionnaire; iii most referenced publications iv international journals with most publications on the subject, v distribution of the number of publications per year; vi authors networks vii educational institutions network; viii terms used in the search in international databases; ix the relationships studied in 132 articles; x criteria for choice of methodology in the research on quality of services; xi most often used paradigm, and xii 160 high impact references.
ATGC database and ATGC-COGs: an updated resource for micro- and macro-evolutionary studies of prokaryotic genomes and protein family annotation.

Science.gov (United States)

Kristensen, David M; Wolf, Yuri I; Koonin, Eugene V

2017-01-04

The Alignable Tight Genomic Clusters (ATGCs) database is a collection of closely related bacterial and archaeal genomes that provides several tools to aid research into evolutionary processes in the microbial world. Each ATGC is a taxonomy-independent cluster of 2 or more completely sequenced genomes that meet the objective criteria of a high degree of local gene order (synteny) and a small number of synonymous substitutions in the protein-coding genes. As such, each ATGC is suited for analysis of microevolutionary variations within a cohesive group of organisms (e.g. species), whereas the entire collection of ATGCs is useful for macroevolutionary studies. The ATGC database includes many forms of pre-computed data, in particular ATGC-COGs (Clusters of Orthologous Genes), multiple sequence alignments, a set of 'index' orthologs representing the most well-conserved members of each ATGC-COG, the phylogenetic tree of the organisms within each ATGC, etc. Although the ATGC database contains several million proteins from thousands of genomes organized into hundreds of clusters (roughly a 4-fold increase since the last version of the ATGC database), it is now built with completely automated methods and will be regularly updated following new releases of the NCBI RefSeq database. The ATGC database is hosted jointly at the University of Iowa at dmk-brain.ecn.uiowa.edu/ATGC/ and the NCBI at ftp.ncbi.nlm.nih.gov/pub/kristensen/ATGC/atgc_home.html. Published by Oxford University Press on behalf of Nucleic Acids Research 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.
Current status of system development to provide databases of nuclides migration

International Nuclear Information System (INIS)

Sasamoto, Hiroshi; Yoshida, Yasushi; Isogai, Takeshi; Suyama, Tadahiro; Shibata, Masahiro; Yui, Mikazu; Jintoku, Takashi

2005-01-01

JNC has developed databases of nuclides migration for safety assessment of high-level radioactive waste (HLW) repository, and they have been used in the second progress report to present the technical reliability of HLW geological disposal system in Japan. The technical level and applicability of databases have been highly evaluated even overseas. To provide the databases broadly over the world and to promote the use of the databases, we have performed the followings: 1) development of tools to convert the database format from geochemical code PHREEQE to PHREEQC, GWB and EQ3/6 and 2) set up a web site (http://migrationdb.jnc.go.jp) which enables the public to access to the databases. As a result, the number of database users has significantly increased. Additionally, a number of useful comments from the users can be applied to modification and/or update of databases. (author)
Psychometric characteristics of a public-domain self-report measure of vocational interests: the Oregon Vocational Interest Scales.

Science.gov (United States)

Pozzebon, Julie A; Visser, Beth A; Ashton, Michael C; Lee, Kibeom; Goldberg, Lewis R

2010-03-01

We investigated the psychometric properties of the Oregon Vocational Interest Scales (ORVIS), a brief public-domain alternative to commercial inventories, in a large community sample and in a college sample. In both samples, we examined the factor structure, scale intercorrelations, and personality correlates of the ORVIS, and in the community sample, we also examined the correlations of the ORVIS scales with cognitive abilities and with the scales of a longer, proprietary interest survey. In both samples, all 8 scales-Leadership, Organization, Altruism, Creativity, Analysis, Producing, Adventuring, and Erudition-showed wide variation in scores, high internal-consistency reliabilities, and a pattern of high convergent and low discriminant correlations with the scales of the proprietary interest survey. Overall, the results support the construct validity of the scales, which are recommended for use in research on vocational interests and other individual differences.
Domain-based small molecule binding site annotation

Directory of Open Access Journals (Sweden)

Dumontier Michel

2006-03-01

Full Text Available Abstract Background Accurate small molecule binding site information for a protein can facilitate studies in drug docking, drug discovery and function prediction, but small molecule binding site protein sequence annotation is sparse. The Small Molecule Interaction Database (SMID, a database of protein domain-small molecule interactions, was created using structural data from the Protein Data Bank (PDB. More importantly it provides a means to predict small molecule binding sites on proteins with a known or unknown structure and unlike prior approaches, removes large numbers of false positive hits arising from transitive alignment errors, non-biologically significant small molecules and crystallographic conditions that overpredict ion binding sites. Description Using a set of co-crystallized protein-small molecule structures as a starting point, SMID interactions were generated by identifying protein domains that bind to small molecules, using NCBI's Reverse Position Specific BLAST (RPS-BLAST algorithm. SMID records are available for viewing at http://smid.blueprint.org. The SMID-BLAST tool provides accurate transitive annotation of small-molecule binding sites for proteins not found in the PDB. Given a protein sequence, SMID-BLAST identifies domains using RPS-BLAST and then lists potential small molecule ligands based on SMID records, as well as their aligned binding sites. A heuristic ligand score is calculated based on E-value, ligand residue identity and domain entropy to assign a level of confidence to hits found. SMID-BLAST predictions were validated against a set of 793 experimental small molecule interactions from the PDB, of which 472 (60% of predicted interactions identically matched the experimental small molecule and of these, 344 had greater than 80% of the binding site residues correctly identified. Further, we estimate that 45% of predictions which were not observed in the PDB validation set may be true positives. Conclusion By
La apropiación del dominio público y las posibilidades de acceso a los bienes culturales | The appropriation of the public domain and the possibilities of access to cultural goods

Directory of Open Access Journals (Sweden)

Joan Ramos Toledano

2017-06-01

Full Text Available Resumen: Las normas de propiedad intelectual y copyright prevén un periodo de protección otorgando unos derechos económicos exclusivos y temporales. Pasado un plazo determinado, las obras protegidas entran en lo que se denomina dominio público. Éste suele ser considerado como el momento en el que los bienes culturales pasan a estar bajo el dominio y control de la sociedad en conjunto. El presente trabajo pretende argumentar que, dado nuestro actual sistema económico, en realidad el dominio público funciona más como una posibilidad de negocio para determinadas empresas que como una verdadera opción para que el público pueda acceder a las obras. Abstract: The legislation of continental intellectual property and copyright provide for a period of protection granting exclusive and temporary economic rights. After a certain period, protected works enter into what is called the public domain. This is often considered as the moment in which the cultural goods come under the control and domain of society as a whole. The present paper pretends to argue that, given our current economic system, the public domain actually functions more as a business opportunity for certain companies than as a real option for the public to access artistic and intellectual works.
A domain-based approach to predict protein-protein interactions

Directory of Open Access Journals (Sweden)

Resat Haluk

2007-06-01

Full Text Available Abstract Background Knowing which proteins exist in a certain organism or cell type and how these proteins interact with each other are necessary for the understanding of biological processes at the whole cell level. The determination of the protein-protein interaction (PPI networks has been the subject of extensive research. Despite the development of reasonably successful methods, serious technical difficulties still exist. In this paper we present DomainGA, a quantitative computational approach that uses the information about the domain-domain interactions to predict the interactions between proteins. Results DomainGA is a multi-parameter optimization method in which the available PPI information is used to derive a quantitative scoring scheme for the domain-domain pairs. Obtained domain interaction scores are then used to predict whether a pair of proteins interacts. Using the yeast PPI data and a series of tests, we show the robustness and insensitivity of the DomainGA method to the selection of the parameter sets, score ranges, and detection rules. Our DomainGA method achieves very high explanation ratios for the positive and negative PPIs in yeast. Based on our cross-verification tests on human PPIs, comparison of the optimized scores with the structurally observed domain interactions obtained from the iPFAM database, and sensitivity and specificity analysis; we conclude that our DomainGA method shows great promise to be applicable across multiple organisms. Conclusion We envision the DomainGA as a first step of a multiple tier approach to constructing organism specific PPIs. As it is based on fundamental structural information, the DomainGA approach can be used to create potential PPIs and the accuracy of the constructed interaction template can be further improved using complementary methods. Explanation ratios obtained in the reported test case studies clearly show that the false prediction rates of the template networks constructed
Update History of This Database - Arabidopsis Phenome Database | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us Arabidopsis Phenome Database Update History of This Database Date Update contents 2017/02/27 Arabidopsis Phenome Data...base English archive site is opened. - Arabidopsis Phenome Database (http://jphenom...e.info/?page_id=95) is opened. About This Database Database Description Download License Update History of This Database... Site Policy | Contact Us Update History of This Database - Arabidopsis Phenome Database | LSDB Archive ...
Construction of database server system for fuel thermo-physical properties

International Nuclear Information System (INIS)

Park, Chang Je; Kang, Kwon Ho; Song, Kee Chan

2003-12-01

To perform the evaluation of various fuels in the nuclear reactors, not only the mechanical properties but also thermo-physical properties are required as one of most important inputs for fuel performance code system. The main objective of this study is to make a database system for fuel thermo-physical properties and a PC-based hardware system has been constructed for ease use for the public with visualization such as web-based server system. This report deals with the hardware and software which are used in the database server system for nuclear fuel thermo-physical properties. It is expected to be highly useful to obtain nuclear fuel data without such a difficulty through opening the database of fuel properties to the public and is also helpful to research of development of various fuel of nuclear industry. Furthermore, the proposed models of nuclear fuel thermo-physical properties will be enough utilized to the fuel performance code system
Evolution of the RNase P RNA structural domain in Leptospira spp

NARCIS (Netherlands)

Ravishankar, Vigneshwaran; Ahmed, Ahmed; Sivagnanam, Ulaganathan; Muthuraman, Krishnaraja; Karthikaichamy, Anbarasu; Wilson, Herald A.; Devendran, Ajay; Hartskeerl, Rudy A.; Raj, Stephen M. L.

2014-01-01

We have employed the RNase P RNA (RPR) gene, which is present as single copy in chromosome I of Leptospira spp. to investigate the phylogeny of structural domains present in the RNA subunit of the tRNA processing enzyme, RNase P. RPR gene sequences of 150 strains derived from NCBI database along
Inference Attacks and Control on Database Structures

Directory of Open Access Journals (Sweden)

Muhamed Turkanovic

2015-02-01

Full Text Available Today’s databases store information with sensitivity levels that range from public to highly sensitive, hence ensuring confidentiality can be highly important, but also requires costly control. This paper focuses on the inference problem on different database structures. It presents possible treats on privacy with relation to the inference, and control methods for mitigating these treats. The paper shows that using only access control, without any inference control is inadequate, since these models are unable to protect against indirect data access. Furthermore, it covers new inference problems which rise from the dimensions of new technologies like XML, semantics, etc.
Molecular signatures database (MSigDB) 3.0.

Science.gov (United States)

Liberzon, Arthur; Subramanian, Aravind; Pinchback, Reid; Thorvaldsdóttir, Helga; Tamayo, Pablo; Mesirov, Jill P

2011-06-15

Well-annotated gene sets representing the universe of the biological processes are critical for meaningful and insightful interpretation of large-scale genomic data. The Molecular Signatures Database (MSigDB) is one of the most widely used repositories of such sets. We report the availability of a new version of the database, MSigDB 3.0, with over 6700 gene sets, a complete revision of the collection of canonical pathways and experimental signatures from publications, enhanced annotations and upgrades to the web site. MSigDB is freely available for non-commercial use at http://www.broadinstitute.org/msigdb.

PENENTUAN FITUR WEBSITE BIDANG PARIWISATA DAN KEBUDAYAAN DENGAN METODE FEATURE-ORIENTED DOMAIN ANALYSIS (FODA

Directory of Open Access Journals (Sweden)

Muhammad Iqbal

2016-10-01

Penentuan fitur dalam membuat website bidang pariwisata dan kebudayaan dibutuhkan untuk mengetahui fitur yang bisa diimplementasikan. Untuk membantu menentukan fitur tersebut, digunakan analisis domain dengan metode Feature-Oriented Domain Analysis (FODA. Metode tersebut mempunyai tahapan dimulai dari tinjauan aplikasi terhadap ketiga website sebagai sampel untuk mengambil fitur. Selanjutnya tahapan analisis konteks yang mendapatkan diagram struktur dan diagram konteks. Berikutnya tahapan pemodelan domain yang dibagi dua langkah yaitu analisis fitur untuk mendapatkan fitur-fitur pada aplikasi web melalui diagram fitur dengan penjelasan melalui kamus terminologi domain. Langkah berikutnya adalah pemodelan entity-relationship dengan membuat diagram entity-relationship untuk pembuatan database. Terakhir, pemodelan arsitektur dengan membuat arsitektur domain untuk pengembangan aplikasi yang hanya fokus pada fitur.Â Hasil dari analisis fitur adalah didapatkan sebanyak 38 fitur mandatory yang berarti fitur tersebut wajib diimplementasikan dalam aplikasi web untuk pariwisata dan kebudayaan.Â Kata kunci: Pariwisata, Kebudayaan, Website, Fitur, Feature-Oriented Domain Analysis
Selected ICAR Data from the SAPA-Project: Development and Initial Validation of a Public-Domain Measure

Directory of Open Access Journals (Sweden)

David M. Condon

2016-01-01

Full Text Available These data were collected during the initial evaluation of the International Cognitive Ability Resource (ICAR project. ICAR is an international collaborative effort to develop open-source public-domain tools for cognitive ability assessment, including tools that can be administered in non-proctored environments (e.g., online administration and those which are based on automatic item generation algorithms. These data provide initial validation of the first four ICAR item types as reported in Condon & Revelle [1]. The 4 item types contain a total of 60 items: 9 Letter and Number Series items, 11 Matrix Reasoning items, 16 Verbal Reasoning items and 24 Three-dimensional Rotation items. Approximately 97,000 individuals were administered random subsets of these 60 items using the Synthetic Aperture Personality Assessment method between August 18, 2010 and May 20, 2013. The data are available in rdata and csv formats and are accompanied by documentation stored as a text file. Re-use potential includes a wide range of structural and item-level analyses.
World-wide ocean optics database WOOD (NODC Accession 0092528)

Data.gov (United States)

National Oceanic and Atmospheric Administration, Department of Commerce — WOOD was developed to be a comprehensive publicly-available oceanographic bio-optical database providing global coverage. It includes nearly 250 major data sources...
SolCyc: a database hub at the Sol Genomics Network (SGN) for the manual curation of metabolic networks in Solanum and Nicotiana specific databases

Science.gov (United States)

Foerster, Hartmut; Bombarely, Aureliano; Battey, James N D; Sierro, Nicolas; Ivanov, Nikolai V; Mueller, Lukas A

2018-01-01

Abstract SolCyc is the entry portal to pathway/genome databases (PGDBs) for major species of the Solanaceae family hosted at the Sol Genomics Network. Currently, SolCyc comprises six organism-specific PGDBs for tomato, potato, pepper, petunia, tobacco and one Rubiaceae, coffee. The metabolic networks of those PGDBs have been computationally predicted by the pathologic component of the pathway tools software using the manually curated multi-domain database MetaCyc (http://www.metacyc.org/) as reference. SolCyc has been recently extended by taxon-specific databases, i.e. the family-specific SolanaCyc database, containing only curated data pertinent to species of the nightshade family, and NicotianaCyc, a genus-specific database that stores all relevant metabolic data of the Nicotiana genus. Through manual curation of the published literature, new metabolic pathways have been created in those databases, which are complemented by the continuously updated, relevant species-specific pathways from MetaCyc. At present, SolanaCyc comprises 199 pathways and 29 superpathways and NicotianaCyc accounts for 72 pathways and 13 superpathways. Curator-maintained, taxon-specific databases such as SolanaCyc and NicotianaCyc are characterized by an enrichment of data specific to these taxa and free of falsely predicted pathways. Both databases have been used to update recently created Nicotiana-specific databases for Nicotiana tabacum, Nicotiana benthamiana, Nicotiana sylvestris and Nicotiana tomentosiformis by propagating verifiable data into those PGDBs. In addition, in-depth curation of the pathways in N.tabacum has been carried out which resulted in the elimination of 156 pathways from the 569 pathways predicted by pathway tools. Together, in-depth curation of the predicted pathway network and the supplementation with curated data from taxon-specific databases has substantially improved the curation status of the species–specific N.tabacum PGDB. The implementation of this
BioCarian: search engine for exploratory searches in heterogeneous biological databases.

Science.gov (United States)

Zaki, Nazar; Tennakoon, Chandana

2017-10-02

There are a large number of biological databases publicly available for scientists in the web. Also, there are many private databases generated in the course of research projects. These databases are in a wide variety of formats. Web standards have evolved in the recent times and semantic web technologies are now available to interconnect diverse and heterogeneous sources of data. Therefore, integration and querying of biological databases can be facilitated by techniques used in semantic web. Heterogeneous databases can be converted into Resource Description Format (RDF) and queried using SPARQL language. Searching for exact queries in these databases is trivial. However, exploratory searches need customized solutions, especially when multiple databases are involved. This process is cumbersome and time consuming for those without a sufficient background in computer science. In this context, a search engine facilitating exploratory searches of databases would be of great help to the scientific community. We present BioCarian, an efficient and user-friendly search engine for performing exploratory searches on biological databases. The search engine is an interface for SPARQL queries over RDF databases. We note that many of the databases can be converted to tabular form. We first convert the tabular databases to RDF. The search engine provides a graphical interface based on facets to explore the converted databases. The facet interface is more advanced than conventional facets. It allows complex queries to be constructed, and have additional features like ranking of facet values based on several criteria, visually indicating the relevance of a facet value and presenting the most important facet values when a large number of choices are available. For the advanced users, SPARQL queries can be run directly on the databases. Using this feature, users will be able to incorporate federated searches of SPARQL endpoints. We used the search engine to do an exploratory search
Refactoring databases evolutionary database design

CERN Document Server

Ambler, Scott W

2006-01-01

Refactoring has proven its value in a wide range of development projects–helping software professionals improve system designs, maintainability, extensibility, and performance. Now, for the first time, leading agile methodologist Scott Ambler and renowned consultant Pramodkumar Sadalage introduce powerful refactoring techniques specifically designed for database systems. Ambler and Sadalage demonstrate how small changes to table structures, data, stored procedures, and triggers can significantly enhance virtually any database design–without changing semantics. You’ll learn how to evolve database schemas in step with source code–and become far more effective in projects relying on iterative, agile methodologies. This comprehensive guide and reference helps you overcome the practical obstacles to refactoring real-world databases by covering every fundamental concept underlying database refactoring. Using start-to-finish examples, the authors walk you through refactoring simple standalone databas...
Mapping the Iranian Research Literature in the Field of Traditional Medicine in Scopus Database 2010-2014.

Science.gov (United States)

GhaedAmini, Hossein; Okhovati, Maryam; Zare, Morteza; Saghafi, Zahra; Bazrafshan, Azam; GhaedAmini, Alireza; GhaedAmini, Mohammadreza

2016-05-01

The aim of this study was to provide research and collaboration overview of Iranian research efforts in the field of traditional medicine during 2010-2014. This is a bibliometric study using the Scopus database as data source, using search affiliation address relevant to traditional medicine and Iran as the search strategy. Subject and geographical overlay maps were also applied to visualize the network activities of the Iranian authors. Highly cited articles (citations >10) were further explored to highlight the impact of research domains more specifically. About 3,683 articles were published by Iranian authors in Scopus database. The compound annual growth rate of Iranian publications was 0.14% during 2010-2014. Tehran University of Medical Sciences (932 articles), Shiraz University of Medical Sciences (404 articles) and Tabriz Islamic Medical University (391 articles), were the leading institutions in the field of traditional medicine. Medicinal plants (72%), digestive system's disease (21%), basics of traditional medicine (13%), mental disorders (8%) were the major research topics. United States (7%), Netherlands (3%), and Canada (2.6%) were the most important collaborators of Iranian authors. Iranian research efforts in the field of traditional medicine have been increased slightly over the last years. Yet, joint multi-disciplinary collaborations are needed to cover inadequately described areas of traditional medicine in the country.
Update History of This Database - SKIP Stemcell Database | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us SKIP Stemcell Database Update History of This Database Date Update contents 2017/03/13 SKIP Stemcell Database... English archive site is opened. 2013/03/29 SKIP Stemcell Database ( https://www.skip.med.k...eio.ac.jp/SKIPSearch/top?lang=en ) is opened. About This Database Database Description Download License Update History of This Databa...se Site Policy | Contact Us Update History of This Database - SKIP Stemcell Database | LSDB Archive ...
The effects of public health policies on population health and health inequalities in European welfare states: protocol for an umbrella review.

Science.gov (United States)

Thomson, Katie; Bambra, Clare; McNamara, Courtney; Huijts, Tim; Todd, Adam

2016-04-08

The welfare state is potentially an important macro-level determinant of health that also moderates the extent, and impact, of socio-economic inequalities in exposure to the social determinants of health. The welfare state has three main policy domains: health care, social policy (e.g. social transfers and education) and public health policy. This is the protocol for an umbrella review to examine the latter; its aim is to assess how European welfare states influence the social determinants of health inequalities institutionally through public health policies. A systematic review methodology will be used to identify systematic reviews from high-income countries (including additional EU-28 members) that describe the health and health equity effects of upstream public health interventions. Interventions will focus on primary and secondary prevention policies including fiscal measures, regulation, education, preventative treatment and screening across ten public health domains (tobacco; alcohol; food and nutrition; reproductive health services; the control of infectious diseases; screening; mental health; road traffic injuries; air, land and water pollution; and workplace regulations). Twenty databases will be searched using a pre-determined search strategy to evaluate population-level public health interventions. Understanding the impact of specific public health policy interventions will help to establish causality in terms of the effects of welfare states on population health and health inequalities. The review will document contextual information on how population-level public health interventions are organised, implemented and delivered. This information can be used to identify effective interventions that could be implemented to reduce health inequalities between and within European countries. PROSPERO CRD42016025283.
DOE's Public Database for Green Building Case Studies: Preprint

Energy Technology Data Exchange (ETDEWEB)

Torcellini, P. A.; Crawley, D. B.

2003-11-01

To help capture valuable information on''green building'' case studies, the U.S. Department of Energy has created an online database for collecting, standardizing, and disseminating information about high-performance, green projects. Type of information collected includes green features, design processes, energy performance, and comparison to other high-performance, green buildings.
The National Land Cover Database

Science.gov (United States)

Homer, Collin G.; Fry, Joyce A.; Barnes, Christopher A.

2012-01-01

The National Land Cover Database (NLCD) serves as the definitive Landsat-based, 30-meter resolution, land cover database for the Nation. NLCD provides spatial reference and descriptive data for characteristics of the land surface such as thematic class (for example, urban, agriculture, and forest), percent impervious surface, and percent tree canopy cover. NLCD supports a wide variety of Federal, State, local, and nongovernmental applications that seek to assess ecosystem status and health, understand the spatial patterns of biodiversity, predict effects of climate change, and develop land management policy. NLCD products are created by the Multi-Resolution Land Characteristics (MRLC) Consortium, a partnership of Federal agencies led by the U.S. Geological Survey. All NLCD data products are available for download at no charge to the public from the MRLC Web site: http://www.mrlc.gov.
CRIMINAL LAW PROTECTION OF DATABASE AT A GLANCE

Directory of Open Access Journals (Sweden)

LUCIAN T. POENARU

2012-05-01

Full Text Available Database protection is provided in Romania by the general law on copyright no. 8/1996. According to the law, it is considered to be a crime making available to the public, by any means, the special rights attributed to database owners or copies thereof. This paper will focus on, one hand, presenting the way database and database related products can be subject to a copyright general protection and, on the other, revealing the special sui generis right attributed to database owners. In such a context, criminal instruments for protecting such rights seem to be quite annoying for the perpetrator, but less effective when it comes to a proper enforcement by the criminal bodies. This paper will therefore try to compare the way guilty actions of the culprit are effectively sanctioned by the criminal instruments provided by the law.And because the Romanian law on copyright does follow at least the letter of the European Directives on copyright and the protection of database, this paper will also search the spirit of the relevant European case-law and its applicability by the Romanian authorities.
Rosette Assay: Highly Customizable Dot-Blot for SH2 Domain Screening.

Science.gov (United States)

Ng, Khong Y; Machida, Kazuya

2017-01-01

With a growing number of high-throughput studies, structural analyses, and availability of protein-protein interaction databases, it is now possible to apply web-based prediction tools to SH2 domain-interactions. However, in silico prediction is not always reliable and requires experimental validation. Rosette assay is a dot blot-based reverse-phase assay developed for the assessment of binding between SH2 domains and their ligands. It is conveniently customizable, allowing for low- to high-throughput analysis of interactions between various numbers of SH2 domains and their ligands, e.g., short peptides, purified proteins, and cell lysates. The binding assay is performed in a 96-well plate (MBA or MWA apparatus) in which a sample spotted membrane is incubated with up to 96 labeled SH2 domains. Bound domains are detected and quantified using a chemiluminescence or near-infrared fluorescence (IR) imaging system. In this chapter, we describe a practical protocol for rosette assay to assess interactions between synthesized tyrosine phosphorylated peptides and a library of GST-tagged SH2 domains. Since the methodology is not confined to assessment of SH2-pTyr interactions, rosette assay can be broadly utilized for ligand and drug screening using different protein interaction domains or antibodies.
Database design and database administration for a kindergarten

OpenAIRE

Vítek, Daniel

2009-01-01

The bachelor thesis deals with creation of database design for a standard kindergarten, installation of the designed database into the database system Oracle Database 10g Express Edition and demonstration of the administration tasks in this database system. The verification of the database was proved by a developed access application.
Analysis and Design of Web-Based Database Application for Culinary Community

Directory of Open Access Journals (Sweden)

Choirul Huda

2017-03-01

Full Text Available This research is based on the rapid development of the culinary and information technology. The difficulties in communicating with the culinary expert and on recipe documentation make a proper support for media very important. Therefore, a web-based database application for the public is important to help the culinary community in communication, searching and recipe management. The aim of the research was to design a web-based database application that could be used as social media for the culinary community. This research used literature review, user interviews, and questionnaires. Moreover, the database system development life cycle was used as a guide for designing a database especially for conceptual database design, logical database design, and physical design database. Web-based application design used eight golden rules for user interface design. The result of this research is the availability of a web-based database application that can fulfill the needs of users in the culinary field related to communication and recipe management.
Respiratory infections research in afghanistan: bibliometric analysis with the database pubmed

International Nuclear Information System (INIS)

Pilsezek, F.H.

2015-01-01

Infectious diseases research in a low-income country like Afghanistan is important. Methods: In this study an internet-based database Pubmed was used for bibliometric analysis of infectious diseases research activity. Research publications entries in PubMed were analysed according to number of publications, topic, publication type, and country of investigators. Results: Between 2002-2011, 226 (77.7%) publications with the following research topics were identified: respiratory infections 3 (1.3%); parasites 8 (3.5%); diarrhoea 10 (4.4%); tuberculosis 10 (4.4%); human immunodeficiency virus (HIV) 11(4.9%); multi-drug resistant bacteria (MDR) 18(8.0%); polio 31(13.7%); leishmania 31(13.7%); malaria 46(20.4%). From 2002-2011, 11 (4.9%) publications were basic science laboratory-based research studies. Between 2002-2011, 8 (3.5%) publications from Afghan institutions were identified. Conclusion: In conclusion, the internet-based database Pubmed can be consulted to collect data for guidance of infectious diseases research activity of low-income countries. The presented data suggest that infectious diseases research in Afghanistan is limited for respiratory infections research, has few studies conducted by Afghan institutions, and limited laboratory-based research contributions. (author)
RESPIRATORY INFECTIONS RESEARCH IN AFGHANISTAN: BIBLIOMETRIC ANALYSIS WITH THE DATABASE PUBMED.

Science.gov (United States)

Pilsczek, Florian H

2015-01-01

Infectious diseases research in a low-income country like Afghanistan is important. In this study an internet-based database Pubmed was used for bibliometric analysis of infectious diseases research activity. Research publications entries in PubMed were analysed according to number of publications, topic, publication type, and country of investigators. Between 2002-2011, 226 (77.7%) publications with the following research topics were identified: respiratory infections 3 (1.3%); parasites 8 (3.5%); diarrhoea 10 (4.4%); tuberculosis 10 (4.4%); human immunodeficiency virus (HIV) 11 (4.9%); multi-drug resistant bacteria (MDR) 18 (8.0%); polio 31 (13.7%); leishmania 31 (13.7%); malaria 46 (20.4%). From 2002-2011, 11 (4.9%) publications were basic science laboratory-based research studies. Between 2002-2011, 8 (3.5%) publications from Afghan institutions were identified. In conclusion, the internet-based database Pubmed can be consulted to collect data for guidance of infectious diseases research activity of low-income countries. The presented data suggest that infectious diseases research in Afghanistan is limited for respiratory infections research, has few studies conducted by Afghan institutions, and limited laboratory-based research contributions.
Exploring Human Cognition Using Large Image Databases.

Science.gov (United States)

Griffiths, Thomas L; Abbott, Joshua T; Hsu, Anne S

2016-07-01

Most cognitive psychology experiments evaluate models of human cognition using a relatively small, well-controlled set of stimuli. This approach stands in contrast to current work in neuroscience, perception, and computer vision, which have begun to focus on using large databases of natural images. We argue that natural images provide a powerful tool for characterizing the statistical environment in which people operate, for better evaluating psychological theories, and for bringing the insights of cognitive science closer to real applications. We discuss how some of the challenges of using natural images as stimuli in experiments can be addressed through increased sample sizes, using representations from computer vision, and developing new experimental methods. Finally, we illustrate these points by summarizing recent work using large image databases to explore questions about human cognition in four different domains: modeling subjective randomness, defining a quantitative measure of representativeness, identifying prior knowledge used in word learning, and determining the structure of natural categories. Copyright © 2016 Cognitive Science Society, Inc.
Database Description - Open TG-GATEs Pathological Image Database | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us Open TG-GATEs Pathological Image Database Database Description General information of database Database... name Open TG-GATEs Pathological Image Database Alternative name - DOI 10.18908/lsdba.nbdc00954-0...iomedical Innovation 7-6-8, Saito-asagi, Ibaraki-city, Osaka 567-0085, Japan TEL:81-72-641-9826 Email: Database... classification Toxicogenomics Database Organism Taxonomy Name: Rattus norvegi... Article title: Author name(s): Journal: External Links: Original website information Database
SWEETLEAD: an in silico database of approved drugs, regulated chemicals, and herbal isolates for computer-aided drug discovery.

Directory of Open Access Journals (Sweden)

Paul A Novick

Full Text Available In the face of drastically rising drug discovery costs, strategies promising to reduce development timelines and expenditures are being pursued. Computer-aided virtual screening and repurposing approved drugs are two such strategies that have shown recent success. Herein, we report the creation of a highly-curated in silico database of chemical structures representing approved drugs, chemical isolates from traditional medicinal herbs, and regulated chemicals, termed the SWEETLEAD database. The motivation for SWEETLEAD stems from the observance of conflicting information in publicly available chemical databases and the lack of a highly curated database of chemical structures for the globally approved drugs. A consensus building scheme surveying information from several publicly accessible databases was employed to identify the correct structure for each chemical. Resulting structures are filtered for the active pharmaceutical ingredient, standardized, and differing formulations of the same drug were combined in the final database. The publically available release of SWEETLEAD (https://simtk.org/home/sweetlead provides an important tool to enable the successful completion of computer-aided repurposing and drug discovery campaigns.

Data mining and visualization of the Alabama accident database

Science.gov (United States)

2000-08-01

The Alabama Department of Public Safety has developed and maintains a centralized database that contain traffic accident data collected from crash report completed by local police officers and state troopers. The Critical Analysis Reporting Environme...
Medicago truncatula transporter database: a comprehensive database resource for M. truncatula transporters

Directory of Open Access Journals (Sweden)

Miao Zhenyan

2012-02-01

Full Text Available Abstract Background Medicago truncatula has been chosen as a model species for genomic studies. It is closely related to an important legume, alfalfa. Transporters are a large group of membrane-spanning proteins. They deliver essential nutrients, eject waste products, and assist the cell in sensing environmental conditions by forming a complex system of pumps and channels. Although studies have effectively characterized individual M. truncatula transporters in several databases, until now there has been no available systematic database that includes all transporters in M. truncatula. Description The M. truncatula transporter database (MTDB contains comprehensive information on the transporters in M. truncatula. Based on the TransportTP method, we have presented a novel prediction pipeline. A total of 3,665 putative transporters have been annotated based on International Medicago Genome Annotated Group (IMGAG V3.5 V3 and the M. truncatula Gene Index (MTGI V10.0 releases and assigned to 162 families according to the transporter classification system. These families were further classified into seven types according to their transport mode and energy coupling mechanism. Extensive annotations referring to each protein were generated, including basic protein function, expressed sequence tag (EST mapping, genome locus, three-dimensional template prediction, transmembrane segment, and domain annotation. A chromosome distribution map and text-based Basic Local Alignment Search Tools were also created. In addition, we have provided a way to explore the expression of putative M. truncatula transporter genes under stress treatments. Conclusions In summary, the MTDB enables the exploration and comparative analysis of putative transporters in M. truncatula. A user-friendly web interface and regular updates make MTDB valuable to researchers in related fields. The MTDB is freely available now to all users at http://bioinformatics.cau.edu.cn/MtTransporter/.
Privacy and Data-Based Research

OpenAIRE

Ori Heffetz; Katrina Ligett

2013-01-01

What can we, as users of microdata, formally guarantee to the individuals (or firms) in our dataset, regarding their privacy? We retell a few stories, well-known in data-privacy circles, of failed anonymization attempts in publicly released datasets. We then provide a mostly informal introduction to several ideas from the literature on differential privacy, an active literature in computer science that studies formal approaches to preserving the privacy of individuals in statistical databases...
Improving pairwise comparison of protein sequences with domain co-occurrence

Science.gov (United States)

Gascuel, Olivier

2018-01-01

Comparing and aligning protein sequences is an essential task in bioinformatics. More specifically, local alignment tools like BLAST are widely used for identifying conserved protein sub-sequences, which likely correspond to protein domains or functional motifs. However, to limit the number of false positives, these tools are used with stringent sequence-similarity thresholds and hence can miss several hits, especially for species that are phylogenetically distant from reference organisms. A solution to this problem is then to integrate additional contextual information to the procedure. Here, we propose to use domain co-occurrence to increase the sensitivity of pairwise sequence comparisons. Domain co-occurrence is a strong feature of proteins, since most protein domains tend to appear with a limited number of other domains on the same protein. We propose a method to take this information into account in a typical BLAST analysis and to construct new domain families on the basis of these results. We used Plasmodium falciparum as a case study to evaluate our method. The experimental findings showed an increase of 14% of the number of significant BLAST hits and an increase of 25% of the proteome area that can be covered with a domain. Our method identified 2240 new domains for which, in most cases, no model of the Pfam database could be linked. Moreover, our study of the quality of the new domains in terms of alignment and physicochemical properties show that they are close to that of standard Pfam domains. Source code of the proposed approach and supplementary data are available at: https://gite.lirmm.fr/menichelli/pairwise-comparison-with-cooccurrence PMID:29293498
e-MIR2: a public online inventory of medical informatics resources.

Science.gov (United States)

de la Calle, Guillermo; García-Remesal, Miguel; Nkumu-Mbomio, Nelida; Kulikowski, Casimir; Maojo, Victor

2012-08-02

Over the past years, the number of available informatics resources in medicine has grown exponentially. While specific inventories of such resources have already begun to be developed for Bioinformatics (BI), comparable inventories are as yet not available for the Medical Informatics (MI) field, so that locating and accessing them currently remains a difficult and time-consuming task. We have created a repository of MI resources from the scientific literature, providing free access to its contents through a web-based service. We define informatics resources as all those elements that constitute, serve to define or are used by informatics systems, ranging from architectures or development methodologies to terminologies, vocabularies, databases or tools. Relevant information describing the resources is automatically extracted from manuscripts published in top-ranked MI journals. We used a pattern matching approach to detect the resources' names and their main features. Detected resources are classified according to three different criteria: functionality, resource type and domain. To facilitate these tasks, we have built three different classification schemas by following a novel approach based on folksonomies and social tagging. We adopted the terminology most frequently used by MI researchers in their publications to create the concepts and hierarchical relationships belonging to the classification schemas. The classification algorithm identifies the categories associated with resources and annotates them accordingly. The database is then populated with this data after manual curation and validation. We have created an online repository of MI resources to assist researchers in locating and accessing the most suitable resources to perform specific tasks. The database contains 609 resources at the time of writing and is available at http://www.gib.fi.upm.es/eMIR2. We are continuing to expand the number of available resources by taking into account further
e-MIR2: a public online inventory of medical informatics resources

Directory of Open Access Journals (Sweden)

de la Calle Guillermo

2012-08-01

Full Text Available Abstract Background Over the past years, the number of available informatics resources in medicine has grown exponentially. While specific inventories of such resources have already begun to be developed for Bioinformatics (BI, comparable inventories are as yet not available for the Medical Informatics (MI field, so that locating and accessing them currently remains a difficult and time-consuming task. Description We have created a repository of MI resources from the scientific literature, providing free access to its contents through a web-based service. We define informatics resources as all those elements that constitute, serve to define or are used by informatics systems, ranging from architectures or development methodologies to terminologies, vocabularies, databases or tools. Relevant information describing the resources is automatically extracted from manuscripts published in top-ranked MI journals. We used a pattern matching approach to detect the resources’ names and their main features. Detected resources are classified according to three different criteria: functionality, resource type and domain. To facilitate these tasks, we have built three different classification schemas by following a novel approach based on folksonomies and social tagging. We adopted the terminology most frequently used by MI researchers in their publications to create the concepts and hierarchical relationships belonging to the classification schemas. The classification algorithm identifies the categories associated with resources and annotates them accordingly. The database is then populated with this data after manual curation and validation. Conclusions We have created an online repository of MI resources to assist researchers in locating and accessing the most suitable resources to perform specific tasks. The database contains 609 resources at the time of writing and is available at http://www.gib.fi.upm.es/eMIR2. We are continuing to expand the number
Ontology to relational database transformation for web application development and maintenance

Science.gov (United States)

Mahmudi, Kamal; Inggriani Liem, M. M.; Akbar, Saiful

2018-03-01

Ontology is used as knowledge representation while database is used as facts recorder in a KMS (Knowledge Management System). In most applications, data are managed in a database system and updated through the application and then they are transformed to knowledge as needed. Once a domain conceptor defines the knowledge in the ontology, application and database can be generated from the ontology. Most existing frameworks generate application from its database. In this research, ontology is used for generating the application. As the data are updated through the application, a mechanism is designed to trigger an update to the ontology so that the application can be rebuilt based on the newest ontology. By this approach, a knowledge engineer has a full flexibility to renew the application based on the latest ontology without dependency to a software developer. In many cases, the concept needs to be updated when the data changed. The framework is built and tested in a spring java environment. A case study was conducted to proof the concepts.
Comparative analysis of cloud cover databases for CORDEX-AFRICA

Science.gov (United States)

Enríquez, A.; Taima-Hernández, D.; González, A.; Pérez, J. C.; Díaz, J. P.; Expósito, F. J.

2012-04-01

The main objective of the CORDEX program (COordinated Regional climate Downscaling Experiment) [1] is the production of regional climate change scenarios at a global scale, creating a contribution to the IPCC (Intergovernmental Panel on Climate Change) AR5 (5th Assessment Report). Inside this project, Africa is the key region due to the lack of data at this moment. In this study, the cloud cover information obtained through five well-known databases: ERA-40, ERA-Interim, ISCCP, NCEP and CRU, over the CORDEX-AFRICA domain, is analyzed for the period 1984-2000, in order to determine the similarity between them.To analyze the accuracy and consistency of the climate databases, some statistical techniques such as correlation coefficient (r), root mean square (RMS) differences and a defined skill score (SS), based on the difference between areas of the probability density functions (PDFs) associated to study parameters [2], were applied. Thus which databases are well-related in different regions and which not are determined, establishing an appropriate framework which could be used to validate the AR5 models in historical simulations.
SpirPro: A Spirulina proteome database and web-based tools for the analysis of protein-protein interactions at the metabolic level in Spirulina (Arthrospira) platensis C1.

Science.gov (United States)

Senachak, Jittisak; Cheevadhanarak, Supapon; Hongsthong, Apiradee

2015-07-29

Spirulina (Arthrospira) platensis is the only cyanobacterium that in addition to being studied at the molecular level and subjected to gene manipulation, can also be mass cultivated in outdoor ponds for commercial use as a food supplement. Thus, encountering environmental changes, including temperature stresses, is common during the mass production of Spirulina. The use of cyanobacteria as an experimental platform, especially for photosynthetic gene manipulation in plants and bacteria, is becoming increasingly important. Understanding the mechanisms and protein-protein interaction networks that underlie low- and high-temperature responses is relevant to Spirulina mass production. To accomplish this goal, high-throughput techniques such as OMICs analyses are used. Thus, large datasets must be collected, managed and subjected to information extraction. Therefore, databases including (i) proteomic analysis and protein-protein interaction (PPI) data and (ii) domain/motif visualization tools are required for potential use in temperature response models for plant chloroplasts and photosynthetic bacteria. A web-based repository was developed including an embedded database, SpirPro, and tools for network visualization. Proteome data were analyzed integrated with protein-protein interactions and/or metabolic pathways from KEGG. The repository provides various information, ranging from raw data (2D-gel images) to associated results, such as data from interaction and/or pathway analyses. This integration allows in silico analyses of protein-protein interactions affected at the metabolic level and, particularly, analyses of interactions between and within the affected metabolic pathways under temperature stresses for comparative proteomic analysis. The developed tool, which is coded in HTML with CSS/JavaScript and depicted in Scalable Vector Graphics (SVG), is designed for interactive analysis and exploration of the constructed network. SpirPro is publicly available on the web
Data Preparation Process for the Buildings Performance Database

Energy Technology Data Exchange (ETDEWEB)

Walter, Travis; Dunn, Laurel; Mercado, Andrea; Brown, Richard E.; Mathew, Paul

2014-06-30

The Buildings Performance Database (BPD) includes empirically measured data from a variety of data sources with varying degrees of data quality and data availability. The purpose of the data preparation process is to maintain data quality within the database and to ensure that all database entries have sufficient data for meaningful analysis and for the database API. Data preparation is a systematic process of mapping data into the Building Energy Data Exchange Specification (BEDES), cleansing data using a set of criteria and rules of thumb, and deriving values such as energy totals and dominant asset types. The data preparation process takes the most amount of effort and time therefore most of the cleansing process has been automated. The process also needs to adapt as more data is contributed to the BPD and as building technologies over time. The data preparation process is an essential step between data contributed by providers and data published to the public in the BPD.
From idea to publication: Publication rates of theses in neurosurgery from Turkey.

Science.gov (United States)

Öğrenci, Ahmet; Ekşi, Murat Şakir; Özcan-Ekşi, Emel Ece; Koban, Orkun

2016-01-01

Thesis at the end of residency is considered as the complementary component of postgraduate training. In this respect, thesis helps the residents learn how to ask structured questions, set up the most appropriate study design, conduct the study, retrieve study results and write conclusions with clinical implications. To the best of our knowledge, the publication rates of theses in the field of neurosurgery have not been reported before. Our aim was to find out publication rates of theses in neurosurgery specialty, in this descriptive study. The database of Higher Education Council of Turkey, which includes the theses of residents in only university hospitals, was screened between years 2004 and 2013. After retrieving the theses from the database; we used search engines to find out the theses published in any SCI/SCI-E-indexed journals. For this purpose, the title of the theses and the author names were used as keywords for searching. Data was presented in a descriptive form as absolute numbers and percentages. We retrieved 164 theses written by former residents in neurosurgery using the database. Among 164 theses, 18% (national journals: 9; international journals: 21) were published in SCI/SCI-E indexed journals. Publication rates of theses in neurosurgery are low as they are in the other specialties of medicine. Our study is a descriptive research, to give an idea about publication rates of theses in neurosurgery. Further studies are required to understand the underlying factors, which are responsible for the limited success in publication of theses in neurosurgery. Copyright © 2015 Polish Neurological Society. Published by Elsevier Urban & Partner Sp. z o.o. All rights reserved.
Expediting topology data gathering for the TOPDB database.

Science.gov (United States)

Dobson, László; Langó, Tamás; Reményi, István; Tusnády, Gábor E

2015-01-01

The Topology Data Bank of Transmembrane Proteins (TOPDB, http://topdb.enzim.ttk.mta.hu) contains experimentally determined topology data of transmembrane proteins. Recently, we have updated TOPDB from several sources and utilized a newly developed topology prediction algorithm to determine the most reliable topology using the results of experiments as constraints. In addition to collecting the experimentally determined topology data published in the last couple of years, we gathered topographies defined by the TMDET algorithm using 3D structures from the PDBTM. Results of global topology analysis of various organisms as well as topology data generated by high throughput techniques, like the sequential positions of N- or O-glycosylations were incorporated into the TOPDB database. Moreover, a new algorithm was developed to integrate scattered topology data from various publicly available databases and a new method was introduced to measure the reliability of predicted topologies. We show that reliability values highly correlate with the per protein topology accuracy of the utilized prediction method. Altogether, more than 52,000 new topology data and more than 2600 new transmembrane proteins have been collected since the last public release of the TOPDB database. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Update History of This Database - Yeast Interacting Proteins Database | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us Yeast Interacting Proteins Database Update History of This Database Date Update contents 201...0/03/29 Yeast Interacting Proteins Database English archive site is opened. 2000/12/4 Yeast Interacting Proteins Database...( http://itolab.cb.k.u-tokyo.ac.jp/Y2H/ ) is released. About This Database Database Description... Download License Update History of This Database Site Policy | Contact Us Update History of This Database... - Yeast Interacting Proteins Database | LSDB Archive ...
Harvesting Covert Networks: The Case Study of the iMiner Database

DEFF Research Database (Denmark)

Memon, Nasrullah; Wiil, Uffe Kock; Alhajj, Reda

2011-01-01

was incorporated in the iMiner prototype tool, which makes use of investigative data mining techniques to analyse data. This paper will present the developed framework along with the form and structure of the terrorist data in the database. Selected cases will be referenced to highlight the effectiveness of the i...... collected by intelligence agencies and government organisations is inaccessible to researchers. To counter the information scarcity, we designed and built a database of terrorist-related data and information by harvesting such data from publicly available authenticated websites. The database...
BioModels Database: a repository of mathematical models of biological processes.

Science.gov (United States)

Chelliah, Vijayalakshmi; Laibe, Camille; Le Novère, Nicolas

2013-01-01

BioModels Database is a public online resource that allows storing and sharing of published, peer-reviewed quantitative, dynamic models of biological processes. The model components and behaviour are thoroughly checked to correspond the original publication and manually curated to ensure reliability. Furthermore, the model elements are annotated with terms from controlled vocabularies as well as linked to relevant external data resources. This greatly helps in model interpretation and reuse. Models are stored in SBML format, accepted in SBML and CellML formats, and are available for download in various other common formats such as BioPAX, Octave, SciLab, VCML, XPP and PDF, in addition to SBML. The reaction network diagram of the models is also available in several formats. BioModels Database features a search engine, which provides simple and more advanced searches. Features such as online simulation and creation of smaller models (submodels) from the selected model elements of a larger one are provided. BioModels Database can be accessed both via a web interface and programmatically via web services. New models are available in BioModels Database at regular releases, about every 4 months.
Defining new criteria for selection of cell-based intestinal models using publicly available databases

Directory of Open Access Journals (Sweden)

Christensen Jon

2012-06-01

Full Text Available Abstract Background The criteria for choosing relevant cell lines among a vast panel of available intestinal-derived lines exhibiting a wide range of functional properties are still ill-defined. The objective of this study was, therefore, to establish objective criteria for choosing relevant cell lines to assess their appropriateness as tumor models as well as for drug absorption studies. Results We made use of publicly available expression signatures and cell based functional assays to delineate differences between various intestinal colon carcinoma cell lines and normal intestinal epithelium. We have compared a panel of intestinal cell lines with patient-derived normal and tumor epithelium and classified them according to traits relating to oncogenic pathway activity, epithelial-mesenchymal transition (EMT and stemness, migratory properties, proliferative activity, transporter expression profiles and chemosensitivity. For example, SW480 represent an EMT-high, migratory phenotype and scored highest in terms of signatures associated to worse overall survival and higher risk of recurrence based on patient derived databases. On the other hand, differentiated HT29 and T84 cells showed gene expression patterns closest to tumor bulk derived cells. Regarding drug absorption, we confirmed that differentiated Caco-2 cells are the model of choice for active uptake studies in the small intestine. Regarding chemosensitivity we were unable to confirm a recently proposed association of chemo-resistance with EMT traits. However, a novel signature was identified through mining of NCI60 GI50 values that allowed to rank the panel of intestinal cell lines according to their drug responsiveness to commonly used chemotherapeutics. Conclusions This study presents a straightforward strategy to exploit publicly available gene expression data to guide the choice of cell-based models. While this approach does not overcome the major limitations of such models
Development a GIS Snowstorm Database

Science.gov (United States)

Squires, M. F.

2010-12-01

This paper describes the development of a GIS Snowstorm Database (GSDB) at NOAA’s National Climatic Data Center. The snowstorm database is a collection of GIS layers and tabular information for 471 snowstorms between 1900 and 2010. Each snowstorm has undergone automated and manual quality control. The beginning and ending date of each snowstorm is specified. The original purpose of this data was to serve as input for NCDC’s new Regional Snowfall Impact Scale (ReSIS). However, this data is being preserved and used to investigate the impacts of snowstorms on society. GSDB is used to summarize the impact of snowstorms on transportation (interstates) and various classes of facilities (roads, schools, hospitals, etc.). GSDB can also be linked to other sources of impacts such as insurance loss information and Storm Data. Thus the snowstorm database is suited for many different types of users including the general public, decision makers, and researchers. This paper summarizes quality control issues associated with using snowfall data, methods used to identify the starting and ending dates of a storm, and examples of the tables that combine snowfall and societal data.
Database Description - RMOS | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available base Description General information of database Database name RMOS Alternative nam...arch Unit Shoshi Kikuchi E-mail : Database classification Plant databases - Rice Microarray Data and other Gene Expression Database...s Organism Taxonomy Name: Oryza sativa Taxonomy ID: 4530 Database description The Ric...19&lang=en Whole data download - Referenced database Rice Expression Database (RED) Rice full-length cDNA Database... (KOME) Rice Genome Integrated Map Database (INE) Rice Mutant Panel Database (Tos17) Rice Genome Annotation Database
The BDNYC database of low-mass stars, brown dwarfs, and planetary mass companions

Science.gov (United States)

Cruz, Kelle; Rodriguez, David; Filippazzo, Joseph; Gonzales, Eileen; Faherty, Jacqueline K.; Rice, Emily; BDNYC

2018-01-01

We present a web-interface to a database of low-mass stars, brown dwarfs, and planetary mass companions. Users can send SELECT SQL queries to the database, perform searches by coordinates or name, check the database inventory on specified objects, and even plot spectra interactively. The initial version of this database contains information for 198 objects and version 2 will contain over 1000 objects. The database currently includes photometric data from 2MASS, WISE, and Spitzer and version 2 will include a significant portion of the publicly available optical and NIR spectra for brown dwarfs. The database is maintained and curated by the BDNYC research group and we welcome contributions from other researchers via GitHub.
KALIMER database development (database configuration and design methodology)

International Nuclear Information System (INIS)

Jeong, Kwan Seong; Kwon, Young Min; Lee, Young Bum; Chang, Won Pyo; Hahn, Do Hee

2001-10-01

KALIMER Database is an advanced database to utilize the integration management for Liquid Metal Reactor Design Technology Development using Web Applicatins. KALIMER Design database consists of Results Database, Inter-Office Communication (IOC), and 3D CAD database, Team Cooperation system, and Reserved Documents, Results Database is a research results database during phase II for Liquid Metal Reactor Design Technology Develpment of mid-term and long-term nuclear R and D. IOC is a linkage control system inter sub project to share and integrate the research results for KALIMER. 3D CAD Database is s schematic design overview for KALIMER. Team Cooperation System is to inform team member of research cooperation and meetings. Finally, KALIMER Reserved Documents is developed to manage collected data and several documents since project accomplishment. This report describes the features of Hardware and Software and the Database Design Methodology for KALIMER

Post-processing of Deep Web Information Extraction Based on Domain Ontology

Directory of Open Access Journals (Sweden)

PENG, T.

2013-11-01

Full Text Available Many methods are utilized to extract and process query results in deep Web, which rely on the different structures of Web pages and various designing modes of databases. However, some semantic meanings and relations are ignored. So, in this paper, we present an approach for post-processing deep Web query results based on domain ontology which can utilize the semantic meanings and relations. A block identification model (BIM based on node similarity is defined to extract data blocks that are relevant to specific domain after reducing noisy nodes. Feature vector of domain books is obtained by result set extraction model (RSEM based on vector space model (VSM. RSEM, in combination with BIM, builds the domain ontology on books which can not only remove the limit of Web page structures when extracting data information, but also make use of semantic meanings of domain ontology. After extracting basic information of Web pages, a ranking algorithm is adopted to offer an ordered list of data records to users. Experimental results show that BIM and RSEM extract data blocks and build domain ontology accurately. In addition, relevant data records and basic information are extracted and ranked. The performances precision and recall show that our proposed method is feasible and efficient.
ExplorEnz: a MySQL database of the IUBMB enzyme nomenclature.

Science.gov (United States)

McDonald, Andrew G; Boyce, Sinéad; Moss, Gerard P; Dixon, Henry B F; Tipton, Keith F

2007-07-27

We describe the database ExplorEnz, which is the primary repository for EC numbers and enzyme data that are being curated on behalf of the IUBMB. The enzyme nomenclature is incorporated into many other resources, including the ExPASy-ENZYME, BRENDA and KEGG bioinformatics databases. The data, which are stored in a MySQL database, preserve the formatting of chemical and enzyme names. A simple, easy to use, web-based query interface is provided, along with an advanced search engine for more complex queries. The database is publicly available at http://www.enzyme-database.org. The data are available for download as SQL and XML files via FTP. ExplorEnz has powerful and flexible search capabilities and provides the scientific community with the most up-to-date version of the IUBMB Enzyme List.
HEND: A Database for High Energy Nuclear Data

International Nuclear Information System (INIS)

Brown, D; Vogt, R

2007-01-01

We propose to develop a high-energy heavy-ion experimental database and make it accessible to the scientific community through an on-line interface. The database will be searchable and cross-indexed with relevant publications, including published detector descriptions. It should eventually contain all published data from older heavy-ion programs such as the Bevalac, AGS, SPS and FNAL fixed-target programs, as well as published data from current programs at RHIC and new facilities at GSI (FAIR), KEK/Tsukuba and the LHC collider. This data includes all proton-proton, proton-nucleus to nucleus-nucleus collisions as well as other relevant systems and all measured observables. Such a database would have tremendous scientific payoff as it makes systematic studies easier and allows simpler benchmarking of theoretical models to a broad range of experiments. To enhance the utility of the database, we propose periodic data evaluations and topical reviews. These reviews would provide an alternative and impartial mechanism to resolve discrepancies between published data from rival experiments and between theory and experiment. Since this database will be a community resource, it requires the high-energy nuclear physics community's financial and manpower support
Database Description - SAHG | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available base Description General information of database Database name SAHG Alternative nam...h: Contact address Chie Motono Tel : +81-3-3599-8067 E-mail : Database classification Structure Databases - ...e databases - Protein properties Organism Taxonomy Name: Homo sapiens Taxonomy ID: 9606 Database description... Links: Original website information Database maintenance site The Molecular Profiling Research Center for D...stration Not available About This Database Database Description Download License Update History of This Database Site Policy | Contact Us Database Description - SAHG | LSDB Archive ...
Response to a widespread, unauthorized dispersal of radioactive waste in the public domain

International Nuclear Information System (INIS)

Wenslawski, F.A.; North, H.S.

1979-01-01

In March 1976 State of Nevada radiological health officials became aware that radioactive items destined for disposal at a radioactive waste burial facility near Beatty, Nevada had instead been distributed to wide segments of the public domain. Because the facility was jointly licensed by the State of Nevada and the Federal Nuclear Regulatory Commission, both agencies quickly responded. It was learned that over a period of several years a practice existed at the disposal facility of opening containers, removing contents and allowing employees to take items of worth or fancy. Numerous items such as hand tools, electric motors, laboratory instruments, shipping containers, etc., had received widespread and uncontrolled distribution in the town of Beatty as well as lesser distributions to other locations. Because the situation might have had the potential for a significant health and safety impact, a comprehensive recovery operation was conducted. During the course of seven days of intense effort, thirty-five individuals became involved in a comprehensive door by door survey and search of the town. Aerial surveys were performed using a helicopter equipped with sensitive radiation detectors, while ground level scans were conducted using a van containing similar instrumentation. Aerial reconnaissance photographs were taken, a special town meeting was held and numerous persons were interviewed. The recovery effort resulted in a retrieval of an estimated 20 to 25 pickup truck loads of radioactively contaminated equipment as well as several loads of large items returned on a 40-foot flatbed trailer
Use of media and public-domain Internet sources for detection and assessment of plant health threats.

Science.gov (United States)

Thomas, Carla S; Nelson, Noele P; Jahn, Gary C; Niu, Tianchan; Hartley, David M

2011-09-05

Event-based biosurveillance is a recognized approach to early warning and situational awareness of emerging health threats. In this study, we build upon previous human and animal health work to develop a new approach to plant pest and pathogen surveillance. We show that monitoring public domain electronic media for indications and warning of epidemics and associated social disruption can provide information about the emergence and progression of plant pest infestation or disease outbreak. The approach is illustrated using a case study, which describes a plant pest and pathogen epidemic in China and Vietnam from February 2006 to December 2007, and the role of ducks in contributing to zoonotic virus spread in birds and humans. This approach could be used as a complementary method to traditional plant pest and pathogen surveillance to aid global and national plant protection officials and political leaders in early detection and timely response to significant biological threats to plant health, economic vitality, and social stability. This study documents the inter-relatedness of health in human, animal, and plant populations and emphasizes the importance of plant health surveillance.
Nuclear Criticality Information System. Database examples

Energy Technology Data Exchange (ETDEWEB)

Foret, C.A.

1984-06-01

The purpose of this publication is to provide our users with a guide to using the Nuclear Criticality Information System (NCIS). It is comprised of an introduction, an information and resources section, a how-to-use section, and several useful appendices. The main objective of this report is to present a clear picture of the NCIS project and its available resources as well as assisting our users in accessing the database and using the TIS computer to process data. The introduction gives a brief description of the NCIS project, the Technology Information System (TIS), online user information, future plans and lists individuals to contact for additional information about the NCIS project. The information and resources section outlines the NCIS database and describes the resources that are available. The how-to-use section illustrates access to the NCIS database as well as searching datafiles for general or specific data. It also shows how to access and read the NCIS news section as well as connecting to other information centers through the TIS computer.
Nuclear Criticality Information System. Database examples

International Nuclear Information System (INIS)

Foret, C.A.

1984-06-01

The purpose of this publication is to provide our users with a guide to using the Nuclear Criticality Information System (NCIS). It is comprised of an introduction, an information and resources section, a how-to-use section, and several useful appendices. The main objective of this report is to present a clear picture of the NCIS project and its available resources as well as assisting our users in accessing the database and using the TIS computer to process data. The introduction gives a brief description of the NCIS project, the Technology Information System (TIS), online user information, future plans and lists individuals to contact for additional information about the NCIS project. The information and resources section outlines the NCIS database and describes the resources that are available. The how-to-use section illustrates access to the NCIS database as well as searching datafiles for general or specific data. It also shows how to access and read the NCIS news section as well as connecting to other information centers through the TIS computer
A feature dictionary supporting a multi-domain medical knowledge base.

Science.gov (United States)

Naeymi-Rad, F

1989-01-01

Because different terminology is used by physicians of different specialties in different locations to refer to the same feature (signs, symptoms, test results), it is essential that our knowledge development tools provide a means to access a common pool of terms. This paper discusses the design of an online medical dictionary that provides a solution to this problem for developers of multi-domain knowledge bases for MEDAS (Medical Emergency Decision Assistance System). Our Feature Dictionary supports phrase equivalents for features, feature interactions, feature classifications, and translations to the binary features generated by the expert during knowledge creation. It is also used in the conversion of a domain knowledge to the database used by the MEDAS inference diagnostic sessions. The Feature Dictionary also provides capabilities for complex queries across multiple domains using the supported relations. The Feature Dictionary supports three methods for feature representation: (1) for binary features, (2) for continuous valued features, and (3) for derived features.
dEMBF: A Comprehensive Database of Enzymes of Microalgal Biofuel Feedstock.

Science.gov (United States)

Misra, Namrata; Panda, Prasanna Kumar; Parida, Bikram Kumar; Mishra, Barada Kanta

2016-01-01

Microalgae have attracted wide attention as one of the most versatile renewable feedstocks for production of biofuel. To develop genetically engineered high lipid yielding algal strains, a thorough understanding of the lipid biosynthetic pathway and the underpinning enzymes is essential. In this work, we have systematically mined the genomes of fifteen diverse algal species belonging to Chlorophyta, Heterokontophyta, Rhodophyta, and Haptophyta, to identify and annotate the putative enzymes of lipid metabolic pathway. Consequently, we have also developed a database, dEMBF (Database of Enzymes of Microalgal Biofuel Feedstock), which catalogues the complete list of identified enzymes along with their computed annotation details including length, hydrophobicity, amino acid composition, subcellular location, gene ontology, KEGG pathway, orthologous group, Pfam domain, intron-exon organization, transmembrane topology, and secondary/tertiary structural data. Furthermore, to facilitate functional and evolutionary study of these enzymes, a collection of built-in applications for BLAST search, motif identification, sequence and phylogenetic analysis have been seamlessly integrated into the database. dEMBF is the first database that brings together all enzymes responsible for lipid synthesis from available algal genomes, and provides an integrative platform for enzyme inquiry and analysis. This database will be extremely useful for algal biofuel research. It can be accessed at http://bbprof.immt.res.in/embf.
The global rock art database: developing a rock art reference model for the RADB system using the CIDOC CRM and Australian heritage examples

Science.gov (United States)

Haubt, R. A.

2015-08-01

The Rock Art Database (RADB) is a virtual organisation that aims to build a global rock art community. It brings together rock art enthusiasts and professionals from around the world in one centralized location through the deployed publicly available RADB Management System. This online platform allows users to share, manage and discuss rock art information and offers a new look at rock art data through the use of new technologies in rich media formats. Full access to the growing platform is currently only available for a selected group of users but it already links over 200 rock art projects around the globe. This paper forms a part of the larger Rock Art Database (RADB) project. It discusses the design stage of the RADB System and the development of a conceptual RADB Reference Model (RARM) that is used to inform the design of the Rock Art Database Management System. It examines the success and failure of international and national systems and uses the Australian heritage sector and Australian rock art as a test model to develop a method for the RADB System design. The system aims to help improve rock art management by introducing the CIDOC CRM in conjunction with a rock art specific domain model. It seeks to improve data compatibility and data sharing to help with the integration of a variety of resources to create the global Rock Art Database Management System.
A NoSQL–SQL Hybrid Organization and Management Approach for Real-Time Geospatial Data: A Case Study of Public Security Video Surveillance

Directory of Open Access Journals (Sweden)

Chen Wu

2017-01-01

Full Text Available With the widespread deployment of ground, air and space sensor sources (internet of things or IoT, social networks, sensor networks, the integrated applications of real-time geospatial data from ubiquitous sensors, especially in public security and smart city domains, are becoming challenging issues. The traditional geographic information system (GIS mostly manages time-discretized geospatial data by means of the Structured Query Language (SQL database management system (DBMS and emphasizes query and retrieval of massive historical geospatial data on disk. This limits its capability for on-the-fly access of real-time geospatial data for online analysis in real time. This paper proposes a hybrid database organization and management approach with SQL relational databases (RDB and not only SQL (NoSQL databases (including the main memory database, MMDB, and distributed files system, DFS. This hybrid approach makes full use of the advantages of NoSQL and SQL DBMS for the real-time access of input data and structured on-the-fly analysis results which can meet the requirements of increased spatio-temporal big data linking analysis. The MMDB facilitates real-time access of the latest input data such as the sensor web and IoT, and supports the real-time query for online geospatial analysis. The RDB stores change information such as multi-modal features and abnormal events extracted from real-time input data. The DFS on disk manages the massive geospatial data, and the extensible storage architecture and distributed scheduling of a NoSQL database satisfy the performance requirements of incremental storage and multi-user concurrent access. A case study of geographic video (GeoVideo surveillance of public security is presented to prove the feasibility of this hybrid organization and management approach.
Latin American contributions to the GEM’s Earthquake Consequences Database

OpenAIRE

Cardona Arboleda, Omar Dario; Ordaz Schroeder, Mario Gustavo; Salgado Gálvez, Mario Andrés; Carreño Tibaduiza, Martha Liliana; Barbat Barbat, Horia Alejandro

2016-01-01

One of the projects of the Global Earthquake Model (GEM) was to develop a global earthquake consequences database (GEMECD) which served both to be an open and public repository of damages and losses on different types of elements at global level and also as a benchmark for the development of vulnerability models that could capture specific characteristics of the affected countries. The online earthquakes consequences database has information on 71 events where 14 correspond to events that occ...
Database Description - PSCDB | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available abase Description General information of database Database name PSCDB Alternative n...rial Science and Technology (AIST) Takayuki Amemiya E-mail: Database classification Structure Databases - Protein structure Database...554-D558. External Links: Original website information Database maintenance site Graduate School of Informat...available URL of Web services - Need for user registration Not available About This Database Database Descri...ption Download License Update History of This Database Site Policy | Contact Us Database Description - PSCDB | LSDB Archive ...
Spatial Digital Database for the Geologic Map of Oregon

Science.gov (United States)

Walker, George W.; MacLeod, Norman S.; Miller, Robert J.; Raines, Gary L.; Connors, Katherine A.

2003-01-01

Introduction This report describes and makes available a geologic digital spatial database (orgeo) representing the geologic map of Oregon (Walker and MacLeod, 1991). The original paper publication was printed as a single map sheet at a scale of 1:500,000, accompanied by a second sheet containing map unit descriptions and ancillary data. A digital version of the Walker and MacLeod (1991) map was included in Raines and others (1996). The dataset provided by this open-file report supersedes the earlier published digital version (Raines and others, 1996). This digital spatial database is one of many being created by the U.S. Geological Survey as an ongoing effort to provide geologic information for use in spatial analysis in a geographic information system (GIS). This database can be queried in many ways to produce a variety of geologic maps. This database is not meant to be used or displayed at any scale larger than 1:500,000 (for example, 1:100,000). This report describes the methods used to convert the geologic map data into a digital format, describes the ArcInfo GIS file structures and relationships, and explains how to download the digital files from the U.S. Geological Survey public access World Wide Web site on the Internet. Scanned images of the printed map (Walker and MacLeod, 1991), their correlation of map units, and their explanation of map symbols are also available for download.
A role for chromatin topology in imprinted domain regulation.

Science.gov (United States)

MacDonald, William A; Sachani, Saqib S; White, Carlee R; Mann, Mellissa R W

2016-02-01

Recently, many advancements in genome-wide chromatin topology and nuclear architecture have unveiled the complex and hidden world of the nucleus, where chromatin is organized into discrete neighbourhoods with coordinated gene expression. This includes the active and inactive X chromosomes. Using X chromosome inactivation as a working model, we utilized publicly available datasets together with a literature review to gain insight into topologically associated domains, lamin-associated domains, nucleolar-associating domains, scaffold/matrix attachment regions, and nucleoporin-associated chromatin and their role in regulating monoallelic expression. Furthermore, we comprehensively review for the first time the role of chromatin topology and nuclear architecture in the regulation of genomic imprinting. We propose that chromatin topology and nuclear architecture are important regulatory mechanisms for directing gene expression within imprinted domains. Furthermore, we predict that dynamic changes in chromatin topology and nuclear architecture play roles in tissue-specific imprint domain regulation during early development and differentiation.
Database Description - ASTRA | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available abase Description General information of database Database name ASTRA Alternative n...tics Journal Search: Contact address Database classification Nucleotide Sequence Databases - Gene structure,...3702 Taxonomy Name: Oryza sativa Taxonomy ID: 4530 Database description The database represents classified p...(10):1211-6. External Links: Original website information Database maintenance site National Institute of Ad... for user registration Not available About This Database Database Description Dow
Databases in the fields of toxicology, occupational and environmental health at DIMDI

International Nuclear Information System (INIS)

Bystrich, E.

1993-01-01

DIMDI, the German Institute for Medical Documentation and Information, is a governmental institute and affiliated to the Federal Ministry for Health. It was founded in 1969 in Cologne. At present DIMDI hosts about seventy international and national bibliographic and factual databases in the field of biosciences, such as medicine, public health, pharmacology, toxicology, occupational and environmental health, nutrition, biology, psychology, sociology, sports, and agricultural sciences. The most important databases with toxicological and ecotoxicological information, which contain data useful for managers of chemical and nucelar power plants are the factual databases HSDB, ECDIN, SIGEDA, RTECS, and CCRIS, and the bibliographic databases TOXALL, ENVIROLINE, SCISEARCH, MEDLINE, EMBASE, and BIOSIS PREVIEWS. (orig.)
A new database sub-system for grain-size analysis

Science.gov (United States)

Suckow, Axel

2013-04-01

content, sand content, etc., which always only displays part of the available information at each depth. Alternatively, full spectra were displayed at one depth. The new software now allows to display the whole grain-size spectrum at each depth in a three dimensional display. LabData and the grain-size subsystem are based on MS Access as front-end and MS SQL Server as back-end database systems. The SQL code for the data model, SQL server procedures and triggers and the MS Access basic code for the front end are public domain code, published under the GNU GPL license agreement and are available free of charge. References: Novothny, Á., Frechen, M., Horváth, E., Wacha, L., Rolf, C., 2011. Investigating the penultimate and last glacial cycles of the Sütt dating, high-resolution grain size, and magnetic susceptibility data. Quaternary International 234, 75-85. Suckow, A., Dumke, I., 2001. A database system for geochemical, isotope hydrological and geochronological laboratories. Radiocarbon 43, 325-337.
Advanced SPARQL querying in small molecule databases.

Science.gov (United States)

Galgonek, Jakub; Hurt, Tomáš; Michlíková, Vendula; Onderka, Petr; Schwarz, Jan; Vondrášek, Jiří

2016-01-01

In recent years, the Resource Description Framework (RDF) and the SPARQL query language have become more widely used in the area of cheminformatics and bioinformatics databases. These technologies allow better interoperability of various data sources and powerful searching facilities. However, we identified several deficiencies that make usage of such RDF databases restrictive or challenging for common users. We extended a SPARQL engine to be able to use special procedures inside SPARQL queries. This allows the user to work with data that cannot be simply precomputed and thus cannot be directly stored in the database. We designed an algorithm that checks a query against data ontology to identify possible user errors. This greatly improves query debugging. We also introduced an approach to visualize retrieved data in a user-friendly way, based on templates describing visualizations of resource classes. To integrate all of our approaches, we developed a simple web application. Our system was implemented successfully, and we demonstrated its usability on the ChEBI database transformed into RDF form. To demonstrate procedure call functions, we employed compound similarity searching based on OrChem. The application is publicly available at https://bioinfo.uochb.cas.cz/projects/chemRDF.

UbiProt: a database of ubiquitylated proteins

Directory of Open Access Journals (Sweden)

Kondratieva Ekaterina V

2007-04-01

Full Text Available Abstract Background Post-translational protein modification with ubiquitin, or ubiquitylation, is one of the hottest topics in a modern biology due to a dramatic impact on diverse metabolic pathways and involvement in pathogenesis of severe human diseases. A great number of eukaryotic proteins was found to be ubiquitylated. However, data about particular ubiquitylated proteins are rather disembodied. Description To fill a general need for collecting and systematizing experimental data concerning ubiquitylation we have developed a new resource, UbiProt Database, a knowledgebase of ubiquitylated proteins. The database contains retrievable information about overall characteristics of a particular protein, ubiquitylation features, related ubiquitylation and de-ubiquitylation machinery and literature references reflecting experimental evidence of ubiquitylation. UbiProt is available at http://ubiprot.org.ru for free. Conclusion UbiProt Database is a public resource offering comprehensive information on ubiquitylated proteins. The resource can serve as a general reference source both for researchers in ubiquitin field and those who deal with particular ubiquitylated proteins which are of their interest. Further development of the UbiProt Database is expected to be of common interest for research groups involved in studies of the ubiquitin system.
Evaluation report on research and development of a database system for mutual computer operation; Denshi keisanki sogo un'yo database system no kenkyu kaihatsu ni kansuru hyoka hokokusho

Energy Technology Data Exchange (ETDEWEB)

NONE

1992-03-01

This paper describes evaluation on the research and development of a database system for mutual computer operation, with respect to discrete database technology, multi-media technology, high reliability technology, and mutual operation network system technology. A large number of research results placing the views on the future were derived, such as the issues of discretion and utilization patterns of the discrete database, structuring of data for multi-media information, retrieval systems, flexible and high-level utilization of the network, and the issues in database protection. These achievements are publicly disclosed widely. The largest feature of this project is in aiming at forming a network system that can be operated mutually under multi-vender environment. Therefore, the researches and developments have been executed under the spirit of the principle of openness to public and international cooperation. These efforts are represented by organizing the rule establishment committee, execution of mutual interconnection experiment (including demonstration evaluation), and development of the mounting rules based on the ISO's 'open system interconnection (OSI)'. These results are compiled in the JIS as the basic reference model for the open system interconnection, whereas the targets shown in the basic plan have been achieved sufficiently. (NEDO)
Database Description - RPD | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available ase Description General information of database Database name RPD Alternative name Rice Proteome Database...titute of Crop Science, National Agriculture and Food Research Organization Setsuko Komatsu E-mail: Database... classification Proteomics Resources Plant databases - Rice Organism Taxonomy Name: Oryza sativa Taxonomy ID: 4530 Database... description Rice Proteome Database contains information on protei...and entered in the Rice Proteome Database. The database is searchable by keyword,
Database Description - PLACE | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available abase Description General information of database Database name PLACE Alternative name A Database...Kannondai, Tsukuba, Ibaraki 305-8602, Japan National Institute of Agrobiological Sciences E-mail : Databas...e classification Plant databases Organism Taxonomy Name: Tracheophyta Taxonomy ID: 58023 Database...99, Vol.27, No.1 :297-300 External Links: Original website information Database maintenance site National In...- Need for user registration Not available About This Database Database Descripti
Beyond cross-domain learning: Multiple-domain nonnegative matrix factorization

KAUST Repository

Wang, Jim Jing-Yan; Gao, Xin

2014-01-01

Traditional cross-domain learning methods transfer learning from a source domain to a target domain. In this paper, we propose the multiple-domain learning problem for several equally treated domains. The multiple-domain learning problem assumes that samples from different domains have different distributions, but share the same feature and class label spaces. Each domain could be a target domain, while also be a source domain for other domains. A novel multiple-domain representation method is proposed for the multiple-domain learning problem. This method is based on nonnegative matrix factorization (NMF), and tries to learn a basis matrix and coding vectors for samples, so that the domain distribution mismatch among different domains will be reduced under an extended variation of the maximum mean discrepancy (MMD) criterion. The novel algorithm - multiple-domain NMF (MDNMF) - was evaluated on two challenging multiple-domain learning problems - multiple user spam email detection and multiple-domain glioma diagnosis. The effectiveness of the proposed algorithm is experimentally verified. © 2013 Elsevier Ltd. All rights reserved.
Beyond cross-domain learning: Multiple-domain nonnegative matrix factorization

KAUST Repository

Wang, Jim Jing-Yan

2014-02-01

Traditional cross-domain learning methods transfer learning from a source domain to a target domain. In this paper, we propose the multiple-domain learning problem for several equally treated domains. The multiple-domain learning problem assumes that samples from different domains have different distributions, but share the same feature and class label spaces. Each domain could be a target domain, while also be a source domain for other domains. A novel multiple-domain representation method is proposed for the multiple-domain learning problem. This method is based on nonnegative matrix factorization (NMF), and tries to learn a basis matrix and coding vectors for samples, so that the domain distribution mismatch among different domains will be reduced under an extended variation of the maximum mean discrepancy (MMD) criterion. The novel algorithm - multiple-domain NMF (MDNMF) - was evaluated on two challenging multiple-domain learning problems - multiple user spam email detection and multiple-domain glioma diagnosis. The effectiveness of the proposed algorithm is experimentally verified. © 2013 Elsevier Ltd. All rights reserved.
CORAL: aligning conserved core regions across domain families.

Science.gov (United States)

Fong, Jessica H; Marchler-Bauer, Aron

2009-08-01

Homologous protein families share highly conserved sequence and structure regions that are frequent targets for comparative analysis of related proteins and families. Many protein families, such as the curated domain families in the Conserved Domain Database (CDD), exhibit similar structural cores. To improve accuracy in aligning such protein families, we propose a profile-profile method CORAL that aligns individual core regions as gap-free units. CORAL computes optimal local alignment of two profiles with heuristics to preserve continuity within core regions. We benchmarked its performance on curated domains in CDD, which have pre-defined core regions, against COMPASS, HHalign and PSI-BLAST, using structure superpositions and comprehensive curator-optimized alignments as standards of truth. CORAL improves alignment accuracy on core regions over general profile methods, returning a balanced score of 0.57 for over 80% of all domain families in CDD, compared with the highest balanced score of 0.45 from other methods. Further, CORAL provides E-values to aid in detecting homologous protein families and, by respecting block boundaries, produces alignments with improved 'readability' that facilitate manual refinement. CORAL will be included in future versions of the NCBI Cn3D/CDTree software, which can be downloaded at http://www.ncbi.nlm.nih.gov/Structure/cdtree/cdtree.shtml. Supplementary data are available at Bioinformatics online.
Development of 2010 national land cover database for the Nepal.

Science.gov (United States)

Uddin, Kabir; Shrestha, Him Lal; Murthy, M S R; Bajracharya, Birendra; Shrestha, Basanta; Gilani, Hammad; Pradhan, Sudip; Dangol, Bikash

2015-01-15

Land cover and its change analysis across the Hindu Kush Himalayan (HKH) region is realized as an urgent need to support diverse issues of environmental conservation. This study presents the first and most complete national land cover database of Nepal prepared using public domain Landsat TM data of 2010 and replicable methodology. The study estimated that 39.1% of Nepal is covered by forests and 29.83% by agriculture. Patch and edge forests constituting 23.4% of national forest cover revealed proximate biotic interferences over the forests. Core forests constituted 79.3% of forests of Protected areas where as 63% of area was under core forests in the outside protected area. Physiographic regions wise forest fragmentation analysis revealed specific conservation requirements for productive hill and mid mountain regions. Comparative analysis with Landsat TM based global land cover product showed difference of the order of 30-60% among different land cover classes stressing the need for significant improvements for national level adoption. The online web based land cover validation tool is developed for continual improvement of land cover product. The potential use of the data set for national and regional level sustainable land use planning strategies and meeting several global commitments also highlighted. Copyright © 2014 Elsevier Ltd. All rights reserved.
Building Vietnamese Herbal Database Towards Big Data Science in Nature-Based Medicine

Science.gov (United States)

2018-01-04

online and hard-copied references). Text mining is planned before DISTRIBUTION A. Approved for public release: distribution unlimited. hand in the...many types of diseases. Poor hand-writing records and current text -based databases, however, perplex the conventionalizing and evaluating process of...remedy for many types of diseases. Poor hand-writing records and current text -based databases, however, perplex the conventionalizing and evaluating
Developing a taxonomy for the science of improvement in public health.

Science.gov (United States)

Riley, William; Lownik, Beth; Halverson, Paul; Parrotta, Carmen; Godsall, Jonathan R; Gyllstrom, Elizabeth; Gearin, Kimberly J; Mays, Glen

2012-11-01

Quality improvement (QI) methods have been used for almost a decade in public health departments to increase effectiveness and efficiency. Although results are rapidly accumulating, the evidence for the science of improvement is shallow and limited. To advance the use and effectiveness of QI in public health, it is important to develop a science of improvement using practice-based research to build an evidence base for QI projects. This purpose of this study is to advance the science of improvement in public health departments with 3 objectives: (1) establish a taxonomy of QI projects in public health, (2) categorize QI projects undertaken in health departments using the taxonomy, and (3) create an opportunity modes and effects analysis. This study is a qualitative analysis of archival data from 2 separate large databases consisting of 51 QI projects undertaken in public health departments over the last 5 years. The study involves 2 separate QI collaboratives. One includes Minnesota health departments; the other is a national collaborative. We propose a standardized case definition, common metrics, and a taxonomy of QI projects to begin building the evidence base for QI in public health and to advance the science of continuous quality improvement. All projects created an aim statement and used metrics while 53% used a specific QI model with an average of 3.25 QI techniques per project. Approximately 40% of the projects incorporated a process control methodology, and 60% of the projects identified the process from beginning to end, while 11 of 12 PHAB (Public Health Accreditation Board) domains were included. The findings provide a baseline for QI taxonomy to operationalize a science of improvement for public health departments.
Database Description - JSNP | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available base Description General information of database Database name JSNP Alternative nam...n Science and Technology Agency Creator Affiliation: Contact address E-mail : Database...sapiens Taxonomy ID: 9606 Database description A database of about 197,000 polymorphisms in Japanese populat...1):605-610 External Links: Original website information Database maintenance site Institute of Medical Scien...er registration Not available About This Database Database Description Download License Update History of This Database
Database Description - RED | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available ase Description General information of database Database name RED Alternative name Rice Expression Database...enome Research Unit Shoshi Kikuchi E-mail : Database classification Plant databases - Rice Database classifi...cation Microarray, Gene Expression Organism Taxonomy Name: Oryza sativa Taxonomy ID: 4530 Database descripti... Article title: Rice Expression Database: the gateway to rice functional genomics...nt Science (2002) Dec 7 (12):563-564 External Links: Original website information Database maintenance site
A database structure for radiological optimization analyses of decommissioning operations

International Nuclear Information System (INIS)

Zeevaert, T.; Van de Walle, B.

1995-09-01

The structure of a database for decommissioning experiences is described. Radiological optimization is a major radiation protection principle in practices and interventions, involving radiological protection factors, economic costs, social factors. An important lack of knowledge with respect to these factors exists in the domain of the decommissioning of nuclear power plants, due to the low number of decommissioning operations already performed. Moreover, decommissioning takes place only once for a installation. Tasks, techniques, and procedures are in most cases rather specific, limiting the use of past experiences in the radiological optimization analyses of new decommissioning operations. Therefore, it is important that relevant data or information be acquired from decommissioning experiences. These data have to be stored in a database in a way they can be used efficiently in ALARA analyses of future decommissioning activities
Expression analysis of the Toll-like receptor and TIR domain adaptor families of zebrafish.

NARCIS (Netherlands)

Meijer, A.H.; Krens, SF Gabby; Rodriguez, IA Medina; He, S; Bitter, W.; Snaar-Jagalska, B Ewa; Spaink, H.P.

2004-01-01

The zebrafish genomic sequence database was analysed for the presence of genes encoding members of the Toll-like receptors (TLR) and interleukin receptors (IL-R) and associated adaptor proteins containing a TIR domain. The resulting predictions show the presence of one or more counterparts for the
Ultra-Structure database design methodology for managing systems biology data and analyses

Directory of Open Access Journals (Sweden)

Hemminger Bradley M

2009-08-01

Full Text Available Abstract Background Modern, high-throughput biological experiments generate copious, heterogeneous, interconnected data sets. Research is dynamic, with frequently changing protocols, techniques, instruments, and file formats. Because of these factors, systems designed to manage and integrate modern biological data sets often end up as large, unwieldy databases that become difficult to maintain or evolve. The novel rule-based approach of the Ultra-Structure design methodology presents a potential solution to this problem. By representing both data and processes as formal rules within a database, an Ultra-Structure system constitutes a flexible framework that enables users to explicitly store domain knowledge in both a machine- and human-readable form. End users themselves can change the system's capabilities without programmer intervention, simply by altering database contents; no computer code or schemas need be modified. This provides flexibility in adapting to change, and allows integration of disparate, heterogenous data sets within a small core set of database tables, facilitating joint analysis and visualization without becoming unwieldy. Here, we examine the application of Ultra-Structure to our ongoing research program for the integration of large proteomic and genomic data sets (proteogenomic mapping. Results We transitioned our proteogenomic mapping information system from a traditional entity-relationship design to one based on Ultra-Structure. Our system integrates tandem mass spectrum data, genomic annotation sets, and spectrum/peptide mappings, all within a small, general framework implemented within a standard relational database system. General software procedures driven by user-modifiable rules can perform tasks such as logical deduction and location-based computations. The system is not tied specifically to proteogenomic research, but is rather designed to accommodate virtually any kind of biological research. Conclusion We find
The Politics of Information: Building a Relational Database To Support Decision-Making at a Public University.

Science.gov (United States)

Friedman, Debra; Hoffman, Phillip

2001-01-01

Describes creation of a relational database at the University of Washington supporting ongoing academic planning at several levels and affecting the culture of decision making. Addresses getting started; sharing the database; questions, worries, and issues; improving access to high-demand courses; the advising function; management of instructional…
Database Description - ConfC | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available abase Description General information of database Database name ConfC Alternative name Database...amotsu Noguchi Tel: 042-495-8736 E-mail: Database classification Structure Database...s - Protein structure Structure Databases - Small molecules Structure Databases - Nucleic acid structure Database... services - Need for user registration - About This Database Database Description Download License Update History of This Database... Site Policy | Contact Us Database Description - ConfC | LSDB Archive ...
Regular periodical public disclosure obligations of public companies

Directory of Open Access Journals (Sweden)

Marjanski Vladimir

2011-01-01

Full Text Available Public companies in the capacity of capital market participants have the obligation to inform the public on their legal and financial status, their general business operations, as well as on the issuance of securities and other financial instruments. Such obligations may be divided into two groups: The first group consists of regular periodical public disclosures, such as the publication of financial reports (annual, semi-annual and quarterly, and the management's reports on the public company's business operations. The second group comprises the obligation of occasional (ad hoc public disclosure. The thesis analyses the obligation of public companies to inform the public in course of their regular reporting. The new Capital Market Law based on two EU Directives (the Transparency Directive and the Directive on Public Disclosure of Inside Information and the Definition of Market Manipulation regulates such obligation of public companies in substantially more detail than the prior Law on the Market of Securities and Other Financial Instruments (hereinafter: ZTHV. Due to the above the ZTHV's provisions are compared to the new solutions within the domain of regular periodical disclosure of the Capital Market Law.
[A relational database to store Poison Centers calls].

Science.gov (United States)

Barelli, Alessandro; Biondi, Immacolata; Tafani, Chiara; Pellegrini, Aristide; Soave, Maurizio; Gaspari, Rita; Annetta, Maria Giuseppina

2006-01-01

Italian Poison Centers answer to approximately 100,000 calls per year. Potentially, this activity is a huge source of data for toxicovigilance and for syndromic surveillance. During the last decade, surveillance systems for early detection of outbreaks have drawn the attention of public health institutions due to the threat of terrorism and high-profile disease outbreaks. Poisoning surveillance needs the ongoing, systematic collection, analysis, interpretation, and dissemination of harmonised data about poisonings from all Poison Centers for use in public health action to reduce morbidity and mortality and to improve health. The entity-relationship model for a Poison Center relational database is extremely complex and not studied in detail. For this reason, not harmonised data collection happens among Italian Poison Centers. Entities are recognizable concepts, either concrete or abstract, such as patients and poisons, or events which have relevance to the database, such as calls. Connectivity and cardinality of relationships are complex as well. A one-to-many relationship exist between calls and patients: for one instance of entity calls, there are zero, one, or many instances of entity patients. At the same time, a one-to-many relationship exist between patients and poisons: for one instance of entity patients, there are zero, one, or many instances of entity poisons. This paper shows a relational model for a poison center database which allows the harmonised data collection of poison centers calls.
Database management systems understanding and applying database technology

CERN Document Server

Gorman, Michael M

1991-01-01

Database Management Systems: Understanding and Applying Database Technology focuses on the processes, methodologies, techniques, and approaches involved in database management systems (DBMSs).The book first takes a look at ANSI database standards and DBMS applications and components. Discussion focus on application components and DBMS components, implementing the dynamic relationship application, problems and benefits of dynamic relationship DBMSs, nature of a dynamic relationship application, ANSI/NDL, and DBMS standards. The manuscript then ponders on logical database, interrogation, and phy

Some links on this page may take you to non-federal websites. Their policies may differ from this site.