WorldWideScience

Sample records for nonredundant protein database

  1. Ion pairs in non-redundant protein structures

    Indian Academy of Sciences (India)

    Ion pairs contribute to several functions including the activity of catalytic triads, fusion of viral membranes, stability in thermophilic proteins and solvent–protein interactions. Furthermore, they have the ability to affect the stability of protein structures and are also a part of the forces that act to hold monomers together.

  2. An essential nonredundant role for mycobacterial DnaK in native protein folding.

    Directory of Open Access Journals (Sweden)

    Allison Fay

    2014-07-01

    Full Text Available Protein chaperones are essential in all domains of life to prevent and resolve protein misfolding during translation and proteotoxic stress. HSP70 family chaperones, including E. coli DnaK, function in stress induced protein refolding and degradation, but are dispensable for cellular viability due to redundant chaperone systems that prevent global nascent peptide insolubility. However, the function of HSP70 chaperones in mycobacteria, a genus that includes multiple human pathogens, has not been examined. We find that mycobacterial DnaK is essential for cell growth and required for native protein folding in Mycobacterium smegmatis. Loss of DnaK is accompanied by proteotoxic collapse characterized by the accumulation of insoluble newly synthesized proteins. DnaK is required for solubility of large multimodular lipid synthases, including the essential lipid synthase FASI, and DnaK loss is accompanied by disruption of membrane structure and increased cell permeability. Trigger Factor is nonessential and has a minor role in native protein folding that is only evident in the absence of DnaK. In unstressed cells, DnaK localizes to multiple, dynamic foci, but relocalizes to focal protein aggregates during stationary phase or upon expression of aggregating peptides. Mycobacterial cells restart cell growth after proteotoxic stress by isolating persistent DnaK containing protein aggregates away from daughter cells. These results reveal unanticipated essential nonredunant roles for mycobacterial DnaK in mycobacteria and indicate that DnaK defines a unique susceptibility point in the mycobacterial proteostasis network.

  3. Protein-Protein Interaction Databases

    DEFF Research Database (Denmark)

    Szklarczyk, Damian; Jensen, Lars Juhl

    2015-01-01

    Years of meticulous curation of scientific literature and increasingly reliable computational predictions have resulted in creation of vast databases of protein interaction data. Over the years, these repositories have become a basic framework in which experiments are analyzed and new directions...

  4. The Drosophila melanogaster DmCK2beta transcription unit encodes for functionally non-redundant protein isoforms.

    Science.gov (United States)

    Jauch, Eike; Wecklein, Heike; Stark, Felix; Jauch, Mandy; Raabe, Thomas

    2006-06-07

    Genes encoding for the two evolutionary highly conserved subunits of a heterotetrameric protein kinase CK2 holoenzyme are present in all examined eukaryotic genomes. Depending on the organism, multiple transcription units encoding for a catalytically active CK2alpha subunit and/or a regulatory CK2beta subunit may exist. The phosphotransferase activity of members of the protein kinase CK2alpha family is thought to be independent of second messengers but is modulated by interaction with CK2beta-like proteins. In the genome of Drosophila melanogaster, one gene encoding for a CK2alpha subunit and three genes encoding for CK2beta-like proteins are present. The X-linked DmCK2beta transcription unit encodes for several CK2beta protein isoforms due to alternative splicing of its primary transcript. We addressed the question whether CK2beta-like proteins are redundant in function. Our in vivo experiments show that variations of the very C-terminal tail of CK2beta isoforms encoded by the X-linked DmCK2beta transcription unit influence their functional properties. In addition, we find that CK2beta-like proteins encoded by the autosomal D. melanogaster genes CK2betates and CK2beta' cannot fully substitute for a loss of CK2beta isoforms encoded by DmCK2beta.

  5. The cellulose synthase companion proteins act non-redundantly with CELLULOSE SYNTHASE INTERACTING1/POM2 and CELLULOSE SYNTHASE 6

    OpenAIRE

    Endler, Anne; Schneider, Rene; Kesten, Christopher; Lampugnani, Edwin R.; Persson, Staffan

    2016-01-01

    Cellulose is a cell wall constituent that is essential for plant growth and development, and an important raw material for a range of industrial applications. Cellulose is synthesized at the plasma membrane by massive cellulose synthase (CesA) complexes that track along cortical microtubules in elongating cells of Arabidopsis through the activity of the protein CELLULOSE SYNTHASE INTERACTING1 (CSI1). In a recent study we identified another family of proteins that also are associated with the ...

  6. The PMDB Protein Model Database

    Science.gov (United States)

    Castrignanò, Tiziana; De Meo, Paolo D'Onorio; Cozzetto, Domenico; Talamo, Ivano Giuseppe; Tramontano, Anna

    2006-01-01

    The Protein Model Database (PMDB) is a public resource aimed at storing manually built 3D models of proteins. The database is designed to provide access to models published in the scientific literature, together with validating experimental data. It is a relational database and it currently contains >74 000 models for ∼240 proteins. The system is accessible at and allows predictors to submit models along with related supporting evidence and users to download them through a simple and intuitive interface. Users can navigate in the database and retrieve models referring to the same target protein or to different regions of the same protein. Each model is assigned a unique identifier that allows interested users to directly access the data. PMID:16381873

  7. Database of Interacting Proteins (DIP)

    Data.gov (United States)

    U.S. Department of Health & Human Services — The DIP database catalogs experimentally determined interactions between proteins. It combines information from a variety of sources to create a single, consistent...

  8. Update History of This Database - Yeast Interacting Proteins Database | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Yeast Interacting Proteins Database Update History of This Database Date Update contents 201...0/03/29 Yeast Interacting Proteins Database English archive site is opened. 2000/12/4 Yeast Interacting Proteins Database...( http://itolab.cb.k.u-tokyo.ac.jp/Y2H/ ) is released. About This Database Database Description... Download License Update History of This Database Site Policy | Contact Us Update History of This Database... - Yeast Interacting Proteins Database | LSDB Archive ...

  9. Database Description - Yeast Interacting Proteins Database | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Yeast Interacting Proteins Database Database Description General information of database Database... name Yeast Interacting Proteins Database Alternative name - DOI 10.18908/lsdba.nbdc00742-000 Creator C...-ken 277-8561 Tel: +81-4-7136-3989 FAX: +81-4-7136-3979 E-mail : Database classif...s cerevisiae Taxonomy ID: 4932 Database description Information on interactions and related information obta...l Acad Sci U S A. 2001 Apr 10;98(8):4569-74. Epub 2001 Mar 13. External Links: Original website information Database

  10. Full Data of Yeast Interacting Proteins Database (Original Version) - Yeast Interacting Proteins Database | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Yeast Interacting Proteins Database Full Data of Yeast Interacting Proteins Database (Origin...al Version) Data detail Data name Full Data of Yeast Interacting Proteins Database (Original Version) DOI 10....18908/lsdba.nbdc00742-004 Description of data contents The entire data in the Yeast Interacting Proteins Database...eir interactions are required. Several sources including YPD (Yeast Proteome Database, Costanzo, M. C., Hoga...ematic name in the SGD (Saccharomyces Genome Database; http://www.yeastgenome.org /). Bait gene name The gen

  11. Improving decoy databases for protein folding algorithms

    KAUST Repository

    Lindsey, Aaron

    2014-01-01

    Copyright © 2014 ACM. Predicting protein structures and simulating protein folding are two of the most important problems in computational biology today. Simulation methods rely on a scoring function to distinguish the native structure (the most energetically stable) from non-native structures. Decoy databases are collections of non-native structures used to test and verify these functions. We present a method to evaluate and improve the quality of decoy databases by adding novel structures and removing redundant structures. We test our approach on 17 different decoy databases of varying size and type and show significant improvement across a variety of metrics. We also test our improved databases on a popular modern scoring function and show that they contain a greater number of native-like structures than the original databases, thereby producing a more rigorous database for testing scoring functions.

  12. Gene Expression Responses to FUS, EWS, and TAF15 Reduction and Stress Granule Sequestration Analyses Identifies FET-Protein Non-Redundant Functions

    DEFF Research Database (Denmark)

    Blechingberg, Jenny; Luo, Yonglun; Bolund, Lars

    2012-01-01

    The FET family of proteins is composed of FUS/TLS, EWS/EWSR1, and TAF15 and possesses RNA- and DNA-binding capacities. The FET-proteins are involved in transcriptional regulation and RNA processing, and FET-gene deregulation is associated with development of cancer and protein granule formations...... in amyotrophic lateral sclerosis, frontotemporal lobar degeneration, and trinucleotide repeat expansion diseases. We here describe a comparative characterization of FET-protein localization and gene regulatory functions. We show that FUS and TAF15 locate to cellular stress granules to a larger extend than EWS....... FET-proteins have no major importance for stress granule formation and cellular stress responses, indicating that FET-protein stress granule association most likely is a downstream response to cellular stress. Gene expression analyses showed that the cellular response towards FUS and TAF15 reduction...

  13. HCVpro: Hepatitis C virus protein interaction database

    KAUST Repository

    Kwofie, Samuel K.

    2011-12-01

    It is essential to catalog characterized hepatitis C virus (HCV) protein-protein interaction (PPI) data and the associated plethora of vital functional information to augment the search for therapies, vaccines and diagnostic biomarkers. In furtherance of these goals, we have developed the hepatitis C virus protein interaction database (HCVpro) by integrating manually verified hepatitis C virus-virus and virus-human protein interactions curated from literature and databases. HCVpro is a comprehensive and integrated HCV-specific knowledgebase housing consolidated information on PPIs, functional genomics and molecular data obtained from a variety of virus databases (VirHostNet, VirusMint, HCVdb and euHCVdb), and from BIND and other relevant biology repositories. HCVpro is further populated with information on hepatocellular carcinoma (HCC) related genes that are mapped onto their encoded cellular proteins. Incorporated proteins have been mapped onto Gene Ontologies, canonical pathways, Online Mendelian Inheritance in Man (OMIM) and extensively cross-referenced to other essential annotations. The database is enriched with exhaustive reviews on structure and functions of HCV proteins, current state of drug and vaccine development and links to recommended journal articles. Users can query the database using specific protein identifiers (IDs), chromosomal locations of a gene, interaction detection methods, indexed PubMed sources as well as HCVpro, BIND and VirusMint IDs. The use of HCVpro is free and the resource can be accessed via http://apps.sanbi.ac.za/hcvpro/ or http://cbrc.kaust.edu.sa/hcvpro/. © 2011 Elsevier B.V.

  14. Medicago PhosphoProtein Database: a repository for Medicago truncatula phosphoprotein data

    Directory of Open Access Journals (Sweden)

    Christopher M. Rose

    2012-06-01

    Full Text Available The ability of legume crops to fix atmospheric nitrogen via a symbiotic association with soil rhizobia makes them an essential component of many agricultural systems. Initiation of this symbiosis requires protein phosphorylation-mediated signaling in response to rhizobial signals named Nod factors. Medicago truncatula (Medicago is the model system for studying legume biology, making the study of its phosphoproteome essential. Here, we describe the Medicago Phosphoprotein Database (http://phospho.medicago.wisc.edu, a repository built to house phosphoprotein, phosphopeptide, and phosphosite data specific to Medicago. Currently, the Medicago Phosphoprotein Database holds 3,457 unique phosphopeptides that contain 3,404 non-redundant sites of phosphorylation on 829 proteins. Through the web-based interface, users are allowed to browse identified proteins or search for proteins of interest. Furthermore, we allow users to conduct BLAST searches of the database using both peptide sequences and phosphorylation motifs as queries. The data contained within the database are available for download to be investigated at the user’s discretion. The Medicago Phosphoprotein Database will be updated continually with novel phosphoprotein and phosphopeptide identifications, with the intent of constructing an unparalleled compendium of large-scale Medicago phosphorylation data.

  15. Proteomics: Protein Identification Using Online Databases

    Science.gov (United States)

    Eurich, Chris; Fields, Peter A.; Rice, Elizabeth

    2012-01-01

    Proteomics is an emerging area of systems biology that allows simultaneous study of thousands of proteins expressed in cells, tissues, or whole organisms. We have developed this activity to enable high school or college students to explore proteomic databases using mass spectrometry data files generated from yeast proteins in a college laboratory…

  16. Core Data of Yeast Interacting Proteins Database (Original Version) - Yeast Interacting Proteins Database | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available y are in the reverse direction. *1 A comprehensive two-hybrid analysis to explore the yeast protein interact...s. 2000 Jan 1;28(1):73-6. *2 The yeast proteome database (YPD) and Caenorhabditis elegans proteome database (WormPD): comprehensive...000 Jan 1;28(1):73-6. *3 A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisia

  17. Protein structure database search and evolutionary classification.

    Science.gov (United States)

    Yang, Jinn-Moon; Tung, Chi-Hua

    2006-01-01

    As more protein structures become available and structural genomics efforts provide structural models in a genome-wide strategy, there is a growing need for fast and accurate methods for discovering homologous proteins and evolutionary classifications of newly determined structures. We have developed 3D-BLAST, in part, to address these issues. 3D-BLAST is as fast as BLAST and calculates the statistical significance (E-value) of an alignment to indicate the reliability of the prediction. Using this method, we first identified 23 states of the structural alphabet that represent pattern profiles of the backbone fragments and then used them to represent protein structure databases as structural alphabet sequence databases (SADB). Our method enhanced BLAST as a search method, using a new structural alphabet substitution matrix (SASM) to find the longest common substructures with high-scoring structured segment pairs from an SADB database. Using personal computers with Intel Pentium4 (2.8 GHz) processors, our method searched more than 10 000 protein structures in 1.3 s and achieved a good agreement with search results from detailed structure alignment methods. [3D-BLAST is available at http://3d-blast.life.nctu.edu.tw].

  18. Thermodynamic database for proteins: features and applications.

    Science.gov (United States)

    Gromiha, M Michael; Sarai, Akinori

    2010-01-01

    We have developed a thermodynamic database for proteins and mutants, ProTherm, which is a collection of a large number of thermodynamic data on protein stability along with the sequence and structure information, experimental methods and conditions, and literature information. This is a valuable resource for understanding/predicting the stability of proteins, and it can be accessible at http://www.gibk26.bse.kyutech.ac.jp/jouhou/Protherm/protherm.html . ProTherm has several features including various search, display, and sorting options and visualization tools. We have analyzed the data in ProTherm to examine the relationship among thermodynamics, structure, and function of proteins. We describe the progress on the development of methods for understanding/predicting protein stability, such as (i) relationship between the stability of protein mutants and amino acid properties, (ii) average assignment method, (iii) empirical energy functions, (iv) torsion, distance, and contact potentials, and (v) machine learning techniques. The list of online resources for predicting protein stability has also been provided.

  19. A large scale analysis of cDNA in Arabidopsis thaliana: generation of 12,028 non-redundant expressed sequence tags from normalized and size-selected cDNA libraries.

    Science.gov (United States)

    Asamizu, E; Nakamura, Y; Sato, S; Tabata, S

    2000-06-30

    For comprehensive analysis of genes expressed in the model dicotyledonous plant, Arabidopsis thaliana, expressed sequence tags (ESTs) were accumulated. Normalized and size-selected cDNA libraries were constructed from aboveground organs, flower buds, roots, green siliques and liquid-cultured seedlings, respectively, and a total of 14,026 5'-end ESTs and 39,207 3'-end ESTs were obtained. The 3'-end ESTs could be clustered into 12,028 non-redundant groups. Similarity search of the non-redundant ESTs against the public non-redundant protein database indicated that 4816 groups show similarity to genes of known function, 1864 to hypothetical genes, and the remaining 5348 are novel sequences. Gene coverage by the non-redundant ESTs was analyzed using the annotated genomic sequences of approximately 10 Mb on chromosomes 3 and 5. A total of 923 regions were hit by at least one EST, among which only 499 regions were hit by the ESTs deposited in the public database. The result indicates that the EST source generated in this project complements the EST data in the public database and facilitates new gene discovery.

  20. PSI/TM-Coffee: a web server for fast and accurate multiple sequence alignments of regular and transmembrane proteins using homology extension on reduced databases.

    Science.gov (United States)

    Floden, Evan W; Tommaso, Paolo D; Chatzou, Maria; Magis, Cedrik; Notredame, Cedric; Chang, Jia-Ming

    2016-07-08

    The PSI/TM-Coffee web server performs multiple sequence alignment (MSA) of proteins by combining homology extension with a consistency based alignment approach. Homology extension is performed with Position Specific Iterative (PSI) BLAST searches against a choice of redundant and non-redundant databases. The main novelty of this server is to allow databases of reduced complexity to rapidly perform homology extension. This server also gives the possibility to use transmembrane proteins (TMPs) reference databases to allow even faster homology extension on this important category of proteins. Aside from an MSA, the server also outputs topological prediction of TMPs using the HMMTOP algorithm. Previous benchmarking of the method has shown this approach outperforms the most accurate alignment methods such as MSAProbs, Kalign, PROMALS, MAFFT, ProbCons and PRALINE™. The web server is available at http://tcoffee.crg.cat/tmcoffee. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  1. Protein - AT Atlas | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ..._protein.zip File URL: ftp://ftp.biosciencedbc.jp/archive/at_atlas/LATEST/at_atla...About This Database Database Description Download License Update History of This Database Site Policy | Contact Us Protein - AT Atlas | LSDB Archive ...

  2. License - Yeast Interacting Proteins Database | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Yeast Interacting Proteins Database License to Use This Database Last updated : 2010/02/15 You may use this database...nal License described below. The Standard License specifies the license terms regarding the use of this database... and the requirements you must follow in using this database. The Additional ...the Standard License. Standard License The Standard License for this database is the license specified in th...e Creative Commons Attribution-Share Alike 2.1 Japan . If you use data from this database

  3. Protein - Trypanosomes Database | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data List Contact us Trypanoso...nhibitor of the protein. Data file File name: trypanosome.zip File URL: ftp://ftp....biosciencedbc.jp/archive/trypanosome/LATEST/trypanosome.zip File size: 1.4 KB Simple search URL http://togo...db.biosciencedbc.jp/togodb/view/trypanosome#en Data acquisition method - Data analysis method - Number of da...ndelian inheritance in Man ) map Location of the gene on a chromosome or its chromosome number pdb PDB ID (P

  4. Improving decoy databases for protein folding algorithms

    KAUST Repository

    Lindsey, Aaron; Yeh, Hsin-Yi (Cindy); Wu, Chih-Peng; Thomas, Shawna; Amato, Nancy M.

    2014-01-01

    energetically stable) from non-native structures. Decoy databases are collections of non-native structures used to test and verify these functions. We present a method to evaluate and improve the quality of decoy databases by adding novel structures and removing

  5. MIPS: a database for genomes and protein sequences.

    Science.gov (United States)

    Mewes, H W; Frishman, D; Güldener, U; Mannhaupt, G; Mayer, K; Mokrejs, M; Morgenstern, B; Münsterkötter, M; Rudd, S; Weil, B

    2002-01-01

    The Munich Information Center for Protein Sequences (MIPS-GSF, Neuherberg, Germany) continues to provide genome-related information in a systematic way. MIPS supports both national and European sequencing and functional analysis projects, develops and maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences, and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the databases for the comprehensive set of genomes (PEDANT genomes), the database of annotated human EST clusters (HIB), the database of complete cDNAs from the DHGP (German Human Genome Project), as well as the project specific databases for the GABI (Genome Analysis in Plants) and HNB (Helmholtz-Netzwerk Bioinformatik) networks. The Arabidospsis thaliana database (MATDB), the database of mitochondrial proteins (MITOP) and our contribution to the PIR International Protein Sequence Database have been described elsewhere [Schoof et al. (2002) Nucleic Acids Res., 30, 91-93; Scharfe et al. (2000) Nucleic Acids Res., 28, 155-158; Barker et al. (2001) Nucleic Acids Res., 29, 29-32]. All databases described, the protein analysis tools provided and the detailed descriptions of our projects can be accessed through the MIPS World Wide Web server (http://mips.gsf.de).

  6. HCVpro: Hepatitis C virus protein interaction database

    KAUST Repository

    Kwofie, Samuel K.; Schaefer, Ulf; Sundararajan, Vijayaraghava Seshadri; Bajic, Vladimir B.; Christoffels, Alan G.

    2011-01-01

    It is essential to catalog characterized hepatitis C virus (HCV) protein-protein interaction (PPI) data and the associated plethora of vital functional information to augment the search for therapies, vaccines and diagnostic biomarkers

  7. MIPS: a database for protein sequences and complete genomes.

    Science.gov (United States)

    Mewes, H W; Hani, J; Pfeiffer, F; Frishman, D

    1998-01-01

    The MIPS group [Munich Information Center for Protein Sequences of the German National Center for Environment and Health (GSF)] at the Max-Planck-Institute for Biochemistry, Martinsried near Munich, Germany, is involved in a number of data collection activities, including a comprehensive database of the yeast genome, a database reflecting the progress in sequencing the Arabidopsis thaliana genome, the systematic analysis of other small genomes and the collection of protein sequence data within the framework of the PIR-International Protein Sequence Database (described elsewhere in this volume). Through its WWW server (http://www.mips.biochem.mpg.de ) MIPS provides access to a variety of generic databases, including a database of protein families as well as automatically generated data by the systematic application of sequence analysis algorithms. The yeast genome sequence and its related information was also compiled on CD-ROM to provide dynamic interactive access to the 16 chromosomes of the first eukaryotic genome unraveled. PMID:9399795

  8. HCSD: the human cancer secretome database

    DEFF Research Database (Denmark)

    Feizi, Amir; Banaei-Esfahani, Amir; Nielsen, Jens

    2015-01-01

    The human cancer secretome database (HCSD) is a comprehensive database for human cancer secretome data. The cancer secretome describes proteins secreted by cancer cells and structuring information about the cancer secretome will enable further analysis of how this is related with tumor biology...... database is limiting the ability to query the increasing community knowledge. We therefore developed the Human Cancer Secretome Database (HCSD) to fulfil this gap. HCSD contains >80 000 measurements for about 7000 nonredundant human proteins collected from up to 35 high-throughput studies on 17 cancer...

  9. SwissPalm: Protein Palmitoylation database.

    Science.gov (United States)

    Blanc, Mathieu; David, Fabrice; Abrami, Laurence; Migliozzi, Daniel; Armand, Florence; Bürgi, Jérôme; van der Goot, Françoise Gisou

    2015-01-01

    Protein S-palmitoylation is a reversible post-translational modification that regulates many key biological processes, although the full extent and functions of protein S-palmitoylation remain largely unexplored. Recent developments of new chemical methods have allowed the establishment of palmitoyl-proteomes of a variety of cell lines and tissues from different species.  As the amount of information generated by these high-throughput studies is increasing, the field requires centralization and comparison of this information. Here we present SwissPalm ( http://swisspalm.epfl.ch), our open, comprehensive, manually curated resource to study protein S-palmitoylation. It currently encompasses more than 5000 S-palmitoylated protein hits from seven species, and contains more than 500 specific sites of S-palmitoylation. SwissPalm also provides curated information and filters that increase the confidence in true positive hits, and integrates predictions of S-palmitoylated cysteine scores, orthologs and isoform multiple alignments. Systems analysis of the palmitoyl-proteome screens indicate that 10% or more of the human proteome is susceptible to S-palmitoylation. Moreover, ontology and pathway analyses of the human palmitoyl-proteome reveal that key biological functions involve this reversible lipid modification. Comparative analysis finally shows a strong crosstalk between S-palmitoylation and other post-translational modifications. Through the compilation of data and continuous updates, SwissPalm will provide a powerful tool to unravel the global importance of protein S-palmitoylation.

  10. TOPDOM: database of conservatively located domains and motifs in proteins.

    Science.gov (United States)

    Varga, Julia; Dobson, László; Tusnády, Gábor E

    2016-09-01

    The TOPDOM database-originally created as a collection of domains and motifs located consistently on the same side of the membranes in α-helical transmembrane proteins-has been updated and extended by taking into consideration consistently localized domains and motifs in globular proteins, too. By taking advantage of the recently developed CCTOP algorithm to determine the type of a protein and predict topology in case of transmembrane proteins, and by applying a thorough search for domains and motifs as well as utilizing the most up-to-date version of all source databases, we managed to reach a 6-fold increase in the size of the whole database and a 2-fold increase in the number of transmembrane proteins. TOPDOM database is available at http://topdom.enzim.hu The webpage utilizes the common Apache, PHP5 and MySQL software to provide the user interface for accessing and searching the database. The database itself is generated on a high performance computer. tusnady.gabor@ttk.mta.hu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  11. Protein - TP Atlas | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...p_atlas_protein.zip File URL: ftp://ftp.biosciencedbc.jp/archive/tp_atlas/LATEST/...story of This Database Site Policy | Contact Us Protein - TP Atlas | LSDB Archive ...

  12. cuticleDB: a relational database of Arthropod cuticular proteins

    Directory of Open Access Journals (Sweden)

    Willis Judith H

    2004-09-01

    Full Text Available Abstract Background The insect exoskeleton or cuticle is a bi-partite composite of proteins and chitin that provides protective, skeletal and structural functions. Little information is available about the molecular structure of this important complex that exhibits a helicoidal architecture. Scores of sequences of cuticular proteins have been obtained from direct protein sequencing, from cDNAs, and from genomic analyses. Most of these cuticular protein sequences contain motifs found only in arthropod proteins. Description cuticleDB is a relational database containing all structural proteins of Arthropod cuticle identified to date. Many come from direct sequencing of proteins isolated from cuticle and from sequences from cDNAs that share common features with these authentic cuticular proteins. It also includes proteins from the Drosophila melanogaster and the Anopheles gambiae genomes, that have been predicted to be cuticular proteins, based on a Pfam motif (PF00379 responsible for chitin binding in Arthropod cuticle. The total number of the database entries is 445: 370 derive from insects, 60 from Crustacea and 15 from Chelicerata. The database can be accessed from our web server at http://bioinformatics.biol.uoa.gr/cuticleDB. Conclusions CuticleDB was primarily designed to contain correct and full annotation of cuticular protein data. The database will be of help to future genome annotators. Users will be able to test hypotheses for the existence of known and also of yet unknown motifs in cuticular proteins. An analysis of motifs may contribute to understanding how proteins contribute to the physical properties of cuticle as well as to the precise nature of their interaction with chitin.

  13. The DExH/D protein family database.

    Science.gov (United States)

    Jankowsky, E; Jankowsky, A

    2000-01-01

    DExH/D proteins are essential for all aspects of cellular RNA metabolism and processing, in the replication of many viruses and in DNA replication. DExH/D proteins are subject to current biological, biochemical and biophysical research which provides a continuous wealth of data. The DExH/D protein family database compiles this information and makes it available over the WWW (http://www.columbia.edu/ ej67/dbhome.htm ). The database can be fully searched by text based queries, facilitating fast access to specific information about this important class of enzymes.

  14. AMYPdb: A database dedicated to amyloid precursor proteins

    Directory of Open Access Journals (Sweden)

    Delamarche Christian

    2008-06-01

    Full Text Available Abstract Background Misfolding and aggregation of proteins into ordered fibrillar structures is associated with a number of severe pathologies, including Alzheimer's disease, prion diseases, and type II diabetes. The rapid accumulation of knowledge about the sequences and structures of these proteins allows using of in silico methods to investigate the molecular mechanisms of their abnormal conformational changes and assembly. However, such an approach requires the collection of accurate data, which are inconveniently dispersed among several generalist databases. Results We therefore created a free online knowledge database (AMYPdb dedicated to amyloid precursor proteins and we have performed large scale sequence analysis of the included data. Currently, AMYPdb integrates data on 31 families, including 1,705 proteins from nearly 600 organisms. It displays links to more than 2,300 bibliographic references and 1,200 3D-structures. A Wiki system is available to insert data into the database, providing a sharing and collaboration environment. We generated and analyzed 3,621 amino acid sequence patterns, reporting highly specific patterns for each amyloid family, along with patterns likely to be involved in protein misfolding and aggregation. Conclusion AMYPdb is a comprehensive online database aiming at the centralization of bioinformatic data regarding all amyloid proteins and their precursors. Our sequence pattern discovery and analysis approach unveiled protein regions of significant interest. AMYPdb is freely accessible 1.

  15. UbiProt: a database of ubiquitylated proteins

    Directory of Open Access Journals (Sweden)

    Kondratieva Ekaterina V

    2007-04-01

    Full Text Available Abstract Background Post-translational protein modification with ubiquitin, or ubiquitylation, is one of the hottest topics in a modern biology due to a dramatic impact on diverse metabolic pathways and involvement in pathogenesis of severe human diseases. A great number of eukaryotic proteins was found to be ubiquitylated. However, data about particular ubiquitylated proteins are rather disembodied. Description To fill a general need for collecting and systematizing experimental data concerning ubiquitylation we have developed a new resource, UbiProt Database, a knowledgebase of ubiquitylated proteins. The database contains retrievable information about overall characteristics of a particular protein, ubiquitylation features, related ubiquitylation and de-ubiquitylation machinery and literature references reflecting experimental evidence of ubiquitylation. UbiProt is available at http://ubiprot.org.ru for free. Conclusion UbiProt Database is a public resource offering comprehensive information on ubiquitylated proteins. The resource can serve as a general reference source both for researchers in ubiquitin field and those who deal with particular ubiquitylated proteins which are of their interest. Further development of the UbiProt Database is expected to be of common interest for research groups involved in studies of the ubiquitin system.

  16. Yeast Interacting Proteins Database: YNR006W, YHL002W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ling Golgi proteins, forming lumenal membranes and sorting ubiquitinated proteins destined for degradation; ..., as well as for recycling of Golgi proteins and formation of lumenal membranes Rows with this prey as prey ...1p; required for recycling Golgi proteins, forming lumenal membranes and sorting ubiquitinated proteins dest...degradation, as well as for recycling of Golgi proteins and formation of lumenal membranes

  17. Yeast Interacting Proteins Database: YLR447C, YOR047C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available xpression; interacts with protein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding protein Sp...; interacts with protein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding protein Spt15p; act

  18. Yeast Interacting Proteins Database: YGR013W, YKL012W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available tion U1 snRNP protein involved in splicing, interacts with the branchpoint-binding protein during the formation of the second commitm... PRP40 U1 snRNP protein involved in splicing, interacts with the branchpoint-binding protein during the form...ation of the second commitment complex Rows with this prey as prey (1) Rows with

  19. Artificial Intelligence in Prediction of Secondary Protein Structure Using CB513 Database

    Science.gov (United States)

    Avdagic, Zikrija; Purisevic, Elvir; Omanovic, Samir; Coralic, Zlatan

    2009-01-01

    In this paper we describe CB513 a non-redundant dataset, suitable for development of algorithms for prediction of secondary protein structure. A program was made in Borland Delphi for transforming data from our dataset to make it suitable for learning of neural network for prediction of secondary protein structure implemented in MATLAB Neural-Network Toolbox. Learning (training and testing) of neural network is researched with different sizes of windows, different number of neurons in the hidden layer and different number of training epochs, while using dataset CB513. PMID:21347158

  20. Yeast Interacting Proteins Database: YOR047C, YKL038W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available racts with protein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding protein Spt15p; acts as a...Bait description Protein involved in control of glucose-regulated gene expression; interacts with protein kinase Snf1p, glucose senso...rs Snf3p and Rgt2p, and TATA-binding protein Spt15p; acts as a regulator of the tra

  1. Yeast Interacting Proteins Database: YFR049W, YOR047C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available protein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding protein Spt15p; acts as a regulator... (0) YOR047C STD1 Protein involved in control of glucose-regulated gene expression; interacts with protein kinase Snf1p, glucose sens...ors Snf3p and Rgt2p, and TATA-binding protein Spt15p; ac

  2. Yeast Interacting Proteins Database: YHL002W, YNR006W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ycling of Golgi proteins and formation of lumenal membranes Rows with this bait as bait (1) Rows with this b...required for recycling Golgi proteins, forming lumenal membranes and sorting ubiquitinated proteins destined...on, as well as for recycling of Golgi proteins and formation of lumenal membranes...ith Hse1p; required for recycling Golgi proteins, forming lumenal membranes and sorting ubiquitinated protei

  3. MultitaskProtDB: a database of multitasking proteins.

    Science.gov (United States)

    Hernández, Sergio; Ferragut, Gabriela; Amela, Isaac; Perez-Pons, JosepAntoni; Piñol, Jaume; Mozo-Villarias, Angel; Cedano, Juan; Querol, Enrique

    2014-01-01

    We have compiled MultitaskProtDB, available online at http://wallace.uab.es/multitask, to provide a repository where the many multitasking proteins found in the literature can be stored. Multitasking or moonlighting is the capability of some proteins to execute two or more biological functions. Usually, multitasking proteins are experimentally revealed by serendipity. This ability of proteins to perform multitasking functions helps us to understand one of the ways used by cells to perform many complex functions with a limited number of genes. Even so, the study of this phenomenon is complex because, among other things, there is no database of moonlighting proteins. The existence of such a tool facilitates the collection and dissemination of these important data. This work reports the database, MultitaskProtDB, which is designed as a friendly user web page containing >288 multitasking proteins with their NCBI and UniProt accession numbers, canonical and additional biological functions, monomeric/oligomeric states, PDB codes when available and bibliographic references. This database also serves to gain insight into some characteristics of multitasking proteins such as frequencies of the different pairs of functions, phylogenetic conservation and so forth.

  4. DB-PABP: a database of polyanion-binding proteins.

    Science.gov (United States)

    Fang, Jianwen; Dong, Yinghua; Salamat-Miller, Nazila; Middaugh, C Russell

    2008-01-01

    The interactions between polyanions (PAs) and polyanion-binding proteins (PABPs) have been found to play significant roles in many essential biological processes including intracellular organization, transport and protein folding. Furthermore, many neurodegenerative disease-related proteins are PABPs. Thus, a better understanding of PA/PABP interactions may not only enhance our understandings of biological systems but also provide new clues to these deadly diseases. The literature in this field is widely scattered, suggesting the need for a comprehensive and searchable database of PABPs. The DB-PABP is a comprehensive, manually curated and searchable database of experimentally characterized PABPs. It is freely available and can be accessed online at http://pabp.bcf.ku.edu/DB_PABP/. The DB-PABP was implemented as a MySQL relational database. An interactive web interface was created using Java Server Pages (JSP). The search page of the database is organized into a main search form and a section for utilities. The main search form enables custom searches via four menus: protein names, polyanion names, the source species of the proteins and the methods used to discover the interactions. Available utilities include a commonality matrix, a function of listing PABPs by the number of interacting polyanions and a string search for author surnames. The DB-PABP is maintained at the University of Kansas. We encourage users to provide feedback and submit new data and references.

  5. Yeast Interacting Proteins Database: YPR103W, YOR047C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available tein involved in control of glucose-regulated gene expression; interacts with protein kinase Snf1p, glucose sensors...gulated gene expression; interacts with protein kinase Snf1p, glucose sensors Snf

  6. Yeast Interacting Proteins Database: YNL152W, YMR032W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YNL152W INN1 Essential protein that associates with the contractile actomyosin ring... Bait description Essential protein that associates with the contractile actomyosin ring, required for ingre

  7. Yeast Interacting Proteins Database: YGL145W, YNL258C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ripheral membrane protein required for Golgi-to-ER retrograde traffic; component ... membrane protein required for Golgi-to-ER retrograde traffic; component of the ER target site that interact

  8. Yeast Interacting Proteins Database: YNL258C, YGL145W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YNL258C DSL1 Peripheral membrane protein required for Golgi-to-ER retrograde traffi...t description Peripheral membrane protein required for Golgi-to-ER retrograde traffic; component of the ER t

  9. Yeast Interacting Proteins Database: YNL216W, YLR453C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YNL216W RAP1 DNA-binding protein involved in either activation or repression of transcription, depending...NA-binding protein involved in either activation or repression of transcription, depending on binding site c

  10. Yeast Interacting Proteins Database: YOL006C, YMR233W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available fusion protein localizes to the cytoplasm, nucleus and nucleolus Rows with this prey as prey (1) Rows with t...on protein localizes to the cytoplasm, nucleus and nucleolus Rows with this prey

  11. Yeast Interacting Proteins Database: YKL002W, YFL034C-B [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available integral membrane proteins into lumenal vesicles of multivesicular bodies, and for delivery of newly synthes...ntegral membrane proteins into lumenal vesicles of multivesicular bodies, and for delivery of newly synthesi

  12. Yeast Interacting Proteins Database: YJR091C, YKL002W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available g of integral membrane proteins into lumenal vesicles of multivesicular bodies, and for delivery of newly sy... integral membrane proteins into lumenal vesicles of multivesicular bodies, and for delivery of newly synthe

  13. Yeast Interacting Proteins Database: YCL046W, YGL115W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YCL046W - Dubious open reading frame unlikely to encode a protein, based on availab...ading frame unlikely to encode a protein, based on available experimental and comparative sequence data; par

  14. Yeast Interacting Proteins Database: YGL237C, YOR047C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ene expression; interacts with protein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding prote... expression; interacts with protein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding protein

  15. Yeast Interacting Proteins Database: YOR358W, YOR047C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ; interacts with protein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding protein Spt15p; act...rotein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding protein Spt15p; acts as a regulator o

  16. Yeast Interacting Proteins Database: YKL002W, YOR047C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ene expression; interacts with protein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding prote...xpression; interacts with protein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding protein Sp

  17. Yeast Interacting Proteins Database: YGL127C, YOR047C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ith protein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding protein Spt15p; acts as a regula...rotein involved in control of glucose-regulated gene expression; interacts with protein kinase Snf1p, glucose sensors

  18. The Protein Identifier Cross-Referencing (PICR service: reconciling protein identifiers across multiple source databases

    Directory of Open Access Journals (Sweden)

    Leinonen Rasko

    2007-10-01

    Full Text Available Abstract Background Each major protein database uses its own conventions when assigning protein identifiers. Resolving the various, potentially unstable, identifiers that refer to identical proteins is a major challenge. This is a common problem when attempting to unify datasets that have been annotated with proteins from multiple data sources or querying data providers with one flavour of protein identifiers when the source database uses another. Partial solutions for protein identifier mapping exist but they are limited to specific species or techniques and to a very small number of databases. As a result, we have not found a solution that is generic enough and broad enough in mapping scope to suit our needs. Results We have created the Protein Identifier Cross-Reference (PICR service, a web application that provides interactive and programmatic (SOAP and REST access to a mapping algorithm that uses the UniProt Archive (UniParc as a data warehouse to offer protein cross-references based on 100% sequence identity to proteins from over 70 distinct source databases loaded into UniParc. Mappings can be limited by source database, taxonomic ID and activity status in the source database. Users can copy/paste or upload files containing protein identifiers or sequences in FASTA format to obtain mappings using the interactive interface. Search results can be viewed in simple or detailed HTML tables or downloaded as comma-separated values (CSV or Microsoft Excel (XLS files suitable for use in a local database or a spreadsheet. Alternatively, a SOAP interface is available to integrate PICR functionality in other applications, as is a lightweight REST interface. Conclusion We offer a publicly available service that can interactively map protein identifiers and protein sequences to the majority of commonly used protein databases. Programmatic access is available through a standards-compliant SOAP interface or a lightweight REST interface. The PICR

  19. Filling and mining the reactive metabolite target protein database.

    Science.gov (United States)

    Hanzlik, Robert P; Fang, Jianwen; Koen, Yakov M

    2009-04-15

    The post-translational modification of proteins is a well-known endogenous mechanism for regulating protein function and activity. Cellular proteins are also susceptible to post-translational modification by xenobiotic agents that possess, or whose metabolites possess, significant electrophilic character. Such non-physiological modifications to endogenous proteins are sometimes benign, but in other cases they are strongly associated with, and are presumed to cause, lethal cytotoxic consequences via necrosis and/or apoptosis. The Reactive Metabolite Target Protein Database (TPDB) is a searchable, freely web-accessible (http://tpdb.medchem.ku.edu:8080/protein_database/) resource that attempts to provide a comprehensive, up-to-date listing of known reactive metabolite target proteins. In this report we characterize the TPDB by reviewing briefly how the information it contains came to be known. We also compare its information to that provided by other types of "-omics" studies relevant to toxicology, and we illustrate how bioinformatic analysis of target proteins may help to elucidate mechanisms of cytotoxic responses to reactive metabolites.

  20. Yeast Interacting Proteins Database: YOR302W, YOR047C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available rol of glucose-regulated gene expression; interacts with protein kinase Snf1p, glucose sensors Snf3p and Rgt...tein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding protein Spt1

  1. Yeast Interacting Proteins Database: YMR280C, YOR047C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available olved in control of glucose-regulated gene expression; interacts with protein kinase Snf1p, glucose sensor... glucose-regulated gene expression; interacts with protein kinase Snf1p, glucose sensors Snf3p and Rgt2p, an

  2. Yeast Interacting Proteins Database: YNL258C, YKR022C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YNL258C DSL1 Peripheral membrane protein required for Golgi-to-ER retrograde traffi...equired for Golgi-to-ER retrograde traffic; component of the ER target site that interacts with coatomer, th...it ORF YNL258C Bait gene name DSL1 Bait description Peripheral membrane protein r

  3. Yeast Interacting Proteins Database: YDL239C, YDR273W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available of a Don1p-containing structure at the leading edge of the prospore membrane via interaction with spindle p...it as prey (1) YDR273W DON1 Meiosis-specific component of the spindle pole body, part of the leading... edge protein (LEP) coat, forms a ring-like structure at the leading edge of the prospore...ption Protein required for spore wall formation, thought to mediate assembly of a Don1p-containing structure at the leading...description Meiosis-specific component of the spindle pole body, part of the leading edge protein (LEP) coat

  4. Yeast Interacting Proteins Database: YOR117W, YJL184W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available c stress response, telomere uncapping and elongation, transcription; component of the EKC/KEOPS protein comp...n proposed to be involved in the modification of N-linked oligosaccharides, osmotic stress response, telomere uncap

  5. Yeast Interacting Proteins Database: YER081W, YDR105C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YDR105C TMS1 Vacuolar membrane protein of unknown function that is conserved in mammals; predicted to contai...tion that is conserved in mammals; predicted to contain eleven transmembrane heli

  6. Yeast Interacting Proteins Database: YKL002W, YLR423C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available integral membrane proteins into lumenal vesicles of multivesicular bodies, and for delivery of newly synthes... into lumenal vesicles of multivesicular bodies, and for delivery of newly synthesized vacuolar enzymes to t

  7. Yeast Interacting Proteins Database: YKL002W, YDL165W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available integral membrane proteins into lumenal vesicles of multivesicular bodies, and for delivery of newly synthes...ins into lumenal vesicles of multivesicular bodies, and for delivery of newly synthesized vacuolar enzymes t

  8. A protein domain interaction interface database: InterPare

    Directory of Open Access Journals (Sweden)

    Lee Jungsul

    2005-08-01

    Full Text Available Abstract Background Most proteins function by interacting with other molecules. Their interaction interfaces are highly conserved throughout evolution to avoid undesirable interactions that lead to fatal disorders in cells. Rational drug discovery includes computational methods to identify the interaction sites of lead compounds to the target molecules. Identifying and classifying protein interaction interfaces on a large scale can help researchers discover drug targets more efficiently. Description We introduce a large-scale protein domain interaction interface database called InterPare http://interpare.net. It contains both inter-chain (between chains interfaces and intra-chain (within chain interfaces. InterPare uses three methods to detect interfaces: 1 the geometric distance method for checking the distance between atoms that belong to different domains, 2 Accessible Surface Area (ASA, a method for detecting the buried region of a protein that is detached from a solvent when forming multimers or complexes, and 3 the Voronoi diagram, a computational geometry method that uses a mathematical definition of interface regions. InterPare includes visualization tools to display protein interior, surface, and interaction interfaces. It also provides statistics such as the amino acid propensities of queried protein according to its interior, surface, and interface region. The atom coordinates that belong to interface, surface, and interior regions can be downloaded from the website. Conclusion InterPare is an open and public database server for protein interaction interface information. It contains the large-scale interface data for proteins whose 3D-structures are known. As of November 2004, there were 10,583 (Geometric distance, 10,431 (ASA, and 11,010 (Voronoi diagram entries in the Protein Data Bank (PDB containing interfaces, according to the above three methods. In the case of the geometric distance method, there are 31,620 inter-chain domain

  9. Exploring Protein Function Using the Saccharomyces Genome Database.

    Science.gov (United States)

    Wong, Edith D

    2017-01-01

    Elucidating the function of individual proteins will help to create a comprehensive picture of cell biology, as well as shed light on human disease mechanisms, possible treatments, and cures. Due to its compact genome, and extensive history of experimentation and annotation, the budding yeast Saccharomyces cerevisiae is an ideal model organism in which to determine protein function. This information can then be leveraged to infer functions of human homologs. Despite the large amount of research and biological data about S. cerevisiae, many proteins' functions remain unknown. Here, we explore ways to use the Saccharomyces Genome Database (SGD; http://www.yeastgenome.org ) to predict the function of proteins and gain insight into their roles in various cellular processes.

  10. THPdb: Database of FDA-approved peptide and protein therapeutics.

    Directory of Open Access Journals (Sweden)

    Salman Sadullah Usmani

    Full Text Available THPdb (http://crdd.osdd.net/raghava/thpdb/ is a manually curated repository of Food and Drug Administration (FDA approved therapeutic peptides and proteins. The information in THPdb has been compiled from 985 research publications, 70 patents and other resources like DrugBank. The current version of the database holds a total of 852 entries, providing comprehensive information on 239 US-FDA approved therapeutic peptides and proteins and their 380 drug variants. The information on each peptide and protein includes their sequences, chemical properties, composition, disease area, mode of activity, physical appearance, category or pharmacological class, pharmacodynamics, route of administration, toxicity, target of activity, etc. In addition, we have annotated the structure of most of the protein and peptides. A number of user-friendly tools have been integrated to facilitate easy browsing and data analysis. To assist scientific community, a web interface and mobile App have also been developed.

  11. Yeast Interacting Proteins Database: YDL239C, YLR423C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available of a Don1p-containing structure at the leading edge of the prospore membrane via interaction with spindle p...cription Protein required for spore wall formation, thought to mediate assembly of a Don1p-containing structure at the leading

  12. Yeast Interacting Proteins Database: YDL239C, YPL070W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available of a Don1p-containing structure at the leading edge of the prospore membrane via interaction with spindle p...cription Protein required for spore wall formation, thought to mediate assembly of a Don1p-containing structure at the leading

  13. Yeast Interacting Proteins Database: YDL239C, YML042W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available of a Don1p-containing structure at the leading edge of the prospore membrane via interaction with spindle p...iption Protein required for spore wall formation, thought to mediate assembly of a Don1p-containing structure at the leading

  14. Yeast Interacting Proteins Database: YDR176W, YDL239C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available a Don1p-containing structure at the leading edge of the prospore membrane via interaction with spindle pole...ining structure at the leading edge of the prospore membrane via interaction with spindle pole body componen...DY3 Prey description Protein required for spore wall formation, thought to mediate assembly of a Don1p-conta

  15. Yeast Interacting Proteins Database: YDL239C, YKL103C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available of a Don1p-containing structure at the leading edge of the prospore membrane via interaction with spindle p...ait description Protein required for spore wall formation, thought to mediate assembly of a Don1p-containing structure at the leading

  16. A protein relational database and protein family knowledge bases to facilitate structure-based design analyses.

    Science.gov (United States)

    Mobilio, Dominick; Walker, Gary; Brooijmans, Natasja; Nilakantan, Ramaswamy; Denny, R Aldrin; Dejoannis, Jason; Feyfant, Eric; Kowticwar, Rupesh K; Mankala, Jyoti; Palli, Satish; Punyamantula, Sairam; Tatipally, Maneesh; John, Reji K; Humblet, Christine

    2010-08-01

    The Protein Data Bank is the most comprehensive source of experimental macromolecular structures. It can, however, be difficult at times to locate relevant structures with the Protein Data Bank search interface. This is particularly true when searching for complexes containing specific interactions between protein and ligand atoms. Moreover, searching within a family of proteins can be tedious. For example, one cannot search for some conserved residue as residue numbers vary across structures. We describe herein three databases, Protein Relational Database, Kinase Knowledge Base, and Matrix Metalloproteinase Knowledge Base, containing protein structures from the Protein Data Bank. In Protein Relational Database, atom-atom distances between protein and ligand have been precalculated allowing for millisecond retrieval based on atom identity and distance constraints. Ring centroids, centroid-centroid and centroid-atom distances and angles have also been included permitting queries for pi-stacking interactions and other structural motifs involving rings. Other geometric features can be searched through the inclusion of residue pair and triplet distances. In Kinase Knowledge Base and Matrix Metalloproteinase Knowledge Base, the catalytic domains have been aligned into common residue numbering schemes. Thus, by searching across Protein Relational Database and Kinase Knowledge Base, one can easily retrieve structures wherein, for example, a ligand of interest is making contact with the gatekeeper residue.

  17. ASAView: Database and tool for solvent accessibility representation in proteins

    Directory of Open Access Journals (Sweden)

    Fawareh Hamed

    2004-05-01

    Full Text Available Abstract Background Accessible surface area (ASA or solvent accessibility of amino acids in a protein has important implications. Knowledge of surface residues helps in locating potential candidates of active sites. Therefore, a method to quickly see the surface residues in a two dimensional model would help to immediately understand the population of amino acid residues on the surface and in the inner core of the proteins. Results ASAView is an algorithm, an application and a database of schematic representations of solvent accessibility of amino acid residues within proteins. A characteristic two-dimensional spiral plot of solvent accessibility provides a convenient graphical view of residues in terms of their exposed surface areas. In addition, sequential plots in the form of bar charts are also provided. Online plots of the proteins included in the entire Protein Data Bank (PDB, are provided for the entire protein as well as their chains separately. Conclusions These graphical plots of solvent accessibility are likely to provide a quick view of the overall topological distribution of residues in proteins. Chain-wise computation of solvent accessibility is also provided.

  18. Completion of autobuilt protein models using a database of protein fragments

    International Nuclear Information System (INIS)

    Cowtan, Kevin

    2012-01-01

    Two developments in the process of automated protein model building in the Buccaneer software are described: the use of a database of protein fragments in improving the model completeness and the assembly of disconnected chain fragments into complete molecules. Two developments in the process of automated protein model building in the Buccaneer software are presented. A general-purpose library for protein fragments of arbitrary size is described, with a highly optimized search method allowing the use of a larger database than in previous work. The problem of assembling an autobuilt model into complete chains is discussed. This involves the assembly of disconnected chain fragments into complete molecules and the use of the database of protein fragments in improving the model completeness. Assembly of fragments into molecules is a standard step in existing model-building software, but the methods have not received detailed discussion in the literature

  19. Yeast Interacting Proteins Database: YOL069W, YIL144W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available complex (Ndc80p-Nuf2p-Spc24p-Spc25p); involved in chromosome segregation, spindle checkpoint activity and kinetochore clustering...vity, kinetochore assembly and clustering Rows with this prey as prey (2) Rows with this prey as bait (0) 12...-Nuf2p-Spc24p-Spc25p); involved in chromosome segregation, spindle checkpoint activity and kinetochore clustering...d coiled-coil protein involved in chromosome segregation, spindle checkpoint activity, kinetochore assembly and clustering

  20. Applications of Protein Thermodynamic Database for Understanding Protein Mutant Stability and Designing Stable Mutants.

    Science.gov (United States)

    Gromiha, M Michael; Anoosha, P; Huang, Liang-Tsung

    2016-01-01

    Protein stability is the free energy difference between unfolded and folded states of a protein, which lies in the range of 5-25 kcal/mol. Experimentally, protein stability is measured with circular dichroism, differential scanning calorimetry, and fluorescence spectroscopy using thermal and denaturant denaturation methods. These experimental data have been accumulated in the form of a database, ProTherm, thermodynamic database for proteins and mutants. It also contains sequence and structure information of a protein, experimental methods and conditions, and literature information. Different features such as search, display, and sorting options and visualization tools have been incorporated in the database. ProTherm is a valuable resource for understanding/predicting the stability of proteins and it can be accessed at http://www.abren.net/protherm/ . ProTherm has been effectively used to examine the relationship among thermodynamics, structure, and function of proteins. We describe the recent progress on the development of methods for understanding/predicting protein stability, such as (1) general trends on mutational effects on stability, (2) relationship between the stability of protein mutants and amino acid properties, (3) applications of protein three-dimensional structures for predicting their stability upon point mutations, (4) prediction of protein stability upon single mutations from amino acid sequence, and (5) prediction methods for addressing double mutants. A list of online resources for predicting has also been provided.

  1. Ebolavirus Database: Gene and Protein Information Resource for Ebolaviruses

    Directory of Open Access Journals (Sweden)

    Rayapadi G. Swetha

    2016-01-01

    Full Text Available Ebola Virus Disease (EVD is a life-threatening haemorrhagic fever in humans. Even though there are many reports on EVD, the protein precursor functions and virulent factors of ebolaviruses remain poorly understood. Comparative analyses of Ebolavirus genomes will help in the identification of these important features. This prompted us to develop the Ebolavirus Database (EDB and we have provided links to various tools that will aid researchers to locate important regions in both the genomes and proteomes of Ebolavirus. The genomic analyses of ebolaviruses will provide important clues for locating the essential and core functional genes. The aim of EDB is to act as an integrated resource for ebolaviruses and we strongly believe that the database will be a useful tool for clinicians, microbiologists, health care workers, and bioscience researchers.

  2. Protein (Cyanobacteria) - PGDBj - Ortholog DB | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available ut This Database Database Description Download License Update History of This Database Site Policy | Contact Us Protein (Cyanobacteria) - PGDBj - Ortholog DB | LSDB Archive ... ...List Contact us PGDBj - Ortholog DB Protein (Cyanobacteria) Data detail Data name Protein (Cyanobacteria) DO...switchLanguage; BLAST Search Image Search Home About Archive Update History Data

  3. Integrating protein structures and precomputed genealogies in the Magnum database: Examples with cellular retinoid binding proteins

    Directory of Open Access Journals (Sweden)

    Bradley Michael E

    2006-02-01

    Full Text Available Abstract Background When accurate models for the divergent evolution of protein sequences are integrated with complementary biological information, such as folded protein structures, analyses of the combined data often lead to new hypotheses about molecular physiology. This represents an excellent example of how bioinformatics can be used to guide experimental research. However, progress in this direction has been slowed by the lack of a publicly available resource suitable for general use. Results The precomputed Magnum database offers a solution to this problem for ca. 1,800 full-length protein families with at least one crystal structure. The Magnum deliverables include 1 multiple sequence alignments, 2 mapping of alignment sites to crystal structure sites, 3 phylogenetic trees, 4 inferred ancestral sequences at internal tree nodes, and 5 amino acid replacements along tree branches. Comprehensive evaluations revealed that the automated procedures used to construct Magnum produced accurate models of how proteins divergently evolve, or genealogies, and correctly integrated these with the structural data. To demonstrate Magnum's capabilities, we asked for amino acid replacements requiring three nucleotide substitutions, located at internal protein structure sites, and occurring on short phylogenetic tree branches. In the cellular retinoid binding protein family a site that potentially modulates ligand binding affinity was discovered. Recruitment of cellular retinol binding protein to function as a lens crystallin in the diurnal gecko afforded another opportunity to showcase the predictive value of a browsable database containing branch replacement patterns integrated with protein structures. Conclusion We integrated two areas of protein science, evolution and structure, on a large scale and created a precomputed database, known as Magnum, which is the first freely available resource of its kind. Magnum provides evolutionary and structural

  4. Development of human protein reference database as an initial platform for approaching systems biology in humans

    DEFF Research Database (Denmark)

    Peri, Suraj; Navarro, J Daniel; Amanchy, Ramars

    2003-01-01

    Human Protein Reference Database (HPRD) is an object database that integrates a wealth of information relevant to the function of human proteins in health and disease. Data pertaining to thousands of protein-protein interactions, posttranslational modifications, enzyme/substrate relationships...

  5. PROXiMATE: a database of mutant protein-protein complex thermodynamics and kinetics.

    Science.gov (United States)

    Jemimah, Sherlyn; Yugandhar, K; Michael Gromiha, M

    2017-09-01

    We have developed PROXiMATE, a database of thermodynamic data for more than 6000 missense mutations in 174 heterodimeric protein-protein complexes, supplemented with interaction network data from STRING database, solvent accessibility, sequence, structural and functional information, experimental conditions and literature information. Additional features include complex structure visualization, search and display options, download options and a provision for users to upload their data. The database is freely available at http://www.iitm.ac.in/bioinfo/PROXiMATE/ . The website is implemented in Python, and supports recent versions of major browsers such as IE10, Firefox, Chrome and Opera. gromiha@iitm.ac.in. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  6. Toxicological relationships between proteins obtained from protein target predictions of large toxicity databases

    International Nuclear Information System (INIS)

    Nigsch, Florian; Mitchell, John B.O.

    2008-01-01

    The combination of models for protein target prediction with large databases containing toxicological information for individual molecules allows the derivation of 'toxiclogical' profiles, i.e., to what extent are molecules of known toxicity predicted to interact with a set of protein targets. To predict protein targets of drug-like and toxic molecules, we built a computational multiclass model using the Winnow algorithm based on a dataset of protein targets derived from the MDL Drug Data Report. A 15-fold Monte Carlo cross-validation using 50% of each class for training, and the remaining 50% for testing, provided an assessment of the accuracy of that model. We retained the 3 top-ranking predictions and found that in 82% of all cases the correct target was predicted within these three predictions. The first prediction was the correct one in almost 70% of cases. A model built on the whole protein target dataset was then used to predict the protein targets for 150 000 molecules from the MDL Toxicity Database. We analysed the frequency of the predictions across the panel of protein targets for experimentally determined toxicity classes of all molecules. This allowed us to identify clusters of proteins related by their toxicological profiles, as well as toxicities that are related. Literature-based evidence is provided for some specific clusters to show the relevance of the relationships identified

  7. Protein Structural Change Data - PSCDB | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us PSCDB Protein Structural Change Data Data detail Data name Protein Structural Change Data DO...History of This Database Site Policy | Contact Us Protein Structural Change Data - PSCDB | LSDB Archive ...

  8. Efficiency of Database Search for Identification of Mutated and Modified Proteins via Mass Spectrometry

    OpenAIRE

    Pevzner, Pavel A.; Mulyukov, Zufar; Dancik, Vlado; Tang, Chris L

    2001-01-01

    Although protein identification by matching tandem mass spectra (MS/MS) against protein databases is a widespread tool in mass spectrometry, the question about reliability of such searches remains open. Absence of rigorous significance scores in MS/MS database search makes it difficult to discard random database hits and may lead to erroneous protein identification, particularly in the case of mutated or post-translationally modified peptides. This problem is especially important for high-thr...

  9. Databases

    Digital Repository Service at National Institute of Oceanography (India)

    Kunte, P.D.

    Information on bibliographic as well as numeric/textual databases relevant to coastal geomorphology has been included in a tabular form. Databases cover a broad spectrum of related subjects like coastal environment and population aspects, coastline...

  10. EKPD: a hierarchical database of eukaryotic protein kinases and protein phosphatases.

    Science.gov (United States)

    Wang, Yongbo; Liu, Zexian; Cheng, Han; Gao, Tianshun; Pan, Zhicheng; Yang, Qing; Guo, Anyuan; Xue, Yu

    2014-01-01

    We present here EKPD (http://ekpd.biocuckoo.org), a hierarchical database of eukaryotic protein kinases (PKs) and protein phosphatases (PPs), the key molecules responsible for the reversible phosphorylation of proteins that are involved in almost all aspects of biological processes. As extensive experimental and computational efforts have been carried out to identify PKs and PPs, an integrative resource with detailed classification and annotation information would be of great value for both experimentalists and computational biologists. In this work, we first collected 1855 PKs and 347 PPs from the scientific literature and various public databases. Based on previously established rationales, we classified all of the known PKs and PPs into a hierarchical structure with three levels, i.e. group, family and individual PK/PP. There are 10 groups with 149 families for the PKs and 10 groups with 33 families for the PPs. We constructed 139 and 27 Hidden Markov Model profiles for PK and PP families, respectively. Then we systematically characterized ∼50,000 PKs and >10,000 PPs in eukaryotes. In addition, >500 PKs and >400 PPs were computationally identified by ortholog search. Finally, the online service of the EKPD database was implemented in PHP + MySQL + JavaScript.

  11. SynechoNET: integrated protein-protein interaction database of a model cyanobacterium Synechocystis sp. PCC 6803

    OpenAIRE

    Kim, Woo-Yeon; Kang, Sungsoo; Kim, Byoung-Chul; Oh, Jeehyun; Cho, Seongwoong; Bhak, Jong; Choi, Jong-Soon

    2008-01-01

    Background Cyanobacteria are model organisms for studying photosynthesis, carbon and nitrogen assimilation, evolution of plant plastids, and adaptability to environmental stresses. Despite many studies on cyanobacteria, there is no web-based database of their regulatory and signaling protein-protein interaction networks to date. Description We report a database and website SynechoNET that provides predicted protein-protein interactions. SynechoNET shows cyanobacterial domain-domain interactio...

  12. Databases

    Directory of Open Access Journals (Sweden)

    Nick Ryan

    2004-01-01

    Full Text Available Databases are deeply embedded in archaeology, underpinning and supporting many aspects of the subject. However, as well as providing a means for storing, retrieving and modifying data, databases themselves must be a result of a detailed analysis and design process. This article looks at this process, and shows how the characteristics of data models affect the process of database design and implementation. The impact of the Internet on the development of databases is examined, and the article concludes with a discussion of a range of issues associated with the recording and management of archaeological data.

  13. Database of ligand-induced domain movements in enzymes

    Directory of Open Access Journals (Sweden)

    Hayward Steven

    2009-03-01

    Full Text Available Abstract Background Conformational change induced by the binding of a substrate or coenzyme is a poorly understood stage in the process of enzyme catalysed reactions. For enzymes that exhibit a domain movement, the conformational change can be clearly characterized and therefore the opportunity exists to gain an understanding of the mechanisms involved. The development of the non-redundant database of protein domain movements contains examples of ligand-induced domain movements in enzymes, but this valuable data has remained unexploited. Description The domain movements in the non-redundant database of protein domain movements are those found by applying the DynDom program to pairs of crystallographic structures contained in Protein Data Bank files. For each pair of structures cross-checking ligands in their Protein Data Bank files with the KEGG-LIGAND database and using methods that search for ligands that contact the enzyme in one conformation but not the other, the non-redundant database of protein domain movements was refined down to a set of 203 enzymes where a domain movement is apparently triggered by the binding of a functional ligand. For these cases, ligand binding information, including hydrogen bonds and salt-bridges between the ligand and specific residues on the enzyme is presented in the context of dynamical information such as the regions that form the dynamic domains, the hinge bending residues, and the hinge axes. Conclusion The presentation at a single website of data on interactions between a ligand and specific residues on the enzyme alongside data on the movement that these interactions induce, should lead to new insights into the mechanisms of these enzymes in particular, and help in trying to understand the general process of ligand-induced domain closure in enzymes. The website can be found at: http://www.cmp.uea.ac.uk/dyndom/enzymeList.do

  14. The Princeton Protein Orthology Database (P-POD): a comparative genomics analysis tool for biologists.

    OpenAIRE

    Sven Heinicke; Michael S Livstone; Charles Lu; Rose Oughtred; Fan Kang; Samuel V Angiuoli; Owen White; David Botstein; Kara Dolinski

    2007-01-01

    Many biological databases that provide comparative genomics information and tools are now available on the internet. While certainly quite useful, to our knowledge none of the existing databases combine results from multiple comparative genomics methods with manually curated information from the literature. Here we describe the Princeton Protein Orthology Database (P-POD, http://ortholog.princeton.edu), a user-friendly database system that allows users to find and visualize the phylogenetic r...

  15. Conformational dynamics data bank: a database for conformational dynamics of proteins and supramolecular protein assemblies.

    Science.gov (United States)

    Kim, Do-Nyun; Altschuler, Josiah; Strong, Campbell; McGill, Gaël; Bathe, Mark

    2011-01-01

    The conformational dynamics data bank (CDDB, http://www.cdyn.org) is a database that aims to provide comprehensive results on the conformational dynamics of high molecular weight proteins and protein assemblies. Analysis is performed using a recently introduced coarse-grained computational approach that is applied to the majority of structures present in the electron microscopy data bank (EMDB). Results include equilibrium thermal fluctuations and elastic strain energy distributions that identify rigid versus flexible protein domains generally, as well as those associated with specific functional transitions, and correlations in molecular motions that identify molecular regions that are highly coupled dynamically, with implications for allosteric mechanisms. A practical web-based search interface enables users to easily collect conformational dynamics data in various formats. The data bank is maintained and updated automatically to include conformational dynamics results for new structural entries as they become available in the EMDB. The CDDB complements static structural information to facilitate the investigation and interpretation of the biological function of proteins and protein assemblies essential to cell function.

  16. Protein structure determination by exhaustive search of Protein Data Bank derived databases.

    Science.gov (United States)

    Stokes-Rees, Ian; Sliz, Piotr

    2010-12-14

    Parallel sequence and structure alignment tools have become ubiquitous and invaluable at all levels in the study of biological systems. We demonstrate the application and utility of this same parallel search paradigm to the process of protein structure determination, benefitting from the large and growing corpus of known structures. Such searches were previously computationally intractable. Through the method of Wide Search Molecular Replacement, developed here, they can be completed in a few hours with the aide of national-scale federated cyberinfrastructure. By dramatically expanding the range of models considered for structure determination, we show that small (less than 12% structural coverage) and low sequence identity (less than 20% identity) template structures can be identified through multidimensional template scoring metrics and used for structure determination. Many new macromolecular complexes can benefit significantly from such a technique due to the lack of known homologous protein folds or sequences. We demonstrate the effectiveness of the method by determining the structure of a full-length p97 homologue from Trichoplusia ni. Example cases with the MHC/T-cell receptor complex and the EmoB protein provide systematic estimates of minimum sequence identity, structure coverage, and structural similarity required for this method to succeed. We describe how this structure-search approach and other novel computationally intensive workflows are made tractable through integration with the US national computational cyberinfrastructure, allowing, for example, rapid processing of the entire Structural Classification of Proteins protein fragment database.

  17. Access to DNA and protein databases on the Internet.

    Science.gov (United States)

    Harper, R

    1994-02-01

    During the past year, the number of biological databases that can be queried via Internet has dramatically increased. This increase has resulted from the introduction of networking tools, such as Gopher and WAIS, that make it easy for research workers to index databases and make them available for on-line browsing. Biocomputing in the nineties will see the advent of more client/server options for the solution of problems in bioinformatics.

  18. MIPS: a database for protein sequences, homology data and yeast genome information.

    Science.gov (United States)

    Mewes, H W; Albermann, K; Heumann, K; Liebl, S; Pfeiffer, F

    1997-01-01

    The MIPS group (Martinsried Institute for Protein Sequences) at the Max-Planck-Institute for Biochemistry, Martinsried near Munich, Germany, collects, processes and distributes protein sequence data within the framework of the tripartite association of the PIR-International Protein Sequence Database (,). MIPS contributes nearly 50% of the data input to the PIR-International Protein Sequence Database. The database is distributed on CD-ROM together with PATCHX, an exhaustive supplement of unique, unverified protein sequences from external sources compiled by MIPS. Through its WWW server (http://www.mips.biochem.mpg.de/ ) MIPS permits internet access to sequence databases, homology data and to yeast genome information. (i) Sequence similarity results from the FASTA program () are stored in the FASTA database for all proteins from PIR-International and PATCHX. The database is dynamically maintained and permits instant access to FASTA results. (ii) Starting with FASTA database queries, proteins have been classified into families and superfamilies (PROT-FAM). (iii) The HPT (hashed position tree) data structure () developed at MIPS is a new approach for rapid sequence and pattern searching. (iv) MIPS provides access to the sequence and annotation of the complete yeast genome (), the functional classification of yeast genes (FunCat) and its graphical display, the 'Genome Browser' (). A CD-ROM based on the JAVA programming language providing dynamic interactive access to the yeast genome and the related protein sequences has been compiled and is available on request. PMID:9016498

  19. UNcleProt (Universal Nuclear Protein database of barley): The first nuclear protein database that distinguishes proteins from different phases of the cell cycle

    Czech Academy of Sciences Publication Activity Database

    Blavet, Nicolas; Uřinovská, J.; Jeřábková, Hana; Chamrád, I.; Vrána, Jan; Lenobel, R.; Beinhauer, D.; Šebela, M.; Doležel, Jaroslav; Petrovská, Beáta

    2017-01-01

    Roč. 8, č. 1 (2017), s. 70-80 ISSN 1949-1034 R&D Projects: GA ČR(CZ) GA14-28443S; GA MŠk(CZ) LO1204 Institutional support: RVO:61389030 Keywords : cicer-arietinum l. * rice oryza-sativa * chromatin-associated protein s * proteomic analysis * mitotic chromosomes * dehydration * localization * chickpea * network * phosphoproteome * barley * cell cycle * database * flow-cytometry * localization * mass spectrometry * nuclear proteome * nucleus Subject RIV: CE - Biochemistry OBOR OECD: Cell biology Impact factor: 2.387, year: 2016

  20. PACSY, a relational database management system for protein structure and chemical shift analysis.

    Science.gov (United States)

    Lee, Woonghee; Yu, Wookyung; Kim, Suhkmann; Chang, Iksoo; Lee, Weontae; Markley, John L

    2012-10-01

    PACSY (Protein structure And Chemical Shift NMR spectroscopY) is a relational database management system that integrates information from the Protein Data Bank, the Biological Magnetic Resonance Data Bank, and the Structural Classification of Proteins database. PACSY provides three-dimensional coordinates and chemical shifts of atoms along with derived information such as torsion angles, solvent accessible surface areas, and hydrophobicity scales. PACSY consists of six relational table types linked to one another for coherence by key identification numbers. Database queries are enabled by advanced search functions supported by an RDBMS server such as MySQL or PostgreSQL. PACSY enables users to search for combinations of information from different database sources in support of their research. Two software packages, PACSY Maker for database creation and PACSY Analyzer for database analysis, are available from http://pacsy.nmrfam.wisc.edu.

  1. PACSY, a relational database management system for protein structure and chemical shift analysis

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Woonghee, E-mail: whlee@nmrfam.wisc.edu [University of Wisconsin-Madison, National Magnetic Resonance Facility at Madison, and Biochemistry Department (United States); Yu, Wookyung [Center for Proteome Biophysics, Pusan National University, Department of Physics (Korea, Republic of); Kim, Suhkmann [Pusan National University, Department of Chemistry and Chemistry Institute for Functional Materials (Korea, Republic of); Chang, Iksoo [Center for Proteome Biophysics, Pusan National University, Department of Physics (Korea, Republic of); Lee, Weontae, E-mail: wlee@spin.yonsei.ac.kr [Yonsei University, Structural Biochemistry and Molecular Biophysics Laboratory, Department of Biochemistry (Korea, Republic of); Markley, John L., E-mail: markley@nmrfam.wisc.edu [University of Wisconsin-Madison, National Magnetic Resonance Facility at Madison, and Biochemistry Department (United States)

    2012-10-15

    PACSY (Protein structure And Chemical Shift NMR spectroscopY) is a relational database management system that integrates information from the Protein Data Bank, the Biological Magnetic Resonance Data Bank, and the Structural Classification of Proteins database. PACSY provides three-dimensional coordinates and chemical shifts of atoms along with derived information such as torsion angles, solvent accessible surface areas, and hydrophobicity scales. PACSY consists of six relational table types linked to one another for coherence by key identification numbers. Database queries are enabled by advanced search functions supported by an RDBMS server such as MySQL or PostgreSQL. PACSY enables users to search for combinations of information from different database sources in support of their research. Two software packages, PACSY Maker for database creation and PACSY Analyzer for database analysis, are available from http://pacsy.nmrfam.wisc.eduhttp://pacsy.nmrfam.wisc.edu.

  2. PACSY, a relational database management system for protein structure and chemical shift analysis

    Science.gov (United States)

    Lee, Woonghee; Yu, Wookyung; Kim, Suhkmann; Chang, Iksoo

    2012-01-01

    PACSY (Protein structure And Chemical Shift NMR spectroscopY) is a relational database management system that integrates information from the Protein Data Bank, the Biological Magnetic Resonance Data Bank, and the Structural Classification of Proteins database. PACSY provides three-dimensional coordinates and chemical shifts of atoms along with derived information such as torsion angles, solvent accessible surface areas, and hydrophobicity scales. PACSY consists of six relational table types linked to one another for coherence by key identification numbers. Database queries are enabled by advanced search functions supported by an RDBMS server such as MySQL or PostgreSQL. PACSY enables users to search for combinations of information from different database sources in support of their research. Two software packages, PACSY Maker for database creation and PACSY Analyzer for database analysis, are available from http://pacsy.nmrfam.wisc.edu. PMID:22903636

  3. PACSY, a relational database management system for protein structure and chemical shift analysis

    International Nuclear Information System (INIS)

    Lee, Woonghee; Yu, Wookyung; Kim, Suhkmann; Chang, Iksoo; Lee, Weontae; Markley, John L.

    2012-01-01

    PACSY (Protein structure And Chemical Shift NMR spectroscopY) is a relational database management system that integrates information from the Protein Data Bank, the Biological Magnetic Resonance Data Bank, and the Structural Classification of Proteins database. PACSY provides three-dimensional coordinates and chemical shifts of atoms along with derived information such as torsion angles, solvent accessible surface areas, and hydrophobicity scales. PACSY consists of six relational table types linked to one another for coherence by key identification numbers. Database queries are enabled by advanced search functions supported by an RDBMS server such as MySQL or PostgreSQL. PACSY enables users to search for combinations of information from different database sources in support of their research. Two software packages, PACSY Maker for database creation and PACSY Analyzer for database analysis, are available from http://pacsy.nmrfam.wisc.eduhttp://pacsy.nmrfam.wisc.edu.

  4. Functional nonredundancy of elephants in a disturbed tropical forest.

    Science.gov (United States)

    Sekar, Nitin; Lee, Chia-Lo; Sukumar, Raman

    2017-10-01

    Conservation efforts are often motivated by the threat of global extinction. Yet if conservationists had more information suggesting that extirpation of individual species could lead to undesirable ecological effects, they might more frequently attempt to protect or restore such species across their ranges even if they were not globally endangered. Scientists have seldom measured or quantitatively predicted the functional consequences of species loss, even for large, extinction-prone species that theory suggests should be functionally unique. We measured the contribution of Asian elephants (Elephas maximus) to the dispersal of 3 large-fruited species in a disturbed tropical moist forest and predicted the extent to which alternative dispersers could compensate for elephants in their absence. We created an empirical probability model with data on frugivory and seed dispersal from Buxa Tiger Reserve, India. These data were used to estimate the proportion of seeds consumed by elephants and other frugivores that survive handling and density-dependent processes (Janzen-Connell effects and conspecific intradung competition) and germinate. Without compensation, the number of seeds dispersed and surviving density-dependent effects decreased 26% (Artocarpus chaplasha), 42% (Careya arborea), and 72% (Dillenia indica) when elephants were absent from the ecosystem. Compensatory fruit removal by other animals substantially ameliorated these losses. For instance, reductions in successful dispersal of D. indica were as low as 23% when gaur (Bos gaurus) persisted, but median dispersal distance still declined from 30% (C. arborea) to 90% (A. chaplasha) without elephants. Our results support the theory that the largest animal species in an ecosystem have nonredundant ecological functionality and that their extirpation is likely to lead to the deterioration of ecosystem processes such as seed dispersal. This effect is likely accentuated by the overall defaunation of many tropical

  5. The PANTHER database of protein families, subfamilies, functions and pathways

    OpenAIRE

    Mi, Huaiyu; Lazareva-Ulitsky, Betty; Loo, Rozina; Kejariwal, Anish; Vandergriff, Jody; Rabkin, Steven; Guo, Nan; Muruganujan, Anushya; Doremieux, Olivier; Campbell, Michael J.; Kitano, Hiroaki; Thomas, Paul D.

    2004-01-01

    PANTHER is a large collection of protein families that have been subdivided into functionally related subfamilies, using human expertise. These subfamilies model the divergence of specific functions within protein families, allowing more accurate association with function (ontology terms and pathways), as well as inference of amino acids important for functional specificity. Hidden Markov models (HMMs) are built for each family and subfamily for classifying additional protein sequences. The l...

  6. SHEETSPAIR: A Database of Amino Acid Pairs in Protein Sheet Structures

    Directory of Open Access Journals (Sweden)

    Ning Zhang

    2007-10-01

    Full Text Available Within folded strands of a protein, amino acids (AAs on every adjacent two strands form a pair of AAs. To explore the interactions between strands in a protein sheet structure, we have established an Internet-accessible relational database named SheetsPairs based on SQL Server 2000. The database has collected AAs pairs in proteins with detailed information. Furthermore, it utilizes a non-freetext database structure to store protein sequences and a specific database table with a unique number to store strands, which provides more searching options and rapid and accurate access to data queries. An IIS web server has been set up for data retrieval through a custom web interface, which enables complex data queries. Also searchable are parallel or anti-parallel folded strands and the list of strands in a specified protein.

  7. Protein (Viridiplantae) - PGDBj - Ortholog DB | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available ase Description Download License Update History of This Database Site Policy | Contact Us Protein (Viridiplantae) - PGDBj - Ortholog DB | LSDB Archive ... ...List Contact us PGDBj - Ortholog DB Protein (Viridiplantae) Data detail Data name Protein (Viridiplantae) DO...switchLanguage; BLAST Search Image Search Home About Archive Update History Data

  8. ARCPHdb: A comprehensive protein database for SF1 and SF2 helicase from archaea.

    Science.gov (United States)

    Moukhtar, Mirna; Chaar, Wafi; Abdel-Razzak, Ziad; Khalil, Mohamad; Taha, Samir; Chamieh, Hala

    2017-01-01

    Superfamily 1 and Superfamily 2 helicases, two of the largest helicase protein families, play vital roles in many biological processes including replication, transcription and translation. Study of helicase proteins in the model microorganisms of archaea have largely contributed to the understanding of their function, architecture and assembly. Based on a large phylogenomics approach, we have identified and classified all SF1 and SF2 protein families in ninety five sequenced archaea genomes. Here we developed an online webserver linked to a specialized protein database named ARCPHdb to provide access for SF1 and SF2 helicase families from archaea. ARCPHdb was implemented using MySQL relational database. Web interfaces were developed using Netbeans. Data were stored according to UniProt accession numbers, NCBI Ref Seq ID, PDB IDs and Entrez Databases. A user-friendly interactive web interface has been developed to browse, search and download archaeal helicase protein sequences, their available 3D structure models, and related documentation available in the literature provided by ARCPHdb. The database provides direct links to matching external databases. The ARCPHdb is the first online database to compile all protein information on SF1 and SF2 helicase from archaea in one platform. This database provides essential resource information for all researchers interested in the field. Copyright © 2016 Elsevier Ltd. All rights reserved.

  9. SCOWLP: a web-based database for detailed characterization and visualization of protein interfaces

    Directory of Open Access Journals (Sweden)

    Schroeder Michael

    2006-03-01

    Full Text Available Abstract Background Currently there is a strong need for methods that help to obtain an accurate description of protein interfaces in order to be able to understand the principles that govern molecular recognition and protein function. Many of the recent efforts to computationally identify and characterize protein networks extract protein interaction information at atomic resolution from the PDB. However, they pay none or little attention to small protein ligands and solvent. They are key components and mediators of protein interactions and fundamental for a complete description of protein interfaces. Interactome profiling requires the development of computational tools to extract and analyze protein-protein, protein-ligand and detailed solvent interaction information from the PDB in an automatic and comparative fashion. Adding this information to the existing one on protein-protein interactions will allow us to better understand protein interaction networks and protein function. Description SCOWLP (Structural Characterization Of Water, Ligands and Proteins is a user-friendly and publicly accessible web-based relational database for detailed characterization and visualization of the PDB protein interfaces. The SCOWLP database includes proteins, peptidic-ligands and interface water molecules as descriptors of protein interfaces. It contains currently 74,907 protein interfaces and 2,093,976 residue-residue interactions formed by 60,664 structural units (protein domains and peptidic-ligands and their interacting solvent. The SCOWLP web-server allows detailed structural analysis and comparisons of protein interfaces at atomic level by text query of PDB codes and/or by navigating a SCOP-based tree. It includes a visualization tool to interactively display the interfaces and label interacting residues and interface solvent by atomic physicochemical properties. SCOWLP is automatically updated with every SCOP release. Conclusion SCOWLP enriches

  10. PARPs database: A LIMS systems for protein-protein interaction data mining or laboratory information management system

    Directory of Open Access Journals (Sweden)

    Picard-Cloutier Aude

    2007-12-01

    Full Text Available Abstract Background In the "post-genome" era, mass spectrometry (MS has become an important method for the analysis of proteins and the rapid advancement of this technique, in combination with other proteomics methods, results in an increasing amount of proteome data. This data must be archived and analysed using specialized bioinformatics tools. Description We herein describe "PARPs database," a data analysis and management pipeline for liquid chromatography tandem mass spectrometry (LC-MS/MS proteomics. PARPs database is a web-based tool whose features include experiment annotation, protein database searching, protein sequence management, as well as data-mining of the peptides and proteins identified. Conclusion Using this pipeline, we have successfully identified several interactions of biological significance between PARP-1 and other proteins, namely RFC-1, 2, 3, 4 and 5.

  11. O-GLYCOBASE version 4.0: a revised database of O-glycosylated proteins

    DEFF Research Database (Denmark)

    Gupta, Ramneek; Birch, Hanne; Rapacki, Krzysztof

    1999-01-01

    O-GLYCBASE is a database of glycoproteins with O-linked glycosylation sites. Entries with at least one experimentally verified O-glycosylation site have been complied from protein sequence databases and literature. Each entry contains information about the glycan involved, the species, sequence, ...

  12. Composition of Overlapping Protein-Protein and Protein-Ligand Interfaces.

    Directory of Open Access Journals (Sweden)

    Ruzianisra Mohamed

    Full Text Available Protein-protein interactions (PPIs play a major role in many biological processes and they represent an important class of targets for therapeutic intervention. However, targeting PPIs is challenging because often no convenient natural substrates are available as starting point for small-molecule design. Here, we explored the characteristics of protein interfaces in five non-redundant datasets of 174 protein-protein (PP complexes, and 161 protein-ligand (PL complexes from the ABC database, 436 PP complexes, and 196 PL complexes from the PIBASE database and a dataset of 89 PL complexes from the Timbal database. In all cases, the small molecule ligands must bind at the respective PP interface. We observed similar amino acid frequencies in all three datasets. Remarkably, also the characteristics of PP contacts and overlapping PL contacts are highly similar.

  13. Domain fusion analysis by applying relational algebra to protein sequence and domain databases.

    Science.gov (United States)

    Truong, Kevin; Ikura, Mitsuhiko

    2003-05-06

    Domain fusion analysis is a useful method to predict functionally linked proteins that may be involved in direct protein-protein interactions or in the same metabolic or signaling pathway. As separate domain databases like BLOCKS, PROSITE, Pfam, SMART, PRINTS-S, ProDom, TIGRFAMs, and amalgamated domain databases like InterPro continue to grow in size and quality, a computational method to perform domain fusion analysis that leverages on these efforts will become increasingly powerful. This paper proposes a computational method employing relational algebra to find domain fusions in protein sequence databases. The feasibility of this method was illustrated on the SWISS-PROT+TrEMBL sequence database using domain predictions from the Pfam HMM (hidden Markov model) database. We identified 235 and 189 putative functionally linked protein partners in H. sapiens and S. cerevisiae, respectively. From scientific literature, we were able to confirm many of these functional linkages, while the remainder offer testable experimental hypothesis. Results can be viewed at http://calcium.uhnres.utoronto.ca/pi. As the analysis can be computed quickly on any relational database that supports standard SQL (structured query language), it can be dynamically updated along with the sequence and domain databases, thereby improving the quality of predictions over time.

  14. Integrated Controlling System and Unified Database for High Throughput Protein Crystallography Experiments

    International Nuclear Information System (INIS)

    Gaponov, Yu.A.; Igarashi, N.; Hiraki, M.; Sasajima, K.; Matsugaki, N.; Suzuki, M.; Kosuge, T.; Wakatsuki, S.

    2004-01-01

    An integrated controlling system and a unified database for high throughput protein crystallography experiments have been developed. Main features of protein crystallography experiments (purification, crystallization, crystal harvesting, data collection, data processing) were integrated into the software under development. All information necessary to perform protein crystallography experiments is stored (except raw X-ray data that are stored in a central data server) in a MySQL relational database. The database contains four mutually linked hierarchical trees describing protein crystals, data collection of protein crystal and experimental data processing. A database editor was designed and developed. The editor supports basic database functions to view, create, modify and delete user records in the database. Two search engines were realized: direct search of necessary information in the database and object oriented search. The system is based on TCP/IP secure UNIX sockets with four predefined sending and receiving behaviors, which support communications between all connected servers and clients with remote control functions (creating and modifying data for experimental conditions, data acquisition, viewing experimental data, and performing data processing). Two secure login schemes were designed and developed: a direct method (using the developed Linux clients with secure connection) and an indirect method (using the secure SSL connection using secure X11 support from any operating system with X-terminal and SSH support). A part of the system has been implemented on a new MAD beam line, NW12, at the Photon Factory Advanced Ring for general user experiments

  15. Social network ties beyond nonredundancy: an experimental investigation of the effect of knowledge content and tie strength on creativity.

    Science.gov (United States)

    Perry-Smith, Jill E

    2014-09-01

    Social network research emphasizes the access to nonredundant knowledge content that network ties provide. I suggest that some content is more beneficial than others and that tie strength may affect creativity for reasons other than the associated structure. That is, tie strength may affect how individuals process nonredundant knowledge. I investigate 2 types of knowledge content--information (i.e., facts or data) and frames (i.e., interpretations or impressions)--and explore whether tie strength influences their effect on creativity. Drawing on creativity theory, I employ an experimental design to provide greater theoretical clarity and to isolate causality. According to the results from 2 studies, distinct frames received from contacts facilitate creativity, but the effect of distinct information is more complex. When individuals receive distinct information from strong ties, it constrains creativity compared to distinct frames. Content from weak ties appears to facilitate creativity across all scenarios. The results of mediated moderation analysis indicate the effect of framing versus information for strong ties is driven by decision-making time, as an indicator of cognitive expansion. PsycINFO Database Record (c) 2014 APA, all rights reserved.

  16. ARAMEMNON, a novel database for Arabidopsis integral membrane proteins

    DEFF Research Database (Denmark)

    Schwacke, Rainer; Schneider, Anja; van der Graaff, Eric

    2003-01-01

    spans and are possibly linked to transport functions. The ARAMEMNON DB enables direct comparison of the predictions of seven different TM span computation programs and the predictions of subcellular localization by eight signal peptide recognition programs. A special function displays the proteins...

  17. The reactive metabolite target protein database (TPDB)--a web-accessible resource.

    Science.gov (United States)

    Hanzlik, Robert P; Koen, Yakov M; Theertham, Bhargav; Dong, Yinghua; Fang, Jianwen

    2007-03-16

    The toxic effects of many simple organic compounds stem from their biotransformation to chemically reactive metabolites which bind covalently to cellular proteins. To understand the mechanisms of cytotoxic responses it may be important to know which proteins become adducted and whether some may be common targets of multiple toxins. The literature of this field is widely scattered but expanding rapidly, suggesting the need for a comprehensive, searchable database of reactive metabolite target proteins. The Reactive Metabolite Target Protein Database (TPDB) is a comprehensive, curated, searchable, documented compilation of publicly available information on the protein targets of reactive metabolites of 18 well-studied chemicals and drugs of known toxicity. TPDB software enables i) string searches for author names and proteins names/synonyms, ii) more complex searches by selecting chemical compound, animal species, target tissue and protein names/synonyms from pull-down menus, and iii) commonality searches over multiple chemicals. Tabulated search results provide information, references and links to other databases. The TPDB is a unique on-line compilation of information on the covalent modification of cellular proteins by reactive metabolites of chemicals and drugs. Its comprehensiveness and searchability should facilitate the elucidation of mechanisms of reactive metabolite toxicity. The database is freely available at http://tpdb.medchem.ku.edu/tpdb.html.

  18. The reactive metabolite target protein database (TPDB – a web-accessible resource

    Directory of Open Access Journals (Sweden)

    Dong Yinghua

    2007-03-01

    Full Text Available Abstract Background The toxic effects of many simple organic compounds stem from their biotransformation to chemically reactive metabolites which bind covalently to cellular proteins. To understand the mechanisms of cytotoxic responses it may be important to know which proteins become adducted and whether some may be common targets of multiple toxins. The literature of this field is widely scattered but expanding rapidly, suggesting the need for a comprehensive, searchable database of reactive metabolite target proteins. Description The Reactive Metabolite Target Protein Database (TPDB is a comprehensive, curated, searchable, documented compilation of publicly available information on the protein targets of reactive metabolites of 18 well-studied chemicals and drugs of known toxicity. TPDB software enables i string searches for author names and proteins names/synonyms, ii more complex searches by selecting chemical compound, animal species, target tissue and protein names/synonyms from pull-down menus, and iii commonality searches over multiple chemicals. Tabulated search results provide information, references and links to other databases. Conclusion The TPDB is a unique on-line compilation of information on the covalent modification of cellular proteins by reactive metabolites of chemicals and drugs. Its comprehensiveness and searchability should facilitate the elucidation of mechanisms of reactive metabolite toxicity. The database is freely available at http://tpdb.medchem.ku.edu/tpdb.html

  19. iPfam: a database of protein family and domain interactions found in the Protein Data Bank.

    Science.gov (United States)

    Finn, Robert D; Miller, Benjamin L; Clements, Jody; Bateman, Alex

    2014-01-01

    The database iPfam, available at http://ipfam.org, catalogues Pfam domain interactions based on known 3D structures that are found in the Protein Data Bank, providing interaction data at the molecular level. Previously, the iPfam domain-domain interaction data was integrated within the Pfam database and website, but it has now been migrated to a separate database. This allows for independent development, improving data access and giving clearer separation between the protein family and interactions datasets. In addition to domain-domain interactions, iPfam has been expanded to include interaction data for domain bound small molecule ligands. Functional annotations are provided from source databases, supplemented by the incorporation of Wikipedia articles where available. iPfam (version 1.0) contains >9500 domain-domain and 15 500 domain-ligand interactions. The new website provides access to this data in a variety of ways, including interactive visualizations of the interaction data.

  20. MEGADOCK-Web: an integrated database of high-throughput structure-based protein-protein interaction predictions.

    Science.gov (United States)

    Hayashi, Takanori; Matsuzaki, Yuri; Yanagisawa, Keisuke; Ohue, Masahito; Akiyama, Yutaka

    2018-05-08

    Protein-protein interactions (PPIs) play several roles in living cells, and computational PPI prediction is a major focus of many researchers. The three-dimensional (3D) structure and binding surface are important for the design of PPI inhibitors. Therefore, rigid body protein-protein docking calculations for two protein structures are expected to allow elucidation of PPIs different from known complexes in terms of 3D structures because known PPI information is not explicitly required. We have developed rapid PPI prediction software based on protein-protein docking, called MEGADOCK. In order to fully utilize the benefits of computational PPI predictions, it is necessary to construct a comprehensive database to gather prediction results and their predicted 3D complex structures and to make them easily accessible. Although several databases exist that provide predicted PPIs, the previous databases do not contain a sufficient number of entries for the purpose of discovering novel PPIs. In this study, we constructed an integrated database of MEGADOCK PPI predictions, named MEGADOCK-Web. MEGADOCK-Web provides more than 10 times the number of PPI predictions than previous databases and enables users to conduct PPI predictions that cannot be found in conventional PPI prediction databases. In MEGADOCK-Web, there are 7528 protein chains and 28,331,628 predicted PPIs from all possible combinations of those proteins. Each protein structure is annotated with PDB ID, chain ID, UniProt AC, related KEGG pathway IDs, and known PPI pairs. Additionally, MEGADOCK-Web provides four powerful functions: 1) searching precalculated PPI predictions, 2) providing annotations for each predicted protein pair with an experimentally known PPI, 3) visualizing candidates that may interact with the query protein on biochemical pathways, and 4) visualizing predicted complex structures through a 3D molecular viewer. MEGADOCK-Web provides a huge amount of comprehensive PPI predictions based on

  1. ProDis-ContSHC: learning protein dissimilarity measures and hierarchical context coherently for protein-protein comparison in protein database retrieval.

    Science.gov (United States)

    Wang, Jingyan; Gao, Xin; Wang, Quanquan; Li, Yongping

    2012-05-08

    The need to retrieve or classify protein molecules using structure or sequence-based similarity measures underlies a wide range of biomedical applications. Traditional protein search methods rely on a pairwise dissimilarity/similarity measure for comparing a pair of proteins. This kind of pairwise measures suffer from the limitation of neglecting the distribution of other proteins and thus cannot satisfy the need for high accuracy of the retrieval systems. Recent work in the machine learning community has shown that exploiting the global structure of the database and learning the contextual dissimilarity/similarity measures can improve the retrieval performance significantly. However, most existing contextual dissimilarity/similarity learning algorithms work in an unsupervised manner, which does not utilize the information of the known class labels of proteins in the database. In this paper, we propose a novel protein-protein dissimilarity learning algorithm, ProDis-ContSHC. ProDis-ContSHC regularizes an existing dissimilarity measure dij by considering the contextual information of the proteins. The context of a protein is defined by its neighboring proteins. The basic idea is, for a pair of proteins (i, j), if their context N(i) and N(j) is similar to each other, the two proteins should also have a high similarity. We implement this idea by regularizing dij by a factor learned from the context N(i) and N(j).Moreover, we divide the context to hierarchial sub-context and get the contextual dissimilarity vector for each protein pair. Using the class label information of the proteins, we select the relevant (a pair of proteins that has the same class labels) and irrelevant (with different labels) protein pairs, and train an SVM model to distinguish between their contextual dissimilarity vectors. The SVM model is further used to learn a supervised regularizing factor. Finally, with the new Supervised learned Dissimilarity measure, we update the Protein Hierarchial

  2. ProDis-ContSHC: Learning protein dissimilarity measures and hierarchical context coherently for protein-protein comparison in protein database retrieval

    KAUST Repository

    Wang, Jim Jing-Yan

    2012-05-08

    Background: The need to retrieve or classify protein molecules using structure or sequence-based similarity measures underlies a wide range of biomedical applications. Traditional protein search methods rely on a pairwise dissimilarity/similarity measure for comparing a pair of proteins. This kind of pairwise measures suffer from the limitation of neglecting the distribution of other proteins and thus cannot satisfy the need for high accuracy of the retrieval systems. Recent work in the machine learning community has shown that exploiting the global structure of the database and learning the contextual dissimilarity/similarity measures can improve the retrieval performance significantly. However, most existing contextual dissimilarity/similarity learning algorithms work in an unsupervised manner, which does not utilize the information of the known class labels of proteins in the database.Results: In this paper, we propose a novel protein-protein dissimilarity learning algorithm, ProDis-ContSHC. ProDis-ContSHC regularizes an existing dissimilarity measure dij by considering the contextual information of the proteins. The context of a protein is defined by its neighboring proteins. The basic idea is, for a pair of proteins (i, j), if their context N (i) and N (j) is similar to each other, the two proteins should also have a high similarity. We implement this idea by regularizing dij by a factor learned from the context N (i) and N (j). Moreover, we divide the context to hierarchial sub-context and get the contextual dissimilarity vector for each protein pair. Using the class label information of the proteins, we select the relevant (a pair of proteins that has the same class labels) and irrelevant (with different labels) protein pairs, and train an SVM model to distinguish between their contextual dissimilarity vectors. The SVM model is further used to learn a supervised regularizing factor. Finally, with the new Supervised learned Dissimilarity measure, we update

  3. An update of the DEF database of protein fold class predictions

    DEFF Research Database (Denmark)

    Reczko, Martin; Karras, Dimitris; Bohr, Henrik

    1997-01-01

    An update is given on the Database of Expected Fold classes (DEF) that contains a collection of fold-class predictions made from protein sequences and a mail server that provides new predictions for new sequences. To any given sequence one of 49 fold-classes is chosen to classify the structure re...... related to the sequence with high accuracy. The updated predictions system is developed using data from the new version of the 3D-ALI database of aligned protein structures and thus is giving more reliable and more detailed predictions than the previous DEF system.......An update is given on the Database of Expected Fold classes (DEF) that contains a collection of fold-class predictions made from protein sequences and a mail server that provides new predictions for new sequences. To any given sequence one of 49 fold-classes is chosen to classify the structure...

  4. Regnase-1 and Roquin Nonredundantly Regulate Th1 Differentiation Causing Cardiac Inflammation and Fibrosis.

    Science.gov (United States)

    Cui, Xiaotong; Mino, Takashi; Yoshinaga, Masanori; Nakatsuka, Yoshinari; Hia, Fabian; Yamasoba, Daichi; Tsujimura, Tohru; Tomonaga, Keizo; Suzuki, Yutaka; Uehata, Takuya; Takeuchi, Osamu

    2017-12-15

    Regnase-1 and Roquin are RNA binding proteins that are essential for degradation of inflammatory mRNAs and maintenance of immune homeostasis. Although deficiency of either of the proteins leads to enhanced T cell activation, their functional relationship in T cells has yet to be clarified because of lethality upon mutation of both Regnase-1 and Roquin. By using a Regnase-1 conditional allele, we show that mutations of both Regnase-1 and Roquin in T cells leads to massive lymphocyte activation. In contrast, mutation of either Regnase-1 or Roquin affected T cell activation to a lesser extent than the double mutation, indicating that Regnase-1 and Roquin function nonredundantly in T cells. Interestingly, Regnase-1 and Roquin double-mutant mice suffered from severe inflammation and early formation of fibrosis, especially in the heart, along with the increased expression of Ifng , but not Il4 or Il17a Consistently, mutation of both Regnase-1 and Roquin leads to a huge increase in the Th1, but not the Th2 or Th17, population in spleens compared with T cells with a single Regnase-1 or Roquin deficiency. Regnase-1 and Roquin are capable of repressing the expression of a group of mRNAs encoding factors involved in Th1 differentiation, such as Furin and Il12rb1 , via their 3' untranslated regions. Moreover, Regnase-1 is capable of repressing Roquin mRNA. This cross-regulation may contribute to the synergistic control of T cell activation/polarization. Collectively, our results demonstrate that Regnase-1 and Roquin maintain T cell immune homeostasis and regulate Th1 polarization synergistically. Copyright © 2017 by The American Association of Immunologists, Inc.

  5. VaProS: a database-integration approach for protein/genome information retrieval

    KAUST Repository

    Gojobori, Takashi; Ikeo, Kazuho; Katayama, Yukie; Kawabata, Takeshi; Kinjo, Akira R.; Kinoshita, Kengo; Kwon, Yeondae; Migita, Ohsuke; Mizutani, Hisashi; Muraoka, Masafumi; Nagata, Koji; Omori, Satoshi; Sugawara, Hideaki; Yamada, Daichi; Yura, Kei

    2016-01-01

    Life science research now heavily relies on all sorts of databases for genome sequences, transcription, protein three-dimensional (3D) structures, protein–protein interactions, phenotypes and so forth. The knowledge accumulated by all the omics research is so vast that a computer-aided search of data is now a prerequisite for starting a new study. In addition, a combinatory search throughout these databases has a chance to extract new ideas and new hypotheses that can be examined by wet-lab experiments. By virtually integrating the related databases on the Internet, we have built a new web application that facilitates life science researchers for retrieving experts’ knowledge stored in the databases and for building a new hypothesis of the research target. This web application, named VaProS, puts stress on the interconnection between the functional information of genome sequences and protein 3D structures, such as structural effect of the gene mutation. In this manuscript, we present the notion of VaProS, the databases and tools that can be accessed without any knowledge of database locations and data formats, and the power of search exemplified in quest of the molecular mechanisms of lysosomal storage disease. VaProS can be freely accessed at http://p4d-info.nig.ac.jp/vapros/.

  6. VaProS: a database-integration approach for protein/genome information retrieval

    KAUST Repository

    Gojobori, Takashi

    2016-12-24

    Life science research now heavily relies on all sorts of databases for genome sequences, transcription, protein three-dimensional (3D) structures, protein–protein interactions, phenotypes and so forth. The knowledge accumulated by all the omics research is so vast that a computer-aided search of data is now a prerequisite for starting a new study. In addition, a combinatory search throughout these databases has a chance to extract new ideas and new hypotheses that can be examined by wet-lab experiments. By virtually integrating the related databases on the Internet, we have built a new web application that facilitates life science researchers for retrieving experts’ knowledge stored in the databases and for building a new hypothesis of the research target. This web application, named VaProS, puts stress on the interconnection between the functional information of genome sequences and protein 3D structures, such as structural effect of the gene mutation. In this manuscript, we present the notion of VaProS, the databases and tools that can be accessed without any knowledge of database locations and data formats, and the power of search exemplified in quest of the molecular mechanisms of lysosomal storage disease. VaProS can be freely accessed at http://p4d-info.nig.ac.jp/vapros/.

  7. HIP2: An online database of human plasma proteins from healthy individuals

    Directory of Open Access Journals (Sweden)

    Shen Changyu

    2008-04-01

    Full Text Available Abstract Background With the introduction of increasingly powerful mass spectrometry (MS techniques for clinical research, several recent large-scale MS proteomics studies have sought to characterize the entire human plasma proteome with a general objective for identifying thousands of proteins leaked from tissues in the circulating blood. Understanding the basic constituents, diversity, and variability of the human plasma proteome is essential to the development of sensitive molecular diagnosis and treatment monitoring solutions for future biomedical applications. Biomedical researchers today, however, do not have an integrated online resource in which they can search for plasma proteins collected from different mass spectrometry platforms, experimental protocols, and search software for healthy individuals. The lack of such a resource for comparisons has made it difficult to interpret proteomics profile changes in patients' plasma and to design protein biomarker discovery experiments. Description To aid future protein biomarker studies of disease and health from human plasma, we developed an online database, HIP2 (Healthy Human Individual's Integrated Plasma Proteome. The current version contains 12,787 protein entries linked to 86,831 peptide entries identified using different MS platforms. Conclusion This web-based database will be useful to biomedical researchers involved in biomarker discovery research. This database has been developed to be the comprehensive collection of healthy human plasma proteins, and has protein data captured in a relational database schema built to contain mappings of supporting peptide evidence from several high-quality and high-throughput mass-spectrometry (MS experimental data sets. Users can search for plasma protein/peptide annotations, peptide/protein alignments, and experimental/sample conditions with options for filter-based retrieval to achieve greater analytical power for discovery and validation.

  8. CPAD, Curated Protein Aggregation Database: A Repository of Manually Curated Experimental Data on Protein and Peptide Aggregation.

    Science.gov (United States)

    Thangakani, A Mary; Nagarajan, R; Kumar, Sandeep; Sakthivel, R; Velmurugan, D; Gromiha, M Michael

    2016-01-01

    Accurate distinction between peptide sequences that can form amyloid-fibrils or amorphous β-aggregates, identification of potential aggregation prone regions in proteins, and prediction of change in aggregation rate of a protein upon mutation(s) are critical to research on protein misfolding diseases, such as Alzheimer's and Parkinson's, as well as biotechnological production of protein based therapeutics. We have developed a Curated Protein Aggregation Database (CPAD), which has collected results from experimental studies performed by scientific community aimed at understanding protein/peptide aggregation. CPAD contains more than 2300 experimentally observed aggregation rates upon mutations in known amyloidogenic proteins. Each entry includes numerical values for the following parameters: change in rate of aggregation as measured by fluorescence intensity or turbidity, name and source of the protein, Uniprot and Protein Data Bank codes, single point as well as multiple mutations, and literature citation. The data in CPAD has been supplemented with five different types of additional information: (i) Amyloid fibril forming hexa-peptides, (ii) Amorphous β-aggregating hexa-peptides, (iii) Amyloid fibril forming peptides of different lengths, (iv) Amyloid fibril forming hexa-peptides whose crystal structures are available in the Protein Data Bank (PDB) and (v) Experimentally validated aggregation prone regions found in amyloidogenic proteins. Furthermore, CPAD is linked to other related databases and resources, such as Uniprot, Protein Data Bank, PUBMED, GAP, TANGO, WALTZ etc. We have set up a web interface with different search and display options so that users have the ability to get the data in multiple ways. CPAD is freely available at http://www.iitm.ac.in/bioinfo/CPAD/. The potential applications of CPAD have also been discussed.

  9. BtoxDB: a comprehensive database of protein structural data on toxin-antitoxin systems.

    Science.gov (United States)

    Barbosa, Luiz Carlos Bertucci; Garrido, Saulo Santesso; Marchetto, Reinaldo

    2015-03-01

    Toxin-antitoxin (TA) systems are diverse and abundant genetic modules in prokaryotic cells that are typically formed by two genes encoding a stable toxin and a labile antitoxin. Because TA systems are able to repress growth or kill cells and are considered to be important actors in cell persistence (multidrug resistance without genetic change), these modules are considered potential targets for alternative drug design. In this scenario, structural information for the proteins in these systems is highly valuable. In this report, we describe the development of a web-based system, named BtoxDB, that stores all protein structural data on TA systems. The BtoxDB database was implemented as a MySQL relational database using PHP scripting language. Web interfaces were developed using HTML, CSS and JavaScript. The data were collected from the PDB, UniProt and Entrez databases. These data were appropriately filtered using specialized literature and our previous knowledge about toxin-antitoxin systems. The database provides three modules ("Search", "Browse" and "Statistics") that enable searches, acquisition of contents and access to statistical data. Direct links to matching external databases are also available. The compilation of all protein structural data on TA systems in one platform is highly useful for researchers interested in this content. BtoxDB is publicly available at http://www.gurupi.uft.edu.br/btoxdb. Copyright © 2015 Elsevier Ltd. All rights reserved.

  10. PDTD: a web-accessible protein database for drug target identification

    Directory of Open Access Journals (Sweden)

    Gao Zhenting

    2008-02-01

    Full Text Available Abstract Background Target identification is important for modern drug discovery. With the advances in the development of molecular docking, potential binding proteins may be discovered by docking a small molecule to a repository of proteins with three-dimensional (3D structures. To complete this task, a reverse docking program and a drug target database with 3D structures are necessary. To this end, we have developed a web server tool, TarFisDock (Target Fishing Docking http://www.dddc.ac.cn/tarfisdock, which has been used widely by others. Recently, we have constructed a protein target database, Potential Drug Target Database (PDTD, and have integrated PDTD with TarFisDock. This combination aims to assist target identification and validation. Description PDTD is a web-accessible protein database for in silico target identification. It currently contains >1100 protein entries with 3D structures presented in the Protein Data Bank. The data are extracted from the literatures and several online databases such as TTD, DrugBank and Thomson Pharma. The database covers diverse information of >830 known or potential drug targets, including protein and active sites structures in both PDB and mol2 formats, related diseases, biological functions as well as associated regulating (signaling pathways. Each target is categorized by both nosology and biochemical function. PDTD supports keyword search function, such as PDB ID, target name, and disease name. Data set generated by PDTD can be viewed with the plug-in of molecular visualization tools and also can be downloaded freely. Remarkably, PDTD is specially designed for target identification. In conjunction with TarFisDock, PDTD can be used to identify binding proteins for small molecules. The results can be downloaded in the form of mol2 file with the binding pose of the probe compound and a list of potential binding targets according to their ranking scores. Conclusion PDTD serves as a comprehensive and

  11. STITCH 2: an interaction network database for small molecules and proteins

    DEFF Research Database (Denmark)

    Kuhn, Michael; Szklarczyk, Damian; Franceschini, Andrea

    2010-01-01

    Over the last years, the publicly available knowledge on interactions between small molecules and proteins has been steadily increasing. To create a network of interactions, STITCH aims to integrate the data dispersed over the literature and various databases of biological pathways, drug......-target relationships and binding affinities. In STITCH 2, the number of relevant interactions is increased by incorporation of BindingDB, PharmGKB and the Comparative Toxicogenomics Database. The resulting network can be explored interactively or used as the basis for large-scale analyses. To facilitate links to other...... chemical databases, we adopt InChIKeys that allow identification of chemicals with a short, checksum-like string. STITCH 2.0 connects proteins from 630 organisms to over 74,000 different chemicals, including 2200 drugs. STITCH can be accessed at http://stitch.embl.de/....

  12. Interleukin-1beta induced changes in the protein expression of rat islets: a computerized database

    DEFF Research Database (Denmark)

    Andersen, H U; Fey, S J; Larsen, Peter Mose

    1997-01-01

    as well as the intracellular mechanisms of action of interleukin 1-mediated beta-cell cytotoxicity are unknown. However, previous studies have found an association of beta-cell destruction with alterations in protein synthesis. Thus, two-dimensional (2-D) gel electrophoresis of pancreatic islet proteins...... may be an important tool facilitating studies of the molecular pathogenesis of insulin-dependent diabetes mellitus. 2-D gel electrophoresis of islet proteins may lead to (i) the determination of qualitative and quantitative changes in specific islet proteins induced by cytokines, (ii......) the determination of the effects of agents modulating cytokine action, and (iii) the identification of primary islet protein antigen(s) initiating the immune destruction of the beta-cells. Therefore, the aim of this study was to create databases (DB) of all reproducibly detectable protein spots on 10% and 15...

  13. Protein backbone angle restraints from searching a database for chemical shift and sequence homology

    Energy Technology Data Exchange (ETDEWEB)

    Cornilescu, Gabriel; Delaglio, Frank; Bax, Ad [National Institutes of Health, Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases (United States)

    1999-03-15

    Chemical shifts of backbone atoms in proteins are exquisitely sensitive to local conformation, and homologous proteins show quite similar patterns of secondary chemical shifts. The inverse of this relation is used to search a database for triplets of adjacent residues with secondary chemical shifts and sequence similarity which provide the best match to the query triplet of interest. The database contains 13C{alpha}, 13C{beta}, 13C', 1H{alpha} and 15N chemical shifts for 20 proteins for which a high resolution X-ray structure is available. The computer program TALOS was developed to search this database for strings of residues with chemical shift and residue type homology. The relative importance of the weighting factors attached to the secondary chemical shifts of the five types of resonances relative to that of sequence similarity was optimized empirically. TALOS yields the 10 triplets which have the closest similarity in secondary chemical shift and amino acid sequence to those of the query sequence. If the central residues in these 10 triplets exhibit similar {phi} and {psi} backbone angles, their averages can reliably be used as angular restraints for the protein whose structure is being studied. Tests carried out for proteins of known structure indicate that the root-mean-square difference (rmsd) between the output of TALOS and the X-ray derived backbone angles is about 15 deg. Approximately 3% of the predictions made by TALOS are found to be in error.

  14. UET: a database of evolutionarily-predicted functional determinants of protein sequences that cluster as functional sites in protein structures.

    Science.gov (United States)

    Lua, Rhonald C; Wilson, Stephen J; Konecki, Daniel M; Wilkins, Angela D; Venner, Eric; Morgan, Daniel H; Lichtarge, Olivier

    2016-01-04

    The structure and function of proteins underlie most aspects of biology and their mutational perturbations often cause disease. To identify the molecular determinants of function as well as targets for drugs, it is central to characterize the important residues and how they cluster to form functional sites. The Evolutionary Trace (ET) achieves this by ranking the functional and structural importance of the protein sequence positions. ET uses evolutionary distances to estimate functional distances and correlates genotype variations with those in the fitness phenotype. Thus, ET ranks are worse for sequence positions that vary among evolutionarily closer homologs but better for positions that vary mostly among distant homologs. This approach identifies functional determinants, predicts function, guides the mutational redesign of functional and allosteric specificity, and interprets the action of coding sequence variations in proteins, people and populations. Now, the UET database offers pre-computed ET analyses for the protein structure databank, and on-the-fly analysis of any protein sequence. A web interface retrieves ET rankings of sequence positions and maps results to a structure to identify functionally important regions. This UET database integrates several ways of viewing the results on the protein sequence or structure and can be found at http://mammoth.bcm.tmc.edu/uet/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  15. Genes2Networks: connecting lists of gene symbols using mammalian protein interactions databases

    Directory of Open Access Journals (Sweden)

    Ma'ayan Avi

    2007-10-01

    Full Text Available Abstract Background In recent years, mammalian protein-protein interaction network databases have been developed. The interactions in these databases are either extracted manually from low-throughput experimental biomedical research literature, extracted automatically from literature using techniques such as natural language processing (NLP, generated experimentally using high-throughput methods such as yeast-2-hybrid screens, or interactions are predicted using an assortment of computational approaches. Genes or proteins identified as significantly changing in proteomic experiments, or identified as susceptibility disease genes in genomic studies, can be placed in the context of protein interaction networks in order to assign these genes and proteins to pathways and protein complexes. Results Genes2Networks is a software system that integrates the content of ten mammalian interaction network datasets. Filtering techniques to prune low-confidence interactions were implemented. Genes2Networks is delivered as a web-based service using AJAX. The system can be used to extract relevant subnetworks created from "seed" lists of human Entrez gene symbols. The output includes a dynamic linkable three color web-based network map, with a statistical analysis report that identifies significant intermediate nodes used to connect the seed list. Conclusion Genes2Networks is powerful web-based software that can help experimental biologists to interpret lists of genes and proteins such as those commonly produced through genomic and proteomic experiments, as well as lists of genes and proteins associated with disease processes. This system can be used to find relationships between genes and proteins from seed lists, and predict additional genes or proteins that may play key roles in common pathways or protein complexes.

  16. PATtyFams: Protein families for the microbial genomes in the PATRIC database

    Directory of Open Access Journals (Sweden)

    James J Davis

    2016-02-01

    Full Text Available The ability to build accurate protein families is a fundamental operation in bioinformatics that influences comparative analyses, genome annotation and metabolic modeling. For several years we have been maintaining protein families for all microbial genomes in the PATRIC database (Pathosystems Resource Integration Center, patricbrc.org in order to drive many of the comparative analysis tools that are available through the PATRIC website. However, due to the burgeoning number of genomes, traditional approaches for generating protein families are becoming prohibitive. In this report, we describe a new approach for generating protein families, which we call PATtyFams. This method uses the k-mer-based function assignments available through RAST (Rapid Annotation using Subsystem Technology to rapidly guide family formation, and then differentiates the function-based groups into families using a Markov Cluster algorithm (MCL. This new approach for generating protein families is rapid, scalable and has properties that are consistent with alignment-based methods.

  17. muBLASTP: database-indexed protein sequence search on multicore CPUs.

    Science.gov (United States)

    Zhang, Jing; Misra, Sanchit; Wang, Hao; Feng, Wu-Chun

    2016-11-04

    The Basic Local Alignment Search Tool (BLAST) is a fundamental program in the life sciences that searches databases for sequences that are most similar to a query sequence. Currently, the BLAST algorithm utilizes a query-indexed approach. Although many approaches suggest that sequence search with a database index can achieve much higher throughput (e.g., BLAT, SSAHA, and CAFE), they cannot deliver the same level of sensitivity as the query-indexed BLAST, i.e., NCBI BLAST, or they can only support nucleotide sequence search, e.g., MegaBLAST. Due to different challenges and characteristics between query indexing and database indexing, the existing techniques for query-indexed search cannot be used into database indexed search. muBLASTP, a novel database-indexed BLAST for protein sequence search, delivers identical hits returned to NCBI BLAST. On Intel Haswell multicore CPUs, for a single query, the single-threaded muBLASTP achieves up to a 4.41-fold speedup for alignment stages, and up to a 1.75-fold end-to-end speedup over single-threaded NCBI BLAST. For a batch of queries, the multithreaded muBLASTP achieves up to a 5.7-fold speedups for alignment stages, and up to a 4.56-fold end-to-end speedup over multithreaded NCBI BLAST. With a newly designed index structure for protein database and associated optimizations in BLASTP algorithm, we re-factored BLASTP algorithm for modern multicore processors that achieves much higher throughput with acceptable memory footprint for the database index.

  18. Fly-DPI: database of protein interactomes for D. melanogaster in the approach of systems biology

    Directory of Open Access Journals (Sweden)

    Lin Chieh-Hua

    2006-12-01

    Full Text Available Abstract Background Proteins control and mediate many biological activities of cells by interacting with other protein partners. This work presents a statistical model to predict protein interaction networks of Drosophila melanogaster based on insight into domain interactions. Results Three high-throughput yeast two-hybrid experiments and the collection in FlyBase were used as our starting datasets. The co-occurrences of domains in these interactive events are converted into a probability score of domain-domain interaction. These scores are used to infer putative interaction among all available open reading frames (ORFs of fruit fly. Additionally, the likelihood function is used to estimate all potential protein-protein interactions. All parameters are successfully iterated and MLE is obtained for each pair of domains. Additionally, the maximized likelihood reaches its converged criteria and maintains the probability stable. The hybrid model achieves a high specificity with a loss of sensitivity, suggesting that the model may possess major features of protein-protein interactions. Several putative interactions predicted by the proposed hybrid model are supported by literatures, while experimental data with a low probability score indicate an uncertain reliability and require further proof of interaction. Fly-DPI is the online database used to present this work. It is an integrated proteomics tool with comprehensive protein annotation information from major databases as well as an effective means of predicting protein-protein interactions. As a novel search strategy, the ping-pong search is a naïve path map between two chosen proteins based on pre-computed shortest paths. Adopting effective filtering strategies will facilitate researchers in depicting the bird's eye view of the network of interest. Fly-DPI can be accessed at http://flydpi.nhri.org.tw. Conclusion This work provides two reference systems, statistical and biological, to evaluate

  19. The human interactome knowledge base (hint-kb): An integrative human protein interaction database enriched with predicted protein–protein interaction scores using a novel hybrid technique

    KAUST Repository

    Theofilatos, Konstantinos A.; Dimitrakopoulos, Christos M.; Likothanassis, Spiridon D.; Kleftogiannis, Dimitrios A.; Moschopoulos, Charalampos N.; Alexakos, Christos; Papadimitriou, Stergios; Mavroudi, Seferina P.

    2013-01-01

    Proteins are the functional components of many cellular processes and the identification of their physical protein–protein interactions (PPIs) is an area of mature academic research. Various databases have been developed containing information about

  20. CPLA 1.0: an integrated database of protein lysine acetylation.

    Science.gov (United States)

    Liu, Zexian; Cao, Jun; Gao, Xinjiao; Zhou, Yanhong; Wen, Longping; Yang, Xiangjiao; Yao, Xuebiao; Ren, Jian; Xue, Yu

    2011-01-01

    As a reversible post-translational modification (PTM) discovered decades ago, protein lysine acetylation was known for its regulation of transcription through the modification of histones. Recent studies discovered that lysine acetylation targets broad substrates and especially plays an essential role in cellular metabolic regulation. Although acetylation is comparable with other major PTMs such as phosphorylation, an integrated resource still remains to be developed. In this work, we presented the compendium of protein lysine acetylation (CPLA) database for lysine acetylated substrates with their sites. From the scientific literature, we manually collected 7151 experimentally identified acetylation sites in 3311 targets. We statistically studied the regulatory roles of lysine acetylation by analyzing the Gene Ontology (GO) and InterPro annotations. Combined with protein-protein interaction information, we systematically discovered a potential human lysine acetylation network (HLAN) among histone acetyltransferases (HATs), substrates and histone deacetylases (HDACs). In particular, there are 1862 triplet relationships of HAT-substrate-HDAC retrieved from the HLAN, at least 13 of which were previously experimentally verified. The online services of CPLA database was implemented in PHP + MySQL + JavaScript, while the local packages were developed in JAVA 1.5 (J2SE 5.0). The CPLA database is freely available for all users at: http://cpla.biocuckoo.org.

  1. MannDB – A microbial database of automated protein sequence analyses and evidence integration for protein characterization

    Directory of Open Access Journals (Sweden)

    Kuczmarski Thomas A

    2006-10-01

    Full Text Available Abstract Background MannDB was created to meet a need for rapid, comprehensive automated protein sequence analyses to support selection of proteins suitable as targets for driving the development of reagents for pathogen or protein toxin detection. Because a large number of open-source tools were needed, it was necessary to produce a software system to scale the computations for whole-proteome analysis. Thus, we built a fully automated system for executing software tools and for storage, integration, and display of automated protein sequence analysis and annotation data. Description MannDB is a relational database that organizes data resulting from fully automated, high-throughput protein-sequence analyses using open-source tools. Types of analyses provided include predictions of cleavage, chemical properties, classification, features, functional assignment, post-translational modifications, motifs, antigenicity, and secondary structure. Proteomes (lists of hypothetical and known proteins are downloaded and parsed from Genbank and then inserted into MannDB, and annotations from SwissProt are downloaded when identifiers are found in the Genbank entry or when identical sequences are identified. Currently 36 open-source tools are run against MannDB protein sequences either on local systems or by means of batch submission to external servers. In addition, BLAST against protein entries in MvirDB, our database of microbial virulence factors, is performed. A web client browser enables viewing of computational results and downloaded annotations, and a query tool enables structured and free-text search capabilities. When available, links to external databases, including MvirDB, are provided. MannDB contains whole-proteome analyses for at least one representative organism from each category of biological threat organism listed by APHIS, CDC, HHS, NIAID, USDA, USFDA, and WHO. Conclusion MannDB comprises a large number of genomes and comprehensive protein

  2. Two-dimensional gel human protein databases offer a systematic approach to the study of cell proliferation and differentiation

    DEFF Research Database (Denmark)

    Celis, julio E.; Gesser, Borbala; Dejgaard, Kurt

    1989-01-01

    Human cellular protein databases have been established using computer-analyzed 2D gel electrophoresis. These databases, which include information on various properties of proteins, offer a global approach to the study of regulation of cell proliferation and differentiation. Furthermore, thanks...

  3. Two dimensional gel human protein databases offer a systematic approach to the study of cell proliferation and differentiation

    DEFF Research Database (Denmark)

    Celis, J E; Gesser, B; Dejgaard, K

    1989-01-01

    Human cellular protein databases have been established using computer-analyzed 2D gel electrophoresis. These databases, which include information on various properties of proteins, offer a global approach to the study of regulation of cell proliferation and differentiation. Furthermore, thanks to...

  4. ZifBASE: a database of zinc finger proteins and associated resources

    Directory of Open Access Journals (Sweden)

    Punetha Ankita

    2009-09-01

    databases like UniprotKB, PDB, ModBase and Protein Model Portal and PubMed for making it more informative. Conclusion A database is established to maintain the information of the sequence features, including the class, framework, number of fingers, residues, position, recognition site and physio-chemical properties (molecular weight, isoelectric point of both natural and engineered zinc finger proteins and dissociation constant of few. ZifBASE can provide more effective and efficient way of accessing the zinc finger protein sequences and their target binding sites with the links to their three-dimensional structures. All the data and functions are available at the advanced web-based search interface http://web.iitd.ac.in/~sundar/zifbase.

  5. Viral Genome DataBase: storing and analyzing genes and proteins from complete viral genomes.

    Science.gov (United States)

    Hiscock, D; Upton, C

    2000-05-01

    The Viral Genome DataBase (VGDB) contains detailed information of the genes and predicted protein sequences from 15 completely sequenced genomes of large (&100 kb) viruses (2847 genes). The data that is stored includes DNA sequence, protein sequence, GenBank and user-entered notes, molecular weight (MW), isoelectric point (pI), amino acid content, A + T%, nucleotide frequency, dinucleotide frequency and codon use. The VGDB is a mySQL database with a user-friendly JAVA GUI. Results of queries can be easily sorted by any of the individual parameters. The software and additional figures and information are available at http://athena.bioc.uvic.ca/genomes/index.html .

  6. dbPAF: an integrative database of protein phosphorylation in animals and fungi.

    Science.gov (United States)

    Ullah, Shahid; Lin, Shaofeng; Xu, Yang; Deng, Wankun; Ma, Lili; Zhang, Ying; Liu, Zexian; Xue, Yu

    2016-03-24

    Protein phosphorylation is one of the most important post-translational modifications (PTMs) and regulates a broad spectrum of biological processes. Recent progresses in phosphoproteomic identifications have generated a flood of phosphorylation sites, while the integration of these sites is an urgent need. In this work, we developed a curated database of dbPAF, containing known phosphorylation sites in H. sapiens, M. musculus, R. norvegicus, D. melanogaster, C. elegans, S. pombe and S. cerevisiae. From the scientific literature and public databases, we totally collected and integrated 54,148 phosphoproteins with 483,001 phosphorylation sites. Multiple options were provided for accessing the data, while original references and other annotations were also present for each phosphoprotein. Based on the new data set, we computationally detected significantly over-represented sequence motifs around phosphorylation sites, predicted potential kinases that are responsible for the modification of collected phospho-sites, and evolutionarily analyzed phosphorylation conservation states across different species. Besides to be largely consistent with previous reports, our results also proposed new features of phospho-regulation. Taken together, our database can be useful for further analyses of protein phosphorylation in human and other model organisms. The dbPAF database was implemented in PHP + MySQL and freely available at http://dbpaf.biocuckoo.org.

  7. Merging in-silico and in vitro salivary protein complex partners using the STRING database: A tutorial.

    Science.gov (United States)

    Crosara, Karla Tonelli Bicalho; Moffa, Eduardo Buozi; Xiao, Yizhi; Siqueira, Walter Luiz

    2018-01-16

    Protein-protein interaction is a common physiological mechanism for protection and actions of proteins in an organism. The identification and characterization of protein-protein interactions in different organisms is necessary to better understand their physiology and to determine their efficacy. In a previous in vitro study using mass spectrometry, we identified 43 proteins that interact with histatin 1. Six previously documented interactors were confirmed and 37 novel partners were identified. In this tutorial, we aimed to demonstrate the usefulness of the STRING database for studying protein-protein interactions. We used an in-silico approach along with the STRING database (http://string-db.org/) and successfully performed a fast simulation of a novel constructed histatin 1 protein-protein network, including both the previously known and the predicted interactors, along with our newly identified interactors. Our study highlights the advantages and importance of applying bioinformatics tools to merge in-silico tactics with experimental in vitro findings for rapid advancement of our knowledge about protein-protein interactions. Our findings also indicate that bioinformatics tools such as the STRING protein network database can help predict potential interactions between proteins and thus serve as a guide for future steps in our exploration of the Human Interactome. Our study highlights the usefulness of the STRING protein database for studying protein-protein interactions. The STRING database can collect and integrate data about known and predicted protein-protein associations from many organisms, including both direct (physical) and indirect (functional) interactions, in an easy-to-use interface. Copyright © 2017 Elsevier B.V. All rights reserved.

  8. HippDB: a database of readily targeted helical protein-protein interactions.

    Science.gov (United States)

    Bergey, Christina M; Watkins, Andrew M; Arora, Paramjit S

    2013-11-01

    HippDB catalogs every protein-protein interaction whose structure is available in the Protein Data Bank and which exhibits one or more helices at the interface. The Web site accepts queries on variables such as helix length and sequence, and it provides computational alanine scanning and change in solvent-accessible surface area values for every interfacial residue. HippDB is intended to serve as a starting point for structure-based small molecule and peptidomimetic drug development. HippDB is freely available on the web at http://www.nyu.edu/projects/arora/hippdb. The Web site is implemented in PHP, MySQL and Apache. Source code freely available for download at http://code.google.com/p/helidb, implemented in Perl and supported on Linux. arora@nyu.edu.

  9. DSFL database: A hub of target proteins of Leishmania sp. to combat leishmaniasis

    Directory of Open Access Journals (Sweden)

    Ameer Khusro

    2017-07-01

    Full Text Available Leishmaniasis is a vector-borne chronic infectious tropical dermal disease caused by the protozoa parasite of the genus Leishmania that causes high mortality globally. Among three different clinical forms of leishmaniasis, visceral leishmaniasis (VL or kala-azar is a systemic public health disease with high morbidity and mortality in developing countries, caused by Leishmania donovani, Leishmania infantum or Leishmania chagasi. Unfortunately, there is no vaccine available till date for the treatment of leishmaniasis. On the other hand, the therapeutics approved to treat this fatal disease is expensive, toxic, and associated with serious side effects. Furthermore, the emergence of drug-resistant Leishmania parasites in most endemic countries due to the incessant utilization of existing drugs is a major concern at present. Drug Search for Leishmaniasis (DSFL is a unique database that involves 50 crystallized target proteins of varied Leishmania sp. in order to develop new drugs in future by interacting several antiparasitic compounds or molecules with specific protein through computational tools. The structure of target protein from different Leishmania sp. is available in this database. In this review, we spotlighted not only the current global status of leishmaniasis in brief but also detailed information about target proteins of various Leishmania sp. available in DSFL. DSFL has created a new expectation for mankind in order to combat leishmaniasis by targeting parasitic proteins and commence a new era to get rid of drug resistance parasites. The database will substantiate to be a worthwhile project for further development of new, non-toxic, and cost-effective antileishmanial drugs as targeted therapies using in vitro/in vivo assays.

  10. The master two-dimensional gel database of human AMA cell proteins: towards linking protein and genome sequence and mapping information (update 1991)

    DEFF Research Database (Denmark)

    Celis, J E; Leffers, H; Rasmussen, H H

    1991-01-01

    autoantigens" and "cDNAs". For convenience we have included an alphabetical list of all known proteins recorded in this database. In the long run, the main goal of this database is to link protein and DNA sequencing and mapping information (Human Genome Program) and to provide an integrated picture......The master two-dimensional gel database of human AMA cells currently lists 3801 cellular and secreted proteins, of which 371 cellular polypeptides (306 IEF; 65 NEPHGE) were added to the master images during the last 10 months. These include: (i) very basic and acidic proteins that do not focus...

  11. Searching the protein structure database for ligand-binding site similarities using CPASS v.2

    Directory of Open Access Journals (Sweden)

    Caprez Adam

    2011-01-01

    Full Text Available Abstract Background A recent analysis of protein sequences deposited in the NCBI RefSeq database indicates that ~8.5 million protein sequences are encoded in prokaryotic and eukaryotic genomes, where ~30% are explicitly annotated as "hypothetical" or "uncharacterized" protein. Our Comparison of Protein Active-Site Structures (CPASS v.2 database and software compares the sequence and structural characteristics of experimentally determined ligand binding sites to infer a functional relationship in the absence of global sequence or structure similarity. CPASS is an important component of our Functional Annotation Screening Technology by NMR (FAST-NMR protocol and has been successfully applied to aid the annotation of a number of proteins of unknown function. Findings We report a major upgrade to our CPASS software and database that significantly improves its broad utility. CPASS v.2 is designed with a layered architecture to increase flexibility and portability that also enables job distribution over the Open Science Grid (OSG to increase speed. Similarly, the CPASS interface was enhanced to provide more user flexibility in submitting a CPASS query. CPASS v.2 now allows for both automatic and manual definition of ligand-binding sites and permits pair-wise, one versus all, one versus list, or list versus list comparisons. Solvent accessible surface area, ligand root-mean square difference, and Cβ distances have been incorporated into the CPASS similarity function to improve the quality of the results. The CPASS database has also been updated. Conclusions CPASS v.2 is more than an order of magnitude faster than the original implementation, and allows for multiple simultaneous job submissions. Similarly, the CPASS database of ligand-defined binding sites has increased in size by ~ 38%, dramatically increasing the likelihood of a positive search result. The modification to the CPASS similarity function is effective in reducing CPASS similarity scores

  12. Protein backbone chemical shifts predicted from searching a database for torsion angle and sequence homology

    International Nuclear Information System (INIS)

    Shen Yang; Bax, Ad

    2007-01-01

    Chemical shifts of nuclei in or attached to a protein backbone are exquisitely sensitive to their local environment. A computer program, SPARTA, is described that uses this correlation with local structure to predict protein backbone chemical shifts, given an input three-dimensional structure, by searching a newly generated database for triplets of adjacent residues that provide the best match in φ/ψ/χ 1 torsion angles and sequence similarity to the query triplet of interest. The database contains 15 N, 1 H N , 1 H α , 13 C α , 13 C β and 13 C' chemical shifts for 200 proteins for which a high resolution X-ray (≤2.4 A) structure is available. The relative importance of the weighting factors for the φ/ψ/χ 1 angles and sequence similarity was optimized empirically. The weighted, average secondary shifts of the central residues in the 20 best-matching triplets, after inclusion of nearest neighbor, ring current, and hydrogen bonding effects, are used to predict chemical shifts for the protein of known structure. Validation shows good agreement between the SPARTA-predicted and experimental shifts, with standard deviations of 2.52, 0.51, 0.27, 0.98, 1.07 and 1.08 ppm for 15 N, 1 H N , 1 H α , 13 C α , 13 C β and 13 C', respectively, including outliers

  13. Gene composer: database software for protein construct design, codon engineering, and gene synthesis.

    Science.gov (United States)

    Lorimer, Don; Raymond, Amy; Walchli, John; Mixon, Mark; Barrow, Adrienne; Wallace, Ellen; Grice, Rena; Burgin, Alex; Stewart, Lance

    2009-04-21

    To improve efficiency in high throughput protein structure determination, we have developed a database software package, Gene Composer, which facilitates the information-rich design of protein constructs and their codon engineered synthetic gene sequences. With its modular workflow design and numerous graphical user interfaces, Gene Composer enables researchers to perform all common bio-informatics steps used in modern structure guided protein engineering and synthetic gene engineering. An interactive Alignment Viewer allows the researcher to simultaneously visualize sequence conservation in the context of known protein secondary structure, ligand contacts, water contacts, crystal contacts, B-factors, solvent accessible area, residue property type and several other useful property views. The Construct Design Module enables the facile design of novel protein constructs with altered N- and C-termini, internal insertions or deletions, point mutations, and desired affinity tags. The modifications can be combined and permuted into multiple protein constructs, and then virtually cloned in silico into defined expression vectors. The Gene Design Module uses a protein-to-gene algorithm that automates the back-translation of a protein amino acid sequence into a codon engineered nucleic acid gene sequence according to a selected codon usage table with minimal codon usage threshold, defined G:C% content, and desired sequence features achieved through synonymous codon selection that is optimized for the intended expression system. The gene-to-oligo algorithm of the Gene Design Module plans out all of the required overlapping oligonucleotides and mutagenic primers needed to synthesize the desired gene constructs by PCR, and for physically cloning them into selected vectors by the most popular subcloning strategies. We present a complete description of Gene Composer functionality, and an efficient PCR-based synthetic gene assembly procedure with mis-match specific endonuclease

  14. Gene Composer: database software for protein construct design, codon engineering, and gene synthesis

    Directory of Open Access Journals (Sweden)

    Mixon Mark

    2009-04-01

    Full Text Available Abstract Background To improve efficiency in high throughput protein structure determination, we have developed a database software package, Gene Composer, which facilitates the information-rich design of protein constructs and their codon engineered synthetic gene sequences. With its modular workflow design and numerous graphical user interfaces, Gene Composer enables researchers to perform all common bio-informatics steps used in modern structure guided protein engineering and synthetic gene engineering. Results An interactive Alignment Viewer allows the researcher to simultaneously visualize sequence conservation in the context of known protein secondary structure, ligand contacts, water contacts, crystal contacts, B-factors, solvent accessible area, residue property type and several other useful property views. The Construct Design Module enables the facile design of novel protein constructs with altered N- and C-termini, internal insertions or deletions, point mutations, and desired affinity tags. The modifications can be combined and permuted into multiple protein constructs, and then virtually cloned in silico into defined expression vectors. The Gene Design Module uses a protein-to-gene algorithm that automates the back-translation of a protein amino acid sequence into a codon engineered nucleic acid gene sequence according to a selected codon usage table with minimal codon usage threshold, defined G:C% content, and desired sequence features achieved through synonymous codon selection that is optimized for the intended expression system. The gene-to-oligo algorithm of the Gene Design Module plans out all of the required overlapping oligonucleotides and mutagenic primers needed to synthesize the desired gene constructs by PCR, and for physically cloning them into selected vectors by the most popular subcloning strategies. Conclusion We present a complete description of Gene Composer functionality, and an efficient PCR-based synthetic gene

  15. Identification and correction of abnormal, incomplete and mispredicted proteins in public databases

    Directory of Open Access Journals (Sweden)

    Bányai László

    2008-08-01

    Full Text Available Abstract Background Despite significant improvements in computational annotation of genomes, sequences of abnormal, incomplete or incorrectly predicted genes and proteins remain abundant in public databases. Since the majority of incomplete, abnormal or mispredicted entries are not annotated as such, these errors seriously affect the reliability of these databases. Here we describe the MisPred approach that may provide an efficient means for the quality control of databases. The current version of the MisPred approach uses five distinct routines for identifying abnormal, incomplete or mispredicted entries based on the principle that a sequence is likely to be incorrect if some of its features conflict with our current knowledge about protein-coding genes and proteins: (i conflict between the predicted subcellular localization of proteins and the absence of the corresponding sequence signals; (ii presence of extracellular and cytoplasmic domains and the absence of transmembrane segments; (iii co-occurrence of extracellular and nuclear domains; (iv violation of domain integrity; (v chimeras encoded by two or more genes located on different chromosomes. Results Analyses of predicted EnsEMBL protein sequences of nine deuterostome (Homo sapiens, Mus musculus, Rattus norvegicus, Monodelphis domestica, Gallus gallus, Xenopus tropicalis, Fugu rubripes, Danio rerio and Ciona intestinalis and two protostome species (Caenorhabditis elegans and Drosophila melanogaster have revealed that the absence of expected signal peptides and violation of domain integrity account for the majority of mispredictions. Analyses of sequences predicted by NCBI's GNOMON annotation pipeline show that the rates of mispredictions are comparable to those of EnsEMBL. Interestingly, even the manually curated UniProtKB/Swiss-Prot dataset is contaminated with mispredicted or abnormal proteins, although to a much lesser extent than UniProtKB/TrEMBL or the EnsEMBL or GNOMON

  16. Integration of gel-based and gel-free proteomic data for functional analysis of proteins through Soybean Proteome Database

    KAUST Repository

    Komatsu, Setsuko

    2017-05-10

    The Soybean Proteome Database (SPD) stores data on soybean proteins obtained with gel-based and gel-free proteomic techniques. The database was constructed to provide information on proteins for functional analyses. The majority of the data is focused on soybean (Glycine max ‘Enrei’). The growth and yield of soybean are strongly affected by environmental stresses such as flooding. The database was originally constructed using data on soybean proteins separated by two-dimensional polyacrylamide gel electrophoresis, which is a gel-based proteomic technique. Since 2015, the database has been expanded to incorporate data obtained by label-free mass spectrometry-based quantitative proteomics, which is a gel-free proteomic technique. Here, the portions of the database consisting of gel-free proteomic data are described. The gel-free proteomic database contains 39,212 proteins identified in 63 sample sets, such as temporal and organ-specific samples of soybean plants grown under flooding stress or non-stressed conditions. In addition, data on organellar proteins identified in mitochondria, nuclei, and endoplasmic reticulum are stored. Furthermore, the database integrates multiple omics data such as genomics, transcriptomics, metabolomics, and proteomics. The SPD database is accessible at http://proteome.dc.affrc.go.jp/Soybean/. Biological significanceThe Soybean Proteome Database stores data obtained from both gel-based and gel-free proteomic techniques. The gel-free proteomic database comprises 39,212 proteins identified in 63 sample sets, such as different organs of soybean plants grown under flooding stress or non-stressed conditions in a time-dependent manner. In addition, organellar proteins identified in mitochondria, nuclei, and endoplasmic reticulum are stored in the gel-free proteomics database. A total of 44,704 proteins, including 5490 proteins identified using a gel-based proteomic technique, are stored in the SPD. It accounts for approximately 80% of all

  17. Integration of gel-based and gel-free proteomic data for functional analysis of proteins through Soybean Proteome Database.

    Science.gov (United States)

    Komatsu, Setsuko; Wang, Xin; Yin, Xiaojian; Nanjo, Yohei; Ohyanagi, Hajime; Sakata, Katsumi

    2017-06-23

    The Soybean Proteome Database (SPD) stores data on soybean proteins obtained with gel-based and gel-free proteomic techniques. The database was constructed to provide information on proteins for functional analyses. The majority of the data is focused on soybean (Glycine max 'Enrei'). The growth and yield of soybean are strongly affected by environmental stresses such as flooding. The database was originally constructed using data on soybean proteins separated by two-dimensional polyacrylamide gel electrophoresis, which is a gel-based proteomic technique. Since 2015, the database has been expanded to incorporate data obtained by label-free mass spectrometry-based quantitative proteomics, which is a gel-free proteomic technique. Here, the portions of the database consisting of gel-free proteomic data are described. The gel-free proteomic database contains 39,212 proteins identified in 63 sample sets, such as temporal and organ-specific samples of soybean plants grown under flooding stress or non-stressed conditions. In addition, data on organellar proteins identified in mitochondria, nuclei, and endoplasmic reticulum are stored. Furthermore, the database integrates multiple omics data such as genomics, transcriptomics, metabolomics, and proteomics. The SPD database is accessible at http://proteome.dc.affrc.go.jp/Soybean/. The Soybean Proteome Database stores data obtained from both gel-based and gel-free proteomic techniques. The gel-free proteomic database comprises 39,212 proteins identified in 63 sample sets, such as different organs of soybean plants grown under flooding stress or non-stressed conditions in a time-dependent manner. In addition, organellar proteins identified in mitochondria, nuclei, and endoplasmic reticulum are stored in the gel-free proteomics database. A total of 44,704 proteins, including 5490 proteins identified using a gel-based proteomic technique, are stored in the SPD. It accounts for approximately 80% of all predicted proteins from

  18. Integration of gel-based and gel-free proteomic data for functional analysis of proteins through Soybean Proteome Database

    KAUST Repository

    Komatsu, Setsuko; Wang, Xin; Yin, Xiaojian; Nanjo, Yohei; Ohyanagi, Hajime; Sakata, Katsumi

    2017-01-01

    The Soybean Proteome Database (SPD) stores data on soybean proteins obtained with gel-based and gel-free proteomic techniques. The database was constructed to provide information on proteins for functional analyses. The majority of the data is focused on soybean (Glycine max ‘Enrei’). The growth and yield of soybean are strongly affected by environmental stresses such as flooding. The database was originally constructed using data on soybean proteins separated by two-dimensional polyacrylamide gel electrophoresis, which is a gel-based proteomic technique. Since 2015, the database has been expanded to incorporate data obtained by label-free mass spectrometry-based quantitative proteomics, which is a gel-free proteomic technique. Here, the portions of the database consisting of gel-free proteomic data are described. The gel-free proteomic database contains 39,212 proteins identified in 63 sample sets, such as temporal and organ-specific samples of soybean plants grown under flooding stress or non-stressed conditions. In addition, data on organellar proteins identified in mitochondria, nuclei, and endoplasmic reticulum are stored. Furthermore, the database integrates multiple omics data such as genomics, transcriptomics, metabolomics, and proteomics. The SPD database is accessible at http://proteome.dc.affrc.go.jp/Soybean/. Biological significanceThe Soybean Proteome Database stores data obtained from both gel-based and gel-free proteomic techniques. The gel-free proteomic database comprises 39,212 proteins identified in 63 sample sets, such as different organs of soybean plants grown under flooding stress or non-stressed conditions in a time-dependent manner. In addition, organellar proteins identified in mitochondria, nuclei, and endoplasmic reticulum are stored in the gel-free proteomics database. A total of 44,704 proteins, including 5490 proteins identified using a gel-based proteomic technique, are stored in the SPD. It accounts for approximately 80% of all

  19. The Mitochondrial Protein Atlas: A Database of Experimentally Verified Information on the Human Mitochondrial Proteome.

    Science.gov (United States)

    Godin, Noa; Eichler, Jerry

    2017-09-01

    Given its central role in various biological systems, as well as its involvement in numerous pathologies, the mitochondrion is one of the best-studied organelles. However, although the mitochondrial genome has been extensively investigated, protein-level information remains partial, and in many cases, hypothetical. The Mitochondrial Protein Atlas (MPA; URL: lifeserv.bgu.ac.il/wb/jeichler/MPA ) is a database that provides a complete, manually curated inventory of only experimentally validated human mitochondrial proteins. The MPA presently contains 911 unique protein entries, each of which is associated with at least one experimentally validated and referenced mitochondrial localization. The MPA also contains experimentally validated and referenced information defining function, structure, involvement in pathologies, interactions with other MPA proteins, as well as the method(s) of analysis used in each instance. Connections to relevant external data sources are offered for each entry, including links to NCBI Gene, PubMed, and Protein Data Bank. The MPA offers a prototype for other information sources that allow for a distinction between what has been confirmed and what remains to be verified experimentally.

  20. The establishment of a database of Italian feeds for the Cornell Net Carbohydrate and Protein System

    Directory of Open Access Journals (Sweden)

    Enzo Tartari

    2010-01-01

    Full Text Available A field application of the Cornell Net Carbohydrate and Protein System (CNCPS in Italy has been limited because thefeed bank is based on North American feedstuffs and still few laboratories are able to analyze feeds as requested by theCNCPS. Moreover, the standardization of analytical procedures is still not homogeneous among laboratories. This workwas carried out to establish a first database for feeds commonly used in Italy, providing nutritionists and producers anaccurate and current feed composition, also indicating methods and apparatus for analytical procedures potentially availablefor routine analysis. A total of 909 samples of hays, silages and raw materials (protein feeds, cereals and by-productswere analyzed through 1999 and 2002; analysis included protein solubility and degradability, protein fractions,structural carbohydrate fractions and the calculation of neutral detergent structural carbohydrates. When possible, averagedata were compared with those included in the feed bank of CNCPS ver. 3 and with those obtained by another Italianlaboratory. The main differences were observed in chemical composition of forages and silages, whose composition largelydepends on environmental conditions and physiological stage; protein feeds, cereals and by-products showed somedifferences in crude protein, soluble protein and protein fractions even in feeds of national origin.The intent to modify the feed bank values of CNCPS for establishing an Italian data base of feeds will require a collaborativestudy of many laboratories not only for forages, hays and silages samples - whose composition is greatly dependenton environmental factors and agronomic techniques - but also for protein fractions, whose values are largely influencedby even small changes in analytical techniques.

  1. PDBj Mine: design and implementation of relational database interface for Protein Data Bank Japan.

    Science.gov (United States)

    Kinjo, Akira R; Yamashita, Reiko; Nakamura, Haruki

    2010-08-25

    This article is a tutorial for PDBj Mine, a new database and its interface for Protein Data Bank Japan (PDBj). In PDBj Mine, data are loaded from files in the PDBMLplus format (an extension of PDBML, PDB's canonical XML format, enriched with annotations), which are then served for the user of PDBj via the worldwide web (WWW). We describe the basic design of the relational database (RDB) and web interfaces of PDBj Mine. The contents of PDBMLplus files are first broken into XPath entities, and these paths and data are indexed in the way that reflects the hierarchical structure of the XML files. The data for each XPath type are saved into the corresponding relational table that is named as the XPath itself. The generation of table definitions from the PDBMLplus XML schema is fully automated. For efficient search, frequently queried terms are compiled into a brief summary table. Casual users can perform simple keyword search, and 'Advanced Search' which can specify various conditions on the entries. More experienced users can query the database using SQL statements which can be constructed in a uniform manner. Thus, PDBj Mine achieves a combination of the flexibility of XML documents and the robustness of the RDB. Database URL: http://www.pdbj.org/

  2. Kin-Driver: a database of driver mutations in protein kinases.

    Science.gov (United States)

    Simonetti, Franco L; Tornador, Cristian; Nabau-Moretó, Nuria; Molina-Vila, Miguel A; Marino-Buslje, Cristina

    2014-01-01

    Somatic mutations in protein kinases (PKs) are frequent driver events in many human tumors, while germ-line mutations are associated with hereditary diseases. Here we present Kin-driver, the first database that compiles driver mutations in PKs with experimental evidence demonstrating their functional role. Kin-driver is a manual expert-curated database that pays special attention to activating mutations (AMs) and can serve as a validation set to develop new generation tools focused on the prediction of gain-of-function driver mutations. It also offers an easy and intuitive environment to facilitate the visualization and analysis of mutations in PKs. Because all mutations are mapped onto a multiple sequence alignment, analogue positions between kinases can be identified and tentative new mutations can be proposed for studying by transferring annotation. Finally, our database can also be of use to clinical and translational laboratories, helping them to identify uncommon AMs that can correlate with response to new antitumor drugs. The website was developed using PHP and JavaScript, which are supported by all major browsers; the database was built using MySQL server. Kin-driver is available at: http://kin-driver.leloir.org.ar/ © The Author(s) 2014. Published by Oxford University Press.

  3. High Performance Protein Sequence Database Scanning on the Cell Broadband Engine

    Directory of Open Access Journals (Sweden)

    Adrianto Wirawan

    2009-01-01

    Full Text Available The enormous growth of biological sequence databases has caused bioinformatics to be rapidly moving towards a data-intensive, computational science. As a result, the computational power needed by bioinformatics applications is growing rapidly as well. The recent emergence of low cost parallel multicore accelerator technologies has made it possible to reduce execution times of many bioinformatics applications. In this paper, we demonstrate how the Cell Broadband Engine can be used as a computational platform to accelerate two approaches for protein sequence database scanning: exhaustive and heuristic. We present efficient parallelization techniques for two representative algorithms: the dynamic programming based Smith–Waterman algorithm and the popular BLASTP heuristic. Their implementation on a Playstation®3 leads to significant runtime savings compared to corresponding sequential implementations.

  4. Neutron cross-sections database for amino acids and proteins analysis

    Energy Technology Data Exchange (ETDEWEB)

    Voi, Dante L.; Ferreira, Francisco de O.; Nunes, Rogerio Chaffin, E-mail: dante@ien.gov.br, E-mail: fferreira@ien.gov.br, E-mail: Chaffin@ien.gov.br [Instituto de Engenharia Nuclear (IEN/CNEN-RJ), Rio de Janeiro, RJ (Brazil); Rocha, Helio F. da, E-mail: hrocha@gbl.com.br [Universidade Federal do Rio de Janeiro (IPPMG/UFRJ), Rio de Janeiro, RJ (Brazil). Instituto de Pediatria

    2015-07-01

    Biological materials may be studied using neutrons as an unconventional tool of analysis. Dynamics and structures data can be obtained for amino acids, protein and others cellular components by neutron cross sections determinations especially for applications in nuclear purity and conformation analysis. The instrument used for this is the crystal spectrometer of the Instituto de Engenharia Nuclear (IEN-CNEN-RJ), the only one in Latin America that uses neutrons for this type of analyzes and it is installed in one of the reactor Argonauta irradiation channels. The experimentally values obtained are compared with calculated values using literature data with a rigorous analysis of the chemical composition, conformation and molecular structure analysis of the materials. A neutron cross-section database was constructed to assist in determining molecular dynamic, structure and formulae of biological materials. The database contains neutron cross-sections values of all amino acids, chemical elements, molecular groups, auxiliary radicals, as well as values of constants and parameters necessary for the analysis. An unprecedented analytical procedure was developed using the neutron cross section parceling and grouping method for data manipulation. This database is a result of measurements obtained from twenty amino acids that were provided by different manufactories and are used in oral administration in hospital individuals for nutritional applications. It was also constructed a small data file of compounds with different molecular groups including carbon, nitrogen, sulfur and oxygen, all linked to hydrogen atoms. A review of global and national scene in the acquisition of neutron cross sections data, the formation of libraries and the application of neutrons for analyzing biological materials is presented. This database has further application in protein analysis and the neutron cross-section from the insulin was estimated. (author)

  5. Neutron cross-sections database for amino acids and proteins analysis

    International Nuclear Information System (INIS)

    Voi, Dante L.; Ferreira, Francisco de O.; Nunes, Rogerio Chaffin; Rocha, Helio F. da

    2015-01-01

    Biological materials may be studied using neutrons as an unconventional tool of analysis. Dynamics and structures data can be obtained for amino acids, protein and others cellular components by neutron cross sections determinations especially for applications in nuclear purity and conformation analysis. The instrument used for this is the crystal spectrometer of the Instituto de Engenharia Nuclear (IEN-CNEN-RJ), the only one in Latin America that uses neutrons for this type of analyzes and it is installed in one of the reactor Argonauta irradiation channels. The experimentally values obtained are compared with calculated values using literature data with a rigorous analysis of the chemical composition, conformation and molecular structure analysis of the materials. A neutron cross-section database was constructed to assist in determining molecular dynamic, structure and formulae of biological materials. The database contains neutron cross-sections values of all amino acids, chemical elements, molecular groups, auxiliary radicals, as well as values of constants and parameters necessary for the analysis. An unprecedented analytical procedure was developed using the neutron cross section parceling and grouping method for data manipulation. This database is a result of measurements obtained from twenty amino acids that were provided by different manufactories and are used in oral administration in hospital individuals for nutritional applications. It was also constructed a small data file of compounds with different molecular groups including carbon, nitrogen, sulfur and oxygen, all linked to hydrogen atoms. A review of global and national scene in the acquisition of neutron cross sections data, the formation of libraries and the application of neutrons for analyzing biological materials is presented. This database has further application in protein analysis and the neutron cross-section from the insulin was estimated. (author)

  6. Exploring the Ligand-Protein Networks in Traditional Chinese Medicine: Current Databases, Methods, and Applications

    Directory of Open Access Journals (Sweden)

    Mingzhu Zhao

    2013-01-01

    Full Text Available The traditional Chinese medicine (TCM, which has thousands of years of clinical application among China and other Asian countries, is the pioneer of the “multicomponent-multitarget” and network pharmacology. Although there is no doubt of the efficacy, it is difficult to elucidate convincing underlying mechanism of TCM due to its complex composition and unclear pharmacology. The use of ligand-protein networks has been gaining significant value in the history of drug discovery while its application in TCM is still in its early stage. This paper firstly surveys TCM databases for virtual screening that have been greatly expanded in size and data diversity in recent years. On that basis, different screening methods and strategies for identifying active ingredients and targets of TCM are outlined based on the amount of network information available, both on sides of ligand bioactivity and the protein structures. Furthermore, applications of successful in silico target identification attempts are discussed in detail along with experiments in exploring the ligand-protein networks of TCM. Finally, it will be concluded that the prospective application of ligand-protein networks can be used not only to predict protein targets of a small molecule, but also to explore the mode of action of TCM.

  7. The drug-minded protein interaction database (DrumPID) for efficient target analysis and drug development.

    Science.gov (United States)

    Kunz, Meik; Liang, Chunguang; Nilla, Santosh; Cecil, Alexander; Dandekar, Thomas

    2016-01-01

    The drug-minded protein interaction database (DrumPID) has been designed to provide fast, tailored information on drugs and their protein networks including indications, protein targets and side-targets. Starting queries include compound, target and protein interactions and organism-specific protein families. Furthermore, drug name, chemical structures and their SMILES notation, affected proteins (potential drug targets), organisms as well as diseases can be queried including various combinations and refinement of searches. Drugs and protein interactions are analyzed in detail with reference to protein structures and catalytic domains, related compound structures as well as potential targets in other organisms. DrumPID considers drug functionality, compound similarity, target structure, interactome analysis and organismic range for a compound, useful for drug development, predicting drug side-effects and structure-activity relationships.Database URL:http://drumpid.bioapps.biozentrum.uni-wuerzburg.de. © The Author(s) 2016. Published by Oxford University Press.

  8. The evolution of a Web resource: The Galactosemia Proteins Database 2.0.

    Science.gov (United States)

    d'Acierno, Antonio; Scafuri, Bernardina; Facchiano, Angelo; Marabotti, Anna

    2018-01-01

    Galactosemia Proteins Database 2.0 is a Web-accessible resource collecting information about the structural and functional effects of the known variations associated to the three different enzymes of the Leloir pathway encoded by the genes GALT, GALE, and GALK1 and involved in the different forms of the genetic disease globally called "galactosemia." It represents an evolution of two available online resources we previously developed, with new data deriving from new structures, new analysis tools, and new interfaces and filters in order to improve the quality and quantity of information available for different categories of users. We propose this new resource both as a landmark for the entire world community of galactosemia and as a model for the development of similar tools for other proteins object of variations and involved in human diseases. © 2017 Wiley Periodicals, Inc.

  9. CLIPZ: a database and analysis environment for experimentally determined binding sites of RNA-binding proteins.

    Science.gov (United States)

    Khorshid, Mohsen; Rodak, Christoph; Zavolan, Mihaela

    2011-01-01

    The stability, localization and translation rate of mRNAs are regulated by a multitude of RNA-binding proteins (RBPs) that find their targets directly or with the help of guide RNAs. Among the experimental methods for mapping RBP binding sites, cross-linking and immunoprecipitation (CLIP) coupled with deep sequencing provides transcriptome-wide coverage as well as high resolution. However, partly due to their vast volume, the data that were so far generated in CLIP experiments have not been put in a form that enables fast and interactive exploration of binding sites. To address this need, we have developed the CLIPZ database and analysis environment. Binding site data for RBPs such as Argonaute 1-4, Insulin-like growth factor II mRNA-binding protein 1-3, TNRC6 proteins A-C, Pumilio 2, Quaking and Polypyrimidine tract binding protein can be visualized at the level of the genome and of individual transcripts. Individual users can upload their own sequence data sets while being able to limit the access to these data to specific users, and analyses of the public and private data sets can be performed interactively. CLIPZ, available at http://www.clipz.unibas.ch, aims to provide an open access repository of information for post-transcriptional regulatory elements.

  10. Dynameomics: a multi-dimensional analysis-optimized database for dynamic protein data.

    Science.gov (United States)

    Kehl, Catherine; Simms, Andrew M; Toofanny, Rudesh D; Daggett, Valerie

    2008-06-01

    The Dynameomics project is our effort to characterize the native-state dynamics and folding/unfolding pathways of representatives of all known protein folds by way of molecular dynamics simulations, as described by Beck et al. (in Protein Eng. Des. Select., the first paper in this series). The data produced by these simulations are highly multidimensional in structure and multi-terabytes in size. Both of these features present significant challenges for storage, retrieval and analysis. For optimal data modeling and flexibility, we needed a platform that supported both multidimensional indices and hierarchical relationships between related types of data and that could be integrated within our data warehouse, as described in the accompanying paper directly preceding this one. For these reasons, we have chosen On-line Analytical Processing (OLAP), a multi-dimensional analysis optimized database, as an analytical platform for these data. OLAP is a mature technology in the financial sector, but it has not been used extensively for scientific analysis. Our project is further more unusual for its focus on the multidimensional and analytical capabilities of OLAP rather than its aggregation capacities. The dimensional data model and hierarchies are very flexible. The query language is concise for complex analysis and rapid data retrieval. OLAP shows great promise for the dynamic protein analysis for bioengineering and biomedical applications. In addition, OLAP may have similar potential for other scientific and engineering applications involving large and complex datasets.

  11. Reference: 699 [Arabidopsis Phenome Database[Archive

    Lifescience Database Archive (English)

    Full Text Available d act in a nonredundant fashion to promote their splicing. With these findings, CRM domain proteins are impl...CFM2 and two previously described CRM proteins are bound simultaneously to the ndhA and ycf3-int1 introns an

  12. CISH has no non-redundant functions in glucose homeostasis or beta cell proliferation during pregnancy in mice.

    Science.gov (United States)

    Jiao, Yang; Rieck, Sebastian; Le Lay, John; Kaestner, Klaus H

    2013-11-01

    Increased beta cell proliferation during pregnancy is mediated by the Janus kinase 2/signal transducer and activator of transcription 5 (JAK2/STAT5) signalling pathway in response to increased lactogen levels. Activation of the pathway leads to transcriptional upregulation of Cish (encoding cytokine-inducible SH2 domain-containing protein), a member of the suppressor of cytokine signalling (SOCS) family of genes, forming a negative-feedback loop. Here, we examined whether conditional gene ablation of Cish in the pancreas improves beta cell proliferation and beta cell function during pregnancy in mice. We derived mice with a novel, conditional loxP allele for Cish. Pancreas-specific ablation of Cish was achieved by crossing Cish (loxP/loxP) mice with Pdx1-Cre (Early) mice. Beta cell proliferation was quantified by BrdU labelling. Glucose homeostasis was examined with glucose tolerance tests and determination of plasma insulin levels. The expression of other Socs genes and target genes of p-STAT5 related to beta cell function and beta cell proliferation was determined by quantitative PCR. There was no difference in beta cell proliferation or glucose homeostasis between the Cish mutant group and the control group. The p-STAT5 protein level was the same in Cish mutant and control mice. Socs2 gene expression was higher in Cish mutant than control mice at pregnancy day 9.5. The expression of other Socs genes was the same between control and mutant mice. Our results show that CISH has no non-redundant functions in beta cell proliferation or glucose homeostasis during pregnancy in mice. Socs2 might compensate for the loss of Cish during pregnancy.

  13. Using random forests for assistance in the curation of G-protein coupled receptor databases.

    Science.gov (United States)

    Shkurin, Aleksei; Vellido, Alfredo

    2017-08-18

    Biology is experiencing a gradual but fast transformation from a laboratory-centred science towards a data-centred one. As such, it requires robust data engineering and the use of quantitative data analysis methods as part of database curation. This paper focuses on G protein-coupled receptors, a large and heterogeneous super-family of cell membrane proteins of interest to biology in general. One of its families, Class C, is of particular interest to pharmacology and drug design. This family is quite heterogeneous on its own, and the discrimination of its several sub-families is a challenging problem. In the absence of known crystal structure, such discrimination must rely on their primary amino acid sequences. We are interested not as much in achieving maximum sub-family discrimination accuracy using quantitative methods, but in exploring sequence misclassification behavior. Specifically, we are interested in isolating those sequences showing consistent misclassification, that is, sequences that are very often misclassified and almost always to the same wrong sub-family. Random forests are used for this analysis due to their ensemble nature, which makes them naturally suited to gauge the consistency of misclassification. This consistency is here defined through the voting scheme of their base tree classifiers. Detailed consistency results for the random forest ensemble classification were obtained for all receptors and for all data transformations of their unaligned primary sequences. Shortlists of the most consistently misclassified receptors for each subfamily and transformation, as well as an overall shortlist including those cases that were consistently misclassified across transformations, were obtained. The latter should be referred to experts for further investigation as a data curation task. The automatic discrimination of the Class C sub-families of G protein-coupled receptors from their unaligned primary sequences shows clear limits. This study has

  14. Cα and Cβ Carbon-13 Chemical Shifts in Proteins From an Empirical Database

    International Nuclear Information System (INIS)

    Iwadate, Mitsuo; Asakura, Tetsuo; Williamson, Michael P.

    1999-01-01

    We have constructed an extensive database of 13C Cα and Cβ chemical shifts in proteins of solution, for proteins of which a high-resolution crystal structure exists, and for which the crystal structure has been shown to be essentially identical to the solution structure. There is no systematic effect of temperature, reference compound, or pH on reported shifts, but there appear to be differences in reported shifts arising from referencing differences of up to 4.2 ppm. The major factor affecting chemical shifts is the backbone geometry, which causes differences of ca. 4 ppm between typical α- helix and β-sheet geometries for Cα, and of ca. 2 ppm for Cβ. The side-chain dihedral angle χ1 has an effect of up to 0.5 ppm on the Cα shift, particularly for amino acids with branched side-chains at Cβ. Hydrogen bonding to main-chain atoms has an effect of up to 0.9 ppm, which depends on the main- chain conformation. The sequence of the protein and ring-current shifts from aromatic rings have an insignificant effect (except for residues following proline). There are significant differences between different amino acid types in the backbone geometry dependence; the amino acids can be grouped together into five different groups with different φ,ψ shielding surfaces. The overall fit of individual residues to a single non-residue-specific surface, incorporating the effects of hydrogen bonding and χ1 angle, is 0.96 ppm for both Cα and Cβ. The results from this study are broadly similar to those from ab initio studies, but there are some differences which could merit further attention

  15. RADARS, a bioinformatics solution that automates proteome mass spectral analysis, optimises protein identification, and archives data in a relational database.

    Science.gov (United States)

    Field, Helen I; Fenyö, David; Beavis, Ronald C

    2002-01-01

    RADARS, a rapid, automated, data archiving and retrieval software system for high-throughput proteomic mass spectral data processing and storage, is described. The majority of mass spectrometer data files are compatible with RADARS, for consistent processing. The system automatically takes unprocessed data files, identifies proteins via in silico database searching, then stores the processed data and search results in a relational database suitable for customized reporting. The system is robust, used in 24/7 operation, accessible to multiple users of an intranet through a web browser, may be monitored by Virtual Private Network, and is secure. RADARS is scalable for use on one or many computers, and is suited to multiple processor systems. It can incorporate any local database in FASTA format, and can search protein and DNA databases online. A key feature is a suite of visualisation tools (many available gratis), allowing facile manipulation of spectra, by hand annotation, reanalysis, and access to all procedures. We also described the use of Sonar MS/MS, a novel, rapid search engine requiring 40 MB RAM per process for searches against a genomic or EST database translated in all six reading frames. RADARS reduces the cost of analysis by its efficient algorithms: Sonar MS/MS can identifiy proteins without accurate knowledge of the parent ion mass and without protein tags. Statistical scoring methods provide close-to-expert accuracy and brings robust data analysis to the non-expert user.

  16. Expanded microbial genome coverage and improved protein family annotation in the COG database.

    Science.gov (United States)

    Galperin, Michael Y; Makarova, Kira S; Wolf, Yuri I; Koonin, Eugene V

    2015-01-01

    Microbial genome sequencing projects produce numerous sequences of deduced proteins, only a small fraction of which have been or will ever be studied experimentally. This leaves sequence analysis as the only feasible way to annotate these proteins and assign to them tentative functions. The Clusters of Orthologous Groups of proteins (COGs) database (http://www.ncbi.nlm.nih.gov/COG/), first created in 1997, has been a popular tool for functional annotation. Its success was largely based on (i) its reliance on complete microbial genomes, which allowed reliable assignment of orthologs and paralogs for most genes; (ii) orthology-based approach, which used the function(s) of the characterized member(s) of the protein family (COG) to assign function(s) to the entire set of carefully identified orthologs and describe the range of potential functions when there were more than one; and (iii) careful manual curation of the annotation of the COGs, aimed at detailed prediction of the biological function(s) for each COG while avoiding annotation errors and overprediction. Here we present an update of the COGs, the first since 2003, and a comprehensive revision of the COG annotations and expansion of the genome coverage to include representative complete genomes from all bacterial and archaeal lineages down to the genus level. This re-analysis of the COGs shows that the original COG assignments had an error rate below 0.5% and allows an assessment of the progress in functional genomics in the past 12 years. During this time, functions of many previously uncharacterized COGs have been elucidated and tentative functional assignments of many COGs have been validated, either by targeted experiments or through the use of high-throughput methods. A particularly important development is the assignment of functions to several widespread, conserved proteins many of which turned out to participate in translation, in particular rRNA maturation and tRNA modification. The new version of the

  17. ProBiS tools (algorithm, database, and web servers) for predicting and modeling of biologically interesting proteins.

    Science.gov (United States)

    Konc, Janez; Janežič, Dušanka

    2017-09-01

    ProBiS (Protein Binding Sites) Tools consist of algorithm, database, and web servers for prediction of binding sites and protein ligands based on the detection of structurally similar binding sites in the Protein Data Bank. In this article, we review the operations that ProBiS Tools perform, provide comments on the evolution of the tools, and give some implementation details. We review some of its applications to biologically interesting proteins. ProBiS Tools are freely available at http://probis.cmm.ki.si and http://probis.nih.gov. Copyright © 2017 Elsevier Ltd. All rights reserved.

  18. AT_CHLORO, a comprehensive chloroplast proteome database with subplastidial localization and curated information on envelope proteins.

    Science.gov (United States)

    Ferro, Myriam; Brugière, Sabine; Salvi, Daniel; Seigneurin-Berny, Daphné; Court, Magali; Moyet, Lucas; Ramus, Claire; Miras, Stéphane; Mellal, Mourad; Le Gall, Sophie; Kieffer-Jaquinod, Sylvie; Bruley, Christophe; Garin, Jérôme; Joyard, Jacques; Masselon, Christophe; Rolland, Norbert

    2010-06-01

    Recent advances in the proteomics field have allowed a series of high throughput experiments to be conducted on chloroplast samples, and the data are available in several public databases. However, the accurate localization of many chloroplast proteins often remains hypothetical. This is especially true for envelope proteins. We went a step further into the knowledge of the chloroplast proteome by focusing, in the same set of experiments, on the localization of proteins in the stroma, the thylakoids, and envelope membranes. LC-MS/MS-based analyses first allowed building the AT_CHLORO database (http://www.grenoble.prabi.fr/protehome/grenoble-plant-proteomics/), a comprehensive repertoire of the 1323 proteins, identified by 10,654 unique peptide sequences, present in highly purified chloroplasts and their subfractions prepared from Arabidopsis thaliana leaves. This database also provides extensive proteomics information (peptide sequences and molecular weight, chromatographic retention times, MS/MS spectra, and spectral count) for a unique chloroplast protein accurate mass and time tag database gathering identified peptides with their respective and precise analytical coordinates, molecular weight, and retention time. We assessed the partitioning of each protein in the three chloroplast compartments by using a semiquantitative proteomics approach (spectral count). These data together with an in-depth investigation of the literature were compiled to provide accurate subplastidial localization of previously known and newly identified proteins. A unique knowledge base containing extensive information on the proteins identified in envelope fractions was thus obtained, allowing new insights into this membrane system to be revealed. Altogether, the data we obtained provide unexpected information about plastidial or subplastidial localization of some proteins that were not suspected to be associated to this membrane system. The spectral counting-based strategy was further

  19. PROCARB: A Database of Known and Modelled Carbohydrate-Binding Protein Structures with Sequence-Based Prediction Tools

    Directory of Open Access Journals (Sweden)

    Adeel Malik

    2010-01-01

    Full Text Available Understanding of the three-dimensional structures of proteins that interact with carbohydrates covalently (glycoproteins as well as noncovalently (protein-carbohydrate complexes is essential to many biological processes and plays a significant role in normal and disease-associated functions. It is important to have a central repository of knowledge available about these protein-carbohydrate complexes as well as preprocessed data of predicted structures. This can be significantly enhanced by tools de novo which can predict carbohydrate-binding sites for proteins in the absence of structure of experimentally known binding site. PROCARB is an open-access database comprising three independently working components, namely, (i Core PROCARB module, consisting of three-dimensional structures of protein-carbohydrate complexes taken from Protein Data Bank (PDB, (ii Homology Models module, consisting of manually developed three-dimensional models of N-linked and O-linked glycoproteins of unknown three-dimensional structure, and (iii CBS-Pred prediction module, consisting of web servers to predict carbohydrate-binding sites using single sequence or server-generated PSSM. Several precomputed structural and functional properties of complexes are also included in the database for quick analysis. In particular, information about function, secondary structure, solvent accessibility, hydrogen bonds and literature reference, and so forth, is included. In addition, each protein in the database is mapped to Uniprot, Pfam, PDB, and so forth.

  20. AllergenOnline: A peer-reviewed, curated allergen database to assess novel food proteins for potential cross-reactivity.

    Science.gov (United States)

    Goodman, Richard E; Ebisawa, Motohiro; Ferreira, Fatima; Sampson, Hugh A; van Ree, Ronald; Vieths, Stefan; Baumert, Joseph L; Bohle, Barbara; Lalithambika, Sreedevi; Wise, John; Taylor, Steve L

    2016-05-01

    Increasingly regulators are demanding evaluation of potential allergenicity of foods prior to marketing. Primary risks are the transfer of allergens or potentially cross-reactive proteins into new foods. AllergenOnline was developed in 2005 as a peer-reviewed bioinformatics platform to evaluate risks of new dietary proteins in genetically modified organisms (GMO) and novel foods. The process used to identify suspected allergens and evaluate the evidence of allergenicity was refined between 2010 and 2015. Candidate proteins are identified from the NCBI database using keyword searches, the WHO/IUIS nomenclature database and peer reviewed publications. Criteria to classify proteins as allergens are described. Characteristics of the protein, the source and human subjects, test methods and results are evaluated by our expert panel and archived. Food, inhalant, salivary, venom, and contact allergens are included. Users access allergen sequences through links to the NCBI database and relevant references are listed online. Version 16 includes 1956 sequences from 778 taxonomic-protein groups that are accepted with evidence of allergic serum IgE-binding and/or biological activity. AllergenOnline provides a useful peer-reviewed tool for identifying the primary potential risks of allergy for GMOs and novel foods based on criteria described by the Codex Alimentarius Commission (2003). © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  1. ProDis-ContSHC: Learning protein dissimilarity measures and hierarchical context coherently for protein-protein comparison in protein database retrieval

    KAUST Repository

    Wang, Jim Jing-Yan; Gao, Xin; Wang, Quanquan; Li, Yongping

    2012-01-01

    Background: The need to retrieve or classify protein molecules using structure or sequence-based similarity measures underlies a wide range of biomedical applications. Traditional protein search methods rely on a pairwise dissimilarity

  2. O-GLYCBASE version 2.0: a revised database of O-glycosylated proteins

    DEFF Research Database (Denmark)

    Hansen, Jan; Lund, Ole; Rapacki, Kristoffer

    1997-01-01

    O-GLYCBASE is an updated database of information on glycoproteins and their O-linked glycosylation sites. Entries are compiled and revised from the literature, and from the SWISS-PROT database. Entries include information about species, sequence, glycosylation sites and glycan type. O-GLYCBASE is...... patterns for the GalNAc, mannose and GlcNAc transferases are shown. The O-GLYCBASE database is available through WWW or by anonymous FTP....

  3. O-GLYCBASE: a revised database of O-glycosylated proteins

    DEFF Research Database (Denmark)

    Hansen, Jan; Lund, Ole; Nielsen, Jens O.

    1996-01-01

    O-GLYCBASE is a comprehensive database of information on glycoproteins and their O-linked glycosylation sites. Entries are compiled and revised from the SWISS-PROT and PIR databases as well as directly from recently published reports. Nineteen percent of the entries extracted from the databases n...... of mucin type O-glycosylation sites in mammalian glycoproteins exclusively from the primary sequence is made available by E-mail or WWW. The O-GLYCBASE database is also available electronically through our WWW server or by anonymous FTP....

  4. Construction and analysis of a plant non-specific lipid transfer protein database (nsLTPDB

    Directory of Open Access Journals (Sweden)

    Wang Nai-Jyuan

    2012-01-01

    Full Text Available Abstract Background Plant non-specific lipid transfer proteins (nsLTPs are small and basic proteins. Recently, nsLTPs have been reported involved in many physiological functions such as mediating phospholipid transfer, participating in plant defence activity against bacterial and fungal pathogens, and enhancing cell wall extension in tobacco. However, the lipid transfer mechanism of nsLTPs is still unclear, and comprehensive information of nsLTPs is difficult to obtain. Methods In this study, we identified 595 nsLTPs from 121 different species and constructed an nsLTPs database -- nsLTPDB -- which comprises the sequence information, structures, relevant literatures, and biological data of all plant nsLTPs http://nsltpdb.life.nthu.edu.tw/. Results Meanwhile, bioinformatics and statistics methods were implemented to develop a classification method for nsLTPs based on the patterns of the eight highly-conserved cysteine residues, and to suggest strict Prosite-styled patterns for Type I and Type II nsLTPs. The pattern of Type I is C X2 V X5-7 C [V, L, I] × Y [L, A, V] X8-13 CC × G X12 D × [Q, K, R] X2 CXC X16-21 P X2 C X13-15C, and that of Type II is C X4 L X2 C X9-11 P [S, T] X2 CC X5 Q X2-4 C[L, F]C X2 [A, L, I] × [D, N] P X10-12 [K, R] X4-5 C X3-4 P X0-2 C. Moreover, we referred the Prosite-styled patterns to the experimental mutagenesis data that previously established by our group, and found that the residues with higher conservation played an important role in the structural stability or lipid binding ability of nsLTPs. Conclusions Taken together, this research has suggested potential residues that might be essential to modulate the structural and functional properties of plant nsLTPs. Finally, we proposed some biologically important sites of the nsLTPs, which are described by using a new Prosite-styled pattern that we defined.

  5. Construction and analysis of a plant non-specific lipid transfer protein database (nsLTPDB).

    Science.gov (United States)

    Wang, Nai-Jyuan; Lee, Chi-Ching; Cheng, Chao-Sheng; Lo, Wei-Cheng; Yang, Ya-Fen; Chen, Ming-Nan; Lyu, Ping-Chiang

    2012-01-01

    Plant non-specific lipid transfer proteins (nsLTPs) are small and basic proteins. Recently, nsLTPs have been reported involved in many physiological functions such as mediating phospholipid transfer, participating in plant defence activity against bacterial and fungal pathogens, and enhancing cell wall extension in tobacco. However, the lipid transfer mechanism of nsLTPs is still unclear, and comprehensive information of nsLTPs is difficult to obtain. In this study, we identified 595 nsLTPs from 121 different species and constructed an nsLTPs database--nsLTPDB--which comprises the sequence information, structures, relevant literatures, and biological data of all plant nsLTPs http://nsltpdb.life.nthu.edu.tw/. Meanwhile, bioinformatics and statistics methods were implemented to develop a classification method for nsLTPs based on the patterns of the eight highly-conserved cysteine residues, and to suggest strict Prosite-styled patterns for Type I and Type II nsLTPs. The pattern of Type I is C X2 V X5-7 C [V, L, I] × Y [L, A, V] X8-13 CC × G X12 D × [Q, K, R] X2 CXC X16-21 P X2 C X13-15C, and that of Type II is C X4 L X2 C X9-11 P [S, T] X2 CC X5 Q X2-4 C[L, F]C X2 [A, L, I] × [D, N] P X10-12 [K, R] X4-5 C X3-4 P X0-2 C. Moreover, we referred the Prosite-styled patterns to the experimental mutagenesis data that previously established by our group, and found that the residues with higher conservation played an important role in the structural stability or lipid binding ability of nsLTPs. Taken together, this research has suggested potential residues that might be essential to modulate the structural and functional properties of plant nsLTPs. Finally, we proposed some biologically important sites of the nsLTPs, which are described by using a new Prosite-styled pattern that we defined.

  6. The human interactome knowledge base (hint-kb): An integrative human protein interaction database enriched with predicted protein–protein interaction scores using a novel hybrid technique

    KAUST Repository

    Theofilatos, Konstantinos A.

    2013-07-12

    Proteins are the functional components of many cellular processes and the identification of their physical protein–protein interactions (PPIs) is an area of mature academic research. Various databases have been developed containing information about experimentally and computationally detected human PPIs as well as their corresponding annotation data. However, these databases contain many false positive interactions, are partial and only a few of them incorporate data from various sources. To overcome these limitations, we have developed HINT-KB (http://biotools.ceid.upatras.gr/hint-kb/), a knowledge base that integrates data from various sources, provides a user-friendly interface for their retrieval, cal-culatesasetoffeaturesofinterest and computesaconfidence score for every candidate protein interaction. This confidence score is essential for filtering the false positive interactions which are present in existing databases, predicting new protein interactions and measuring the frequency of each true protein interaction. For this reason, a novel machine learning hybrid methodology, called (Evolutionary Kalman Mathematical Modelling—EvoKalMaModel), was used to achieve an accurate and interpretable scoring methodology. The experimental results indicated that the proposed scoring scheme outperforms existing computational methods for the prediction of PPIs.

  7. The human keratinocyte two-dimensional protein database (update 1994): towards an integrated approach to the study of cell proliferation, differentiation and skin diseases

    DEFF Research Database (Denmark)

    Celis, J E; Rasmussen, H H; Olsen, E

    1994-01-01

    The master two-dimensional (2-D) gel database of human keratinocytes currently lists 3087 cellular proteins (2168 isoelectric focusing, IEF; and 919 none-quilibrium pH gradient electrophoresis, NEPHGE), many of which correspond to posttranslational modifications, 890 polypeptides have been...... in the database. We also report a database of proteins recovered from the medium of noncultured, unfractionated keratinocytes. This database lists 398 polypeptides (309 IEF; 89 NEPHGE) of which 76 have been identified. The aim of the comprehensive databases is to gather, through a systematic study...

  8. O-GLYCBASE version 3.0: a revised database of O-glycosylated proteins

    DEFF Research Database (Denmark)

    Hansen, Jan; Lund, Ole; Nilsson, Jette

    1998-01-01

    O-GLYCBASE is a revised database of information on glycoproteins and their O-linked glycosylation sites. Entries are compiled and revised from the literature, and from the sequence databases. Entries include informations about species, sequence, glycosylation sites and glycan type and is fully cr...

  9. Mono- and Digalactosyldiacylglycerol Lipids Function Nonredundantly to Regulate Systemic Acquired Resistance in Plants

    Directory of Open Access Journals (Sweden)

    Qing-ming Gao

    2014-12-01

    Full Text Available Summary: The plant galactolipids monogalactosyldiacylglycerol (MGDG and digalactosyldiacylglycerol (DGDG have been linked to the anti-inflammatory and cancer benefits of a green leafy vegetable diet in humans due to their ability to regulate the levels of free radicals like nitric oxide (NO. Here, we show that DGDG contributes to plant NO as well as salicylic acid biosynthesis and is required for the induction of systemic acquired resistance (SAR. In contrast, MGDG regulates the biosynthesis of the SAR signals azelaic acid (AzA and glycerol-3-phosphate (G3P that function downstream of NO. Interestingly, DGDG is also required for AzA-induced SAR, but MGDG is not. Notably, transgenic expression of a bacterial glucosyltransferase is unable to restore SAR in dgd1 plants even though it does rescue their morphological and fatty acid phenotypes. These results suggest that MGDG and DGDG are required at distinct steps and function exclusively in their individual roles during the induction of SAR. : The galactolipids monogalactosyldiacylglycerol (MGDG and digalactosyldiacylglycerol (DGDG constitute ∼80% of total membrane lipids in plants. Gao et al. now show that these galactolipids function nonredundantly to regulate systemic acquired resistance (SAR. Furthermore, they show that the terminal galactose on the α-galactose-β-galactose head group of DGDG is critical for SAR.

  10. Sibling rivalry: related bacterial small RNAs and their redundant and non-redundant roles.

    Science.gov (United States)

    Caswell, Clayton C; Oglesby-Sherrouse, Amanda G; Murphy, Erin R

    2014-01-01

    Small RNA molecules (sRNAs) are now recognized as key regulators controlling bacterial gene expression, as sRNAs provide a quick and efficient means of positively or negatively altering the expression of specific genes. To date, numerous sRNAs have been identified and characterized in a myriad of bacterial species, but more recently, a theme in bacterial sRNAs has emerged: the presence of more than one highly related sRNAs produced by a given bacterium, here termed sibling sRNAs. Sibling sRNAs are those that are highly similar at the nucleotide level, and while it might be expected that sibling sRNAs exert identical regulatory functions on the expression of target genes based on their high degree of relatedness, emerging evidence is demonstrating that this is not always the case. Indeed, there are several examples of bacterial sibling sRNAs with non-redundant regulatory functions, but there are also instances of apparent regulatory redundancy between sibling sRNAs. This review provides a comprehensive overview of the current knowledge of bacterial sibling sRNAs, and also discusses important questions about the significance and evolutionary implications of this emerging class of regulators.

  11. Sibling rivalry: Related bacterial small RNAs and their redundant and non-redundant roles

    Directory of Open Access Journals (Sweden)

    Clayton eCaswell

    2014-10-01

    Full Text Available Small RNA molecules (sRNAs are now recognized as key regulators controlling bacterial gene expression, as sRNAs provide a quick and efficient means of positively or negatively altering the expression of specific genes. To date, numerous sRNAs have been identified and characterized in a myriad of bacterial species, but more recently, a theme in bacterial sRNAs has emerged: the presence of more than one highly related sRNAs produced by a given bacterium, here termed sibling sRNAs. Sibling sRNAs are those that are highly similar at the nucleotide level, and while it might be expected that sibling sRNAs exert identical regulatory functions on the expression of target genes based on their high degree of relatedness, emerging evidence is demonstrating that this is not always the case. Indeed, there are several examples of bacterial sibling sRNAs with non-redundant regulatory functions, but there are also instances of apparent regulatory redundancy between sibling sRNAs. This review provides a comprehensive overview of the current knowledge of bacterial sibling sRNAs, and also discusses important questions about the significance and evolutionary implications of this emerging class of regulators.

  12. A database and tool, IM Browser, for exploring and integrating emerging gene and protein interaction data for Drosophila

    Directory of Open Access Journals (Sweden)

    Parrish Jodi R

    2006-04-01

    Full Text Available Abstract Background Biological processes are mediated by networks of interacting genes and proteins. Efforts to map and understand these networks are resulting in the proliferation of interaction data derived from both experimental and computational techniques for a number of organisms. The volume of this data combined with the variety of specific forms it can take has created a need for comprehensive databases that include all of the available data sets, and for exploration tools to facilitate data integration and analysis. One powerful paradigm for the navigation and analysis of interaction data is an interaction graph or map that represents proteins or genes as nodes linked by interactions. Several programs have been developed for graphical representation and analysis of interaction data, yet there remains a need for alternative programs that can provide casual users with rapid easy access to many existing and emerging data sets. Description Here we describe a comprehensive database of Drosophila gene and protein interactions collected from a variety of sources, including low and high throughput screens, genetic interactions, and computational predictions. We also present a program for exploring multiple interaction data sets and for combining data from different sources. The program, referred to as the Interaction Map (IM Browser, is a web-based application for searching and visualizing interaction data stored in a relational database system. Use of the application requires no downloads and minimal user configuration or training, thereby enabling rapid initial access to interaction data. IM Browser was designed to readily accommodate and integrate new types of interaction data as it becomes available. Moreover, all information associated with interaction measurements or predictions and the genes or proteins involved are accessible to the user. This allows combined searches and analyses based on either common or technique-specific attributes

  13. Sting_RDB: a relational database of structural parameters for protein analysis with support for data warehousing and data mining.

    Science.gov (United States)

    Oliveira, S R M; Almeida, G V; Souza, K R R; Rodrigues, D N; Kuser-Falcão, P R; Yamagishi, M E B; Santos, E H; Vieira, F D; Jardine, J G; Neshich, G

    2007-10-05

    An effective strategy for managing protein databases is to provide mechanisms to transform raw data into consistent, accurate and reliable information. Such mechanisms will greatly reduce operational inefficiencies and improve one's ability to better handle scientific objectives and interpret the research results. To achieve this challenging goal for the STING project, we introduce Sting_RDB, a relational database of structural parameters for protein analysis with support for data warehousing and data mining. In this article, we highlight the main features of Sting_RDB and show how a user can explore it for efficient and biologically relevant queries. Considering its importance for molecular biologists, effort has been made to advance Sting_RDB toward data quality assessment. To the best of our knowledge, Sting_RDB is one of the most comprehensive data repositories for protein analysis, now also capable of providing its users with a data quality indicator. This paper differs from our previous study in many aspects. First, we introduce Sting_RDB, a relational database with mechanisms for efficient and relevant queries using SQL. Sting_rdb evolved from the earlier, text (flat file)-based database, in which data consistency and integrity was not guaranteed. Second, we provide support for data warehousing and mining. Third, the data quality indicator was introduced. Finally and probably most importantly, complex queries that could not be posed on a text-based database, are now easily implemented. Further details are accessible at the Sting_RDB demo web page: http://www.cbi.cnptia.embrapa.br/StingRDB.

  14. ProOpDB: Prokaryotic Operon DataBase.

    Science.gov (United States)

    Taboada, Blanca; Ciria, Ricardo; Martinez-Guerrero, Cristian E; Merino, Enrique

    2012-01-01

    The Prokaryotic Operon DataBase (ProOpDB, http://operons.ibt.unam.mx/OperonPredictor) constitutes one of the most precise and complete repositories of operon predictions now available. Using our novel and highly accurate operon identification algorithm, we have predicted the operon structures of more than 1200 prokaryotic genomes. ProOpDB offers diverse alternatives by which a set of operon predictions can be retrieved including: (i) organism name, (ii) metabolic pathways, as defined by the KEGG database, (iii) gene orthology, as defined by the COG database, (iv) conserved protein domains, as defined by the Pfam database, (v) reference gene and (vi) reference operon, among others. In order to limit the operon output to non-redundant organisms, ProOpDB offers an efficient method to select the most representative organisms based on a precompiled phylogenetic distances matrix. In addition, the ProOpDB operon predictions are used directly as the input data of our Gene Context Tool to visualize their genomic context and retrieve the sequence of their corresponding 5' regulatory regions, as well as the nucleotide or amino acid sequences of their genes.

  15. Verification of Single-Peptide Protein Identifications by the Application of Complementary Database Search Algorithms

    National Research Council Canada - National Science Library

    Rohrbough, James G; Breci, Linda; Merchant, Nirav; Miller, Susan; Haynes, Paul A

    2005-01-01

    .... One such technique, known as the Multi-Dimensional Protein Identification Technique, or MudPIT, involves the use of computer search algorithms that automate the process of identifying proteins...

  16. Proteins in similarity relationship with the cluster - Gclust Server | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Gclust Server Proteins in similarity relationship with the cluster Data detail Data name Pro...teins in similarity relationship with the cluster DOI 10.18908/lsdba.nbdc00464-003 Description of data conte...s Proteins in similarity relationship with the cluster - Gclust Server | LSDB Archive ...

  17. DMPD: Post-transcriptional regulation of proinflammatory proteins. [Dynamic Macrophage Pathway CSML Database

    Lifescience Database Archive (English)

    Full Text Available 15075353 Post-transcriptional regulation of proinflammatory proteins. Anderson P, P...l) (.csml) Show Post-transcriptional regulation of proinflammatory proteins. PubmedID 15075353 Title Post-tr...anscriptional regulation of proinflammatory proteins. Authors Anderson P, Phillip

  18. DMPD: LPS-binding proteins and receptors. [Dynamic Macrophage Pathway CSML Database

    Lifescience Database Archive (English)

    Full Text Available 9665271 LPS-binding proteins and receptors. Fenton MJ, Golenbock DT. J Leukoc Biol.... 1998 Jul;64(1):25-32. (.png) (.svg) (.html) (.csml) Show LPS-binding proteins and receptors. PubmedID 9665271 Title LPS-binding prot...eins and receptors. Authors Fenton MJ, Golenbock DT. Publication J Leukoc Biol. 199

  19. AN IMAGE-PLANE ALGORITHM FOR JWST'S NON-REDUNDANT APERTURE MASK DATA

    Energy Technology Data Exchange (ETDEWEB)

    Greenbaum, Alexandra Z. [Johns Hopkins University Department of Physics and Astronomy 3400 North Charles, Baltimore, MD 21218 (United States); Pueyo, Laurent; Sivaramakrishnan, Anand [Space Telescope Science Institute, 3700 San Martin Drive, Baltimore, MD 21218 (United States); Lacour, Sylvestre [LESIA, CNRS/UMR-8109, Observatoire de Paris, UPMC, Université Paris Diderot 5 place Jules Janssen, 92195 Meudon (France)

    2015-01-10

    The high angular resolution technique of non-redundant masking (NRM) or aperture masking interferometry (AMI) has yielded images of faint protoplanetary companions of nearby stars from the ground. AMI on James Webb Space Telescope (JWST)'s Near Infrared Imager and Slitless Spectrograph (NIRISS) has a lower thermal background than ground-based facilities and does not suffer from atmospheric instability. NIRISS AMI images are likely to have 90%-95% Strehl ratio between 2.77 and 4.8 μm. In this paper we quantify factors that limit the raw point source contrast of JWST NRM. We develop an analytic model of the NRM point spread function which includes different optical path delays (pistons) between mask holes and fit the model parameters with image plane data. It enables a straightforward way to exclude bad pixels, is suited to limited fields of view, and can incorporate effects such as intra-pixel sensitivity variations. We simulate various sources of noise to estimate their effect on the standard deviation of closure phase, σ{sub CP} (a proxy for binary point source contrast). If σ{sub CP} < 10{sup –4} radians—a contrast ratio of 10 mag—young accreting gas giant planets (e.g., in the nearby Taurus star-forming region) could be imaged with JWST NIRISS. We show the feasibility of using NIRISS' NRM with the sub-Nyquist sampled F277W, which would enable some exoplanet chemistry characterization. In the presence of small piston errors, the dominant sources of closure phase error (depending on pixel sampling, and filter bandwidth) are flat field errors and unmodeled variations in intra-pixel sensitivity. The in-flight stability of NIRISS will determine how well these errors can be calibrated by observing a point source. Our results help develop efficient observing strategies for space-based NRM.

  20. A MEMORY EFFICIENT HARDWARE BASED PATTERN MATCHING AND PROTEIN ALIGNMENT SCHEMES FOR HIGHLY COMPLEX DATABASES

    OpenAIRE

    Bennet, M.Anto; Sankaranarayanan, S.; Deepika, M.; Nanthini, N.; Bhuvaneshwari, S.; Priyanka, M.

    2017-01-01

    Protein sequence alignment to find correlation between different species, or genetic mutations etc. is the most computational intensive task when performing protein comparison. To speed-up the alignment, Systolic Arrays (SAs) have been used. In order to avoid the internal-loop problem which reduces the performance, pipeline interleaving strategy has been presented. This strategy is applied to an SA for Smith Waterman (SW) algorithm which is an alignment algorithm to locally align two proteins...

  1. ContaMiner and ContaBase: a webserver and database for early identification of unwantedly crystallized protein contaminants

    KAUST Repository

    Hungler, Arnaud; Momin, Afaque Ahmad Imtiyaz; Diederichs, Kay; Arold, Stefan T.

    2016-01-01

    Solving the phase problem in protein X-ray crystallography relies heavily on the identity of the crystallized protein, especially when molecular replacement (MR) methods are used. Yet, it is not uncommon that a contaminant crystallizes instead of the protein of interest. Such contaminants may be proteins from the expression host organism, protein fusion tags or proteins added during the purification steps. Many contaminants co-purify easily, crystallize and give good diffraction data. Identification of contaminant crystals may take time, since the presence of the contaminant is unexpected and its identity unknown. A webserver (ContaMiner) and a contaminant database (ContaBase) have been established, to allow fast MR-based screening of crystallographic data against currently 62 known contaminants. The web-based ContaMiner (available at http://strube.cbrc.kaust.edu.sa/contaminer/) currently produces results in 5 min to 4 h. The program is also available in a github repository and can be installed locally. ContaMiner enables screening of novel crystals at synchrotron beamlines, and it would be valuable as a routine safety check for 'crystallization and preliminary X-ray analysis' publications. Thus, in addition to potentially saving X-ray crystallographers much time and effort, ContaMiner might considerably lower the risk of publishing erroneous data. A web server, titled ContaMiner, has been established, which allows fast molecular-replacement-based screening of crystallographic data against a database (ContaBase) of currently 62 potential contaminants. ContaMiner enables systematic screening of novel crystals at synchrotron beamlines, and it would be valuable as a routine safety check for 'crystallization and preliminary X-ray analysis' publications. © Arnaud Hungler et al. 2016.

  2. ContaMiner and ContaBase: a webserver and database for early identification of unwantedly crystallized protein contaminants

    KAUST Repository

    Hungler, Arnaud

    2016-11-02

    Solving the phase problem in protein X-ray crystallography relies heavily on the identity of the crystallized protein, especially when molecular replacement (MR) methods are used. Yet, it is not uncommon that a contaminant crystallizes instead of the protein of interest. Such contaminants may be proteins from the expression host organism, protein fusion tags or proteins added during the purification steps. Many contaminants co-purify easily, crystallize and give good diffraction data. Identification of contaminant crystals may take time, since the presence of the contaminant is unexpected and its identity unknown. A webserver (ContaMiner) and a contaminant database (ContaBase) have been established, to allow fast MR-based screening of crystallographic data against currently 62 known contaminants. The web-based ContaMiner (available at http://strube.cbrc.kaust.edu.sa/contaminer/) currently produces results in 5 min to 4 h. The program is also available in a github repository and can be installed locally. ContaMiner enables screening of novel crystals at synchrotron beamlines, and it would be valuable as a routine safety check for \\'crystallization and preliminary X-ray analysis\\' publications. Thus, in addition to potentially saving X-ray crystallographers much time and effort, ContaMiner might considerably lower the risk of publishing erroneous data. A web server, titled ContaMiner, has been established, which allows fast molecular-replacement-based screening of crystallographic data against a database (ContaBase) of currently 62 potential contaminants. ContaMiner enables systematic screening of novel crystals at synchrotron beamlines, and it would be valuable as a routine safety check for \\'crystallization and preliminary X-ray analysis\\' publications. © Arnaud Hungler et al. 2016.

  3. ContaMiner and ContaBase: a webserver and database for early identification of unwantedly crystallized protein contaminants

    Science.gov (United States)

    Hungler, Arnaud; Momin, Afaque; Diederichs, Kay; Arold, Stefan, T.

    2016-01-01

    Solving the phase problem in protein X-ray crystallography relies heavily on the identity of the crystallized protein, especially when molecular replacement (MR) methods are used. Yet, it is not uncommon that a contaminant crystallizes instead of the protein of interest. Such contaminants may be proteins from the expression host organism, protein fusion tags or proteins added during the purification steps. Many contaminants co-purify easily, crystallize and give good diffraction data. Identification of contaminant crystals may take time, since the presence of the contaminant is unexpected and its identity unknown. A webserver (ContaMiner) and a contaminant database (ContaBase) have been established, to allow fast MR-based screening of crystallographic data against currently 62 known contaminants. The web-based ContaMiner (available at http://strube.cbrc.kaust.edu.sa/contaminer/) currently produces results in 5 min to 4 h. The program is also available in a github repository and can be installed locally. ContaMiner enables screening of novel crystals at synchrotron beamlines, and it would be valuable as a routine safety check for ‘crystallization and preliminary X-ray analysis’ publications. Thus, in addition to potentially saving X-ray crystallographers much time and effort, ContaMiner might considerably lower the risk of publishing erroneous data. PMID:27980519

  4. Phospho.ELM: A database of experimentally verified phosphorylation sites in eukaryotic proteins

    DEFF Research Database (Denmark)

    Diella, F.; Cameron, S.; Gemund, C.

    2004-01-01

    Background: Post-translational phosphorylation is one of the most common protein modifications. Phosphoserine, threonine and tyrosine residues play critical roles in the regulation of many cellular processes. The fast growing number of research reports on protein phosphorylation points to a gener...

  5. Heart research advances using database search engines, Human Protein Atlas and the Sydney Heart Bank.

    Science.gov (United States)

    Li, Amy; Estigoy, Colleen; Raftery, Mark; Cameron, Darryl; Odeberg, Jacob; Pontén, Fredrik; Lal, Sean; Dos Remedios, Cristobal G

    2013-10-01

    This Methodological Review is intended as a guide for research students who may have just discovered a human "novel" cardiac protein, but it may also help hard-pressed reviewers of journal submissions on a "novel" protein reported in an animal model of human heart failure. Whether you are an expert or not, you may know little or nothing about this particular protein of interest. In this review we provide a strategic guide on how to proceed. We ask: How do you discover what has been published (even in an abstract or research report) about this protein? Everyone knows how to undertake literature searches using PubMed and Medline but these are usually encyclopaedic, often producing long lists of papers, most of which are either irrelevant or only vaguely relevant to your query. Relatively few will be aware of more advanced search engines such as Google Scholar and even fewer will know about Quertle. Next, we provide a strategy for discovering if your "novel" protein is expressed in the normal, healthy human heart, and if it is, we show you how to investigate its subcellular location. This can usually be achieved by visiting the website "Human Protein Atlas" without doing a single experiment. Finally, we provide a pathway to discovering if your protein of interest changes its expression level with heart failure/disease or with ageing. Crown Copyright © 2013. Published by Elsevier B.V. All rights reserved.

  6. TcoF-DB: dragon database for human transcription co-factors and transcription factor interacting proteins

    KAUST Repository

    Schaefer, Ulf

    2010-10-21

    The initiation and regulation of transcription in eukaryotes is complex and involves a large number of transcription factors (TFs), which are known to bind to the regulatory regions of eukaryotic DNA. Apart from TF-DNA binding, protein-protein interaction involving TFs is an essential component of the machinery facilitating transcriptional regulation. Proteins that interact with TFs in the context of transcription regulation but do not bind to the DNA themselves, we consider transcription co-factors (TcoFs). The influence of TcoFs on transcriptional regulation and initiation, although indirect, has been shown to be significant with the functionality of TFs strongly influenced by the presence of TcoFs. While the role of TFs and their interaction with regulatory DNA regions has been well-studied, the association between TFs and TcoFs has so far been given less attention. Here, we present a resource that is comprised of a collection of human TFs and the TcoFs with which they interact. Other proteins that have a proven interaction with a TF, but are not considered TcoFs are also included. Our database contains 157 high-confidence TcoFs and additionally 379 hypothetical TcoFs. These have been identified and classified according to the type of available evidence for their involvement in transcriptional regulation and their presence in the cell nucleus. We have divided TcoFs into four groups, one of which contains high-confidence TcoFs and three others contain TcoFs which are hypothetical to different extents. We have developed the Dragon Database for Human Transcription Co-Factors and Transcription Factor Interacting Proteins (TcoF-DB). A web-based interface for this resource can be freely accessed at http://cbrc.kaust.edu.sa/tcof/ and http://apps.sanbi.ac.za/tcof/. © The Author(s) 2010.

  7. TcoF-DB: dragon database for human transcription co-factors and transcription factor interacting proteins

    KAUST Repository

    Schaefer, Ulf; Schmeier, Sebastian; Bajic, Vladimir B.

    2010-01-01

    The initiation and regulation of transcription in eukaryotes is complex and involves a large number of transcription factors (TFs), which are known to bind to the regulatory regions of eukaryotic DNA. Apart from TF-DNA binding, protein-protein interaction involving TFs is an essential component of the machinery facilitating transcriptional regulation. Proteins that interact with TFs in the context of transcription regulation but do not bind to the DNA themselves, we consider transcription co-factors (TcoFs). The influence of TcoFs on transcriptional regulation and initiation, although indirect, has been shown to be significant with the functionality of TFs strongly influenced by the presence of TcoFs. While the role of TFs and their interaction with regulatory DNA regions has been well-studied, the association between TFs and TcoFs has so far been given less attention. Here, we present a resource that is comprised of a collection of human TFs and the TcoFs with which they interact. Other proteins that have a proven interaction with a TF, but are not considered TcoFs are also included. Our database contains 157 high-confidence TcoFs and additionally 379 hypothetical TcoFs. These have been identified and classified according to the type of available evidence for their involvement in transcriptional regulation and their presence in the cell nucleus. We have divided TcoFs into four groups, one of which contains high-confidence TcoFs and three others contain TcoFs which are hypothetical to different extents. We have developed the Dragon Database for Human Transcription Co-Factors and Transcription Factor Interacting Proteins (TcoF-DB). A web-based interface for this resource can be freely accessed at http://cbrc.kaust.edu.sa/tcof/ and http://apps.sanbi.ac.za/tcof/. © The Author(s) 2010.

  8. Amino acid sequences of predicted proteins and their annotation for 95 organism species. - Gclust Server | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Gclust Server Amino acid sequences of predicted proteins and their annotation for 95 organis...m species. Data detail Data name Amino acid sequences of predicted proteins and their annotation for 95 orga...nism species. DOI 10.18908/lsdba.nbdc00464-001 Description of data contents Amino acid sequences of predicted proteins...Database Description Download License Update History of This Database Site Policy | Contact Us Amino acid sequences of predicted prot...eins and their annotation for 95 organism species. - Gclust Server | LSDB Archive ...

  9. CAZymes Analysis Toolkit (CAT): web service for searching and analyzing carbohydrate-active enzymes in a newly sequenced organism using CAZy database.

    Science.gov (United States)

    Park, Byung H; Karpinets, Tatiana V; Syed, Mustafa H; Leuze, Michael R; Uberbacher, Edward C

    2010-12-01

    The Carbohydrate-Active Enzyme (CAZy) database provides a rich set of manually annotated enzymes that degrade, modify, or create glycosidic bonds. Despite rich and invaluable information stored in the database, software tools utilizing this information for annotation of newly sequenced genomes by CAZy families are limited. We have employed two annotation approaches to fill the gap between manually curated high-quality protein sequences collected in the CAZy database and the growing number of other protein sequences produced by genome or metagenome sequencing projects. The first approach is based on a similarity search against the entire nonredundant sequences of the CAZy database. The second approach performs annotation using links or correspondences between the CAZy families and protein family domains. The links were discovered using the association rule learning algorithm applied to sequences from the CAZy database. The approaches complement each other and in combination achieved high specificity and sensitivity when cross-evaluated with the manually curated genomes of Clostridium thermocellum ATCC 27405 and Saccharophagus degradans 2-40. The capability of the proposed framework to predict the function of unknown protein domains and of hypothetical proteins in the genome of Neurospora crassa is demonstrated. The framework is implemented as a Web service, the CAZymes Analysis Toolkit, and is available at http://cricket.ornl.gov/cgi-bin/cat.cgi.

  10. The effect of using an inappropriate protein database for proteomic data analysis.

    Directory of Open Access Journals (Sweden)

    Giselle M Knudsen

    Full Text Available A recent study by Bromenshenk et al., published in PLoS One (2010, used proteomic analysis to identify peptides purportedly of Iridovirus and Nosema origin; however the validity of this finding is controversial. We show here through re-analysis of a subset of this data that many of the spectra identified by Bromenshenk et al. as deriving from Iridovirus and Nosema proteins are actually products from Apis mellifera honey bee proteins. We find no reliable evidence that proteins from Iridovirus and Nosema are present in the samples that were re-analyzed. This article is also intended as a learning exercise for illustrating some of the potential pitfalls of analysis of mass spectrometry proteomic data and to encourage authors to observe MS/MS data reporting guidelines that would facilitate recognition of analysis problems during the review process.

  11. Identification of ace inhibitory cryptides in Tilapia protein hydrolysate by UPLC-MS/MS coupled to database analysis.

    Science.gov (United States)

    Yesmine, Ben Henda; Antoine, Bonnet; da Silva Ortência Leocádia, Nunes Gonzalez; Rogério, Boscolo Wilson; Ingrid, Arnaudin; Nicolas, Bridiau; Thierry, Maugard; Jean-Marie, Piot; Frédéric, Sannier; Stéphanie, Bordenave-Juchereau

    2017-05-01

    An ultra-performance liquid chromatography-quadrupole-time of flight mass spectrometry method was developed and applied to identify short angiotensin-I-converting enzyme (ACE) inhibitory cryptides in Tilapia (Oreochromis Niloticus) protein hydrolyzate. A database was created with previously identified ACE-inhibitory di- and tripeptides and the lowest molecular weight fraction of Tilapia hydrolysate was analysed for coincidences. Only VW and VY were identified. Further analysis of collected fractions conducted to the identification of 51 different peptides in major fractions. 19 peptides selected were synthesised and tested for their ACE inhibitory potential. TL, TI, IK, LR, LD, IQ, DI, AILE, ALLE, ALIE and AIIE were identified as new ACE inhibitors. The findings from this study point UPLC-MS/MS combined with the creation of a database as an efficient technique to identify specific short peptides within a complex hydrolysate, in addition with de novo sequencing. This efficient characterisation of bioactive factors like cryptides in protein hydrolysates will extend their use as functional foods. Copyright © 2017 Elsevier B.V. All rights reserved.

  12. Automated builder and database of protein/membrane complexes for molecular dynamics simulations.

    Directory of Open Access Journals (Sweden)

    Sunhwan Jo

    2007-09-01

    Full Text Available Molecular dynamics simulations of membrane proteins have provided deeper insights into their functions and interactions with surrounding environments at the atomic level. However, compared to solvation of globular proteins, building a realistic protein/membrane complex is still challenging and requires considerable experience with simulation software. Membrane Builder in the CHARMM-GUI website (http://www.charmm-gui.org helps users to build such a complex system using a web browser with a graphical user interface. Through a generalized and automated building process including system size determination as well as generation of lipid bilayer, pore water, bulk water, and ions, a realistic membrane system with virtually any kinds and shapes of membrane proteins can be generated in 5 minutes to 2 hours depending on the system size. Default values that were elaborated and tested extensively are given in each step to provide reasonable options and starting points for both non-expert and expert users. The efficacy of Membrane Builder is illustrated by its applications to 12 transmembrane and 3 interfacial membrane proteins, whose fully equilibrated systems with three different types of lipid molecules (DMPC, DPPC, and POPC and two types of system shapes (rectangular and hexagonal are freely available on the CHARMM-GUI website. One of the most significant advantages of using the web environment is that, if a problem is found, users can go back and re-generate the whole system again before quitting the browser. Therefore, Membrane Builder provides the intuitive and easy way to build and simulate the biologically important membrane system.

  13. Exploring the composition of protein-ligand binding sites on a large scale.

    Directory of Open Access Journals (Sweden)

    Nickolay A Khazanov

    Full Text Available The residue composition of a ligand binding site determines the interactions available for diffusion-mediated ligand binding, and understanding general composition of these sites is of great importance if we are to gain insight into the functional diversity of the proteome. Many structure-based drug design methods utilize such heuristic information for improving prediction or characterization of ligand-binding sites in proteins of unknown function. The Binding MOAD database if one of the largest curated sets of protein-ligand complexes, and provides a source of diverse, high-quality data for establishing general trends of residue composition from currently available protein structures. We present an analysis of 3,295 non-redundant proteins with 9,114 non-redundant binding sites to identify residues over-represented in binding regions versus the rest of the protein surface. The Binding MOAD database delineates biologically-relevant "valid" ligands from "invalid" small-molecule ligands bound to the protein. Invalids are present in the crystallization medium and serve no known biological function. Contacts are found to differ between these classes of ligands, indicating that residue composition of biologically relevant binding sites is distinct not only from the rest of the protein surface, but also from surface regions capable of opportunistic binding of non-functional small molecules. To confirm these trends, we perform a rigorous analysis of the variation of residue propensity with respect to the size of the dataset and the content bias inherent in structure sets obtained from a large protein structure database. The optimal size of the dataset for establishing general trends of residue propensities, as well as strategies for assessing the significance of such trends, are suggested for future studies of binding-site composition.

  14. A non-redundant role for Drosophila Mkk4 and hemipterous/Mkk7 in TAK1-mediated activation of JNK.

    Directory of Open Access Journals (Sweden)

    Peter Geuking

    Full Text Available BACKGROUND: The JNK pathway is a mitogen-activated protein (MAP kinase pathway involved in the regulation of numerous physiological processes during development and in response to environmental stress. JNK activity is controlled by two MAPK kinases (MAPKK, Mkk4 and Mkk7. Mkk7 plays a prominent role upon Tumor Necrosis Factor (TNF stimulation. Eiger, the unique TNF-superfamily ligand in Drosophila, potently activates JNK signaling through the activation of the MAPKKK Tak1. METHODOLOGY/PRINCIPAL FINDINGS: In a dominant suppressor screen for new components of the Eiger/JNK-pathway in Drosophila, we have identified an allelic series of the Mkk4 gene. Our genetic and biochemical results demonstrate that Mkk4 is dispensable for normal development and host resistance to systemic bacterial infection but plays a non-redundant role as a MAPKK acting in parallel to Hemipterous/Mkk7 in dTAK1-mediated JNK activation upon Eiger and Imd pathway activation. CONCLUSIONS/SIGNIFICANCE: In contrast to mammals, it seems that in Drosophila both MAPKKs, Hep/Mkk7 and Mkk4, are required to induce JNK upon TNF or pro-inflammatory stimulation.

  15. ProFITS of maize: a database of protein families involved in the transduction of signalling in the maize genome

    Directory of Open Access Journals (Sweden)

    Zhang Zhenhai

    2010-10-01

    Full Text Available Abstract Background Maize (Zea mays ssp. mays L. is an important model for plant basic and applied research. In 2009, the B73 maize genome sequencing made a great step forward, using clone by clone strategy; however, functional annotation and gene classification of the maize genome are still limited. Thus, a well-annotated datasets and informative database will be important for further research discoveries. Signal transduction is a fundamental biological process in living cells, and many protein families participate in this process in sensing, amplifying and responding to various extracellular or internal stimuli. Therefore, it is a good starting point to integrate information on the maize functional genes involved in signal transduction. Results Here we introduce a comprehensive database 'ProFITS' (Protein Families Involved in the Transduction of Signalling, which endeavours to identify and classify protein kinases/phosphatases, transcription factors and ubiquitin-proteasome-system related genes in the B73 maize genome. Users can explore gene models, corresponding transcripts and FLcDNAs using the three abovementioned protein hierarchical categories, and visualize them using an AJAX-based genome browser (JBrowse or Generic Genome Browser (GBrowse. Functional annotations such as GO annotation, protein signatures, protein best-hits in the Arabidopsis and rice genome are provided. In addition, pre-calculated transcription factor binding sites of each gene are generated and mutant information is incorporated into ProFITS. In short, ProFITS provides a user-friendly web interface for studies in signal transduction process in maize. Conclusion ProFITS, which utilizes both the B73 maize genome and full length cDNA (FLcDNA datasets, provides users a comprehensive platform of maize annotation with specific focus on the categorization of families involved in the signal transduction process. ProFITS is designed as a user-friendly web interface and it is

  16. Database Description - PSCDB | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available abase Description General information of database Database name PSCDB Alternative n...rial Science and Technology (AIST) Takayuki Amemiya E-mail: Database classification Structure Databases - Protein structure Database...554-D558. External Links: Original website information Database maintenance site Graduate School of Informat...available URL of Web services - Need for user registration Not available About This Database Database Descri...ption Download License Update History of This Database Site Policy | Contact Us Database Description - PSCDB | LSDB Archive ...

  17. JASPAR 2010: the greatly expanded open-access database of transcription factor binding profiles

    Science.gov (United States)

    Portales-Casamar, Elodie; Thongjuea, Supat; Kwon, Andrew T.; Arenillas, David; Zhao, Xiaobei; Valen, Eivind; Yusuf, Dimas; Lenhard, Boris; Wasserman, Wyeth W.; Sandelin, Albin

    2010-01-01

    JASPAR (http://jaspar.genereg.net) is the leading open-access database of matrix profiles describing the DNA-binding patterns of transcription factors (TFs) and other proteins interacting with DNA in a sequence-specific manner. Its fourth major release is the largest expansion of the core database to date: the database now holds 457 non-redundant, curated profiles. The new entries include the first batch of profiles derived from ChIP-seq and ChIP-chip whole-genome binding experiments, and 177 yeast TF binding profiles. The introduction of a yeast division brings the convenience of JASPAR to an active research community. As binding models are refined by newer data, the JASPAR database now uses versioning of matrices: in this release, 12% of the older models were updated to improved versions. Classification of TF families has been improved by adopting a new DNA-binding domain nomenclature. A curated catalog of mammalian TFs is provided, extending the use of the JASPAR profiles to additional TFs belonging to the same structural family. The changes in the database set the system ready for more rapid acquisition of new high-throughput data sources. Additionally, three new special collections provide matrix profile data produced by recent alternative high-throughput approaches. PMID:19906716

  18. Cluster based on sequence comparison of homologous proteins of 95 organism species - Gclust Server | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Gclust Server Cluster based on sequence comparison of homologous proteins of 95 organism spe...cies Data detail Data name Cluster based on sequence comparison of homologous proteins of 95 organism specie...istory of This Database Site Policy | Contact Us Cluster based on sequence compariso

  19. Oligomeric protein structure networks: insights into protein-protein interactions

    Directory of Open Access Journals (Sweden)

    Brinda KV

    2005-12-01

    Full Text Available Abstract Background Protein-protein association is essential for a variety of cellular processes and hence a large number of investigations are being carried out to understand the principles of protein-protein interactions. In this study, oligomeric protein structures are viewed from a network perspective to obtain new insights into protein association. Structure graphs of proteins have been constructed from a non-redundant set of protein oligomer crystal structures by considering amino acid residues as nodes and the edges are based on the strength of the non-covalent interactions between the residues. The analysis of such networks has been carried out in terms of amino acid clusters and hubs (highly connected residues with special emphasis to protein interfaces. Results A variety of interactions such as hydrogen bond, salt bridges, aromatic and hydrophobic interactions, which occur at the interfaces are identified in a consolidated manner as amino acid clusters at the interface, from this study. Moreover, the characterization of the highly connected hub-forming residues at the interfaces and their comparison with the hubs from the non-interface regions and the non-hubs in the interface regions show that there is a predominance of charged interactions at the interfaces. Further, strong and weak interfaces are identified on the basis of the interaction strength between amino acid residues and the sizes of the interface clusters, which also show that many protein interfaces are stronger than their monomeric protein cores. The interface strengths evaluated based on the interface clusters and hubs also correlate well with experimentally determined dissociation constants for known complexes. Finally, the interface hubs identified using the present method correlate very well with experimentally determined hotspots in the interfaces of protein complexes obtained from the Alanine Scanning Energetics database (ASEdb. A few predictions of interface hot

  20. Conformationally selective multidimensional chemical shift ranges in proteins from a PACSY database purged using intrinsic quality criteria

    International Nuclear Information System (INIS)

    Fritzsching, Keith J.; Hong, Mei; Schmidt-Rohr, Klaus

    2016-01-01

    We have determined refined multidimensional chemical shift ranges for intra-residue correlations ( 13 C– 13 C, 15 N– 13 C, etc.) in proteins, which can be used to gain type-assignment and/or secondary-structure information from experimental NMR spectra. The chemical-shift ranges are the result of a statistical analysis of the PACSY database of >3000 proteins with 3D structures (1,200,207 13 C chemical shifts and >3 million chemical shifts in total); these data were originally derived from the Biological Magnetic Resonance Data Bank. Using relatively simple non-parametric statistics to find peak maxima in the distributions of helix, sheet, coil and turn chemical shifts, and without the use of limited “hand-picked” data sets, we show that ∼94 % of the 13 C NMR data and almost all 15 N data are quite accurately referenced and assigned, with smaller standard deviations (0.2 and 0.8 ppm, respectively) than recognized previously. On the other hand, approximately 6 % of the 13 C chemical shift data in the PACSY database are shown to be clearly misreferenced, mostly by ca. −2.4 ppm. The removal of the misreferenced data and other outliers by this purging by intrinsic quality criteria (PIQC) allows for reliable identification of secondary maxima in the two-dimensional chemical-shift distributions already pre-separated by secondary structure. We demonstrate that some of these correspond to specific regions in the Ramachandran plot, including left-handed helix dihedral angles, reflect unusual hydrogen bonding, or are due to the influence of a following proline residue. With appropriate smoothing, significantly more tightly defined chemical shift ranges are obtained for each amino acid type in the different secondary structures. These chemical shift ranges, which may be defined at any statistical threshold, can be used for amino-acid type assignment and secondary-structure analysis of chemical shifts from intra-residue cross peaks by inspection or by using a

  1. Conformationally selective multidimensional chemical shift ranges in proteins from a PACSY database purged using intrinsic quality criteria

    Energy Technology Data Exchange (ETDEWEB)

    Fritzsching, Keith J., E-mail: kfritzsc@brandeis.edu [Brandeis University, Department of Chemistry (United States); Hong, Mei [Massachusetts Institute of Technology, Department of Chemistry (United States); Schmidt-Rohr, Klaus, E-mail: srohr@brandeis.edu [Brandeis University, Department of Chemistry (United States)

    2016-02-15

    We have determined refined multidimensional chemical shift ranges for intra-residue correlations ({sup 13}C–{sup 13}C, {sup 15}N–{sup 13}C, etc.) in proteins, which can be used to gain type-assignment and/or secondary-structure information from experimental NMR spectra. The chemical-shift ranges are the result of a statistical analysis of the PACSY database of >3000 proteins with 3D structures (1,200,207 {sup 13}C chemical shifts and >3 million chemical shifts in total); these data were originally derived from the Biological Magnetic Resonance Data Bank. Using relatively simple non-parametric statistics to find peak maxima in the distributions of helix, sheet, coil and turn chemical shifts, and without the use of limited “hand-picked” data sets, we show that ∼94 % of the {sup 13}C NMR data and almost all {sup 15}N data are quite accurately referenced and assigned, with smaller standard deviations (0.2 and 0.8 ppm, respectively) than recognized previously. On the other hand, approximately 6 % of the {sup 13}C chemical shift data in the PACSY database are shown to be clearly misreferenced, mostly by ca. −2.4 ppm. The removal of the misreferenced data and other outliers by this purging by intrinsic quality criteria (PIQC) allows for reliable identification of secondary maxima in the two-dimensional chemical-shift distributions already pre-separated by secondary structure. We demonstrate that some of these correspond to specific regions in the Ramachandran plot, including left-handed helix dihedral angles, reflect unusual hydrogen bonding, or are due to the influence of a following proline residue. With appropriate smoothing, significantly more tightly defined chemical shift ranges are obtained for each amino acid type in the different secondary structures. These chemical shift ranges, which may be defined at any statistical threshold, can be used for amino-acid type assignment and secondary-structure analysis of chemical shifts from intra

  2. Detailed tail proteomic analysis of axolotl (Ambystoma mexicanum) using an mRNA-seq reference database.

    Science.gov (United States)

    Demircan, Turan; Keskin, Ilknur; Dumlu, Seda Nilgün; Aytürk, Nilüfer; Avşaroğlu, Mahmut Erhan; Akgün, Emel; Öztürk, Gürkan; Baykal, Ahmet Tarık

    2017-01-01

    Salamander axolotl has been emerging as an important model for stem cell research due to its powerful regenerative capacity. Several advantages, such as the high capability of advanced tissue, organ, and appendages regeneration, promote axolotl as an ideal model system to extend our current understanding on the mechanisms of regeneration. Acknowledging the common molecular pathways between amphibians and mammals, there is a great potential to translate the messages from axolotl research to mammalian studies. However, the utilization of axolotl is hindered due to the lack of reference databases of genomic, transcriptomic, and proteomic data. Here, we introduce the proteome analysis of the axolotl tail section searched against an mRNA-seq database. We translated axolotl mRNA sequences to protein sequences and annotated these to process the LC-MS/MS data and identified 1001 nonredundant proteins. Functional classification of identified proteins was performed by gene ontology searches. The presence of some of the identified proteins was validated by in situ antibody labeling. Furthermore, we have analyzed the proteome expressional changes postamputation at three time points to evaluate the underlying mechanisms of the regeneration process. Taken together, this work expands the proteomics data of axolotl to contribute to its establishment as a fully utilized model. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  3. Purification, identification and preliminary crystallographic studies of Pru du amandin, an allergenic protein from Prunus dulcis

    Energy Technology Data Exchange (ETDEWEB)

    Gaur, Vineet; Sethi, Dhruv K.; Salunke, Dinakar M., E-mail: dinakar@nii.res.in [National Institute of Immunology, Aruna Asaf Ali Marg, New Delhi 110 067 (India)

    2008-01-01

    The purification, identification, crystallization and preliminary crystallographic studies of an allergy-related protein, Pru du amandin, from P. dulcis nuts are reported. Food allergies appear to be one of the foremost causes of hypersensitivity reactions. Nut allergies account for most food allergies and are often permanent. The 360 kDa hexameric protein Pru du amandin, a known allergen, was purified from almonds (Prunus dulcis) by ammonium sulfate fractionation and ion-exchange chromatography. The protein was identified by a BLAST homology search against the nonredundant sequence database. Pru du amandin belongs to the 11S legumin family of seed storage proteins characterized by the presence of a cupin motif. Crystals were obtained by the hanging-drop vapour-diffusion method. The crystals belong to space group P4{sub 1} (or P4{sub 3}), with unit-cell parameters a = b = 150.7, c = 164.9 Å.

  4. Purification, identification and preliminary crystallographic studies of Pru du amandin, an allergenic protein from Prunus dulcis

    International Nuclear Information System (INIS)

    Gaur, Vineet; Sethi, Dhruv K.; Salunke, Dinakar M.

    2007-01-01

    The purification, identification, crystallization and preliminary crystallographic studies of an allergy-related protein, Pru du amandin, from P. dulcis nuts are reported. Food allergies appear to be one of the foremost causes of hypersensitivity reactions. Nut allergies account for most food allergies and are often permanent. The 360 kDa hexameric protein Pru du amandin, a known allergen, was purified from almonds (Prunus dulcis) by ammonium sulfate fractionation and ion-exchange chromatography. The protein was identified by a BLAST homology search against the nonredundant sequence database. Pru du amandin belongs to the 11S legumin family of seed storage proteins characterized by the presence of a cupin motif. Crystals were obtained by the hanging-drop vapour-diffusion method. The crystals belong to space group P4 1 (or P4 3 ), with unit-cell parameters a = b = 150.7, c = 164.9 Å

  5. BioMagResBank databases DOCR and FRED containing converted and filtered sets of experimental NMR restraints and coordinates from over 500 protein PDB structures

    Energy Technology Data Exchange (ETDEWEB)

    Doreleijers, Jurgen F. [University of Wisconsin-Madison, BioMagResBank, Department of Biochemistry (United States); Nederveen, Aart J. [Utrecht University, Bijvoet Center for Biomolecular Research (Netherlands); Vranken, Wim [European Bioinformatics Institute, Macromolecular Structure Database group (United Kingdom); Lin Jundong [University of Wisconsin-Madison, BioMagResBank, Department of Biochemistry (United States); Bonvin, Alexandre M.J.J.; Kaptein, Robert [Utrecht University, Bijvoet Center for Biomolecular Research (Netherlands); Markley, John L.; Ulrich, Eldon L. [University of Wisconsin-Madison, BioMagResBank, Department of Biochemistry (United States)], E-mail: elu@bmrb.wisc.edu

    2005-05-15

    We present two new databases of NMR-derived distance and dihedral angle restraints: the Database Of Converted Restraints (DOCR) and the Filtered Restraints Database (FRED). These databases currently correspond to 545 proteins with NMR structures deposited in the Protein Databank (PDB). The criteria for inclusion were that these should be unique, monomeric proteins with author-provided experimental NMR data and coordinates available from the PDB capable of being parsed and prepared in a consistent manner. The Wattos program was used to parse the files, and the CcpNmr FormatConverter program was used to prepare them semi-automatically. New modules, including a new implementation of Aqua in the BioMagResBank (BMRB) software Wattos were used to analyze the sets of distance restraints (DRs) for inconsistencies, redundancies, NOE completeness, classification and violations with respect to the original coordinates. Restraints that could not be associated with a known nomenclature were flagged. The coordinates of hydrogen atoms were recalculated from the positions of heavy atoms to allow for a full restraint analysis. The DOCR database contains restraint and coordinate data that is made consistent with each other and with IUPAC conventions. The FRED database is based on the DOCR data but is filtered for use by test calculation protocols and longitudinal analyses and validations. These two databases are available from websites of the BMRB and the Macromolecular Structure Database (MSD) in various formats: NMR-STAR, CCPN XML, and in formats suitable for direct use in the software packages CNS and CYANA.

  6. BioMagResBank databases DOCR and FRED containing converted and filtered sets of experimental NMR restraints and coordinates from over 500 protein PDB structures

    International Nuclear Information System (INIS)

    Doreleijers, Jurgen F.; Nederveen, Aart J.; Vranken, Wim; Lin Jundong; Bonvin, Alexandre M.J.J.; Kaptein, Robert; Markley, John L.; Ulrich, Eldon L.

    2005-01-01

    We present two new databases of NMR-derived distance and dihedral angle restraints: the Database Of Converted Restraints (DOCR) and the Filtered Restraints Database (FRED). These databases currently correspond to 545 proteins with NMR structures deposited in the Protein Databank (PDB). The criteria for inclusion were that these should be unique, monomeric proteins with author-provided experimental NMR data and coordinates available from the PDB capable of being parsed and prepared in a consistent manner. The Wattos program was used to parse the files, and the CcpNmr FormatConverter program was used to prepare them semi-automatically. New modules, including a new implementation of Aqua in the BioMagResBank (BMRB) software Wattos were used to analyze the sets of distance restraints (DRs) for inconsistencies, redundancies, NOE completeness, classification and violations with respect to the original coordinates. Restraints that could not be associated with a known nomenclature were flagged. The coordinates of hydrogen atoms were recalculated from the positions of heavy atoms to allow for a full restraint analysis. The DOCR database contains restraint and coordinate data that is made consistent with each other and with IUPAC conventions. The FRED database is based on the DOCR data but is filtered for use by test calculation protocols and longitudinal analyses and validations. These two databases are available from websites of the BMRB and the Macromolecular Structure Database (MSD) in various formats: NMR-STAR, CCPN XML, and in formats suitable for direct use in the software packages CNS and CYANA

  7. GPCR-SSFE: A comprehensive database of G-protein-coupled receptor template predictions and homology models

    Directory of Open Access Journals (Sweden)

    Kreuchwig Annika

    2011-05-01

    Full Text Available Abstract Background G protein-coupled receptors (GPCRs transduce a wide variety of extracellular signals to within the cell and therefore have a key role in regulating cell activity and physiological function. GPCR malfunction is responsible for a wide range of diseases including cancer, diabetes and hyperthyroidism and a large proportion of drugs on the market target these receptors. The three dimensional structure of GPCRs is important for elucidating the molecular mechanisms underlying these diseases and for performing structure-based drug design. Although structural data are restricted to only a handful of GPCRs, homology models can be used as a proxy for those receptors not having crystal structures. However, many researchers working on GPCRs are not experienced homology modellers and are therefore unable to benefit from the information that can be gleaned from such three-dimensional models. Here, we present a comprehensive database called the GPCR-SSFE, which provides initial homology models of the transmembrane helices for a large variety of family A GPCRs. Description Extending on our previous theoretical work, we have developed an automated pipeline for GPCR homology modelling and applied it to a large set of family A GPCR sequences. Our pipeline is a fragment-based approach that exploits available family A crystal structures. The GPCR-SSFE database stores the template predictions, sequence alignments, identified sequence and structure motifs and homology models for 5025 family A GPCRs. Users are able to browse the GPCR dataset according to their pharmacological classification or search for results using a UniProt entry name. It is also possible for a user to submit a GPCR sequence that is not contained in the database for analysis and homology model building. The models can be viewed using a Jmol applet and are also available for download along with the alignments. Conclusions The data provided by GPCR-SSFE are useful for investigating

  8. GPCR-SSFE: a comprehensive database of G-protein-coupled receptor template predictions and homology models.

    Science.gov (United States)

    Worth, Catherine L; Kreuchwig, Annika; Kleinau, Gunnar; Krause, Gerd

    2011-05-23

    G protein-coupled receptors (GPCRs) transduce a wide variety of extracellular signals to within the cell and therefore have a key role in regulating cell activity and physiological function. GPCR malfunction is responsible for a wide range of diseases including cancer, diabetes and hyperthyroidism and a large proportion of drugs on the market target these receptors. The three dimensional structure of GPCRs is important for elucidating the molecular mechanisms underlying these diseases and for performing structure-based drug design. Although structural data are restricted to only a handful of GPCRs, homology models can be used as a proxy for those receptors not having crystal structures. However, many researchers working on GPCRs are not experienced homology modellers and are therefore unable to benefit from the information that can be gleaned from such three-dimensional models. Here, we present a comprehensive database called the GPCR-SSFE, which provides initial homology models of the transmembrane helices for a large variety of family A GPCRs. Extending on our previous theoretical work, we have developed an automated pipeline for GPCR homology modelling and applied it to a large set of family A GPCR sequences. Our pipeline is a fragment-based approach that exploits available family A crystal structures. The GPCR-SSFE database stores the template predictions, sequence alignments, identified sequence and structure motifs and homology models for 5025 family A GPCRs. Users are able to browse the GPCR dataset according to their pharmacological classification or search for results using a UniProt entry name. It is also possible for a user to submit a GPCR sequence that is not contained in the database for analysis and homology model building. The models can be viewed using a Jmol applet and are also available for download along with the alignments. The data provided by GPCR-SSFE are useful for investigating general and detailed sequence-structure-function relationships

  9. ORFer--retrieval of protein sequences and open reading frames from GenBank and storage into relational databases or text files.

    Science.gov (United States)

    Büssow, Konrad; Hoffmann, Steve; Sievert, Volker

    2002-12-19

    Functional genomics involves the parallel experimentation with large sets of proteins. This requires management of large sets of open reading frames as a prerequisite of the cloning and recombinant expression of these proteins. A Java program was developed for retrieval of protein and nucleic acid sequences and annotations from NCBI GenBank, using the XML sequence format. Annotations retrieved by ORFer include sequence name, organism and also the completeness of the sequence. The program has a graphical user interface, although it can be used in a non-interactive mode. For protein sequences, the program also extracts the open reading frame sequence, if available, and checks its correct translation. ORFer accepts user input in the form of single or lists of GenBank GI identifiers or accession numbers. It can be used to extract complete sets of open reading frames and protein sequences from any kind of GenBank sequence entry, including complete genomes or chromosomes. Sequences are either stored with their features in a relational database or can be exported as text files in Fasta or tabulator delimited format. The ORFer program is freely available at http://www.proteinstrukturfabrik.de/orfer. The ORFer program allows for fast retrieval of DNA sequences, protein sequences and their open reading frames and sequence annotations from GenBank. Furthermore, storage of sequences and features in a relational database is supported. Such a database can supplement a laboratory information system (LIMS) with appropriate sequence information.

  10. Purification, identification and preliminary crystallographic studies of Pru du amandin, an allergenic protein from Prunus dulcis.

    Science.gov (United States)

    Gaur, Vineet; Sethi, Dhruv K; Salunke, Dinakar M

    2008-01-01

    Food allergies appear to be one of the foremost causes of hypersensitivity reactions. Nut allergies account for most food allergies and are often permanent. The 360 kDa hexameric protein Pru du amandin, a known allergen, was purified from almonds (Prunus dulcis) by ammonium sulfate fractionation and ion-exchange chromatography. The protein was identified by a BLAST homology search against the nonredundant sequence database. Pru du amandin belongs to the 11S legumin family of seed storage proteins characterized by the presence of a cupin motif. Crystals were obtained by the hanging-drop vapour-diffusion method. The crystals belong to space group P4(1) (or P4(3)), with unit-cell parameters a = b = 150.7, c = 164.9 A.

  11. Accelerating Smith-Waterman Alignment for Protein Database Search Using Frequency Distance Filtration Scheme Based on CPU-GPU Collaborative System.

    Science.gov (United States)

    Liu, Yu; Hong, Yang; Lin, Chun-Yuan; Hung, Che-Lun

    2015-01-01

    The Smith-Waterman (SW) algorithm has been widely utilized for searching biological sequence databases in bioinformatics. Recently, several works have adopted the graphic card with Graphic Processing Units (GPUs) and their associated CUDA model to enhance the performance of SW computations. However, these works mainly focused on the protein database search by using the intertask parallelization technique, and only using the GPU capability to do the SW computations one by one. Hence, in this paper, we will propose an efficient SW alignment method, called CUDA-SWfr, for the protein database search by using the intratask parallelization technique based on a CPU-GPU collaborative system. Before doing the SW computations on GPU, a procedure is applied on CPU by using the frequency distance filtration scheme (FDFS) to eliminate the unnecessary alignments. The experimental results indicate that CUDA-SWfr runs 9.6 times and 96 times faster than the CPU-based SW method without and with FDFS, respectively.

  12. A curated gluten protein sequence database to support development of proteomics methods for determination of gluten in gluten-free foods.

    Science.gov (United States)

    Bromilow, Sophie; Gethings, Lee A; Buckley, Mike; Bromley, Mike; Shewry, Peter R; Langridge, James I; Clare Mills, E N

    2017-06-23

    The unique physiochemical properties of wheat gluten enable a diverse range of food products to be manufactured. However, gluten triggers coeliac disease, a condition which is treated using a gluten-free diet. Analytical methods are required to confirm if foods are gluten-free, but current immunoassay-based methods can unreliable and proteomic methods offer an alternative but require comprehensive and well annotated sequence databases which are lacking for gluten. A manually a curated database (GluPro V1.0) of gluten proteins, comprising 630 discrete unique full length protein sequences has been compiled. It is representative of the different types of gliadin and glutenin components found in gluten. An in silico comparison of their coeliac toxicity was undertaken by analysing the distribution of coeliac toxic motifs. This demonstrated that whilst the α-gliadin proteins contained more toxic motifs, these were distributed across all gluten protein sub-types. Comparison of annotations observed using a discovery proteomics dataset acquired using ion mobility MS/MS showed that more reliable identifications were obtained using the GluPro V1.0 database compared to the complete reviewed Viridiplantae database. This highlights the value of a curated sequence database specifically designed to support the proteomic workflows and the development of methods to detect and quantify gluten. We have constructed the first manually curated open-source wheat gluten protein sequence database (GluPro V1.0) in a FASTA format to support the application of proteomic methods for gluten protein detection and quantification. We have also analysed the manually verified sequences to give the first comprehensive overview of the distribution of sequences able to elicit a reaction in coeliac disease, the prevalent form of gluten intolerance. Provision of this database will improve the reliability of gluten protein identification by proteomic analysis, and aid the development of targeted mass

  13. Transporter Classification Database (TCDB)

    Data.gov (United States)

    U.S. Department of Health & Human Services — The Transporter Classification Database details a comprehensive classification system for membrane transport proteins known as the Transporter Classification (TC)...

  14. Database Description - TMFunction | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available sidue (or mutant) in a protein. The experimental data are collected from the literature both by searching th...the sequence database, UniProt, structural database, PDB, and literature database

  15. Generation of a predicted protein database from EST data and application to iTRAQ analyses in grape (Vitis vinifera cv. Cabernet Sauvignon) berries at ripening initiation

    Science.gov (United States)

    Lücker, Joost; Laszczak, Mario; Smith, Derek; Lund, Steven T

    2009-01-01

    Background iTRAQ is a proteomics technique that uses isobaric tags for relative and absolute quantitation of tryptic peptides. In proteomics experiments, the detection and high confidence annotation of proteins and the significance of corresponding expression differences can depend on the quality and the species specificity of the tryptic peptide map database used for analysis of the data. For species for which finished genome sequence data are not available, identification of proteins relies on similarity to proteins from other species using comprehensive peptide map databases such as the MSDB. Results We were interested in characterizing ripening initiation ('veraison') in grape berries at the protein level in order to better define the molecular control of this important process for grape growers and wine makers. We developed a bioinformatic pipeline for processing EST data in order to produce a predicted tryptic peptide database specifically targeted to the wine grape cultivar, Vitis vinifera cv. Cabernet Sauvignon, and lacking truncated N- and C-terminal fragments. By searching iTRAQ MS/MS data generated from berry exocarp and mesocarp samples at ripening initiation, we determined that implementation of the custom database afforded a large improvement in high confidence peptide annotation in comparison to the MSDB. We used iTRAQ MS/MS in conjunction with custom peptide db searches to quantitatively characterize several important pathway components for berry ripening previously described at the transcriptional level and confirmed expression patterns for these at the protein level. Conclusion We determined that a predicted peptide database for MS/MS applications can be derived from EST data using advanced clustering and trimming approaches and successfully implemented for quantitative proteome profiling. Quantitative shotgun proteome profiling holds great promise for characterizing biological processes such as fruit ripening initiation and may be further

  16. Generation of a predicted protein database from EST data and application to iTRAQ analyses in grape (Vitis vinifera cv. Cabernet Sauvignon berries at ripening initiation

    Directory of Open Access Journals (Sweden)

    Smith Derek

    2009-01-01

    Full Text Available Abstract Background iTRAQ is a proteomics technique that uses isobaric tags for relative and absolute quantitation of tryptic peptides. In proteomics experiments, the detection and high confidence annotation of proteins and the significance of corresponding expression differences can depend on the quality and the species specificity of the tryptic peptide map database used for analysis of the data. For species for which finished genome sequence data are not available, identification of proteins relies on similarity to proteins from other species using comprehensive peptide map databases such as the MSDB. Results We were interested in characterizing ripening initiation ('veraison' in grape berries at the protein level in order to better define the molecular control of this important process for grape growers and wine makers. We developed a bioinformatic pipeline for processing EST data in order to produce a predicted tryptic peptide database specifically targeted to the wine grape cultivar, Vitis vinifera cv. Cabernet Sauvignon, and lacking truncated N- and C-terminal fragments. By searching iTRAQ MS/MS data generated from berry exocarp and mesocarp samples at ripening initiation, we determined that implementation of the custom database afforded a large improvement in high confidence peptide annotation in comparison to the MSDB. We used iTRAQ MS/MS in conjunction with custom peptide db searches to quantitatively characterize several important pathway components for berry ripening previously described at the transcriptional level and confirmed expression patterns for these at the protein level. Conclusion We determined that a predicted peptide database for MS/MS applications can be derived from EST data using advanced clustering and trimming approaches and successfully implemented for quantitative proteome profiling. Quantitative shotgun proteome profiling holds great promise for characterizing biological processes such as fruit ripening

  17. Database Description - ConfC | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available abase Description General information of database Database name ConfC Alternative name Database...amotsu Noguchi Tel: 042-495-8736 E-mail: Database classification Structure Database...s - Protein structure Structure Databases - Small molecules Structure Databases - Nucleic acid structure Database... services - Need for user registration - About This Database Database Description Download License Update History of This Database... Site Policy | Contact Us Database Description - ConfC | LSDB Archive ...

  18. Database Description - SAHG | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available base Description General information of database Database name SAHG Alternative nam...h: Contact address Chie Motono Tel : +81-3-3599-8067 E-mail : Database classification Structure Databases - ...e databases - Protein properties Organism Taxonomy Name: Homo sapiens Taxonomy ID: 9606 Database description... Links: Original website information Database maintenance site The Molecular Profiling Research Center for D...stration Not available About This Database Database Description Download License Update History of This Database Site Policy | Contact Us Database Description - SAHG | LSDB Archive ...

  19. Database Description - RPSD | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available base Description General information of database Database name RPSD Alternative nam...e Rice Protein Structure Database DOI 10.18908/lsdba.nbdc00749-000 Creator Creator Name: Toshimasa Yamazaki ... Ibaraki 305-8602, Japan National Institute of Agrobiological Sciences Toshimasa Yamazaki E-mail : Databas...e classification Structure Databases - Protein structure Organism Taxonomy Name: Or...or name(s): Journal: External Links: Original website information Database maintenance site National Institu

  20. The Use of the Time Average Visibility for Analyzing HERA-19 Commissioning Data: Effects of Non-Redundancy

    Science.gov (United States)

    Benefo, Roshan; Gallardo, Samavarti; Aguirre, James; La Plante, Paul; HERA Collaboration

    2018-01-01

    The Hydrogen Epoch of Reionization Array (HERA) is a radio telescope situated in South Africa designed to observe the universe from redshifts 13 through 6, in order to detect the emission of the 21 cm line from the hydrogen spin-flip transition. We perform 21 cm cosmology due to its relation with reionization; by detecting this emission line, we can identify the timing of reionization, and understand more about the nature of the universe during the birth of the first stars and galaxies. With that, we can understand the heating conditions of the initial universe, providing us a larger picture of the conditions that created the large-scale structure of the universe we observe today. The HERA array currently consists of 19 antennas, spaced in a hexagonal grid pattern. We consider a robust observable, the time-averaged visibility (TAV), which is in principle sensitive to variations in the beam pattern between antenna elements and is easier to measure than the beam pattern itself. We use this TAV to explore the non-redundancy of baselines in the HERA array due either to cross-coupling between antennas (probed by antenna location in the array) or non-uniformity in their manufacture. The TAV may provide a simple way of verifying improvements in antenna element redundancy.

  1. ATGC database and ATGC-COGs: an updated resource for micro- and macro-evolutionary studies of prokaryotic genomes and protein family annotation.

    Science.gov (United States)

    Kristensen, David M; Wolf, Yuri I; Koonin, Eugene V

    2017-01-04

    The Alignable Tight Genomic Clusters (ATGCs) database is a collection of closely related bacterial and archaeal genomes that provides several tools to aid research into evolutionary processes in the microbial world. Each ATGC is a taxonomy-independent cluster of 2 or more completely sequenced genomes that meet the objective criteria of a high degree of local gene order (synteny) and a small number of synonymous substitutions in the protein-coding genes. As such, each ATGC is suited for analysis of microevolutionary variations within a cohesive group of organisms (e.g. species), whereas the entire collection of ATGCs is useful for macroevolutionary studies. The ATGC database includes many forms of pre-computed data, in particular ATGC-COGs (Clusters of Orthologous Genes), multiple sequence alignments, a set of 'index' orthologs representing the most well-conserved members of each ATGC-COG, the phylogenetic tree of the organisms within each ATGC, etc. Although the ATGC database contains several million proteins from thousands of genomes organized into hundreds of clusters (roughly a 4-fold increase since the last version of the ATGC database), it is now built with completely automated methods and will be regularly updated following new releases of the NCBI RefSeq database. The ATGC database is hosted jointly at the University of Iowa at dmk-brain.ecn.uiowa.edu/ATGC/ and the NCBI at ftp.ncbi.nlm.nih.gov/pub/kristensen/ATGC/atgc_home.html. Published by Oxford University Press on behalf of Nucleic Acids Research 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.

  2. Sequence protein identification by randomized sequence database and transcriptome mass spectrometry (SPIDER-TMS): from manual to automatic application of a 'de novo sequencing' approach.

    Science.gov (United States)

    Pascale, Raffaella; Grossi, Gerarda; Cruciani, Gabriele; Mecca, Giansalvatore; Santoro, Donatello; Sarli Calace, Renzo; Falabella, Patrizia; Bianco, Giuliana

    Sequence protein identification by a randomized sequence database and transcriptome mass spectrometry software package has been developed at the University of Basilicata in Potenza (Italy) and designed to facilitate the determination of the amino acid sequence of a peptide as well as an unequivocal identification of proteins in a high-throughput manner with enormous advantages of time, economical resource and expertise. The software package is a valid tool for the automation of a de novo sequencing approach, overcoming the main limits and a versatile platform useful in the proteomic field for an unequivocal identification of proteins, starting from tandem mass spectrometry data. The strength of this software is that it is a user-friendly and non-statistical approach, so protein identification can be considered unambiguous.

  3. DMPD: Suppressor of cytokine signaling (SOCS) 2, a protein with multiple functions. [Dynamic Macrophage Pathway CSML Database

    Lifescience Database Archive (English)

    Full Text Available 17070092 Suppressor of cytokine signaling (SOCS) 2, a protein with multiple function...Epub 2006 Oct 27. (.png) (.svg) (.html) (.csml) Show Suppressor of cytokine signaling (SOCS) 2, a protein with multiple function...SOCS) 2, a protein with multiple functions. Authors Rico-Bautista E, Flores-Morales A, Fernandez-Perez L. Pu

  4. DMPD: G-protein-coupled receptor expression, function, and signaling in macrophages. [Dynamic Macrophage Pathway CSML Database

    Lifescience Database Archive (English)

    Full Text Available 17456803 G-protein-coupled receptor expression, function, and signaling in macropha...2007 Apr 24. (.png) (.svg) (.html) (.csml) Show G-protein-coupled receptor expression, function, and signali...ng in macrophages. PubmedID 17456803 Title G-protein-coupled receptor expression, function

  5. DMPD: Protein kinase C epsilon: a new target to control inflammation andimmune-mediated disorders. [Dynamic Macrophage Pathway CSML Database

    Lifescience Database Archive (English)

    Full Text Available 14643884 Protein kinase C epsilon: a new target to control inflammation andimmune-m...g) (.html) (.csml) Show Protein kinase C epsilon: a new target to control inflammation andimmune-mediated di...sorders. PubmedID 14643884 Title Protein kinase C epsilon: a new target to contro

  6. DMPD: The role of Toll-like receptors and Nod proteins in bacterial infection. [Dynamic Macrophage Pathway CSML Database

    Lifescience Database Archive (English)

    Full Text Available 15476921 The role of Toll-like receptors and Nod proteins in bacterial infection. P...of Toll-like receptors and Nod proteins in bacterial infection. PubmedID 15476921 Title The role of Toll-like receptors and Nod prote...ins in bacterial infection. Authors Philpott DJ, Girardi

  7. DMPD: Regulation of innate immunity by suppressor of cytokine signaling (SOCS)proteins. [Dynamic Macrophage Pathway CSML Database

    Lifescience Database Archive (English)

    Full Text Available 18406369 Regulation of innate immunity by suppressor of cytokine signaling (SOCS)proteins...svg) (.html) (.csml) Show Regulation of innate immunity by suppressor of cytokine signaling (SOCS)proteins. ...PubmedID 18406369 Title Regulation of innate immunity by suppressor of cytokine signaling (SOCS)proteins

  8. DMPD: Macrophage-stimulating protein and RON receptor tyrosine kinase: potentialregulators of macrophage inflammatory activities. [Dynamic Macrophage Pathway CSML Database

    Lifescience Database Archive (English)

    Full Text Available 12472665 Macrophage-stimulating protein and RON receptor tyrosine kinase: potential...:545-53. (.png) (.svg) (.html) (.csml) Show Macrophage-stimulating protein and RON receptor tyrosine kinase:...le Macrophage-stimulating protein and RON receptor tyrosine kinase: potentialregulators of macrophage inflam

  9. Exploiting protein flexibility to predict the location of allosteric sites

    Directory of Open Access Journals (Sweden)

    Panjkovich Alejandro

    2012-10-01

    Full Text Available Abstract Background Allostery is one of the most powerful and common ways of regulation of protein activity. However, for most allosteric proteins identified to date the mechanistic details of allosteric modulation are not yet well understood. Uncovering common mechanistic patterns underlying allostery would allow not only a better academic understanding of the phenomena, but it would also streamline the design of novel therapeutic solutions. This relatively unexplored therapeutic potential and the putative advantages of allosteric drugs over classical active-site inhibitors fuel the attention allosteric-drug research is receiving at present. A first step to harness the regulatory potential and versatility of allosteric sites, in the context of drug-discovery and design, would be to detect or predict their presence and location. In this article, we describe a simple computational approach, based on the effect allosteric ligands exert on protein flexibility upon binding, to predict the existence and position of allosteric sites on a given protein structure. Results By querying the literature and a recently available database of allosteric sites, we gathered 213 allosteric proteins with structural information that we further filtered into a non-redundant set of 91 proteins. We performed normal-mode analysis and observed significant changes in protein flexibility upon allosteric-ligand binding in 70% of the cases. These results agree with the current view that allosteric mechanisms are in many cases governed by changes in protein dynamics caused by ligand binding. Furthermore, we implemented an approach that achieves 65% positive predictive value in identifying allosteric sites within the set of predicted cavities of a protein (stricter parameters set, 0.22 sensitivity, by combining the current analysis on dynamics with previous results on structural conservation of allosteric sites. We also analyzed four biological examples in detail, revealing

  10. Accelerating Smith-Waterman Alignment for Protein Database Search Using Frequency Distance Filtration Scheme Based on CPU-GPU Collaborative System

    Directory of Open Access Journals (Sweden)

    Yu Liu

    2015-01-01

    Full Text Available The Smith-Waterman (SW algorithm has been widely utilized for searching biological sequence databases in bioinformatics. Recently, several works have adopted the graphic card with Graphic Processing Units (GPUs and their associated CUDA model to enhance the performance of SW computations. However, these works mainly focused on the protein database search by using the intertask parallelization technique, and only using the GPU capability to do the SW computations one by one. Hence, in this paper, we will propose an efficient SW alignment method, called CUDA-SWfr, for the protein database search by using the intratask parallelization technique based on a CPU-GPU collaborative system. Before doing the SW computations on GPU, a procedure is applied on CPU by using the frequency distance filtration scheme (FDFS to eliminate the unnecessary alignments. The experimental results indicate that CUDA-SWfr runs 9.6 times and 96 times faster than the CPU-based SW method without and with FDFS, respectively.

  11. SpirPro: A Spirulina proteome database and web-based tools for the analysis of protein-protein interactions at the metabolic level in Spirulina (Arthrospira) platensis C1.

    Science.gov (United States)

    Senachak, Jittisak; Cheevadhanarak, Supapon; Hongsthong, Apiradee

    2015-07-29

    Spirulina (Arthrospira) platensis is the only cyanobacterium that in addition to being studied at the molecular level and subjected to gene manipulation, can also be mass cultivated in outdoor ponds for commercial use as a food supplement. Thus, encountering environmental changes, including temperature stresses, is common during the mass production of Spirulina. The use of cyanobacteria as an experimental platform, especially for photosynthetic gene manipulation in plants and bacteria, is becoming increasingly important. Understanding the mechanisms and protein-protein interaction networks that underlie low- and high-temperature responses is relevant to Spirulina mass production. To accomplish this goal, high-throughput techniques such as OMICs analyses are used. Thus, large datasets must be collected, managed and subjected to information extraction. Therefore, databases including (i) proteomic analysis and protein-protein interaction (PPI) data and (ii) domain/motif visualization tools are required for potential use in temperature response models for plant chloroplasts and photosynthetic bacteria. A web-based repository was developed including an embedded database, SpirPro, and tools for network visualization. Proteome data were analyzed integrated with protein-protein interactions and/or metabolic pathways from KEGG. The repository provides various information, ranging from raw data (2D-gel images) to associated results, such as data from interaction and/or pathway analyses. This integration allows in silico analyses of protein-protein interactions affected at the metabolic level and, particularly, analyses of interactions between and within the affected metabolic pathways under temperature stresses for comparative proteomic analysis. The developed tool, which is coded in HTML with CSS/JavaScript and depicted in Scalable Vector Graphics (SVG), is designed for interactive analysis and exploration of the constructed network. SpirPro is publicly available on the web

  12. Gene Discovery in the Apicomplexa as Revealed by EST Sequencing and Assembly of a Comparative Gene Database

    Science.gov (United States)

    Li, Li; Brunk, Brian P.; Kissinger, Jessica C.; Pape, Deana; Tang, Keliang; Cole, Robert H.; Martin, John; Wylie, Todd; Dante, Mike; Fogarty, Steven J.; Howe, Daniel K.; Liberator, Paul; Diaz, Carmen; Anderson, Jennifer; White, Michael; Jerome, Maria E.; Johnson, Emily A.; Radke, Jay A.; Stoeckert, Christian J.; Waterston, Robert H.; Clifton, Sandra W.; Roos, David S.; Sibley, L. David

    2003-01-01

    Large-scale EST sequencing projects for several important parasites within the phylum Apicomplexa were undertaken for the purpose of gene discovery. Included were several parasites of medical importance (Plasmodium falciparum, Toxoplasma gondii) and others of veterinary importance (Eimeria tenella, Sarcocystis neurona, and Neospora caninum). A total of 55,192 ESTs, deposited into dbEST/GenBank, were included in the analyses. The resulting sequences have been clustered into nonredundant gene assemblies and deposited into a relational database that supports a variety of sequence and text searches. This database has been used to compare the gene assemblies using BLAST similarity comparisons to the public protein databases to identify putative genes. Of these new entries, ∼15%–20% represent putative homologs with a conservative cutoff of p neurona: , , , , , , , , , , , , , –, –, –, –, –. Eimeria tenella: –, –, –, –, –, –, –, –, – , –, –, –, –, –, –, –, –, –, –, –. Neospora caninum: –, –, , – , –, –.] PMID:12618375

  13. Database Description - Trypanosomes Database | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Trypanosomes Database Database Description General information of database Database name Trypanosomes Database...stitute of Genetics Research Organization of Information and Systems Yata 1111, Mishima, Shizuoka 411-8540, JAPAN E mail: Database...y Name: Trypanosoma Taxonomy ID: 5690 Taxonomy Name: Homo sapiens Taxonomy ID: 9606 Database description The... Article title: Author name(s): Journal: External Links: Original website information Database maintenance s...DB (Protein Data Bank) KEGG PATHWAY Database DrugPort Entry list Available Query search Available Web servic

  14. LoopX: A Graphical User Interface-Based Database for Comprehensive Analysis and Comparative Evaluation of Loops from Protein Structures.

    Science.gov (United States)

    Kadumuri, Rajashekar Varma; Vadrevu, Ramakrishna

    2017-10-01

    Due to their crucial role in function, folding, and stability, protein loops are being targeted for grafting/designing to create novel or alter existing functionality and improve stability and foldability. With a view to facilitate a thorough analysis and effectual search options for extracting and comparing loops for sequence and structural compatibility, we developed, LoopX a comprehensively compiled library of sequence and conformational features of ∼700,000 loops from protein structures. The database equipped with a graphical user interface is empowered with diverse query tools and search algorithms, with various rendering options to visualize the sequence- and structural-level information along with hydrogen bonding patterns, backbone φ, ψ dihedral angles of both the target and candidate loops. Two new features (i) conservation of the polar/nonpolar environment and (ii) conservation of sequence and conformation of specific residues within the loops have also been incorporated in the search and retrieval of compatible loops for a chosen target loop. Thus, the LoopX server not only serves as a database and visualization tool for sequence and structural analysis of protein loops but also aids in extracting and comparing candidate loops for a given target loop based on user-defined search options.

  15. CUDASW++2.0: enhanced Smith-Waterman protein database search on CUDA-enabled GPUs based on SIMT and virtualized SIMD abstractions

    Directory of Open Access Journals (Sweden)

    Schmidt Bertil

    2010-04-01

    Full Text Available Abstract Background Due to its high sensitivity, the Smith-Waterman algorithm is widely used for biological database searches. Unfortunately, the quadratic time complexity of this algorithm makes it highly time-consuming. The exponential growth of biological databases further deteriorates the situation. To accelerate this algorithm, many efforts have been made to develop techniques in high performance architectures, especially the recently emerging many-core architectures and their associated programming models. Findings This paper describes the latest release of the CUDASW++ software, CUDASW++ 2.0, which makes new contributions to Smith-Waterman protein database searches using compute unified device architecture (CUDA. A parallel Smith-Waterman algorithm is proposed to further optimize the performance of CUDASW++ 1.0 based on the single instruction, multiple thread (SIMT abstraction. For the first time, we have investigated a partitioned vectorized Smith-Waterman algorithm using CUDA based on the virtualized single instruction, multiple data (SIMD abstraction. The optimized SIMT and the partitioned vectorized algorithms were benchmarked, and remarkably, have similar performance characteristics. CUDASW++ 2.0 achieves performance improvement over CUDASW++ 1.0 as much as 1.74 (1.72 times using the optimized SIMT algorithm and up to 1.77 (1.66 times using the partitioned vectorized algorithm, with a performance of up to 17 (30 billion cells update per second (GCUPS on a single-GPU GeForce GTX 280 (dual-GPU GeForce GTX 295 graphics card. Conclusions CUDASW++ 2.0 is publicly available open-source software, written in CUDA and C++ programming languages. It obtains significant performance improvement over CUDASW++ 1.0 using either the optimized SIMT algorithm or the partitioned vectorized algorithm for Smith-Waterman protein database searches by fully exploiting the compute capability of commonly used CUDA-enabled low-cost GPUs.

  16. Mining the Human Complexome Database Identifies RBM14 as an XPO1-Associated Protein Involved in HIV-1 Rev Function

    OpenAIRE

    Budhiraja, Sona; Liu, Hongbing; Couturier, Jacob; Malovannaya, Anna; Qin, Jun; Lewis, Dorothy E.; Rice, Andrew P.

    2015-01-01

    By recruiting the host protein XPO1 (CRM1), the HIV-1 Rev protein mediates the nuclear export of incompletely spliced viral transcripts. We mined data from the recently described human nuclear complexome to identify a host protein, RBM14, which associates with XPO1 and Rev and is involved in Rev function. Using a Rev-dependent p24 reporter plasmid, we found that RBM14 depletion decreased Rev activity and Rev-mediated enhancement of the cytoplasmic levels of unspliced viral transcripts. RBM14 ...

  17. DMPD: Pellino proteins: novel players in TLR and IL-1R signalling. [Dynamic Macrophage Pathway CSML Database

    Lifescience Database Archive (English)

    Full Text Available 17635639 Pellino proteins: novel players in TLR and IL-1R signalling. Schauvliege R..., Janssens S, Beyaert R. J Cell Mol Med. 2007 May-Jun;11(3):453-61. (.png) (.svg) (.html) (.csml) Show Pellino proteins: novel player...s in TLR and IL-1R signalling. PubmedID 17635639 Title Pellino proteins: novel player...tml) CSML File (.csml) Open .csml file with CIOPlayer Open .csml file with CIOPlayer - ※CIO Playerのご利用上の注意 Open .csml file with CIO Open .csml file with CIO - ※CIOのご利用上の注意 ...

  18. The human keratinocyte two-dimensional gel protein database (update 1995): mapping components of signal transduction pathways

    DEFF Research Database (Denmark)

    Celis, J E; Rasmussen, H H; Gromov, P

    1995-01-01

    identified (protein name, organelle components, etc.) using a procedure or a combination of procedures that include (i) comigration with known human proteins, (ii) 2-D gel immunoblotting using specific antibodies, (iii) microsequencing of Coomassie Brilliant Blue stained proteins, (iv) mass spectrometry, (v......)vaccinia virus expression of full length cDNAs, and (vi) in vitro transcription/translation of full-length cDNAs. This year, special emphasis has been given to the identification of signal transduction components by using 2-D gel immunoblotting of crude keratinocyte lysates in combination with enhanced......--through a systematic study of ekeratinocytes--qualitative and quantitative information on proteins and their genes that may allow us to identify abnormal patterns of gene expression and to pinpoint signaling pathways and components affected in various skin diseases, cancer included. Udgivelsesdato: 1995-Dec...

  19. Proteomic analysis of Pinus radiata needles: 2-DE map and protein identification by LC/MS/MS and substitution-tolerant database searching.

    Science.gov (United States)

    Valledor, Luis; Castillejo, Maria A; Lenz, Christof; Rodríguez, Roberto; Cañal, Maria J; Jorrín, Jesús

    2008-07-01

    Pinus radiata is one of the most economically important forest tree species, with a worldwide production of around 370 million m (3) of wood per year. Current selection of elite trees to be used in conservation and breeding programes requires the physiological and molecular characterization of available populations. To identify key proteins related to tree growth, productivity and responses to environmental factors, a proteomic approach is being utilized. In this paper, we present the first report of the 2-DE protein reference map of physiologically mature P. radiata needles, as a basis for subsequent differential expression proteomic studies related to growth, development, biomass production and responses to stresses. After TCA/acetone protein extraction of needle tissue, 549 +/- 21 well-resolved spots were detected in Coommassie-stained gels within the 5-8 pH and 10-100 kDa M(r) ranges. The analytical and biological variance determined for 450 spots were of 31 and 42%, respectively. After LC/MS/MS analysis of in-gel tryptic digested spots, proteins were identified by using the novel Paragon algorithm that tolerates amino acid substitution in the first-pass search. It allowed the confident identification of 115 out of the 150 protein spots subjected to MS, quite unusual high percentage for a poor sequence database, as is the case of P. radiata. Proteins were classified into 12 or 18 groups based on their corresponding cell component or biological process/pathway categories, respectively. Carbohydrate metabolism and photosynthetic enzymes predominate in the 2-DE protein profile of P. radiata needles.

  20. 3D complex: a structural classification of protein complexes.

    Directory of Open Access Journals (Sweden)

    Emmanuel D Levy

    2006-11-01

    Full Text Available Most of the proteins in a cell assemble into complexes to carry out their function. It is therefore crucial to understand the physicochemical properties as well as the evolution of interactions between proteins. The Protein Data Bank represents an important source of information for such studies, because more than half of the structures are homo- or heteromeric protein complexes. Here we propose the first hierarchical classification of whole protein complexes of known 3-D structure, based on representing their fundamental structural features as a graph. This classification provides the first overview of all the complexes in the Protein Data Bank and allows nonredundant sets to be derived at different levels of detail. This reveals that between one-half and two-thirds of known structures are multimeric, depending on the level of redundancy accepted. We also analyse the structures in terms of the topological arrangement of their subunits and find that they form a small number of arrangements compared with all theoretically possible ones. This is because most complexes contain four subunits or less, and the large majority are homomeric. In addition, there is a strong tendency for symmetry in complexes, even for heteromeric complexes. Finally, through comparison of Biological Units in the Protein Data Bank with the Protein Quaternary Structure database, we identified many possible errors in quaternary structure assignments. Our classification, available as a database and Web server at http://www.3Dcomplex.org, will be a starting point for future work aimed at understanding the structure and evolution of protein complexes.

  1. DMPD: Structure, function and regulation of the Toll/IL-1 receptor adaptor proteins. [Dynamic Macrophage Pathway CSML Database

    Lifescience Database Archive (English)

    Full Text Available 17667936 Structure, function and regulation of the Toll/IL-1 receptor adaptor prote... (.svg) (.html) (.csml) Show Structure, function and regulation of the Toll/IL-1 receptor adaptor proteins. ...PubmedID 17667936 Title Structure, function and regulation of the Toll/IL-1 recep

  2. AllergenOnline: A peer-reviewed, curated allergen database to assess novel food proteins for potential cross-reactivity

    NARCIS (Netherlands)

    Goodman, Richard E.; Ebisawa, Motohiro; Ferreira, Fatima; Sampson, Hugh A.; van Ree, Ronald; Vieths, Stefan; Baumert, Joseph L.; Bohle, Barbara; Lalithambika, Sreedevi; Wise, John; Taylor, Steve L.

    2016-01-01

    Increasingly regulators are demanding evaluation of potential allergenicity of foods prior to marketing. Primary risks are the transfer of allergens or potentially cross-reactive proteins into new foods. AllergenOnline was developed in 2005 as a peer-reviewed bioinformatics platform to evaluate

  3. Development of Pharmacophore Model for Indeno[1,2-b]indoles as Human Protein Kinase CK2 Inhibitors and Database Mining

    Directory of Open Access Journals (Sweden)

    Samer Haidar

    2017-01-01

    Full Text Available Protein kinase CK2, initially designated as casein kinase 2, is an ubiquitously expressed serine/threonine kinase. This enzyme, implicated in many cellular processes, is highly expressed and active in many tumor cells. A large number of compounds has been developed as inhibitors comprising different backbones. Beside others, structures with an indeno[1,2-b]indole scaffold turned out to be potent new leads. With the aim of developing new inhibitors of human protein kinase CK2, we report here on the generation of common feature pharmacophore model to further explain the binding requirements for human CK2 inhibitors. Nine common chemical features of indeno[1,2-b]indole-type CK2 inhibitors were determined using MOE software (Chemical Computing Group, Montreal, Canada. This pharmacophore model was used for database mining with the aim to identify novel scaffolds for developing new potent and selective CK2 inhibitors. Using this strategy several structures were selected by searching inside the ZINC compound database. One of the selected compounds was bikaverin (6,11-dihydroxy-3,8-dimethoxy-1-methylbenzo[b]xanthene-7,10,12-trione, a natural compound which is produced by several kinds of fungi. This compound was tested on human recombinant CK2 and turned out to be an active inhibitor with an IC50 value of 1.24 µM.

  4. A resource for benchmarking the usefulness of protein structure models.

    KAUST Repository

    Carbajo, Daniel

    2012-08-02

    BACKGROUND: Increasingly, biologists and biochemists use computational tools to design experiments to probe the function of proteins and/or to engineer them for a variety of different purposes. The most effective strategies rely on the knowledge of the three-dimensional structure of the protein of interest. However it is often the case that an experimental structure is not available and that models of different quality are used instead. On the other hand, the relationship between the quality of a model and its appropriate use is not easy to derive in general, and so far it has been analyzed in detail only for specific application. RESULTS: This paper describes a database and related software tools that allow testing of a given structure based method on models of a protein representing different levels of accuracy. The comparison of the results of a computational experiment on the experimental structure and on a set of its decoy models will allow developers and users to assess which is the specific threshold of accuracy required to perform the task effectively. CONCLUSIONS: The ModelDB server automatically builds decoy models of different accuracy for a given protein of known structure and provides a set of useful tools for their analysis. Pre-computed data for a non-redundant set of deposited protein structures are available for analysis and download in the ModelDB database. IMPLEMENTATION, AVAILABILITY AND REQUIREMENTS: Project name: A resource for benchmarking the usefulness of protein structure models. Project home page: http://bl210.caspur.it/MODEL-DB/MODEL-DB_web/MODindex.php.Operating system(s): Platform independent. Programming language: Perl-BioPerl (program); mySQL, Perl DBI and DBD modules (database); php, JavaScript, Jmol scripting (web server). Other requirements: Java Runtime Environment v1.4 or later, Perl, BioPerl, CPAN modules, HHsearch, Modeller, LGA, NCBI Blast package, DSSP, Speedfill (Surfnet) and PSAIA. License: Free. Any restrictions to use by

  5. A resource for benchmarking the usefulness of protein structure models.

    Science.gov (United States)

    Carbajo, Daniel; Tramontano, Anna

    2012-08-02

    Increasingly, biologists and biochemists use computational tools to design experiments to probe the function of proteins and/or to engineer them for a variety of different purposes. The most effective strategies rely on the knowledge of the three-dimensional structure of the protein of interest. However it is often the case that an experimental structure is not available and that models of different quality are used instead. On the other hand, the relationship between the quality of a model and its appropriate use is not easy to derive in general, and so far it has been analyzed in detail only for specific application. This paper describes a database and related software tools that allow testing of a given structure based method on models of a protein representing different levels of accuracy. The comparison of the results of a computational experiment on the experimental structure and on a set of its decoy models will allow developers and users to assess which is the specific threshold of accuracy required to perform the task effectively. The ModelDB server automatically builds decoy models of different accuracy for a given protein of known structure and provides a set of useful tools for their analysis. Pre-computed data for a non-redundant set of deposited protein structures are available for analysis and download in the ModelDB database. IMPLEMENTATION, AVAILABILITY AND REQUIREMENTS: Project name: A resource for benchmarking the usefulness of protein structure models. Project home page: http://bl210.caspur.it/MODEL-DB/MODEL-DB_web/MODindex.php.Operating system(s): Platform independent. Programming language: Perl-BioPerl (program); mySQL, Perl DBI and DBD modules (database); php, JavaScript, Jmol scripting (web server). Other requirements: Java Runtime Environment v1.4 or later, Perl, BioPerl, CPAN modules, HHsearch, Modeller, LGA, NCBI Blast package, DSSP, Speedfill (Surfnet) and PSAIA. License: Free. Any restrictions to use by non-academics: No.

  6. A resource for benchmarking the usefulness of protein structure models.

    KAUST Repository

    Carbajo, Daniel; Tramontano, Anna

    2012-01-01

    BACKGROUND: Increasingly, biologists and biochemists use computational tools to design experiments to probe the function of proteins and/or to engineer them for a variety of different purposes. The most effective strategies rely on the knowledge of the three-dimensional structure of the protein of interest. However it is often the case that an experimental structure is not available and that models of different quality are used instead. On the other hand, the relationship between the quality of a model and its appropriate use is not easy to derive in general, and so far it has been analyzed in detail only for specific application. RESULTS: This paper describes a database and related software tools that allow testing of a given structure based method on models of a protein representing different levels of accuracy. The comparison of the results of a computational experiment on the experimental structure and on a set of its decoy models will allow developers and users to assess which is the specific threshold of accuracy required to perform the task effectively. CONCLUSIONS: The ModelDB server automatically builds decoy models of different accuracy for a given protein of known structure and provides a set of useful tools for their analysis. Pre-computed data for a non-redundant set of deposited protein structures are available for analysis and download in the ModelDB database. IMPLEMENTATION, AVAILABILITY AND REQUIREMENTS: Project name: A resource for benchmarking the usefulness of protein structure models. Project home page: http://bl210.caspur.it/MODEL-DB/MODEL-DB_web/MODindex.php.Operating system(s): Platform independent. Programming language: Perl-BioPerl (program); mySQL, Perl DBI and DBD modules (database); php, JavaScript, Jmol scripting (web server). Other requirements: Java Runtime Environment v1.4 or later, Perl, BioPerl, CPAN modules, HHsearch, Modeller, LGA, NCBI Blast package, DSSP, Speedfill (Surfnet) and PSAIA. License: Free. Any restrictions to use by

  7. ENERGY AND PROTEIN REQUIREMENTS OF GROWING PELIBUEY SHEEP UNDER TROPICAL CONDITIONS ESTIMATED FROM A LITERATURE DATABASE ANALYSES

    Directory of Open Access Journals (Sweden)

    Fernando Duarte

    2012-01-01

    Full Text Available Data from previous studies were used to estimate the metabolizable energy and protein requirements for maintenance and growth and basal metabolism energy requirement of male Pelibuey sheep under tropical conditions were estimated. In addition, empty body weight and mature weight of males and female Pelibuey sheep were also estimated. Basal metabolism energy requirements were estimated with the Cornell Net Carbohydrate and Protein System – Sheep (CNCPS-S model using the a1 factor of the maintenance equation. Mature weight was estimated to be 69 kg for males and 45 kg for females. Empty body weight was estimated to be 81% of live weight. Metabolizable energy and protein requirements for growth were 0.106 Mcal MEm/kg LW0.75 and 2.4 g MP/kg LW0.75 for males. The collected information did not allowed appropriate estimation of female requirements. The basal metabolism energy requirement was estimated to be 0.039 Mcal MEm/kg LW0.75. Energy requirements for basal metabolism were lower in Pelibuey sheep than those reported for wool breeds even though their total requirements were similar.

  8. A comprehensive two-dimensional gel protein database of noncultured unfractionated normal human epidermal keratinocytes: towards an integrated approach to the study of cell proliferation, differentiation and skin diseases

    DEFF Research Database (Denmark)

    Celis, J E; Madsen, Peder; Rasmussen, H H

    1991-01-01

    A two-dimensional (2-D) gel database of cellular proteins from noncultured, unfractionated normal human epidermal keratinocytes has been established. A total of 2651 [35S]methionine-labeled cellular proteins (1868 isoelectric focusing, 783 nonequilibrium pH gradient electrophoresis) were resolved...

  9. Comprehensive two-dimensional gel protein databases offer a global approach to the analysis of human cells: the transformed amnion cells (AMA) master database and its link to genome DNA sequence data

    DEFF Research Database (Denmark)

    Celis, J E; Gesser, B; Rasmussen, H H

    1990-01-01

    , mitochondria, Golgi, ribosomes, intermediate filaments, microfilaments and microtubules), levels in fetal human tissues, partial protein sequences (containing information on 48 human proteins microsequenced so far), cell cycle-regulated proteins, proteins sensitive to interferons alpha, beta, and gamma, heat...

  10. Relational databases

    CERN Document Server

    Bell, D A

    1986-01-01

    Relational Databases explores the major advances in relational databases and provides a balanced analysis of the state of the art in relational databases. Topics covered include capture and analysis of data placement requirements; distributed relational database systems; data dependency manipulation in database schemata; and relational database support for computer graphics and computer aided design. This book is divided into three sections and begins with an overview of the theory and practice of distributed systems, using the example of INGRES from Relational Technology as illustration. The

  11. The CATH database

    Directory of Open Access Journals (Sweden)

    Knudsen Michael

    2010-02-01

    Full Text Available Abstract The CATH database provides hierarchical classification of protein domains based on their folding patterns. Domains are obtained from protein structures deposited in the Protein Data Bank and both domain identification and subsequent classification use manual as well as automated procedures. The accompanying website http://www.cathdb.info provides an easy-to-use entry to the classification, allowing for both browsing and downloading of data. Here, we give a brief review of the database, its corresponding website and some related tools.

  12. Hidden Markov model approach for identifying the modular framework of the protein backbone.

    Science.gov (United States)

    Camproux, A C; Tuffery, P; Chevrolat, J P; Boisvieux, J F; Hazout, S

    1999-12-01

    The hidden Markov model (HMM) was used to identify recurrent short 3D structural building blocks (SBBs) describing protein backbones, independently of any a priori knowledge. Polypeptide chains are decomposed into a series of short segments defined by their inter-alpha-carbon distances. Basically, the model takes into account the sequentiality of the observed segments and assumes that each one corresponds to one of several possible SBBs. Fitting the model to a database of non-redundant proteins allowed us to decode proteins in terms of 12 distinct SBBs with different roles in protein structure. Some SBBs correspond to classical regular secondary structures. Others correspond to a significant subdivision of their bounding regions previously considered to be a single pattern. The major contribution of the HMM is that this model implicitly takes into account the sequential connections between SBBs and thus describes the most probable pathways by which the blocks are connected to form the framework of the protein structures. Validation of the SBBs code was performed by extracting SBB series repeated in recoding proteins and examining their structural similarities. Preliminary results on the sequence specificity of SBBs suggest promising perspectives for the prediction of SBBs or series of SBBs from the protein sequences.

  13. Biofuel Database

    Science.gov (United States)

    Biofuel Database (Web, free access)   This database brings together structural, biological, and thermodynamic data for enzymes that are either in current use or are being considered for use in the production of biofuels.

  14. Community Database

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — This excel spreadsheet is the result of merging at the port level of several of the in-house fisheries databases in combination with other demographic databases such...

  15. Database Administrator

    Science.gov (United States)

    Moore, Pam

    2010-01-01

    The Internet and electronic commerce (e-commerce) generate lots of data. Data must be stored, organized, and managed. Database administrators, or DBAs, work with database software to find ways to do this. They identify user needs, set up computer databases, and test systems. They ensure that systems perform as they should and add people to the…

  16. Practical use of chemical shift databases for protein solid-state NMR: 2D chemical shift maps and amino-acid assignment with secondary-structure information

    International Nuclear Information System (INIS)

    Fritzsching, K. J.; Yang, Y.; Schmidt-Rohr, K.; Hong Mei

    2013-01-01

    We introduce a Python-based program that utilizes the large database of 13 C and 15 N chemical shifts in the Biological Magnetic Resonance Bank to rapidly predict the amino acid type and secondary structure from correlated chemical shifts. The program, called PACSYlite Unified Query (PLUQ), is designed to help assign peaks obtained from 2D 13 C– 13 C, 15 N– 13 C, or 3D 15 N– 13 C– 13 C magic-angle-spinning correlation spectra. We show secondary-structure specific 2D 13 C– 13 C correlation maps of all twenty amino acids, constructed from a chemical shift database of 262,209 residues. The maps reveal interesting conformation-dependent chemical shift distributions and facilitate searching of correlation peaks during amino-acid type assignment. Based on these correlations, PLUQ outputs the most likely amino acid types and the associated secondary structures from inputs of experimental chemical shifts. We test the assignment accuracy using four high-quality protein structures. Based on only the Cα and Cβ chemical shifts, the highest-ranked PLUQ assignments were 40–60 % correct in both the amino-acid type and the secondary structure. For three input chemical shifts (CO–Cα–Cβ or N–Cα–Cβ), the first-ranked assignments were correct for 60 % of the residues, while within the top three predictions, the correct assignments were found for 80 % of the residues. PLUQ and the chemical shift maps are expected to be useful at the first stage of sequential assignment, for combination with automated sequential assignment programs, and for highly disordered proteins for which secondary structure analysis is the main goal of structure determination.

  17. Practical use of chemical shift databases for protein solid-state NMR: 2D chemical shift maps and amino-acid assignment with secondary-structure information

    Energy Technology Data Exchange (ETDEWEB)

    Fritzsching, K. J.; Yang, Y.; Schmidt-Rohr, K.; Hong Mei, E-mail: mhong@iastate.edu [Iowa State University, Department of Chemistry (United States)

    2013-06-15

    We introduce a Python-based program that utilizes the large database of {sup 13}C and {sup 15}N chemical shifts in the Biological Magnetic Resonance Bank to rapidly predict the amino acid type and secondary structure from correlated chemical shifts. The program, called PACSYlite Unified Query (PLUQ), is designed to help assign peaks obtained from 2D {sup 13}C-{sup 13}C, {sup 15}N-{sup 13}C, or 3D {sup 15}N-{sup 13}C-{sup 13}C magic-angle-spinning correlation spectra. We show secondary-structure specific 2D {sup 13}C-{sup 13}C correlation maps of all twenty amino acids, constructed from a chemical shift database of 262,209 residues. The maps reveal interesting conformation-dependent chemical shift distributions and facilitate searching of correlation peaks during amino-acid type assignment. Based on these correlations, PLUQ outputs the most likely amino acid types and the associated secondary structures from inputs of experimental chemical shifts. We test the assignment accuracy using four high-quality protein structures. Based on only the C{alpha} and C{beta} chemical shifts, the highest-ranked PLUQ assignments were 40-60 % correct in both the amino-acid type and the secondary structure. For three input chemical shifts (CO-C{alpha}-C{beta} or N-C{alpha}-C{beta}), the first-ranked assignments were correct for 60 % of the residues, while within the top three predictions, the correct assignments were found for 80 % of the residues. PLUQ and the chemical shift maps are expected to be useful at the first stage of sequential assignment, for combination with automated sequential assignment programs, and for highly disordered proteins for which secondary structure analysis is the main goal of structure determination.

  18. Immunoproteomic analysis of the protein repertoire of unsporulated Eimeria tenella oocysts

    Directory of Open Access Journals (Sweden)

    Zhang Zhenchao

    2017-01-01

    Full Text Available The apicomplexan protozoans Eimeria spp. cause coccidioses, the most common intestinal diseases in chickens. Coccidiosis is associated with significant animal welfare issues and has a high economic impact on the poultry industry. Lack of a full understanding of immunogenic molecules and their precise functions involved in the Eimeria life cycles may limit development of effective vaccines and drug therapies. In this study, immunoproteomic approaches were used to define the antigenic protein repertoire from the total proteins of unsporulated Eimeria tenella oocysts. Approximately 101 protein spots were recognized in sera from chickens infected experimentally with E. tenella. Forty-six spots of unsporulated oocysts were excised from preparative gels and identified by matrix-assisted laser desorption ionization time-of-flight MS (MALDI-TOF-MS and MALDI-TOF/TOF-MS. For unsporulated oocysts, 13 known proteins of E. tenella and 17 homologous proteins to other apicomplexan or protozoan parasites were identified using the ‘Mascot’ server. The remaining proteins were searched against the E. tenella protein sequence database using the ‘Mascot in-house’ search engine (version 2.1 in automated mode, and 12 unknown proteins were identified. The amino acid sequences of the unknown proteins were searched using BLAST against non-redundant sequence databases (NCBI, and 9 homologous proteins in unsporulated oocyst were found homologous to proteins of other apicomplexan parasites. These findings may provide useful evidence for understanding parasite biology, pathogenesis, immunogenicity and immune evasion mechanisms of E. tenella.

  19. Consensus coding sequence (CCDS) database: a standardized set of human and mouse protein-coding regions supported by expert curation.

    Science.gov (United States)

    Pujar, Shashikant; O'Leary, Nuala A; Farrell, Catherine M; Loveland, Jane E; Mudge, Jonathan M; Wallin, Craig; Girón, Carlos G; Diekhans, Mark; Barnes, If; Bennett, Ruth; Berry, Andrew E; Cox, Eric; Davidson, Claire; Goldfarb, Tamara; Gonzalez, Jose M; Hunt, Toby; Jackson, John; Joardar, Vinita; Kay, Mike P; Kodali, Vamsi K; Martin, Fergal J; McAndrews, Monica; McGarvey, Kelly M; Murphy, Michael; Rajput, Bhanu; Rangwala, Sanjida H; Riddick, Lillian D; Seal, Ruth L; Suner, Marie-Marthe; Webb, David; Zhu, Sophia; Aken, Bronwen L; Bruford, Elspeth A; Bult, Carol J; Frankish, Adam; Murphy, Terence; Pruitt, Kim D

    2018-01-04

    The Consensus Coding Sequence (CCDS) project provides a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assembly in genome annotations produced independently by NCBI and the Ensembl group at EMBL-EBI. This dataset is the product of an international collaboration that includes NCBI, Ensembl, HUGO Gene Nomenclature Committee, Mouse Genome Informatics and University of California, Santa Cruz. Identically annotated coding regions, which are generated using an automated pipeline and pass multiple quality assurance checks, are assigned a stable and tracked identifier (CCDS ID). Additionally, coordinated manual review by expert curators from the CCDS collaboration helps in maintaining the integrity and high quality of the dataset. The CCDS data are available through an interactive web page (https://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi) and an FTP site (ftp://ftp.ncbi.nlm.nih.gov/pub/CCDS/). In this paper, we outline the ongoing work, growth and stability of the CCDS dataset and provide updates on new collaboration members and new features added to the CCDS user interface. We also present expert curation scenarios, with specific examples highlighting the importance of an accurate reference genome assembly and the crucial role played by input from the research community. Published by Oxford University Press on behalf of Nucleic Acids Research 2017.

  20. CIG-DB: the database for human or mouse immunoglobulin and T cell receptor genes available for cancer studies

    Directory of Open Access Journals (Sweden)

    Furue Motoki

    2010-07-01

    Full Text Available Abstract Background Immunoglobulin (IG or antibody and the T-cell receptor (TR are pivotal proteins in the immune system of higher organisms. In cancer immunotherapy, the immune responses mediated by tumor-epitope-binding IG or TR play important roles in anticancer effects. Although there are public databases specific for immunological genes, their contents have not been associated with clinical studies. Therefore, we developed an integrated database of IG/TR data reported in cancer studies (the Cancer-related Immunological Gene Database [CIG-DB]. Description This database is designed as a platform to explore public human and murine IG/TR genes sequenced in cancer studies. A total of 38,308 annotation entries for IG/TR proteins were collected from GenBank/DDBJ/EMBL and the Protein Data Bank, and 2,740 non-redundant corresponding MEDLINE references were appended. Next, we filtered the MEDLINE texts by MeSH terms, titles, and abstracts containing keywords related to cancer. After we performed a manual check, we classified the protein entries into two groups: 611 on cancer therapy (Group I and 1,470 on hematological tumors (Group II. Thus, a total of 2,081 cancer-related IG and TR entries were tabularized. To effectively classify future entries, we developed a computational method based on text mining and canonical discriminant analysis by parsing MeSH/title/abstract words. We performed a leave-one-out cross validation for the method, which showed high accuracy rates: 94.6% for IG references and 94.7% for TR references. We also collected 920 epitope sequences bound with IG/TR. The CIG-DB is equipped with search engines for amino acid sequences and MEDLINE references, sequence analysis tools, and a 3D viewer. This database is accessible without charge or registration at http://www.scchr-cigdb.jp/, and the search results are freely downloadable. Conclusions The CIG-DB serves as a bridge between immunological gene data and cancer studies, presenting

  1. Mycobacteriophage genome database.

    Science.gov (United States)

    Joseph, Jerrine; Rajendran, Vasanthi; Hassan, Sameer; Kumar, Vanaja

    2011-01-01

    Mycobacteriophage genome database (MGDB) is an exclusive repository of the 64 completely sequenced mycobacteriophages with annotated information. It is a comprehensive compilation of the various gene parameters captured from several databases pooled together to empower mycobacteriophage researchers. The MGDB (Version No.1.0) comprises of 6086 genes from 64 mycobacteriophages classified into 72 families based on ACLAME database. Manual curation was aided by information available from public databases which was enriched further by analysis. Its web interface allows browsing as well as querying the classification. The main objective is to collect and organize the complexity inherent to mycobacteriophage protein classification in a rational way. The other objective is to browse the existing and new genomes and describe their functional annotation. The database is available for free at http://mpgdb.ibioinformatics.org/mpgdb.php.

  2. JAFA: a protein function annotation meta-server

    DEFF Research Database (Denmark)

    Friedberg, Iddo; Harder, Tim; Godzik, Adam

    2006-01-01

    Annotations, or JAFA server. JAFA queries several function prediction servers with a protein sequence and assembles the returned predictions in a legible, non-redundant format. In this manner, JAFA combines the predictions of several servers to provide a comprehensive view of what are the predicted functions...

  3. Federal databases

    International Nuclear Information System (INIS)

    Welch, M.J.; Welles, B.W.

    1988-01-01

    Accident statistics on all modes of transportation are available as risk assessment analytical tools through several federal agencies. This paper reports on the examination of the accident databases by personal contact with the federal staff responsible for administration of the database programs. This activity, sponsored by the Department of Energy through Sandia National Laboratories, is an overview of the national accident data on highway, rail, air, and marine shipping. For each mode, the definition or reporting requirements of an accident are determined and the method of entering the accident data into the database is established. Availability of the database to others, ease of access, costs, and who to contact were prime questions to each of the database program managers. Additionally, how the agency uses the accident data was of major interest

  4. Comprehensive MALDI-TOF biotyping of the non-redundant Harvard Pseudomonas aeruginosa PA14 transposon insertion mutant library.

    Science.gov (United States)

    Oumeraci, Tonio; Jensen, Vanessa; Talbot, Steven R; Hofmann, Winfried; Kostrzewa, Markus; Schlegelberger, Brigitte; von Neuhoff, Nils; Häussler, Susanne

    2015-01-01

    Pseudomonas aeruginosa is a gram-negative bacterium that is ubiquitously present in the aerobic biosphere. As an antibiotic-resistant facultative pathogen, it is a major cause of hospital-acquired infections. Its rapid and accurate identification is crucial in clinical and therapeutic environments. In a large-scale MALDI-TOF mass spectrometry-based screen of the Harvard transposon insertion mutant library of P. aeruginosa strain PA14, intact-cell proteome profile spectra of 5547 PA14 transposon mutants exhibiting a plethora of different phenotypes were acquired and analyzed. Of all P. aeruginosa PA14 mutant profiles 99.7% were correctly identified as P. aeruginosa with the Biotyper software on the species level. On the strain level, 99.99% of the profiles were mapped to five different individual P. aeruginosa Biotyper database entries. A principal component analysis-based approach was used to determine the most important discriminatory mass features between these Biotyper groups. Although technical replicas were consistently categorized to specific Biotyper groups in 94.2% of the mutant profiles, biological replicas were not, indicating that the distinct proteotypes are affected by growth conditions. The PA14 mutant profile collection presented here constitutes the largest coherent P. aeruginosa MALDI-TOF spectral dataset publicly available today. Transposon insertions in thousands of different P. aeruginosa genes did not affect species identification from MALDI-TOF mass spectra, clearly demonstrating the robustness of the approach. However, the assignment of the individual spectra to sub-groups proved to be non-consistent in biological replicas, indicating that the differentiation between biotyper groups in this nosocomial pathogen is unassured.

  5. FastBLAST: homology relationships for millions of proteins.

    Directory of Open Access Journals (Sweden)

    Morgan N Price

    Full Text Available BACKGROUND: All-versus-all BLAST, which searches for homologous pairs of sequences in a database of proteins, is used to identify potential orthologs, to find new protein families, and to provide rapid access to these homology relationships. As DNA sequencing accelerates and data sets grow, all-versus-all BLAST has become computationally demanding. METHODOLOGY/PRINCIPAL FINDINGS: We present FastBLAST, a heuristic replacement for all-versus-all BLAST that relies on alignments of proteins to known families, obtained from tools such as PSI-BLAST and HMMer. FastBLAST avoids most of the work of all-versus-all BLAST by taking advantage of these alignments and by clustering similar sequences. FastBLAST runs in two stages: the first stage identifies additional families and aligns them, and the second stage quickly identifies the homologs of a query sequence, based on the alignments of the families, before generating pairwise alignments. On 6.53 million proteins from the non-redundant Genbank database ("NR", FastBLAST identifies new families 25 times faster than all-versus-all BLAST. Once the first stage is completed, FastBLAST identifies homologs for the average query in less than 5 seconds (8.6 times faster than BLAST and gives nearly identical results. For hits above 70 bits, FastBLAST identifies 98% of the top 3,250 hits per query. CONCLUSIONS/SIGNIFICANCE: FastBLAST enables research groups that do not have supercomputers to analyze large protein sequence data sets. FastBLAST is open source software and is available at http://microbesonline.org/fastblast.

  6. ValidatorDB: database of up-to-date validation results for ligands and non-standard residues from the Protein Data Bank.

    Science.gov (United States)

    Sehnal, David; Svobodová Vařeková, Radka; Pravda, Lukáš; Ionescu, Crina-Maria; Geidl, Stanislav; Horský, Vladimír; Jaiswal, Deepti; Wimmerová, Michaela; Koča, Jaroslav

    2015-01-01

    Following the discovery of serious errors in the structure of biomacromolecules, structure validation has become a key topic of research, especially for ligands and non-standard residues. ValidatorDB (freely available at http://ncbr.muni.cz/ValidatorDB) offers a new step in this direction, in the form of a database of validation results for all ligands and non-standard residues from the Protein Data Bank (all molecules with seven or more heavy atoms). Model molecules from the wwPDB Chemical Component Dictionary are used as reference during validation. ValidatorDB covers the main aspects of validation of annotation, and additionally introduces several useful validation analyses. The most significant is the classification of chirality errors, allowing the user to distinguish between serious issues and minor inconsistencies. Other such analyses are able to report, for example, completely erroneous ligands, alternate conformations or complete identity with the model molecules. All results are systematically classified into categories, and statistical evaluations are performed. In addition to detailed validation reports for each molecule, ValidatorDB provides summaries of the validation results for the entire PDB, for sets of molecules sharing the same annotation (three-letter code) or the same PDB entry, and for user-defined selections of annotations or PDB entries. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  7. Database Replication

    CERN Document Server

    Kemme, Bettina

    2010-01-01

    Database replication is widely used for fault-tolerance, scalability and performance. The failure of one database replica does not stop the system from working as available replicas can take over the tasks of the failed replica. Scalability can be achieved by distributing the load across all replicas, and adding new replicas should the load increase. Finally, database replication can provide fast local access, even if clients are geographically distributed clients, if data copies are located close to clients. Despite its advantages, replication is not a straightforward technique to apply, and

  8. Refactoring databases evolutionary database design

    CERN Document Server

    Ambler, Scott W

    2006-01-01

    Refactoring has proven its value in a wide range of development projects–helping software professionals improve system designs, maintainability, extensibility, and performance. Now, for the first time, leading agile methodologist Scott Ambler and renowned consultant Pramodkumar Sadalage introduce powerful refactoring techniques specifically designed for database systems. Ambler and Sadalage demonstrate how small changes to table structures, data, stored procedures, and triggers can significantly enhance virtually any database design–without changing semantics. You’ll learn how to evolve database schemas in step with source code–and become far more effective in projects relying on iterative, agile methodologies. This comprehensive guide and reference helps you overcome the practical obstacles to refactoring real-world databases by covering every fundamental concept underlying database refactoring. Using start-to-finish examples, the authors walk you through refactoring simple standalone databas...

  9. Integration of multiple biological features yields high confidence human protein interactome.

    Science.gov (United States)

    Karagoz, Kubra; Sevimoglu, Tuba; Arga, Kazim Yalcin

    2016-08-21

    The biological function of a protein is usually determined by its physical interaction with other proteins. Protein-protein interactions (PPIs) are identified through various experimental methods and are stored in curated databases. The noisiness of the existing PPI data is evident, and it is essential that a more reliable data is generated. Furthermore, the selection of a set of PPIs at different confidence levels might be necessary for many studies. Although different methodologies were introduced to evaluate the confidence scores for binary interactions, a highly reliable, almost complete PPI network of Homo sapiens is not proposed yet. The quality and coverage of human protein interactome need to be improved to be used in various disciplines, especially in biomedicine. In the present work, we propose an unsupervised statistical approach to assign confidence scores to PPIs of H. sapiens. To achieve this goal PPI data from six different databases were collected and a total of 295,288 non-redundant interactions between 15,950 proteins were acquired. The present scoring system included the context information that was assigned to PPIs derived from eight biological attributes. A high confidence network, which included 147,923 binary interactions between 13,213 proteins, had scores greater than the cutoff value of 0.80, for which sensitivity, specificity, and coverage were 94.5%, 80.9%, and 82.8%, respectively. We compared the present scoring method with others for evaluation. Reducing the noise inherent in experimental PPIs via our scoring scheme increased the accuracy significantly. As it was demonstrated through the assessment of process and cancer subnetworks, this study allows researchers to construct and analyze context-specific networks via valid PPI sets and one can easily achieve subnetworks around proteins of interest at a specified confidence level. Copyright © 2016 Elsevier Ltd. All rights reserved.

  10. RDD Databases

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — This database was established to oversee documents issued in support of fishery research activities including experimental fishing permits (EFP), letters of...

  11. Snowstorm Database

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — The Snowstorm Database is a collection of over 500 snowstorms dating back to 1900 and updated operationally. Only storms having large areas of heavy snowfall (10-20...

  12. Dealer Database

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — The dealer reporting databases contain the primary data reported by federally permitted seafood dealers in the northeast. Electronic reporting was implemented May 1,...

  13. National database

    DEFF Research Database (Denmark)

    Kristensen, Helen Grundtvig; Stjernø, Henrik

    1995-01-01

    Artikel om national database for sygeplejeforskning oprettet på Dansk Institut for Sundheds- og Sygeplejeforskning. Det er målet med databasen at samle viden om forsknings- og udviklingsaktiviteter inden for sygeplejen.......Artikel om national database for sygeplejeforskning oprettet på Dansk Institut for Sundheds- og Sygeplejeforskning. Det er målet med databasen at samle viden om forsknings- og udviklingsaktiviteter inden for sygeplejen....

  14. The MRC-5 human embryonal lung fibroblast two-dimensional gel cellular protein database: quantitative identification of polypeptides whose relative abundance differs between quiescent, proliferating and SV40 transformed cells

    DEFF Research Database (Denmark)

    Celis, J E; Dejgaard, K; Madsen, Peder

    1990-01-01

    interferon-induced proteins, were not detected in the master MRC-5 images. The identity of 36 of the transformation-sensitive proteins whose levels are up or down regulated by two times or more was determined and additional information can be transferred from the master transformed human epithelial amnion......, this comprehensive database will outline an integrated picture of the expression levels and properties of the thousands of protein components of organelles, pathways and cytoskeletal systems that may be directly or indirectly involved in properties associated with the transformed state. Udgivelsesdato: 1990-Dec...

  15. JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles.

    Science.gov (United States)

    Mathelier, Anthony; Fornes, Oriol; Arenillas, David J; Chen, Chih-Yu; Denay, Grégoire; Lee, Jessica; Shi, Wenqiang; Shyr, Casper; Tan, Ge; Worsley-Hunt, Rebecca; Zhang, Allen W; Parcy, François; Lenhard, Boris; Sandelin, Albin; Wasserman, Wyeth W

    2016-01-04

    JASPAR (http://jaspar.genereg.net) is an open-access database storing curated, non-redundant transcription factor (TF) binding profiles representing transcription factor binding preferences as position frequency matrices for multiple species in six taxonomic groups. For this 2016 release, we expanded the JASPAR CORE collection with 494 new TF binding profiles (315 in vertebrates, 11 in nematodes, 3 in insects, 1 in fungi and 164 in plants) and updated 59 profiles (58 in vertebrates and 1 in fungi). The introduced profiles represent an 83% expansion and 10% update when compared to the previous release. We updated the structural annotation of the TF DNA binding domains (DBDs) following a published hierarchical structural classification. In addition, we introduced 130 transcription factor flexible models trained on ChIP-seq data for vertebrates, which capture dinucleotide dependencies within TF binding sites. This new JASPAR release is accompanied by a new web tool to infer JASPAR TF binding profiles recognized by a given TF protein sequence. Moreover, we provide the users with a Ruby module complementing the JASPAR API to ease programmatic access and use of the JASPAR collection of profiles. Finally, we provide the JASPAR2016 R/Bioconductor data package with the data of this release. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  16. Comparative proteomic analysis of plasma membrane proteins between human osteosarcoma and normal osteoblastic cell lines

    International Nuclear Information System (INIS)

    Zhang, Zhiyu; Ma, Fang; Cai, Zhengdong; Zhang, Lijun; Hua, Yingqi; Jia, Xiaofang; Li, Jian; Hu, Shuo; Peng, Xia; Yang, Pengyuan; Sun, Mengxiong

    2010-01-01

    Osteosarcoma (OS) is the most common primary malignant tumor of bone in children and adolescents. However, the knowledge in diagnostic modalities has progressed less. To identify new biomarkers for the early diagnosis of OS as well as for potential novel therapeutic candidates, we performed a sub-cellular comparative proteomic research. An osteosarcoma cell line (MG-63) and human osteoblastic cells (hFOB1.19) were used as our comparative model. Plasma membrane (PM) was obtained by aqueous two-phase partition. Proteins were analyzed through iTRAQ-based quantitative differential LC/MS/MS. The location and function of differential proteins were analyzed through GO database. Protein-protein interaction was examined through String software. One of differentially expressed proteins was verified by immunohistochemistry. 342 non-redundant proteins were identified, 68 of which were differentially expressed with 1.5-fold difference, with 25 up-regulated and 43 down-regulated. Among those differential proteins, 69% ware plasma membrane, which are related to the biological processes of binding, cell structure, signal transduction, cell adhesion, etc., and interaction with each other. One protein--CD151 located in net nodes was verified to be over-expressed in osteosarcoma tissue by immunohistochemistry. It is the first time to use plasma membrane proteomics for studying the OS membrane proteins according to our knowledge. We generated preliminary but comprehensive data about membrane protein of osteosarcoma. Among these, CD151 was further validated in patient samples, and this small molecule membrane might be a new target for OS research. The plasma membrane proteins identified in this study may provide new insight into osteosarcoma biology and potential diagnostic and therapeutic biomarkers

  17. Experiment Databases

    Science.gov (United States)

    Vanschoren, Joaquin; Blockeel, Hendrik

    Next to running machine learning algorithms based on inductive queries, much can be learned by immediately querying the combined results of many prior studies. Indeed, all around the globe, thousands of machine learning experiments are being executed on a daily basis, generating a constant stream of empirical information on machine learning techniques. While the information contained in these experiments might have many uses beyond their original intent, results are typically described very concisely in papers and discarded afterwards. If we properly store and organize these results in central databases, they can be immediately reused for further analysis, thus boosting future research. In this chapter, we propose the use of experiment databases: databases designed to collect all the necessary details of these experiments, and to intelligently organize them in online repositories to enable fast and thorough analysis of a myriad of collected results. They constitute an additional, queriable source of empirical meta-data based on principled descriptions of algorithm executions, without reimplementing the algorithms in an inductive database. As such, they engender a very dynamic, collaborative approach to experimentation, in which experiments can be freely shared, linked together, and immediately reused by researchers all over the world. They can be set up for personal use, to share results within a lab or to create open, community-wide repositories. Here, we provide a high-level overview of their design, and use an existing experiment database to answer various interesting research questions about machine learning algorithms and to verify a number of recent studies.

  18. Refinement of protein termini in template-based modeling using conformational space annealing.

    Science.gov (United States)

    Park, Hahnbeom; Ko, Junsu; Joo, Keehyoung; Lee, Julian; Seok, Chaok; Lee, Jooyoung

    2011-09-01

    The rapid increase in the number of experimentally determined protein structures in recent years enables us to obtain more reliable protein tertiary structure models than ever by template-based modeling. However, refinement of template-based models beyond the limit available from the best templates is still needed for understanding protein function in atomic detail. In this work, we develop a new method for protein terminus modeling that can be applied to refinement of models with unreliable terminus structures. The energy function for terminus modeling consists of both physics-based and knowledge-based potential terms with carefully optimized relative weights. Effective sampling of both the framework and terminus is performed using the conformational space annealing technique. This method has been tested on a set of termini derived from a nonredundant structure database and two sets of termini from the CASP8 targets. The performance of the terminus modeling method is significantly improved over our previous method that does not employ terminus refinement. It is also comparable or superior to the best server methods tested in CASP8. The success of the current approach suggests that similar strategy may be applied to other types of refinement problems such as loop modeling or secondary structure rearrangement. Copyright © 2011 Wiley-Liss, Inc.

  19. BioMagResBank database with sets of experimental NMR constraints corresponding to the structures of over 1400 biomolecules deposited in the Protein Data Bank

    International Nuclear Information System (INIS)

    Doreleijers, Jurgen F.; Mading, Steve; Maziuk, Dimitri; Sojourner, Kassandra; Yin Lei; Zhu Jun; Markley, John L.; Ulrich, Eldon L.

    2003-01-01

    Experimental constraints associated with NMR structures are available from the Protein Data Bank (PDB) in the form of 'Magnetic Resonance' (MR) files. These files contain multiple types of data concatenated without boundary markers and are difficult to use for further research. Reported here are the results of a project initiated to annotate, archive, and disseminate these data to the research community from a searchable resource in a uniform format. The MR files from a set of 1410 NMR structures were analyzed and their original constituent data blocks annotated as to data type using a semi-automated protocol. A new software program called Wattos was then used to parse and archive the data in a relational database. From the total number of MR file blocks annotated as constraints, it proved possible to parse 84% (3337/3975). The constraint lists that were parsed correspond to three data types (2511 distance, 788 dihedral angle, and 38 residual dipolar couplings lists) from the three most popular software packages used in NMR structure determination: XPLOR/CNS (2520 lists), DISCOVER (412 lists), and DYANA/DIANA (405 lists). These constraints were then mapped to a developmental version of the BioMagResBank (BMRB) data model. A total of 31 data types originating from 16 programs have been classified, with the NOE distance constraint being the most commonly observed. The results serve as a model for the development of standards for NMR constraint deposition in computer-readable form. The constraints are updated regularly and are available from the BMRB web site (http://www.bmrb.wisc.edu)

  20. Building a database for brain 18 kDa translocator protein imaged using [11C]PBR28 in healthy subjects.

    Science.gov (United States)

    Paul, Soumen; Gallagher, Evan; Liow, Jeih-San; Mabins, Sanche; Henry, Katharine; Zoghbi, Sami S; Gunn, Roger N; Kreisl, William C; Richards, Erica M; Zanotti-Fregonara, Paolo; Morse, Cheryl L; Hong, Jinsoo; Kowalski, Aneta; Pike, Victor W; Innis, Robert B; Fujita, Masahiro

    2018-01-01

    Translocator protein 18 kDa (TSPO) has been widely imaged as a marker of neuroinflammation using several radioligands, including [ 11 C]PBR28. In order to study the effects of age, sex, and obesity on TSPO binding and to determine whether this binding can be accurately assessed using fewer radio high-performance liquid chromatography (radio-HPLC) measurements of arterial blood samples, we created a database of 48 healthy subjects who had undergone [ 11 C]PBR28 scans (23 high-affinity binders (HABs) and 25 mixed-affinity binders (MABs), 20 F/28 M, age: 40.6 ± 16.8 years). After analysis by Logan plot using 23 metabolite-corrected arterial samples, total distribution volume ( V T ) was found to be 1.2-fold higher in HABs across all brain regions. Additionally, the polymorphism plot estimated nondisplaceable uptake ( V ND ) as 1.40 mL · cm -3 , which generated a specific-to-nondisplaceable ratio ( BP ND ) of 1.6 ± 0.6 in HABs and 1.1 ± 0.6 in MABs. V T increased significantly with age in nearly all regions and was well estimated with radio-HPLC measurements from six arterial samples. However, V T did not correlate with body mass index and was not affected by sex. These results underscore which patient characteristics should be accounted for during [ 11 C]PBR28 studies and suggest ways to perform such studies more easily and with fewer blood samples.

  1. A FREQUENCY-BASED LINGUISTIC APPROACH TO PROTEIN DECODING AND DESIGN: SIMPLE CONCEPTS, DIVERSE APPLICATIONS, AND THE SCS PACKAGE

    Directory of Open Access Journals (Sweden)

    Kenta Motomura

    2013-02-01

    Full Text Available Protein structure and function information is coded in amino acid sequences. However, the relationship between primary sequences and three-dimensional structures and functions remains enigmatic. Our approach to this fundamental biochemistry problem is based on the frequencies of short constituent sequences (SCSs or words. A protein amino acid sequence is considered analogous to an English sentence, where SCSs are equivalent to words. Availability scores, which are defined as real SCS frequencies in the non-redundant amino acid database relative to their probabilistically expected frequencies, demonstrate the biological usage bias of SCSs. As a result, this frequency-based linguistic approach is expected to have diverse applications, such as secondary structure specifications by structure-specific SCSs and immunological adjuvants with rare or non-existent SCSs. Linguistic similarities (e.g., wide ranges of scale-free distributions and dissimilarities (e.g., behaviors of low-rank samples between proteins and the natural English language have been revealed in the rank-frequency relationships of SCSs or words. We have developed a web server, the SCS Package, which contains five applications for analyzing protein sequences based on the linguistic concept. These tools have the potential to assist researchers in deciphering structurally and functionally important protein sites, species-specific sequences, and functional relationships between SCSs. The SCS Package also provides researchers with a tool to construct amino acid sequences de novo based on the idiomatic usage of SCSs.

  2. A frequency-based linguistic approach to protein decoding and design: Simple concepts, diverse applications, and the SCS Package

    Science.gov (United States)

    Motomura, Kenta; Nakamura, Morikazu; Otaki, Joji M.

    2013-01-01

    Protein structure and function information is coded in amino acid sequences. However, the relationship between primary sequences and three-dimensional structures and functions remains enigmatic. Our approach to this fundamental biochemistry problem is based on the frequencies of short constituent sequences (SCSs) or words. A protein amino acid sequence is considered analogous to an English sentence, where SCSs are equivalent to words. Availability scores, which are defined as real SCS frequencies in the non-redundant amino acid database relative to their probabilistically expected frequencies, demonstrate the biological usage bias of SCSs. As a result, this frequency-based linguistic approach is expected to have diverse applications, such as secondary structure specifications by structure-specific SCSs and immunological adjuvants with rare or non-existent SCSs. Linguistic similarities (e.g., wide ranges of scale-free distributions) and dissimilarities (e.g., behaviors of low-rank samples) between proteins and the natural English language have been revealed in the rank-frequency relationships of SCSs or words. We have developed a web server, the SCS Package, which contains five applications for analyzing protein sequences based on the linguistic concept. These tools have the potential to assist researchers in deciphering structurally and functionally important protein sites, species-specific sequences, and functional relationships between SCSs. The SCS Package also provides researchers with a tool to construct amino acid sequences de novo based on the idiomatic usage of SCSs. PMID:24688703

  3. ProBiS-2012: web server and web services for detection of structurally similar binding sites in proteins.

    Science.gov (United States)

    Konc, Janez; Janezic, Dusanka

    2012-07-01

    The ProBiS web server is a web server for detection of structurally similar binding sites in the PDB and for local pairwise alignment of protein structures. In this article, we present a new version of the ProBiS web server that is 10 times faster than earlier versions, due to the efficient parallelization of the ProBiS algorithm, which now allows significantly faster comparison of a protein query against the PDB and reduces the calculation time for scanning the entire PDB from hours to minutes. It also features new web services, and an improved user interface. In addition, the new web server is united with the ProBiS-Database and thus provides instant access to pre-calculated protein similarity profiles for over 29 000 non-redundant protein structures. The ProBiS web server is particularly adept at detection of secondary binding sites in proteins. It is freely available at http://probis.cmm.ki.si/old-version, and the new ProBiS web server is at http://probis.cmm.ki.si.

  4. gEVE: a genome-based endogenous viral element database provides comprehensive viral protein-coding sequences in mammalian genomes.

    Science.gov (United States)

    Nakagawa, So; Takahashi, Mahoko Ueda

    2016-01-01

    In mammals, approximately 10% of genome sequences correspond to endogenous viral elements (EVEs), which are derived from ancient viral infections of germ cells. Although most EVEs have been inactivated, some open reading frames (ORFs) of EVEs obtained functions in the hosts. However, EVE ORFs usually remain unannotated in the genomes, and no databases are available for EVE ORFs. To investigate the function and evolution of EVEs in mammalian genomes, we developed EVE ORF databases for 20 genomes of 19 mammalian species. A total of 736,771 non-overlapping EVE ORFs were identified and archived in a database named gEVE (http://geve.med.u-tokai.ac.jp). The gEVE database provides nucleotide and amino acid sequences, genomic loci and functional annotations of EVE ORFs for all 20 genomes. In analyzing RNA-seq data with the gEVE database, we successfully identified the expressed EVE genes, suggesting that the gEVE database facilitates studies of the genomic analyses of various mammalian species.Database URL: http://geve.med.u-tokai.ac.jp. © The Author(s) 2016. Published by Oxford University Press.

  5. Non-redundant roles of phosphoinositide 3-kinase isoforms alpha and beta in glycoprotein VI-induced platelet signaling and thrombus formation.

    Science.gov (United States)

    Gilio, Karen; Munnix, Imke C A; Mangin, Pierre; Cosemans, Judith M E M; Feijge, Marion A H; van der Meijden, Paola E J; Olieslagers, Servé; Chrzanowska-Wodnicka, Magdalena B; Lillian, Rivka; Schoenwaelder, Simone; Koyasu, Shigeo; Sage, Stewart O; Jackson, Shaun P; Heemskerk, Johan W M

    2009-12-04

    Platelets are activated by adhesion to vascular collagen via the immunoglobulin receptor, glycoprotein VI (GPVI). This causes potent signaling toward activation of phospholipase Cgamma2, which bears similarity to the signaling pathway evoked by T- and B-cell receptors. Phosphoinositide 3-kinase (PI3K) plays an important role in collagen-induced platelet activation, because this activity modulates the autocrine effects of secreted ADP. Here, we identified the PI3K isoforms directly downstream of GPVI in human and mouse platelets and determined their role in GPVI-dependent thrombus formation. The targeting of platelet PI3Kalpha or -beta strongly and selectively suppressed GPVI-induced Ca(2+) mobilization and inositol 1,4,5-triphosphate production, thus demonstrating enhancement of phospholipase Cgamma2 by PI3Kalpha/beta. That PI3Kalpha and -beta have a non-redundant function in GPVI-induced platelet activation and thrombus formation was concluded from measurements of: (i) serine phosphorylation of Akt, (ii) dense granule secretion, (iii) intracellular Ca(2+) increases and surface expression of phosphatidylserine under flow, and (iv) thrombus formation, under conditions where PI3Kalpha/beta was blocked or p85alpha was deficient. In contrast, GPVI-induced platelet activation was insensitive to inhibition or deficiency of PI3Kdelta or -gamma. Furthermore, PI3Kalpha/beta, but not PI3Kgamma, contributed to GPVI-induced Rap1b activation and, surprisingly, also to Rap1b-independent platelet activation via GPVI. Together, these findings demonstrate that both PI3Kalpha and -beta isoforms are required for full GPVI-dependent platelet Ca(2+) signaling and thrombus formation, partly independently of Rap1b. This provides a new mechanistic explanation for the anti-thrombotic effect of PI3K inhibition and makes PI3Kalpha an interesting new target for anti-platelet therapy.

  6. Mapping and identification of HeLa cell proteins separated by immobilized pH-gradient two-dimensional gel electrophoresis and construction of a two-dimensional polyacrylamide gel electrophoresis database

    DEFF Research Database (Denmark)

    Shaw, AC; Rossel Larsen, M; Roepstorff, P

    1999-01-01

    The HeLa cell line, a human adenocarcinoma, is used in many research fields, since it can be infected with a wide range of viruses and intracellular bacteria. Therefore, the mapping of HeLa cell proteins is useful for the investigation of parasite host cell interactions. Because of the recent imp...... these and future data accessible for interlaboratory comparison, we constructed a 2-D PAGE database on the World Wide Web....... the mapping of [35S]methionine/cysteine-labeled HeLa cell proteins with the 2-D PAGE (IPG)-system, using matrix-assisted laser desorption/ionization-mass spectrometry (MALDI-MS) and N-terminal sequencing for protein identification. To date 21 proteins have been identified and mapped. In order to make...

  7. Database Description - eSOL | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available base Description General information of database Database name eSOL Alternative nam...eator Affiliation: The Research and Development of Biological Databases Project, National Institute of Genet...nology 4259 Nagatsuta-cho, Midori-ku, Yokohama, Kanagawa 226-8501 Japan Email: Tel.: +81-45-924-5785 Database... classification Protein sequence databases - Protein properties Organism Taxonomy Name: Escherichia coli Taxonomy ID: 562 Database...i U S A. 2009 Mar 17;106(11):4201-6. External Links: Original website information Database maintenance site

  8. 2P2I HUNTER: a tool for filtering orthosteric protein-protein interaction modulators via a dedicated support vector machine.

    Science.gov (United States)

    Hamon, Véronique; Bourgeas, Raphael; Ducrot, Pierre; Theret, Isabelle; Xuereb, Laura; Basse, Marie Jeanne; Brunel, Jean Michel; Combes, Sebastien; Morelli, Xavier; Roche, Philippe

    2014-01-06

    Over the last 10 years, protein-protein interactions (PPIs) have shown increasing potential as new therapeutic targets. As a consequence, PPIs are today the most screened target class in high-throughput screening (HTS). The development of broad chemical libraries dedicated to these particular targets is essential; however, the chemical space associated with this 'high-hanging fruit' is still under debate. Here, we analyse the properties of 40 non-redundant small molecules present in the 2P2I database (http://2p2idb.cnrs-mrs.fr/) to define a general profile of orthosteric inhibitors and propose an original protocol to filter general screening libraries using a support vector machine (SVM) with 11 standard Dragon molecular descriptors. The filtering protocol has been validated using external datasets from PubChem BioAssay and results from in-house screening campaigns. This external blind validation demonstrated the ability of the SVM model to reduce the size of the filtered chemical library by eliminating up to 96% of the compounds as well as enhancing the proportion of active compounds by up to a factor of 8. We believe that the resulting chemical space identified in this paper will provide the scientific community with a concrete support to search for PPI inhibitors during HTS campaigns.

  9. A resource for benchmarking the usefulness of protein structure models

    Directory of Open Access Journals (Sweden)

    Carbajo Daniel

    2012-08-01

    Full Text Available Abstract Background Increasingly, biologists and biochemists use computational tools to design experiments to probe the function of proteins and/or to engineer them for a variety of different purposes. The most effective strategies rely on the knowledge of the three-dimensional structure of the protein of interest. However it is often the case that an experimental structure is not available and that models of different quality are used instead. On the other hand, the relationship between the quality of a model and its appropriate use is not easy to derive in general, and so far it has been analyzed in detail only for specific application. Results This paper describes a database and related software tools that allow testing of a given structure based method on models of a protein representing different levels of accuracy. The comparison of the results of a computational experiment on the experimental structure and on a set of its decoy models will allow developers and users to assess which is the specific threshold of accuracy required to perform the task effectively. Conclusions The ModelDB server automatically builds decoy models of different accuracy for a given protein of known structure and provides a set of useful tools for their analysis. Pre-computed data for a non-redundant set of deposited protein structures are available for analysis and download in the ModelDB database. Implementation, availability and requirements Project name: A resource for benchmarking the usefulness of protein structure models. Project home page: http://bl210.caspur.it/MODEL-DB/MODEL-DB_web/MODindex.php. Operating system(s: Platform independent. Programming language: Perl-BioPerl (program; mySQL, Perl DBI and DBD modules (database; php, JavaScript, Jmol scripting (web server. Other requirements: Java Runtime Environment v1.4 or later, Perl, BioPerl, CPAN modules, HHsearch, Modeller, LGA, NCBI Blast package, DSSP, Speedfill (Surfnet and PSAIA. License: Free. Any

  10. The STRING database in 2017

    DEFF Research Database (Denmark)

    Szklarczyk, Damian; Morris, John H; Cook, Helen

    2017-01-01

    A system-wide understanding of cellular function requires knowledge of all functional interactions between the expressed proteins. The STRING database aims to collect and integrate this information, by consolidating known and predicted protein-protein association data for a large number of organi......A system-wide understanding of cellular function requires knowledge of all functional interactions between the expressed proteins. The STRING database aims to collect and integrate this information, by consolidating known and predicted protein-protein association data for a large number...... of organisms. The associations in STRING include direct (physical) interactions, as well as indirect (functional) interactions, as long as both are specific and biologically meaningful. Apart from collecting and reassessing available experimental data on protein-protein interactions, and importing known...... pathways and protein complexes from curated databases, interaction predictions are derived from the following sources: (i) systematic co-expression analysis, (ii) detection of shared selective signals across genomes, (iii) automated text-mining of the scientific literature and (iv) computational transfer...

  11. Stackfile Database

    Science.gov (United States)

    deVarvalho, Robert; Desai, Shailen D.; Haines, Bruce J.; Kruizinga, Gerhard L.; Gilmer, Christopher

    2013-01-01

    This software provides storage retrieval and analysis functionality for managing satellite altimetry data. It improves the efficiency and analysis capabilities of existing database software with improved flexibility and documentation. It offers flexibility in the type of data that can be stored. There is efficient retrieval either across the spatial domain or the time domain. Built-in analysis tools are provided for frequently performed altimetry tasks. This software package is used for storing and manipulating satellite measurement data. It was developed with a focus on handling the requirements of repeat-track altimetry missions such as Topex and Jason. It was, however, designed to work with a wide variety of satellite measurement data [e.g., Gravity Recovery And Climate Experiment -- GRACE). The software consists of several command-line tools for importing, retrieving, and analyzing satellite measurement data.

  12. Systematization of the protein sequence diversity in enzymes related to secondary metabolic pathways in plants, in the context of big data biology inspired by the KNApSAcK motorcycle database.

    Science.gov (United States)

    Ikeda, Shun; Abe, Takashi; Nakamura, Yukiko; Kibinge, Nelson; Hirai Morita, Aki; Nakatani, Atsushi; Ono, Naoaki; Ikemura, Toshimichi; Nakamura, Kensuke; Altaf-Ul-Amin, Md; Kanaya, Shigehiko

    2013-05-01

    Biology is increasingly becoming a data-intensive science with the recent progress of the omics fields, e.g. genomics, transcriptomics, proteomics and metabolomics. The species-metabolite relationship database, KNApSAcK Core, has been widely utilized and cited in metabolomics research, and chronological analysis of that research work has helped to reveal recent trends in metabolomics research. To meet the needs of these trends, the KNApSAcK database has been extended by incorporating a secondary metabolic pathway database called Motorcycle DB. We examined the enzyme sequence diversity related to secondary metabolism by means of batch-learning self-organizing maps (BL-SOMs). Initially, we constructed a map by using a big data matrix consisting of the frequencies of all possible dipeptides in the protein sequence segments of plants and bacteria. The enzyme sequence diversity of the secondary metabolic pathways was examined by identifying clusters of segments associated with certain enzyme groups in the resulting map. The extent of diversity of 15 secondary metabolic enzyme groups is discussed. Data-intensive approaches such as BL-SOM applied to big data matrices are needed for systematizing protein sequences. Handling big data has become an inevitable part of biology.

  13. DMPD: Manipulation of mitogen-activated protein kinase/nuclear factor-kappaB-signalingcascades during intracellular Toxoplasma gondii infection. [Dynamic Macrophage Pathway CSML Database

    Lifescience Database Archive (English)

    Full Text Available 15361242 Manipulation of mitogen-activated protein kinase/nuclear factor-kappaB-sig...mmunol Rev. 2004 Oct;201:191-205. (.png) (.svg) (.html) (.csml) Show Manipulation of mitogen-activated prote... gondii infection. PubmedID 15361242 Title Manipulation of mitogen-activated protein kinase/nuclear factor-k

  14. IntPath--an integrated pathway gene relationship database for model organisms and important pathogens.

    Science.gov (United States)

    Zhou, Hufeng; Jin, Jingjing; Zhang, Haojun; Yi, Bo; Wozniak, Michal; Wong, Limsoon

    2012-01-01

    Pathway data are important for understanding the relationship between genes, proteins and many other molecules in living organisms. Pathway gene relationships are crucial information for guidance, prediction, reference and assessment in biochemistry, computational biology, and medicine. Many well-established databases--e.g., KEGG, WikiPathways, and BioCyc--are dedicated to collecting pathway data for public access. However, the effectiveness of these databases is hindered by issues such as incompatible data formats, inconsistent molecular representations, inconsistent molecular relationship representations, inconsistent referrals to pathway names, and incomprehensive data from different databases. In this paper, we overcome these issues through extraction, normalization and integration of pathway data from several major public databases (KEGG, WikiPathways, BioCyc, etc). We build a database that not only hosts our integrated pathway gene relationship data for public access but also maintains the necessary updates in the long run. This public repository is named IntPath (Integrated Pathway gene relationship database for model organisms and important pathogens). Four organisms--S. cerevisiae, M. tuberculosis H37Rv, H. Sapiens and M. musculus--are included in this version (V2.0) of IntPath. IntPath uses the "full unification" approach to ensure no deletion and no introduced noise in this process. Therefore, IntPath contains much richer pathway-gene and pathway-gene pair relationships and much larger number of non-redundant genes and gene pairs than any of the single-source databases. The gene relationships of each gene (measured by average node degree) per pathway are significantly richer. The gene relationships in each pathway (measured by average number of gene pairs per pathway) are also considerably richer in the integrated pathways. Moderate manual curation are involved to get rid of errors and noises from source data (e.g., the gene ID errors in WikiPathways and

  15. Olfactory Receptor Database: a sensory chemoreceptor resource

    OpenAIRE

    Skoufos, Emmanouil; Marenco, Luis; Nadkarni, Prakash M.; Miller, Perry L.; Shepherd, Gordon M.

    2000-01-01

    The Olfactory Receptor Database (ORDB) is a WWW-accessible database that has been expanded from an olfactory receptor resource to a chemoreceptor resource. It stores data on six classes of G-protein-coupled sensory chemoreceptors: (i) olfactory receptor-like proteins, (ii) vomeronasal receptors, (iii) insect olfactory receptors, (iv) worm chemoreceptors, (v) taste papilla receptors and (vi) fungal pheromone receptors. A complementary database of the ligands of these receptors (OdorDB) has bee...

  16. Extending Database Integration Technology

    National Research Council Canada - National Science Library

    Buneman, Peter

    1999-01-01

    Formal approaches to the semantics of databases and database languages can have immediate and practical consequences in extending database integration technologies to include a vastly greater range...

  17. DMPD: Function of lipopolysaccharide (LPS)-binding protein (LBP) and CD14, thereceptor for LPS/LBP complexes: a short review. [Dynamic Macrophage Pathway CSML Database

    Lifescience Database Archive (English)

    Full Text Available eptor for LPS/LBP complexes: a short review. Schumann RR. Res Immunol. 1992 Jan;143(1):11-5. (.png) (.svg) (...ride (LPS)-binding protein (LBP) and CD14, thereceptor for LPS/LBP complexes: a short review. Authors Schuma.../LBP complexes: a short review. PubmedID 1373512 Title Function of lipopolysaccha....html) (.csml) Show Function of lipopolysaccharide (LPS)-binding protein (LBP) and CD14, thereceptor for LPS

  18. ECMDB: The E. coli Metabolome Database

    OpenAIRE

    Guo, An Chi; Jewison, Timothy; Wilson, Michael; Liu, Yifeng; Knox, Craig; Djoumbou, Yannick; Lo, Patrick; Mandal, Rupasri; Krishnamurthy, Ram; Wishart, David S.

    2012-01-01

    The Escherichia coli Metabolome Database (ECMDB, http://www.ecmdb.ca) is a comprehensively annotated metabolomic database containing detailed information about the metabolome of E. coli (K-12). Modelled closely on the Human and Yeast Metabolome Databases, the ECMDB contains >2600 metabolites with links to ?1500 different genes and proteins, including enzymes and transporters. The information in the ECMDB has been collected from dozens of textbooks, journal articles and electronic databases. E...

  19. Proteomic analysis of cerebrospinal fluid from children with central nervous system tumors identifies candidate proteins relating to tumor metastatic spread.

    Science.gov (United States)

    Spreafico, Filippo; Bongarzone, Italia; Pizzamiglio, Sara; Magni, Ruben; Taverna, Elena; De Bortoli, Maida; Ciniselli, Chiara M; Barzanò, Elena; Biassoni, Veronica; Luchini, Alessandra; Liotta, Lance A; Zhou, Weidong; Signore, Michele; Verderio, Paolo; Massimino, Maura

    2017-07-11

    Central nervous system (CNS) tumors are the most common solid tumors in childhood. Since the sensitivity of combined cerebrospinal fluid (CSF) cytology and radiological neuroimaging in detecting meningeal metastases remains relatively low, we sought to characterize the CSF proteome of patients with CSF tumors to identify biomarkers predictive of metastatic spread. CSF samples from 27 children with brain tumors and 13 controls (extra-CNS non-Hodgkin lymphoma) were processed using core-shell hydrogel nanoparticles, and analyzed with reverse-phase liquid chromatography/electrospray tandem mass spectrometry (LC-MS/MS). Candidate proteins were identified with Fisher's exact test and/or a univariate logistic regression model. Reverse phase protein array (RPPA), Western blot (WB), and ELISA were used in the training set and in an independent set of CFS samples (60 cases, 14 controls) to validate our discovery findings. Among the 558 non-redundant proteins identified by LC-MS/MS, 147 were missing from the CSF database at http://www.biosino.org. Fourteen of the 26 final top-candidate proteins were chosen for validation with WB, RPPA and ELISA methods. Six proteins (type 1 collagen, insulin-like growth factor binding protein 4, procollagen C-endopeptidase enhancer 1, glial cell-line derived neurotrophic factor receptor α2, inter-alpha-trypsin inhibitor heavy chain 4, neural proliferation and differentiation control protein-1) revealed the ability to discriminate metastatic cases from controls. Combining a unique dataset of CSFs from pediatric CNS tumors with a novel enabling nanotechnology led us to identify CSF proteins potentially related to metastatic status.

  20. Building alternate protein structures using the elastic network model.

    Science.gov (United States)

    Yang, Qingyi; Sharp, Kim A

    2009-02-15

    We describe a method for efficiently generating ensembles of alternate, all-atom protein structures that (a) differ significantly from the starting structure, (b) have good stereochemistry (bonded geometry), and (c) have good steric properties (absence of atomic overlap). The method uses reconstruction from a series of backbone framework structures that are obtained from a modified elastic network model (ENM) by perturbation along low-frequency normal modes. To ensure good quality backbone frameworks, the single force parameter ENM is modified by introducing two more force parameters to characterize the interaction between the consecutive carbon alphas and those within the same secondary structure domain. The relative stiffness of the three parameters is parameterized to reproduce B-factors, while maintaining good bonded geometry. After parameterization, violations of experimental Calpha-Calpha distances and Calpha-Calpha-Calpha pseudo angles along the backbone are reduced to less than 1%. Simultaneously, the average B-factor correlation coefficient improves to R = 0.77. Two applications illustrate the potential of the approach. (1) 102,051 protein backbones spanning a conformational space of 15 A root mean square deviation were generated from 148 nonredundant proteins in the PDB database, and all-atom models with minimal bonded and nonbonded violations were produced from this ensemble of backbone structures using the SCWRL side chain building program. (2) Improved backbone templates for homology modeling. Fifteen query sequences were each modeled on two targets. For each of the 30 target frameworks, dozens of improved templates could be produced In all cases, improved full atom homology models resulted, of which 50% could be identified blind using the D-Fire statistical potential. (c) 2008 Wiley-Liss, Inc.

  1. Determination of fat, moisture, and protein in meat and meat products by using the FOSS FoodScan Near-Infrared Spectrophotometer with FOSS Artificial Neural Network Calibration Model and Associated Database: collaborative study.

    Science.gov (United States)

    Anderson, Shirley

    2007-01-01

    A collaborative study was conducted to evaluate the repeatability and reproducibility of the FOSS FoodScan near-infrared spectrophotometer with artificial neural network calibration model and database for the determination of fat, moisture, and protein in meat and meat products. Representative samples were homogenized by grinding according to AOAC Official Method 983.18. Approximately 180 g ground sample was placed in a 140 mm round sample dish, and the dish was placed in the FoodScan. The operator ID was entered, the meat product profile within the software was selected, and the scanning process was initiated by pressing the "start" button. Results were displayed for percent (g/100 g) fat, moisture, and protein. Ten blind duplicate samples were sent to 15 collaborators in the United States. The within-laboratory (repeatability) relative standard deviation (RSD(r)) ranged from 0.22 to 2.67% for fat, 0.23 to 0.92% for moisture, and 0.35 to 2.13% for protein. The between-laboratories (reproducibility) relative standard deviation (RSD(R)) ranged from 0.52 to 6.89% for fat, 0.39 to 1.55% for moisture, and 0.54 to 5.23% for protein. The method is recommended for Official First Action.

  2. Database development and management

    CERN Document Server

    Chao, Lee

    2006-01-01

    Introduction to Database Systems Functions of a DatabaseDatabase Management SystemDatabase ComponentsDatabase Development ProcessConceptual Design and Data Modeling Introduction to Database Design Process Understanding Business ProcessEntity-Relationship Data Model Representing Business Process with Entity-RelationshipModelTable Structure and NormalizationIntroduction to TablesTable NormalizationTransforming Data Models to Relational Databases .DBMS Selection Transforming Data Models to Relational DatabasesEnforcing ConstraintsCreating Database for Business ProcessPhysical Design and Database

  3. DisoMCS: Accurately Predicting Protein Intrinsically Disordered Regions Using a Multi-Class Conservative Score Approach.

    Directory of Open Access Journals (Sweden)

    Zhiheng Wang

    Full Text Available The precise prediction of protein intrinsically disordered regions, which play a crucial role in biological procedures, is a necessary prerequisite to further the understanding of the principles and mechanisms of protein function. Here, we propose a novel predictor, DisoMCS, which is a more accurate predictor of protein intrinsically disordered regions. The DisoMCS bases on an original multi-class conservative score (MCS obtained by sequence-order/disorder alignment. Initially, near-disorder regions are defined on fragments located at both the terminus of an ordered region connecting a disordered region. Then the multi-class conservative score is generated by sequence alignment against a known structure database and represented as order, near-disorder and disorder conservative scores. The MCS of each amino acid has three elements: order, near-disorder and disorder profiles. Finally, the MCS is exploited as features to identify disordered regions in sequences. DisoMCS utilizes a non-redundant data set as the training set, MCS and predicted secondary structure as features, and a conditional random field as the classification algorithm. In predicted near-disorder regions a residue is determined as an order or a disorder according to the optimized decision threshold. DisoMCS was evaluated by cross-validation, large-scale prediction, independent tests and CASP (Critical Assessment of Techniques for Protein Structure Prediction tests. All results confirmed that DisoMCS was very competitive in terms of accuracy of prediction when compared with well-established publicly available disordered region predictors. It also indicated our approach was more accurate when a query has higher homologous with the knowledge database.The DisoMCS is available at http://cal.tongji.edu.cn/disorder/.

  4. ESLpred2: improved method for predicting subcellular localization of eukaryotic proteins

    Directory of Open Access Journals (Sweden)

    Raghava Gajendra PS

    2008-11-01

    Full Text Available Abstract Background The expansion of raw protein sequence databases in the post genomic era and availability of fresh annotated sequences for major localizations particularly motivated us to introduce a new improved version of our previously forged eukaryotic subcellular localizations prediction method namely "ESLpred". Since, subcellular localization of a protein offers essential clues about its functioning, hence, availability of localization predictor would definitely aid and expedite the protein deciphering studies. However, robustness of a predictor is highly dependent on the superiority of dataset and extracted protein attributes; hence, it becomes imperative to improve the performance of presently available method using latest dataset and crucial input features. Results Here, we describe augmentation in the prediction performance obtained for our most popular ESLpred method using new crucial features as an input to Support Vector Machine (SVM. In addition, recently available, highly non-redundant dataset encompassing three kingdoms specific protein sequence sets; 1198 fungi sequences, 2597 from animal and 491 plant sequences were also included in the present study. First, using the evolutionary information in the form of profile composition along with whole and N-terminal sequence composition as an input feature vector of 440 dimensions, overall accuracies of 72.7, 75.8 and 74.5% were achieved respectively after five-fold cross-validation. Further, enhancement in performance was observed when similarity search based results were coupled with whole and N-terminal sequence composition along with profile composition by yielding overall accuracies of 75.9, 80.8, 76.6% respectively; best accuracies reported till date on the same datasets. Conclusion These results provide confidence about the reliability and accurate prediction of SVM modules generated in the present study using sequence and profile compositions along with similarity search

  5. Evaluating the Immunogenicity of Protein Drugs by Applying In Vitro MHC Binding Data and the Immune Epitope Database and Analysis Resource

    Directory of Open Access Journals (Sweden)

    Sinu Paul

    2013-01-01

    Full Text Available The immune system has evolved to become highly specialized in recognizing and responding to pathogens and foreign molecules. Specifically, the function of HLA class II is to ensure that a sufficient sample of peptides derived from foreign molecules is presented to T cells. This leads to an important concern in human drug development as the possible immunogenicity of biopharmaceuticals, especially those intended for chronic administration, can lead to reduced efficacy and an undesired safety profile for biological therapeutics. As part of this review, we will highlight the molecular basis of antigen presentation as a key step in the induction of T cell responses, emphasizing the events associated with peptide binding to polymorphic and polygenic HLA class II molecules. We will further review methodologies that predict HLA class II binding peptides and candidate epitopes. We will focus on tools provided by the Immune Epitope Database and Analysis Resource, discussing the basic features of different prediction methods, the objective evaluation of prediction quality, and general guidelines for practical use of these tools. Finally the use, advantages, and limitations of the methodology will be demonstrated in a review of two previous studies investigating the immunogenicity of erythropoietin and timothy grass pollen.

  6. Mathematics for Databases

    NARCIS (Netherlands)

    ir. Sander van Laar

    2007-01-01

    A formal description of a database consists of the description of the relations (tables) of the database together with the constraints that must hold on the database. Furthermore the contents of a database can be retrieved using queries. These constraints and queries for databases can very well be

  7. Databases and their application

    NARCIS (Netherlands)

    Grimm, E.C.; Bradshaw, R.H.W; Brewer, S.; Flantua, S.; Giesecke, T.; Lézine, A.M.; Takahara, H.; Williams, J.W.,Jr; Elias, S.A.; Mock, C.J.

    2013-01-01

    During the past 20 years, several pollen database cooperatives have been established. These databases are now constituent databases of the Neotoma Paleoecology Database, a public domain, multiproxy, relational database designed for Quaternary-Pliocene fossil data and modern surface samples. The

  8. DOT Online Database

    Science.gov (United States)

    Page Home Table of Contents Contents Search Database Search Login Login Databases Advisory Circulars accessed by clicking below: Full-Text WebSearch Databases Database Records Date Advisory Circulars 2092 5 data collection and distribution policies. Document Database Website provided by MicroSearch

  9. Identification of group specific motifs in Beta-lactamase family of proteins

    Directory of Open Access Journals (Sweden)

    Saxena Akansha

    2009-12-01

    Full Text Available Abstract Background Beta-lactamases are one of the most serious threats to public health. In order to combat this threat we need to study the molecular and functional diversity of these enzymes and identify signatures specific to these enzymes. These signatures will enable us to develop inhibitors and diagnostic probes specific to lactamases. The existing classification of beta-lactamases was developed nearly 30 years ago when few lactamases were available. DLact database contain more than 2000 beta-lactamase, which can be used to study the molecular diversity and to identify signatures specific to this family. Methods A set of 2020 beta-lactamase proteins available in the DLact database http://59.160.102.202/DLact were classified using graph-based clustering of Best Bi-Directional Hits. Non-redundant (> 90 percent identical protein sequences from each group were aligned using T-Coffee and annotated using information available in literature. Motifs specific to each group were predicted using PRATT program. Results The graph-based classification of beta-lactamase proteins resulted in the formation of six groups (Four major groups containing 191, 726, 774 and 73 proteins while two minor groups containing 50 and 8 proteins. Based on the information available in literature, we found that each of the four major groups correspond to the four classes proposed by Ambler. The two minor groups were novel and do not contain molecular signatures of beta-lactamase proteins reported in literature. The group-specific motifs showed high sensitivity (> 70% and very high specificity (> 90%. The motifs from three groups (corresponding to class A, C and D had a high level of conservation at DNA as well as protein level whereas the motifs from the fourth group (corresponding to class B showed conservation at only protein level. Conclusion The graph-based classification of beta-lactamase proteins corresponds with the classification proposed by Ambler, thus there is

  10. Generation and analysis of a large-scale expressed sequence Tag database from a full-length enriched cDNA library of developing leaves of Gossypium hirsutum L.

    Directory of Open Access Journals (Sweden)

    Min Lin

    Full Text Available BACKGROUND: Cotton (Gossypium hirsutum L. is one of the world's most economically-important crops. However, its entire genome has not been sequenced, and limited resources are available in GenBank for understanding the molecular mechanisms underlying leaf development and senescence. METHODOLOGY/PRINCIPAL FINDINGS: In this study, 9,874 high-quality ESTs were generated from a normalized, full-length cDNA library derived from pooled RNA isolated from throughout leaf development during the plant blooming stage. After clustering and assembly of these ESTs, 5,191 unique sequences, representative 1,652 contigs and 3,539 singletons, were obtained. The average unique sequence length was 682 bp. Annotation of these unique sequences revealed that 84.4% showed significant homology to sequences in the NCBI non-redundant protein database, and 57.3% had significant hits to known proteins in the Swiss-Prot database. Comparative analysis indicated that our library added 2,400 ESTs and 991 unique sequences to those known for cotton. The unigenes were functionally characterized by gene ontology annotation. We identified 1,339 and 200 unigenes as potential leaf senescence-related genes and transcription factors, respectively. Moreover, nine genes related to leaf senescence and eleven MYB transcription factors were randomly selected for quantitative real-time PCR (qRT-PCR, which revealed that these genes were regulated differentially during senescence. The qRT-PCR for three GhYLSs revealed that these genes express express preferentially in senescent leaves. CONCLUSIONS/SIGNIFICANCE: These EST resources will provide valuable sequence information for gene expression profiling analyses and functional genomics studies to elucidate their roles, as well as for studying the mechanisms of leaf development and senescence in cotton and discovering candidate genes related to important agronomic traits of cotton. These data will also facilitate future whole-genome sequence

  11. Dietary Supplement Ingredient Database

    Science.gov (United States)

    ... and US Department of Agriculture Dietary Supplement Ingredient Database Toggle navigation Menu Home About DSID Mission Current ... values can be saved to build a small database or add to an existing database for national, ...

  12. Energy Consumption Database

    Science.gov (United States)

    Consumption Database The California Energy Commission has created this on-line database for informal reporting ) classifications. The database also provides easy downloading of energy consumption data into Microsoft Excel (XLSX

  13. YMDB: the Yeast Metabolome Database

    Science.gov (United States)

    Jewison, Timothy; Knox, Craig; Neveu, Vanessa; Djoumbou, Yannick; Guo, An Chi; Lee, Jacqueline; Liu, Philip; Mandal, Rupasri; Krishnamurthy, Ram; Sinelnikov, Igor; Wilson, Michael; Wishart, David S.

    2012-01-01

    The Yeast Metabolome Database (YMDB, http://www.ymdb.ca) is a richly annotated ‘metabolomic’ database containing detailed information about the metabolome of Saccharomyces cerevisiae. Modeled closely after the Human Metabolome Database, the YMDB contains >2000 metabolites with links to 995 different genes/proteins, including enzymes and transporters. The information in YMDB has been gathered from hundreds of books, journal articles and electronic databases. In addition to its comprehensive literature-derived data, the YMDB also contains an extensive collection of experimental intracellular and extracellular metabolite concentration data compiled from detailed Mass Spectrometry (MS) and Nuclear Magnetic Resonance (NMR) metabolomic analyses performed in our lab. This is further supplemented with thousands of NMR and MS spectra collected on pure, reference yeast metabolites. Each metabolite entry in the YMDB contains an average of 80 separate data fields including comprehensive compound description, names and synonyms, structural information, physico-chemical data, reference NMR and MS spectra, intracellular/extracellular concentrations, growth conditions and substrates, pathway information, enzyme data, gene/protein sequence data, as well as numerous hyperlinks to images, references and other public databases. Extensive searching, relational querying and data browsing tools are also provided that support text, chemical structure, spectral, molecular weight and gene/protein sequence queries. Because of S. cervesiae's importance as a model organism for biologists and as a biofactory for industry, we believe this kind of database could have considerable appeal not only to metabolomics researchers, but also to yeast biologists, systems biologists, the industrial fermentation industry, as well as the beer, wine and spirit industry. PMID:22064855

  14. The NCBI BioSystems database.

    Science.gov (United States)

    Geer, Lewis Y; Marchler-Bauer, Aron; Geer, Renata C; Han, Lianyi; He, Jane; He, Siqian; Liu, Chunlei; Shi, Wenyao; Bryant, Stephen H

    2010-01-01

    The NCBI BioSystems database, found at http://www.ncbi.nlm.nih.gov/biosystems/, centralizes and cross-links existing biological systems databases, increasing their utility and target audience by integrating their pathways and systems into NCBI resources. This integration allows users of NCBI's Entrez databases to quickly categorize proteins, genes and small molecules by metabolic pathway, disease state or other BioSystem type, without requiring time-consuming inference of biological relationships from the literature or multiple experimental datasets.

  15. Collecting Taxes Database

    Data.gov (United States)

    US Agency for International Development — The Collecting Taxes Database contains performance and structural indicators about national tax systems. The database contains quantitative revenue performance...

  16. USAID Anticorruption Projects Database

    Data.gov (United States)

    US Agency for International Development — The Anticorruption Projects Database (Database) includes information about USAID projects with anticorruption interventions implemented worldwide between 2007 and...

  17. NoSQL databases

    OpenAIRE

    Mrozek, Jakub

    2012-01-01

    This thesis deals with database systems referred to as NoSQL databases. In the second chapter, I explain basic terms and the theory of database systems. A short explanation is dedicated to database systems based on the relational data model and the SQL standardized query language. Chapter Three explains the concept and history of the NoSQL databases, and also presents database models, major features and the use of NoSQL databases in comparison with traditional database systems. In the fourth ...

  18. dBBQs: dataBase of Bacterial Quality scores.

    Science.gov (United States)

    Wanchai, Visanu; Patumcharoenpol, Preecha; Nookaew, Intawat; Ussery, David

    2017-12-28

    It is well-known that genome sequencing technologies are becoming significantly cheaper and faster. As a result of this, the exponential growth in sequencing data in public databases allows us to explore ever growing large collections of genome sequences. However, it is less known that the majority of available sequenced genome sequences in public databases are not complete, drafts of varying qualities. We have calculated quality scores for around 100,000 bacterial genomes from all major genome repositories and put them in a fast and easy-to-use database. Prokaryotic genomic data from all sources were collected and combined to make a non-redundant set of bacterial genomes. The genome quality score for each was calculated by four different measurements: assembly quality, number of rRNA and tRNA genes, and the occurrence of conserved functional domains. The dataBase of Bacterial Quality scores (dBBQs) was designed to store and retrieve quality scores. It offers fast searching and download features which the result can be used for further analysis. In addition, the search results are shown in interactive JavaScript chart framework using DC.js. The analysis of quality scores across major public genome databases find that around 68% of the genomes are of acceptable quality for many uses. dBBQs (available at http://arc-gem.uams.edu/dbbqs ) provides genome quality scores for all available prokaryotic genome sequences with a user-friendly Web-interface. These scores can be used as cut-offs to get a high-quality set of genomes for testing bioinformatics tools or improving the analysis. Moreover, all data of the four measurements that were combined to make the quality score for each genome, which can potentially be used for further analysis. dBBQs will be updated regularly and is freely use for non-commercial purpose.

  19. PrimateLit Database

    Science.gov (United States)

    Primate Info Net Related Databases NCRR PrimateLit: A bibliographic database for primatology Top of any problems with this service. We welcome your feedback. The PrimateLit database is no longer being Resources, National Institutes of Health. The database is a collaborative project of the Wisconsin Primate

  20. KALIMER database development

    Energy Technology Data Exchange (ETDEWEB)

    Jeong, Kwan Seong; Lee, Yong Bum; Jeong, Hae Yong; Ha, Kwi Seok

    2003-03-01

    KALIMER database is an advanced database to utilize the integration management for liquid metal reactor design technology development using Web applications. KALIMER design database is composed of results database, Inter-Office Communication (IOC), 3D CAD database, and reserved documents database. Results database is a research results database during all phase for liquid metal reactor design technology development of mid-term and long-term nuclear R and D. IOC is a linkage control system inter sub project to share and integrate the research results for KALIMER. 3D CAD database is a schematic overview for KALIMER design structure. And reserved documents database is developed to manage several documents and reports since project accomplishment.

  1. KALIMER database development

    International Nuclear Information System (INIS)

    Jeong, Kwan Seong; Lee, Yong Bum; Jeong, Hae Yong; Ha, Kwi Seok

    2003-03-01

    KALIMER database is an advanced database to utilize the integration management for liquid metal reactor design technology development using Web applications. KALIMER design database is composed of results database, Inter-Office Communication (IOC), 3D CAD database, and reserved documents database. Results database is a research results database during all phase for liquid metal reactor design technology development of mid-term and long-term nuclear R and D. IOC is a linkage control system inter sub project to share and integrate the research results for KALIMER. 3D CAD database is a schematic overview for KALIMER design structure. And reserved documents database is developed to manage several documents and reports since project accomplishment

  2. A novel fractal approach for predicting G-protein-coupled receptors and their subfamilies with support vector machines.

    Science.gov (United States)

    Nie, Guoping; Li, Yong; Wang, Feichi; Wang, Siwen; Hu, Xuehai

    2015-01-01

    G-protein-coupled receptors (GPCRs) are seven membrane-spanning proteins and regulate many important physiological processes, such as vision, neurotransmission, immune response and so on. GPCRs-related pathways are the targets of a large number of marketed drugs. Therefore, the design of a reliable computational model for predicting GPCRs from amino acid sequence has long been a significant biomedical problem. Chaos game representation (CGR) reveals the fractal patterns hidden in protein sequences, and then fractal dimension (FD) is an important feature of these highly irregular geometries with concise mathematical expression. Here, in order to extract important features from GPCR protein sequences, CGR algorithm, fractal dimension and amino acid composition (AAC) are employed to formulate the numerical features of protein samples. Four groups of features are considered, and each group is evaluated by support vector machine (SVM) and 10-fold cross-validation test. To test the performance of the present method, a new non-redundant dataset was built based on latest GPCRDB database. Comparing the results of numerical experiments, the group of combined features with AAC and FD gets the best result, the accuracy is 99.22% and Matthew's correlation coefficient (MCC) is 0.9845 for identifying GPCRs from non-GPCRs. Moreover, if it is classified as a GPCR, it will be further put into the second level, which will classify a GPCR into one of the five main subfamilies. At this level, the group of combined features with AAC and FD also gets best accuracy 85.73%. Finally, the proposed predictor is also compared with existing methods and shows better performances.

  3. Logical database design principles

    CERN Document Server

    Garmany, John; Clark, Terry

    2005-01-01

    INTRODUCTION TO LOGICAL DATABASE DESIGNUnderstanding a Database Database Architectures Relational Databases Creating the Database System Development Life Cycle (SDLC)Systems Planning: Assessment and Feasibility System Analysis: RequirementsSystem Analysis: Requirements Checklist Models Tracking and Schedules Design Modeling Functional Decomposition DiagramData Flow Diagrams Data Dictionary Logical Structures and Decision Trees System Design: LogicalSYSTEM DESIGN AND IMPLEMENTATION The ER ApproachEntities and Entity Types Attribute Domains AttributesSet-Valued AttributesWeak Entities Constraint

  4. An Interoperable Cartographic Database

    OpenAIRE

    Slobodanka Ključanin; Zdravko Galić

    2007-01-01

    The concept of producing a prototype of interoperable cartographic database is explored in this paper, including the possibilities of integration of different geospatial data into the database management system and their visualization on the Internet. The implementation includes vectorization of the concept of a single map page, creation of the cartographic database in an object-relation database, spatial analysis, definition and visualization of the database content in the form of a map on t...

  5. Software listing: CHEMTOX database

    International Nuclear Information System (INIS)

    Moskowitz, P.D.

    1993-01-01

    Initially launched in 1983, the CHEMTOX Database was among the first microcomputer databases containing hazardous chemical information. The database is used in many industries and government agencies in more than 17 countries. Updated quarterly, the CHEMTOX Database provides detailed environmental and safety information on 7500-plus hazardous substances covered by dozens of regulatory and advisory sources. This brief listing describes the method of accessing data and provides ordering information for those wishing to obtain the CHEMTOX Database

  6. Database Search Engines: Paradigms, Challenges and Solutions.

    Science.gov (United States)

    Verheggen, Kenneth; Martens, Lennart; Berven, Frode S; Barsnes, Harald; Vaudel, Marc

    2016-01-01

    The first step in identifying proteins from mass spectrometry based shotgun proteomics data is to infer peptides from tandem mass spectra, a task generally achieved using database search engines. In this chapter, the basic principles of database search engines are introduced with a focus on open source software, and the use of database search engines is demonstrated using the freely available SearchGUI interface. This chapter also discusses how to tackle general issues related to sequence database searching and shows how to minimize their impact.

  7. 2P2IHUNTER: a tool for filtering orthosteric protein–protein interaction modulators via a dedicated support vector machine

    Science.gov (United States)

    Hamon, Véronique; Bourgeas, Raphael; Ducrot, Pierre; Theret, Isabelle; Xuereb, Laura; Basse, Marie Jeanne; Brunel, Jean Michel; Combes, Sebastien; Morelli, Xavier; Roche, Philippe

    2014-01-01

    Over the last 10 years, protein–protein interactions (PPIs) have shown increasing potential as new therapeutic targets. As a consequence, PPIs are today the most screened target class in high-throughput screening (HTS). The development of broad chemical libraries dedicated to these particular targets is essential; however, the chemical space associated with this ‘high-hanging fruit’ is still under debate. Here, we analyse the properties of 40 non-redundant small molecules present in the 2P2I database (http://2p2idb.cnrs-mrs.fr/) to define a general profile of orthosteric inhibitors and propose an original protocol to filter general screening libraries using a support vector machine (SVM) with 11 standard Dragon molecular descriptors. The filtering protocol has been validated using external datasets from PubChem BioAssay and results from in-house screening campaigns. This external blind validation demonstrated the ability of the SVM model to reduce the size of the filtered chemical library by eliminating up to 96% of the compounds as well as enhancing the proportion of active compounds by up to a factor of 8. We believe that the resulting chemical space identified in this paper will provide the scientific community with a concrete support to search for PPI inhibitors during HTS campaigns. PMID:24196694

  8. ProtDCal: A program to compute general-purpose-numerical descriptors for sequences and 3D-structures of proteins.

    Science.gov (United States)

    Ruiz-Blanco, Yasser B; Paz, Waldo; Green, James; Marrero-Ponce, Yovani

    2015-05-16

    The exponential growth of protein structural and sequence databases is enabling multifaceted approaches to understanding the long sought sequence-structure-function relationship. Advances in computation now make it possible to apply well-established data mining and pattern recognition techniques to these data to learn models that effectively relate structure and function. However, extracting meaningful numerical descriptors of protein sequence and structure is a key issue that requires an efficient and widely available solution. We here introduce ProtDCal, a new computational software suite capable of generating tens of thousands of features considering both sequence-based and 3D-structural descriptors. We demonstrate, by means of principle component analysis and Shannon entropy tests, how ProtDCal's sequence-based descriptors provide new and more relevant information not encoded by currently available servers for sequence-based protein feature generation. The wide diversity of the 3D-structure-based features generated by ProtDCal is shown to provide additional complementary information and effectively completes its general protein encoding capability. As demonstration of the utility of ProtDCal's features, prediction models of N-linked glycosylation sites are trained and evaluated. Classification performance compares favourably with that of contemporary predictors of N-linked glycosylation sites, in spite of not using domain-specific features as input information. ProtDCal provides a friendly and cross-platform graphical user interface, developed in the Java programming language and is freely available at: http://bioinf.sce.carleton.ca/ProtDCal/ . ProtDCal introduces local and group-based encoding which enhances the diversity of the information captured by the computed features. Furthermore, we have shown that adding structure-based descriptors contributes non-redundant additional information to the features-based characterization of polypeptide systems. This

  9. Knockout of the PKN Family of Rho Effector Kinases Reveals a Non-redundant Role for PKN2 in Developmental Mesoderm Expansion

    Directory of Open Access Journals (Sweden)

    Ivan Quétier

    2016-01-01

    Full Text Available In animals, the protein kinase C (PKC family has expanded into diversely regulated subgroups, including the Rho family-responsive PKN kinases. Here, we describe knockouts of all three mouse PKN isoforms and reveal that PKN2 loss results in lethality at embryonic day 10 (E10, with associated cardiovascular and morphogenetic defects. The cardiovascular phenotype was not recapitulated by conditional deletion of PKN2 in endothelial cells or the developing heart. In contrast, inducible systemic deletion of PKN2 after E7 provoked collapse of the embryonic mesoderm. Furthermore, mouse embryonic fibroblasts, which arise from the embryonic mesoderm, depend on PKN2 for proliferation and motility. These cellular defects are reflected in vivo as dependence on PKN2 for mesoderm proliferation and neural crest migration. We conclude that failure of the mesoderm to expand in the absence of PKN2 compromises cardiovascular integrity and development, resulting in lethality.

  10. Directory of IAEA databases

    International Nuclear Information System (INIS)

    1991-11-01

    The first edition of the Directory of IAEA Databases is intended to describe the computerized information sources available to IAEA staff members. It contains a listing of all databases produced at the IAEA, together with information on their availability

  11. Native Health Research Database

    Science.gov (United States)

    ... Indian Health Board) Welcome to the Native Health Database. Please enter your search terms. Basic Search Advanced ... To learn more about searching the Native Health Database, click here. Tutorial Video The NHD has made ...

  12. Cell Centred Database (CCDB)

    Data.gov (United States)

    U.S. Department of Health & Human Services — The Cell Centered Database (CCDB) is a web accessible database for high resolution 2D, 3D and 4D data from light and electron microscopy, including correlated imaging.

  13. E3 Staff Database

    Data.gov (United States)

    US Agency for International Development — E3 Staff database is maintained by E3 PDMS (Professional Development & Management Services) office. The database is Mysql. It is manually updated by E3 staff as...

  14. The ABC (Analysing Biomolecular Contacts-database

    Directory of Open Access Journals (Sweden)

    Walter Peter

    2007-03-01

    Full Text Available As protein-protein interactions are one of the basic mechanisms in most cellular processes, it is desirable to understand the molecular details of protein-protein contacts and ultimately be able to predict which proteins interact. Interface areas on a protein surface that are involved in protein interactions exhibit certain characteristics. Therefore, several attempts were made to distinguish protein interactions from each other and to categorize them. One way of classification are the groups of transient and permanent interactions. Previously two of the authors analysed several properties for transient complexes such as the amino acid and secondary structure element composition and pairing preferences. Certainly, interfaces can be characterized by many more possible attributes and this is a subject of intense ongoing research. Although several freely available online databases exist that illuminate various aspects of protein-protein interactions, we decided to construct a new database collecting all desired interface features allowing for facile selection of subsets of complexes. As database-server we applied MySQL and the program logic was written in JAVA. Furthermore several class extensions and tools such as JMOL were included to visualize the interfaces and JfreeChart for the representation of diagrams and statistics. The contact data is automatically generated from standard PDB files by a tcl/tk-script running through the molecular visualization package VMD. Currently the database contains 536 interfaces extracted from 479 PDB files and it can be queried by various types of parameters. Here, we describe the database design and demonstrate its usefulness with a number of selected features.

  15. NIRS database of the original research database

    International Nuclear Information System (INIS)

    Morita, Kyoko

    1991-01-01

    Recently, library staffs arranged and compiled the original research papers that have been written by researchers for 33 years since National Institute of Radiological Sciences (NIRS) established. This papers describes how the internal database of original research papers has been created. This is a small sample of hand-made database. This has been cumulating by staffs who have any knowledge about computer machine or computer programming. (author)

  16. Scopus database: a review.

    Science.gov (United States)

    Burnham, Judy F

    2006-03-08

    The Scopus database provides access to STM journal articles and the references included in those articles, allowing the searcher to search both forward and backward in time. The database can be used for collection development as well as for research. This review provides information on the key points of the database and compares it to Web of Science. Neither database is inclusive, but complements each other. If a library can only afford one, choice must be based in institutional needs.

  17. Aviation Safety Issues Database

    Science.gov (United States)

    Morello, Samuel A.; Ricks, Wendell R.

    2009-01-01

    The aviation safety issues database was instrumental in the refinement and substantiation of the National Aviation Safety Strategic Plan (NASSP). The issues database is a comprehensive set of issues from an extremely broad base of aviation functions, personnel, and vehicle categories, both nationally and internationally. Several aviation safety stakeholders such as the Commercial Aviation Safety Team (CAST) have already used the database. This broader interest was the genesis to making the database publically accessible and writing this report.

  18. Database Description - TMBETA-GENOME | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available ENOME is a database for transmembrane β-barrel proteins in complete genomes. For each genome, calculations with machine learning algo...rithms and statistical methods have been perfumed and th

  19. Automated Oracle database testing

    CERN Multimedia

    CERN. Geneva

    2014-01-01

    Ensuring database stability and steady performance in the modern world of agile computing is a major challenge. Various changes happening at any level of the computing infrastructure: OS parameters & packages, kernel versions, database parameters & patches, or even schema changes, all can potentially harm production services. This presentation shows how an automatic and regular testing of Oracle databases can be achieved in such agile environment.

  20. Inleiding database-systemen

    NARCIS (Netherlands)

    Pels, H.J.; Lans, van der R.F.; Pels, H.J.; Meersman, R.A.

    1993-01-01

    Dit artikel introduceert de voornaamste begrippen die een rol spelen rond databases en het geeft een overzicht van de doelstellingen, de functies en de componenten van database-systemen. Hoewel de functie van een database intuitief vrij duidelijk is, is het toch een in technologisch opzicht complex

  1. Database Description - RMOS | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available base Description General information of database Database name RMOS Alternative nam...arch Unit Shoshi Kikuchi E-mail : Database classification Plant databases - Rice Microarray Data and other Gene Expression Database...s Organism Taxonomy Name: Oryza sativa Taxonomy ID: 4530 Database description The Ric...19&lang=en Whole data download - Referenced database Rice Expression Database (RED) Rice full-length cDNA Database... (KOME) Rice Genome Integrated Map Database (INE) Rice Mutant Panel Database (Tos17) Rice Genome Annotation Database

  2. An Interoperable Cartographic Database

    Directory of Open Access Journals (Sweden)

    Slobodanka Ključanin

    2007-05-01

    Full Text Available The concept of producing a prototype of interoperable cartographic database is explored in this paper, including the possibilities of integration of different geospatial data into the database management system and their visualization on the Internet. The implementation includes vectorization of the concept of a single map page, creation of the cartographic database in an object-relation database, spatial analysis, definition and visualization of the database content in the form of a map on the Internet. 

  3. Keyword Search in Databases

    CERN Document Server

    Yu, Jeffrey Xu; Chang, Lijun

    2009-01-01

    It has become highly desirable to provide users with flexible ways to query/search information over databases as simple as keyword search like Google search. This book surveys the recent developments on keyword search over databases, and focuses on finding structural information among objects in a database using a set of keywords. Such structural information to be returned can be either trees or subgraphs representing how the objects, that contain the required keywords, are interconnected in a relational database or in an XML database. The structural keyword search is completely different from

  4. Nuclear power economic database

    International Nuclear Information System (INIS)

    Ding Xiaoming; Li Lin; Zhao Shiping

    1996-01-01

    Nuclear power economic database (NPEDB), based on ORACLE V6.0, consists of three parts, i.e., economic data base of nuclear power station, economic data base of nuclear fuel cycle and economic database of nuclear power planning and nuclear environment. Economic database of nuclear power station includes data of general economics, technique, capital cost and benefit, etc. Economic database of nuclear fuel cycle includes data of technique and nuclear fuel price. Economic database of nuclear power planning and nuclear environment includes data of energy history, forecast, energy balance, electric power and energy facilities

  5. Database Description - RPD | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available ase Description General information of database Database name RPD Alternative name Rice Proteome Database...titute of Crop Science, National Agriculture and Food Research Organization Setsuko Komatsu E-mail: Database... classification Proteomics Resources Plant databases - Rice Organism Taxonomy Name: Oryza sativa Taxonomy ID: 4530 Database... description Rice Proteome Database contains information on protei...and entered in the Rice Proteome Database. The database is searchable by keyword,

  6. Database Description - JSNP | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available base Description General information of database Database name JSNP Alternative nam...n Science and Technology Agency Creator Affiliation: Contact address E-mail : Database...sapiens Taxonomy ID: 9606 Database description A database of about 197,000 polymorphisms in Japanese populat...1):605-610 External Links: Original website information Database maintenance site Institute of Medical Scien...er registration Not available About This Database Database Description Download License Update History of This Database

  7. Database Description - ASTRA | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available abase Description General information of database Database name ASTRA Alternative n...tics Journal Search: Contact address Database classification Nucleotide Sequence Databases - Gene structure,...3702 Taxonomy Name: Oryza sativa Taxonomy ID: 4530 Database description The database represents classified p...(10):1211-6. External Links: Original website information Database maintenance site National Institute of Ad... for user registration Not available About This Database Database Description Dow

  8. Database Description - RED | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available ase Description General information of database Database name RED Alternative name Rice Expression Database...enome Research Unit Shoshi Kikuchi E-mail : Database classification Plant databases - Rice Database classifi...cation Microarray, Gene Expression Organism Taxonomy Name: Oryza sativa Taxonomy ID: 4530 Database descripti... Article title: Rice Expression Database: the gateway to rice functional genomics...nt Science (2002) Dec 7 (12):563-564 External Links: Original website information Database maintenance site

  9. Database Description - PLACE | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available abase Description General information of database Database name PLACE Alternative name A Database...Kannondai, Tsukuba, Ibaraki 305-8602, Japan National Institute of Agrobiological Sciences E-mail : Databas...e classification Plant databases Organism Taxonomy Name: Tracheophyta Taxonomy ID: 58023 Database...99, Vol.27, No.1 :297-300 External Links: Original website information Database maintenance site National In...- Need for user registration Not available About This Database Database Descripti

  10. Database Description - Arabidopsis Phenome Database | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Arabidopsis Phenome Database Database Description General information of database Database n... BioResource Center Hiroshi Masuya Database classification Plant databases - Arabidopsis thaliana Organism T...axonomy Name: Arabidopsis thaliana Taxonomy ID: 3702 Database description The Arabidopsis thaliana phenome i...heir effective application. We developed the new Arabidopsis Phenome Database integrating two novel database...seful materials for their experimental research. The other, the “Database of Curated Plant Phenome” focusing

  11. Database activities at Brookhaven National Laboratory

    International Nuclear Information System (INIS)

    Trahern, C.G.

    1995-01-01

    Brookhaven National Laboratory is a multi-disciplinary lab in the DOE system of research laboratories. Database activities are correspondingly diverse within the restrictions imposed by the dominant relational database paradigm. The authors discuss related activities and tools used in RHIC and in the other major projects at BNL. The others are the Protein Data Bank being maintained by the Chemistry department, and a Geographical Information System (GIS)--a Superfund sponsored environmental monitoring project under development in the Office of Environmental Restoration

  12. Construction of a medicinal leech transcriptome database and its application to the identification of leech homologs of neural and innate immune genes

    Directory of Open Access Journals (Sweden)

    Wincker Patrick

    2010-06-01

    Full Text Available Abstract Background The medicinal leech, Hirudo medicinalis, is an important model system for the study of nervous system structure, function, development, regeneration and repair. It is also a unique species in being presently approved for use in medical procedures, such as clearing of pooled blood following certain surgical procedures. It is a current, and potentially also future, source of medically useful molecular factors, such as anticoagulants and antibacterial peptides, which may have evolved as a result of its parasitizing large mammals, including humans. Despite the broad focus of research on this system, little has been done at the genomic or transcriptomic levels and there is a paucity of openly available sequence data. To begin to address this problem, we constructed whole embryo and adult central nervous system (CNS EST libraries and created a clustered sequence database of the Hirudo transcriptome that is available to the scientific community. Results A total of ~133,000 EST clones from two directionally-cloned cDNA libraries, one constructed from mRNA derived from whole embryos at several developmental stages and the other from adult CNS cords, were sequenced in one or both directions by three different groups: Genoscope (French National Sequencing Center, the University of Iowa Sequencing Facility and the DOE Joint Genome Institute. These were assembled using the phrap software package into 31,232 unique contigs and singletons, with an average length of 827 nt. The assembled transcripts were then translated in all six frames and compared to proteins in NCBI's non-redundant (NR and to the Gene Ontology (GO protein sequence databases, resulting in 15,565 matches to 11,236 proteins in NR and 13,935 matches to 8,073 proteins in GO. Searching the database for transcripts of genes homologous to those thought to be involved in the innate immune responses of vertebrates and other invertebrates yielded a set of nearly one hundred

  13. Construction of a medicinal leech transcriptome database and its application to the identification of leech homologs of neural and innate immune genes.

    Science.gov (United States)

    Macagno, Eduardo R; Gaasterland, Terry; Edsall, Lee; Bafna, Vineet; Soares, Marcelo B; Scheetz, Todd; Casavant, Thomas; Da Silva, Corinne; Wincker, Patrick; Tasiemski, Aurélie; Salzet, Michel

    2010-06-25

    The medicinal leech, Hirudo medicinalis, is an important model system for the study of nervous system structure, function, development, regeneration and repair. It is also a unique species in being presently approved for use in medical procedures, such as clearing of pooled blood following certain surgical procedures. It is a current, and potentially also future, source of medically useful molecular factors, such as anticoagulants and antibacterial peptides, which may have evolved as a result of its parasitizing large mammals, including humans. Despite the broad focus of research on this system, little has been done at the genomic or transcriptomic levels and there is a paucity of openly available sequence data. To begin to address this problem, we constructed whole embryo and adult central nervous system (CNS) EST libraries and created a clustered sequence database of the Hirudo transcriptome that is available to the scientific community. A total of approximately 133,000 EST clones from two directionally-cloned cDNA libraries, one constructed from mRNA derived from whole embryos at several developmental stages and the other from adult CNS cords, were sequenced in one or both directions by three different groups: Genoscope (French National Sequencing Center), the University of Iowa Sequencing Facility and the DOE Joint Genome Institute. These were assembled using the phrap software package into 31,232 unique contigs and singletons, with an average length of 827 nt. The assembled transcripts were then translated in all six frames and compared to proteins in NCBI's non-redundant (NR) and to the Gene Ontology (GO) protein sequence databases, resulting in 15,565 matches to 11,236 proteins in NR and 13,935 matches to 8,073 proteins in GO. Searching the database for transcripts of genes homologous to those thought to be involved in the innate immune responses of vertebrates and other invertebrates yielded a set of nearly one hundred evolutionarily conserved sequences

  14. Hazard Analysis Database Report

    CERN Document Server

    Grams, W H

    2000-01-01

    The Hazard Analysis Database was developed in conjunction with the hazard analysis activities conducted in accordance with DOE-STD-3009-94, Preparation Guide for U S . Department of Energy Nonreactor Nuclear Facility Safety Analysis Reports, for HNF-SD-WM-SAR-067, Tank Farms Final Safety Analysis Report (FSAR). The FSAR is part of the approved Authorization Basis (AB) for the River Protection Project (RPP). This document describes, identifies, and defines the contents and structure of the Tank Farms FSAR Hazard Analysis Database and documents the configuration control changes made to the database. The Hazard Analysis Database contains the collection of information generated during the initial hazard evaluations and the subsequent hazard and accident analysis activities. The Hazard Analysis Database supports the preparation of Chapters 3 ,4 , and 5 of the Tank Farms FSAR and the Unreviewed Safety Question (USQ) process and consists of two major, interrelated data sets: (1) Hazard Analysis Database: Data from t...

  15. Database Optimizing Services

    Directory of Open Access Journals (Sweden)

    Adrian GHENCEA

    2010-12-01

    Full Text Available Almost every organization has at its centre a database. The database provides support for conducting different activities, whether it is production, sales and marketing or internal operations. Every day, a database is accessed for help in strategic decisions. The satisfaction therefore of such needs is entailed with a high quality security and availability. Those needs can be realised using a DBMS (Database Management System which is, in fact, software for a database. Technically speaking, it is software which uses a standard method of cataloguing, recovery, and running different data queries. DBMS manages the input data, organizes it, and provides ways of modifying or extracting the data by its users or other programs. Managing the database is an operation that requires periodical updates, optimizing and monitoring.

  16. National Database of Geriatrics

    DEFF Research Database (Denmark)

    Kannegaard, Pia Nimann; Vinding, Kirsten L; Hare-Bruun, Helle

    2016-01-01

    AIM OF DATABASE: The aim of the National Database of Geriatrics is to monitor the quality of interdisciplinary diagnostics and treatment of patients admitted to a geriatric hospital unit. STUDY POPULATION: The database population consists of patients who were admitted to a geriatric hospital unit....... Geriatric patients cannot be defined by specific diagnoses. A geriatric patient is typically a frail multimorbid elderly patient with decreasing functional ability and social challenges. The database includes 14-15,000 admissions per year, and the database completeness has been stable at 90% during the past......, percentage of discharges with a rehabilitation plan, and the part of cases where an interdisciplinary conference has taken place. Data are recorded by doctors, nurses, and therapists in a database and linked to the Danish National Patient Register. DESCRIPTIVE DATA: Descriptive patient-related data include...

  17. PAMDB: a comprehensive Pseudomonas aeruginosa metabolome database.

    Science.gov (United States)

    Huang, Weiliang; Brewer, Luke K; Jones, Jace W; Nguyen, Angela T; Marcu, Ana; Wishart, David S; Oglesby-Sherrouse, Amanda G; Kane, Maureen A; Wilks, Angela

    2018-01-04

    The Pseudomonas aeruginosaMetabolome Database (PAMDB, http://pseudomonas.umaryland.edu) is a searchable, richly annotated metabolite database specific to P. aeruginosa. P. aeruginosa is a soil organism and significant opportunistic pathogen that adapts to its environment through a versatile energy metabolism network. Furthermore, P. aeruginosa is a model organism for the study of biofilm formation, quorum sensing, and bioremediation processes, each of which are dependent on unique pathways and metabolites. The PAMDB is modelled on the Escherichia coli (ECMDB), yeast (YMDB) and human (HMDB) metabolome databases and contains >4370 metabolites and 938 pathways with links to over 1260 genes and proteins. The database information was compiled from electronic databases, journal articles and mass spectrometry (MS) metabolomic data obtained in our laboratories. For each metabolite entered, we provide detailed compound descriptions, names and synonyms, structural and physiochemical information, nuclear magnetic resonance (NMR) and MS spectra, enzymes and pathway information, as well as gene and protein sequences. The database allows extensive searching via chemical names, structure and molecular weight, together with gene, protein and pathway relationships. The PAMBD and its future iterations will provide a valuable resource to biologists, natural product chemists and clinicians in identifying active compounds, potential biomarkers and clinical diagnostics. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  18. Recognition of HIV-1 peptides by host CTL is related to HIV-1 similarity to human proteins.

    Directory of Open Access Journals (Sweden)

    Morgane Rolland

    Full Text Available BACKGROUND: While human immunodeficiency virus type 1 (HIV-1-specific cytotoxic T lymphocytes preferentially target specific regions of the viral proteome, HIV-1 features that contribute to immune recognition are not well understood. One hypothesis is that similarities between HIV and human proteins influence the host immune response, i.e., resemblance between viral and host peptides could preclude reactivity against certain HIV epitopes. METHODOLOGY/PRINCIPAL FINDINGS: We analyzed the extent of similarity between HIV-1 and the human proteome. Proteins from the HIV-1 B consensus sequence from 2001 were dissected into overlapping k-mers, which were then probed against a non-redundant database of the human proteome in order to identify segments of high similarity. We tested the relationship between HIV-1 similarity to host encoded peptides and immune recognition in HIV-infected individuals, and found that HIV immunogenicity could be partially modulated by the sequence similarity to the host proteome. ELISpot responses to peptides spanning the entire viral proteome evaluated in 314 individuals showed a trend indicating an inverse relationship between the similarity to the host proteome and the frequency of recognition. In addition, analysis of responses by a group of 30 HIV-infected individuals against 944 overlapping peptides representing a broad range of individual HIV-1B Nef variants, affirmed that the degree of similarity to the host was significantly lower for peptides with reactive epitopes than for those that were not recognized. CONCLUSIONS/SIGNIFICANCE: Our results suggest that antigenic motifs that are scarcely represented in human proteins might represent more immunogenic CTL targets not selected against in the host. This observation could provide guidance in the design of more effective HIV immunogens, as sequences devoid of host-like features might afford superior immune reactivity.

  19. Tradeoffs in distributed databases

    OpenAIRE

    Juntunen, R. (Risto)

    2016-01-01

    Abstract In a distributed database data is spread throughout the network into separated nodes with different DBMS systems (Date, 2000). According to CAP-theorem three database properties — consistency, availability and partition tolerance cannot be achieved simultaneously in distributed database systems. Two of these properties can be achieved but not all three at the same time (Brewer, 2000). Since this theorem there has b...

  20. Specialist Bibliographic Databases

    OpenAIRE

    Gasparyan, Armen Yuri; Yessirkepov, Marlen; Voronov, Alexander A.; Trukhachev, Vladimir I.; Kostyukova, Elena I.; Gerasimov, Alexey N.; Kitas, George D.

    2016-01-01

    Specialist bibliographic databases offer essential online tools for researchers and authors who work on specific subjects and perform comprehensive and systematic syntheses of evidence. This article presents examples of the established specialist databases, which may be of interest to those engaged in multidisciplinary science communication. Access to most specialist databases is through subscription schemes and membership in professional associations. Several aggregators of information and d...

  1. Supply Chain Initiatives Database

    Energy Technology Data Exchange (ETDEWEB)

    None

    2012-11-01

    The Supply Chain Initiatives Database (SCID) presents innovative approaches to engaging industrial suppliers in efforts to save energy, increase productivity and improve environmental performance. This comprehensive and freely-accessible database was developed by the Institute for Industrial Productivity (IIP). IIP acknowledges Ecofys for their valuable contributions. The database contains case studies searchable according to the types of activities buyers are undertaking to motivate suppliers, target sector, organization leading the initiative, and program or partnership linkages.

  2. Intermodal Passenger Connectivity Database -

    Data.gov (United States)

    Department of Transportation — The Intermodal Passenger Connectivity Database (IPCD) is a nationwide data table of passenger transportation terminals, with data on the availability of connections...

  3. Residency Allocation Database

    Data.gov (United States)

    Department of Veterans Affairs — The Residency Allocation Database is used to determine allocation of funds for residency programs offered by Veterans Affairs Medical Centers (VAMCs). Information...

  4. Smart Location Database - Service

    Data.gov (United States)

    U.S. Environmental Protection Agency — The Smart Location Database (SLD) summarizes over 80 demographic, built environment, transit service, and destination accessibility attributes for every census block...

  5. Database principles programming performance

    CERN Document Server

    O'Neil, Patrick

    2014-01-01

    Database: Principles Programming Performance provides an introduction to the fundamental principles of database systems. This book focuses on database programming and the relationships between principles, programming, and performance.Organized into 10 chapters, this book begins with an overview of database design principles and presents a comprehensive introduction to the concepts used by a DBA. This text then provides grounding in many abstract concepts of the relational model. Other chapters introduce SQL, describing its capabilities and covering the statements and functions of the programmi

  6. Veterans Administration Databases

    Science.gov (United States)

    The Veterans Administration Information Resource Center provides database and informatics experts, customer service, expert advice, information products, and web technology to VA researchers and others.

  7. IVR EFP Database

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — This database contains trip-level reports submitted by vessels participating in Exempted Fishery projects with IVR reporting requirements.

  8. Towards Sensor Database Systems

    DEFF Research Database (Denmark)

    Bonnet, Philippe; Gehrke, Johannes; Seshadri, Praveen

    2001-01-01

    . These systems lack flexibility because data is extracted in a predefined way; also, they do not scale to a large number of devices because large volumes of raw data are transferred regardless of the queries that are submitted. In our new concept of sensor database system, queries dictate which data is extracted...... from the sensors. In this paper, we define the concept of sensor databases mixing stored data represented as relations and sensor data represented as time series. Each long-running query formulated over a sensor database defines a persistent view, which is maintained during a given time interval. We...... also describe the design and implementation of the COUGAR sensor database system....

  9. Database Publication Practices

    DEFF Research Database (Denmark)

    Bernstein, P.A.; DeWitt, D.; Heuer, A.

    2005-01-01

    There has been a growing interest in improving the publication processes for database research papers. This panel reports on recent changes in those processes and presents an initial cut at historical data for the VLDB Journal and ACM Transactions on Database Systems.......There has been a growing interest in improving the publication processes for database research papers. This panel reports on recent changes in those processes and presents an initial cut at historical data for the VLDB Journal and ACM Transactions on Database Systems....

  10. Smart Location Database - Download

    Data.gov (United States)

    U.S. Environmental Protection Agency — The Smart Location Database (SLD) summarizes over 80 demographic, built environment, transit service, and destination accessibility attributes for every census block...

  11. Database Description - RMG | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available ase Description General information of database Database name RMG Alternative name ...raki 305-8602, Japan National Institute of Agrobiological Sciences E-mail : Database... classification Nucleotide Sequence Databases Organism Taxonomy Name: Oryza sativa Japonica Group Taxonomy ID: 39947 Database...rnal: Mol Genet Genomics (2002) 268: 434–445 External Links: Original website information Database...available URL of Web services - Need for user registration Not available About This Database Database Descri

  12. Database Description - KOME | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available base Description General information of database Database name KOME Alternative nam... Sciences Plant Genome Research Unit Shoshi Kikuchi E-mail : Database classification Plant databases - Rice ...Organism Taxonomy Name: Oryza sativa Taxonomy ID: 4530 Database description Information about approximately ...Hayashizaki Y, Kikuchi S. Journal: PLoS One. 2007 Nov 28; 2(11):e1235. External Links: Original website information Database...OS) Rice mutant panel database (Tos17) A Database of Plant Cis-acting Regulatory

  13. Update History of This Database - Arabidopsis Phenome Database | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Arabidopsis Phenome Database Update History of This Database Date Update contents 2017/02/27 Arabidopsis Phenome Data...base English archive site is opened. - Arabidopsis Phenome Database (http://jphenom...e.info/?page_id=95) is opened. About This Database Database Description Download License Update History of This Database... Site Policy | Contact Us Update History of This Database - Arabidopsis Phenome Database | LSDB Archive ...

  14. Update History of This Database - SKIP Stemcell Database | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us SKIP Stemcell Database Update History of This Database Date Update contents 2017/03/13 SKIP Stemcell Database... English archive site is opened. 2013/03/29 SKIP Stemcell Database ( https://www.skip.med.k...eio.ac.jp/SKIPSearch/top?lang=en ) is opened. About This Database Database Description Download License Update History of This Databa...se Site Policy | Contact Us Update History of This Database - SKIP Stemcell Database | LSDB Archive ...

  15. Download - Trypanosomes Database | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Trypanosomes Database Download First of all, please read the license of this database. Data ...1.4 KB) Simple search and download Downlaod via FTP FTP server is sometimes jammed. If it is, access [here]. About This Database Data...base Description Download License Update History of This Database Site Policy | Contact Us Download - Trypanosomes Database | LSDB Archive ...

  16. Database design and database administration for a kindergarten

    OpenAIRE

    Vítek, Daniel

    2009-01-01

    The bachelor thesis deals with creation of database design for a standard kindergarten, installation of the designed database into the database system Oracle Database 10g Express Edition and demonstration of the administration tasks in this database system. The verification of the database was proved by a developed access application.

  17. Directory of IAEA databases

    International Nuclear Information System (INIS)

    1992-12-01

    This second edition of the Directory of IAEA Databases has been prepared within the Division of Scientific and Technical Information (NESI). Its main objective is to describe the computerized information sources available to staff members. This directory contains all databases produced at the IAEA, including databases stored on the mainframe, LAN's and PC's. All IAEA Division Directors have been requested to register the existence of their databases with NESI. For the second edition database owners were requested to review the existing entries for their databases and answer four additional questions. The four additional questions concerned the type of database (e.g. Bibliographic, Text, Statistical etc.), the category of database (e.g. Administrative, Nuclear Data etc.), the available documentation and the type of media used for distribution. In the individual entries on the following pages the answers to the first two questions (type and category) is always listed, but the answers to the second two questions (documentation and media) is only listed when information has been made available

  18. HIV Structural Database

    Science.gov (United States)

    SRD 102 HIV Structural Database (Web, free access)   The HIV Protease Structural Database is an archive of experimentally determined 3-D structures of Human Immunodeficiency Virus 1 (HIV-1), Human Immunodeficiency Virus 2 (HIV-2) and Simian Immunodeficiency Virus (SIV) Proteases and their complexes with inhibitors or products of substrate cleavage.

  19. Balkan Vegetation Database

    NARCIS (Netherlands)

    Vassilev, Kiril; Pedashenko, Hristo; Alexandrova, Alexandra; Tashev, Alexandar; Ganeva, Anna; Gavrilova, Anna; Gradevska, Asya; Assenov, Assen; Vitkova, Antonina; Grigorov, Borislav; Gussev, Chavdar; Filipova, Eva; Aneva, Ina; Knollová, Ilona; Nikolov, Ivaylo; Georgiev, Georgi; Gogushev, Georgi; Tinchev, Georgi; Pachedjieva, Kalina; Koev, Koycho; Lyubenova, Mariyana; Dimitrov, Marius; Apostolova-Stoyanova, Nadezhda; Velev, Nikolay; Zhelev, Petar; Glogov, Plamen; Natcheva, Rayna; Tzonev, Rossen; Boch, Steffen; Hennekens, Stephan M.; Georgiev, Stoyan; Stoyanov, Stoyan; Karakiev, Todor; Kalníková, Veronika; Shivarov, Veselin; Russakova, Veska; Vulchev, Vladimir

    2016-01-01

    The Balkan Vegetation Database (BVD; GIVD ID: EU-00-019; http://www.givd.info/ID/EU-00- 019) is a regional database that consists of phytosociological relevés from different vegetation types from six countries on the Balkan Peninsula (Albania, Bosnia and Herzegovina, Bulgaria, Kosovo, Montenegro

  20. World Database of Happiness

    NARCIS (Netherlands)

    R. Veenhoven (Ruut)

    1995-01-01

    textabstractABSTRACT The World Database of Happiness is an ongoing register of research on subjective appreciation of life. Its purpose is to make the wealth of scattered findings accessible, and to create a basis for further meta-analytic studies. The database involves four sections:
    1.

  1. Dictionary as Database.

    Science.gov (United States)

    Painter, Derrick

    1996-01-01

    Discussion of dictionaries as databases focuses on the digitizing of The Oxford English dictionary (OED) and the use of Standard Generalized Mark-Up Language (SGML). Topics include the creation of a consortium to digitize the OED, document structure, relational databases, text forms, sequence, and discourse. (LRW)

  2. Fire test database

    International Nuclear Information System (INIS)

    Lee, J.A.

    1989-01-01

    This paper describes a project recently completed for EPRI by Impell. The purpose of the project was to develop a reference database of fire tests performed on non-typical fire rated assemblies. The database is designed for use by utility fire protection engineers to locate test reports for power plant fire rated assemblies. As utilities prepare to respond to Information Notice 88-04, the database will identify utilities, vendors or manufacturers who have specific fire test data. The database contains fire test report summaries for 729 tested configurations. For each summary, a contact is identified from whom a copy of the complete fire test report can be obtained. Five types of configurations are included: doors, dampers, seals, wraps and walls. The database is computerized. One version for IBM; one for Mac. Each database is accessed through user-friendly software which allows adding, deleting, browsing, etc. through the database. There are five major database files. One each for the five types of tested configurations. The contents of each provides significant information regarding the test method and the physical attributes of the tested configuration. 3 figs

  3. Children's Culture Database (CCD)

    DEFF Research Database (Denmark)

    Wanting, Birgit

    a Dialogue inspired database with documentation, network (individual and institutional profiles) and current news , paper presented at the research seminar: Electronic access to fiction, Copenhagen, November 11-13, 1996......a Dialogue inspired database with documentation, network (individual and institutional profiles) and current news , paper presented at the research seminar: Electronic access to fiction, Copenhagen, November 11-13, 1996...

  4. Atomic Spectra Database (ASD)

    Science.gov (United States)

    SRD 78 NIST Atomic Spectra Database (ASD) (Web, free access)   This database provides access and search capability for NIST critically evaluated data on atomic energy levels, wavelengths, and transition probabilities that are reasonably up-to-date. The NIST Atomic Spectroscopy Data Center has carried out these critical compilations.

  5. Consumer Product Category Database

    Science.gov (United States)

    The Chemical and Product Categories database (CPCat) catalogs the use of over 40,000 chemicals and their presence in different consumer products. The chemical use information is compiled from multiple sources while product information is gathered from publicly available Material Safety Data Sheets (MSDS). EPA researchers are evaluating the possibility of expanding the database with additional product and use information.

  6. Database in Artificial Intelligence.

    Science.gov (United States)

    Wilkinson, Julia

    1986-01-01

    Describes a specialist bibliographic database of literature in the field of artificial intelligence created by the Turing Institute (Glasgow, Scotland) using the BRS/Search information retrieval software. The subscription method for end-users--i.e., annual fee entitles user to unlimited access to database, document provision, and printed awareness…

  7. NoSQL database scaling

    OpenAIRE

    Žardin, Norbert

    2017-01-01

    NoSQL database scaling is a decision, where system resources or financial expenses are traded for database performance or other benefits. By scaling a database, database performance and resource usage might increase or decrease, such changes might have a negative impact on an application that uses the database. In this work it is analyzed how database scaling affect database resource usage and performance. As a results, calculations are acquired, using which database scaling types and differe...

  8. Building nonredundant adaptive wavelets by update lifting

    NARCIS (Netherlands)

    H.J.A.M. Heijmans (Henk); B. Pesquet-Popescu; G. Piella (Gema)

    2002-01-01

    textabstractAdaptive wavelet decompositions appear useful in various applications in image and video processing, such as image analysis, compression, feature extraction, denoising and deconvolution, or optic flow estimation. For such tasks it may be important that the multiresolution representations

  9. The LHCb configuration database

    CERN Document Server

    Abadie, L; Van Herwijnen, Eric; Jacobsson, R; Jost, B; Neufeld, N

    2005-01-01

    The aim of the LHCb configuration database is to store information about all the controllable devices of the detector. The experiment's control system (that uses PVSS ) will configure, start up and monitor the detector from the information in the configuration database. The database will contain devices with their properties, connectivity and hierarchy. The ability to store and rapidly retrieve huge amounts of data, and the navigability between devices are important requirements. We have collected use cases to ensure the completeness of the design. Using the entity relationship modelling technique we describe the use cases as classes with attributes and links. We designed the schema for the tables using relational diagrams. This methodology has been applied to the TFC (switches) and DAQ system. Other parts of the detector will follow later. The database has been implemented using Oracle to benefit from central CERN database support. The project also foresees the creation of tools to populate, maintain, and co...

  10. Database Description - SKIP Stemcell Database | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us SKIP Stemcell Database Database Description General information of database Database name SKIP Stemcell Database...rsity Journal Search: Contact address http://www.skip.med.keio.ac.jp/en/contact/ Database classification Human Genes and Diseases Dat...abase classification Stemcell Article Organism Taxonomy Name: Homo sapiens Taxonomy ID: 9606 Database...ks: Original website information Database maintenance site Center for Medical Genetics, School of medicine, ...lable Web services Not available URL of Web services - Need for user registration Not available About This Database Database

  11. BSDB: the Biomolecule Stretching Database

    Science.gov (United States)

    Cieplak, Marek; Sikora, Mateusz; Sulkowska, Joanna I.; Witkowski, Bartlomiej

    2011-03-01

    Despite more than a decade of experiments on single biomolecule manipulation, mechanical properties of only several scores of proteins have been measured. A characteristic scale of the force of resistance to stretching, Fmax , has been found to range between ~ 10 and 480 pN. The Biomolecule Stretching Data Base (BSDB) described here provides information about expected values of Fmax for, currently, 17 134 proteins. The values and other characteristics of the unfolding proces, including the nature of identified mechanical clamps, are available at www://info.ifpan.edu.pl/BSDB/. They have been obtained through simulations within a structure-based model which correlates satisfactorily with the available experimental data on stretching. BSDB also lists experimental data and results of the existing all-atom simulations. The database offers a Protein-Data-Bank-wide guide to mechano-stability of proteins. Its description is provided by a forthcoming Nucleic Acids Research paper. Supported by EC FUNMOL project FP7-NMP-2007-SMALL-1, and European Regional Development Fund: Innovative Economy (POIG.01.01.02-00-008/08).

  12. Hazard Analysis Database Report

    Energy Technology Data Exchange (ETDEWEB)

    GAULT, G.W.

    1999-10-13

    The Hazard Analysis Database was developed in conjunction with the hazard analysis activities conducted in accordance with DOE-STD-3009-94, Preparation Guide for US Department of Energy Nonreactor Nuclear Facility Safety Analysis Reports, for the Tank Waste Remediation System (TWRS) Final Safety Analysis Report (FSAR). The FSAR is part of the approved TWRS Authorization Basis (AB). This document describes, identifies, and defines the contents and structure of the TWRS FSAR Hazard Analysis Database and documents the configuration control changes made to the database. The TWRS Hazard Analysis Database contains the collection of information generated during the initial hazard evaluations and the subsequent hazard and accident analysis activities. The database supports the preparation of Chapters 3,4, and 5 of the TWRS FSAR and the USQ process and consists of two major, interrelated data sets: (1) Hazard Evaluation Database--Data from the results of the hazard evaluations; and (2) Hazard Topography Database--Data from the system familiarization and hazard identification.

  13. Protein profiling in potato (Solanum tuberosum L.) leaf tissues by differential centrifugation.

    Science.gov (United States)

    Lim, Sanghyun; Chisholm, Kenneth; Coffin, Robert H; Peters, Rick D; Al-Mughrabi, Khalil I; Wang-Pruski, Gefu; Pinto, Devanand M

    2012-04-06

    Foliar diseases, such as late blight, result in serious threats to potato production. As such, potato leaf tissue becomes an important substrate to study biological processes, such as plant defense responses to infection. Nonetheless, the potato leaf proteome remains poorly characterized. Here, we report protein profiling of potato leaf tissues using a modified differential centrifugation approach to separate the leaf tissues into cell wall and cytoplasmic fractions. This method helps to increase the number of identified proteins, including targeted putative cell wall proteins. The method allowed for the identification of 1484 nonredundant potato leaf proteins, of which 364 and 447 were reproducibly identified proteins in the cell wall and cytoplasmic fractions, respectively. Reproducibly identified proteins corresponded to over 70% of proteins identified in each replicate. A diverse range of proteins was identified based on their theoretical pI values, molecular masses, functional classification, and biological processes. Such a protein extraction method is effective for the establishment of a highly qualified proteome profile.

  14. Database for propagation models

    Science.gov (United States)

    Kantak, Anil V.

    1991-07-01

    A propagation researcher or a systems engineer who intends to use the results of a propagation experiment is generally faced with various database tasks such as the selection of the computer software, the hardware, and the writing of the programs to pass the data through the models of interest. This task is repeated every time a new experiment is conducted or the same experiment is carried out at a different location generating different data. Thus the users of this data have to spend a considerable portion of their time learning how to implement the computer hardware and the software towards the desired end. This situation may be facilitated considerably if an easily accessible propagation database is created that has all the accepted (standardized) propagation phenomena models approved by the propagation research community. Also, the handling of data will become easier for the user. Such a database construction can only stimulate the growth of the propagation research it if is available to all the researchers, so that the results of the experiment conducted by one researcher can be examined independently by another, without different hardware and software being used. The database may be made flexible so that the researchers need not be confined only to the contents of the database. Another way in which the database may help the researchers is by the fact that they will not have to document the software and hardware tools used in their research since the propagation research community will know the database already. The following sections show a possible database construction, as well as properties of the database for the propagation research.

  15. Product Licenses Database Application

    CERN Document Server

    Tonkovikj, Petar

    2016-01-01

    The goal of this project is to organize and centralize the data about software tools available to CERN employees, as well as provide a system that would simplify the license management process by providing information about the available licenses and their expiry dates. The project development process is consisted of two steps: modeling the products (software tools), product licenses, legal agreements and other data related to these entities in a relational database and developing the front-end user interface so that the user can interact with the database. The result is an ASP.NET MVC web application with interactive views for displaying and managing the data in the underlying database.

  16. LandIT Database

    DEFF Research Database (Denmark)

    Iftikhar, Nadeem; Pedersen, Torben Bach

    2010-01-01

    and reporting purposes. This paper presents the LandIT database; which is result of the LandIT project, which refers to an industrial collaboration project that developed technologies for communication and data integration between farming devices and systems. The LandIT database in principal is based...... on the ISOBUS standard; however the standard is extended with additional requirements, such as gradual data aggregation and flexible exchange of farming data. This paper describes the conceptual and logical schemas of the proposed database based on a real-life farming case study....

  17. JICST Factual Database(2)

    Science.gov (United States)

    Araki, Keisuke

    The computer programme, which builds atom-bond connection tables from nomenclatures, is developed. Chemical substances with their nomenclature and varieties of trivial names or experimental code numbers are inputted. The chemical structures of the database are stereospecifically stored and are able to be searched and displayed according to stereochemistry. Source data are from laws and regulations of Japan, RTECS of US and so on. The database plays a central role within the integrated fact database service of JICST and makes interrelational retrieval possible.

  18. ABI domain-containing proteins contribute to surface protein display and cell division in Staphylococcus aureus.

    Science.gov (United States)

    Frankel, Matthew B; Wojcik, Brandon M; DeDent, Andrea C; Missiakas, Dominique M; Schneewind, Olaf

    2010-10-01

    The human pathogen Staphylococcus aureus requires cell wall anchored surface proteins to cause disease. During cell division, surface proteins with YSIRK signal peptides are secreted into the cross-wall, a layer of newly synthesized peptidoglycan between separating daughter cells. The molecular determinants for the trafficking of surface proteins are, however, still unknown. We screened mutants with non-redundant transposon insertions by fluorescence-activated cell sorting for reduced deposition of protein A (SpA) into the staphylococcal envelope. Three mutants, each of which harboured transposon insertions in genes for transmembrane proteins, displayed greatly reduced envelope abundance of SpA and surface proteins with YSIRK signal peptides. Characterization of the corresponding mutations identified three transmembrane proteins with abortive infectivity (ABI) domains, elements first described in lactococci for their role in phage exclusion. Mutations in genes for ABI domain proteins, designated spdA, spdB and spdC (surface protein display), diminish the expression of surface proteins with YSIRK signal peptides, but not of precursor proteins with conventional signal peptides. spdA, spdB and spdC mutants display an increase in the thickness of cross-walls and in the relative abundance of staphylococci with cross-walls, suggesting that spd mutations may represent a possible link between staphylococcal cell division and protein secretion. © 2010 Blackwell Publishing Ltd.

  19. Livestock Anaerobic Digester Database

    Science.gov (United States)

    The Anaerobic Digester Database provides basic information about anaerobic digesters on livestock farms in the United States, organized in Excel spreadsheets. It includes projects that are under construction, operating, or shut down.

  20. Toxicity Reference Database

    Data.gov (United States)

    U.S. Environmental Protection Agency — The Toxicity Reference Database (ToxRefDB) contains approximately 30 years and $2 billion worth of animal studies. ToxRefDB allows scientists and the interested...

  1. Dissolution Methods Database

    Data.gov (United States)

    U.S. Department of Health & Human Services — For a drug product that does not have a dissolution test method in the United States Pharmacopeia (USP), the FDA Dissolution Methods Database provides information on...

  2. OTI Activity Database

    Data.gov (United States)

    US Agency for International Development — OTI's worldwide activity database is a simple and effective information system that serves as a program management, tracking, and reporting tool. In each country,...

  3. ARTI Refrigerant Database

    Energy Technology Data Exchange (ETDEWEB)

    Calm, J.M. [Calm (James M.), Great Falls, VA (United States)

    1994-05-27

    The Refrigerant Database consolidates and facilitates access to information to assist industry in developing equipment using alternative refrigerants. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern.

  4. Marine Jurisdictions Database

    National Research Council Canada - National Science Library

    Goldsmith, Roger

    1998-01-01

    The purpose of this project was to take the data gathered for the Maritime Claims chart and create a Maritime Jurisdictions digital database suitable for use with oceanographic mission planning objectives...

  5. Medicaid CHIP ESPC Database

    Data.gov (United States)

    U.S. Department of Health & Human Services — The Environmental Scanning and Program Characteristic (ESPC) Database is in a Microsoft (MS) Access format and contains Medicaid and CHIP data, for the 50 states and...

  6. Records Management Database

    Data.gov (United States)

    US Agency for International Development — The Records Management Database is tool created in Microsoft Access specifically for USAID use. It contains metadata in order to access and retrieve the information...

  7. Reach Address Database (RAD)

    Data.gov (United States)

    U.S. Environmental Protection Agency — The Reach Address Database (RAD) stores the reach address of each Water Program feature that has been linked to the underlying surface water features (streams,...

  8. Household Products Database: Pesticides

    Science.gov (United States)

    ... of Products Manufacturers Ingredients About the Database FAQ Product ... control bulbs carpenter ants caterpillars crabgrass control deer dogs dogs/cats fertilizer w/insecticide fertilizer w/weed ...

  9. Mouse Phenome Database (MPD)

    Data.gov (United States)

    U.S. Department of Health & Human Services — The Mouse Phenome Database (MPD) has characterizations of hundreds of strains of laboratory mice to facilitate translational discoveries and to assist in selection...

  10. Consumer Product Category Database

    Data.gov (United States)

    U.S. Environmental Protection Agency — The Chemical and Product Categories database (CPCat) catalogs the use of over 40,000 chemicals and their presence in different consumer products. The chemical use...

  11. Drycleaner Database - Region 7

    Data.gov (United States)

    U.S. Environmental Protection Agency — THIS DATA ASSET NO LONGER ACTIVE: This is metadata documentation for the Region 7 Drycleaner Database (R7DryClnDB) which tracks all Region7 drycleaners who notify...

  12. National Assessment Database

    Data.gov (United States)

    U.S. Environmental Protection Agency — The National Assessment Database stores and tracks state water quality assessment decisions, Total Maximum Daily Loads (TMDLs) and other watershed plans designed to...

  13. IVR RSA Database

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — This database contains trip-level reports submitted by vessels participating in Research Set-Aside projects with IVR reporting requirements.

  14. Rat Genome Database (RGD)

    Data.gov (United States)

    U.S. Department of Health & Human Services — The Rat Genome Database (RGD) is a collaborative effort between leading research institutions involved in rat genetic and genomic research to collect, consolidate,...

  15. The CAPEC Database

    DEFF Research Database (Denmark)

    Nielsen, Thomas Lund; Abildskov, Jens; Harper, Peter Mathias

    2001-01-01

    in the compound. This classification makes the CAPEC database a very useful tool, for example, in the development of new property models, since properties of chemically similar compounds are easily obtained. A program with efficient search and retrieval functions of properties has been developed.......The Computer-Aided Process Engineering Center (CAPEC) database of measured data was established with the aim to promote greater data exchange in the chemical engineering community. The target properties are pure component properties, mixture properties, and special drug solubility data....... The database divides pure component properties into primary, secondary, and functional properties. Mixture properties are categorized in terms of the number of components in the mixture and the number of phases present. The compounds in the database have been classified on the basis of the functional groups...

  16. Danish Urogynaecological Database

    DEFF Research Database (Denmark)

    Hansen, Ulla Darling; Gradel, Kim Oren; Larsen, Michael Due

    2016-01-01

    , complications if relevant, implants used if relevant, 3-6-month postoperative recording of symptoms, if any. A set of clinical quality indicators is being maintained by the steering committee for the database and is published in an annual report which also contains extensive descriptive statistics. The database......The Danish Urogynaecological Database is established in order to ensure high quality of treatment for patients undergoing urogynecological surgery. The database contains details of all women in Denmark undergoing incontinence surgery or pelvic organ prolapse surgery amounting to ~5,200 procedures...... has a completeness of over 90% of all urogynecological surgeries performed in Denmark. Some of the main variables have been validated using medical records as gold standard. The positive predictive value was above 90%. The data are used as a quality monitoring tool by the hospitals and in a number...

  17. The Danish Urogynaecological Database

    DEFF Research Database (Denmark)

    Guldberg, Rikke; Brostrøm, Søren; Hansen, Jesper Kjær

    2013-01-01

    in the DugaBase from 1 January 2009 to 31 October 2010, using medical records as a reference. RESULTS: A total of 16,509 urogynaecological procedures were registered in the DugaBase by 31 December 2010. The database completeness has increased by calendar time, from 38.2 % in 2007 to 93.2 % in 2010 for public......INTRODUCTION AND HYPOTHESIS: The Danish Urogynaecological Database (DugaBase) is a nationwide clinical database established in 2006 to monitor, ensure and improve the quality of urogynaecological surgery. We aimed to describe its establishment and completeness and to validate selected variables....... This is the first study based on data from the DugaBase. METHODS: The database completeness was calculated as a comparison between urogynaecological procedures reported to the Danish National Patient Registry and to the DugaBase. Validity was assessed for selected variables from a random sample of 200 women...

  18. Danish Pancreatic Cancer Database

    DEFF Research Database (Denmark)

    Fristrup, Claus; Detlefsen, Sönke; Palnæs Hansen, Carsten

    2016-01-01

    : Death is monitored using data from the Danish Civil Registry. This registry monitors the survival status of the Danish population, and the registration is virtually complete. All data in the database are audited by all participating institutions, with respect to baseline characteristics, key indicators......AIM OF DATABASE: The Danish Pancreatic Cancer Database aims to prospectively register the epidemiology, diagnostic workup, diagnosis, treatment, and outcome of patients with pancreatic cancer in Denmark at an institutional and national level. STUDY POPULATION: Since May 1, 2011, all patients...... with microscopically verified ductal adenocarcinoma of the pancreas have been registered in the database. As of June 30, 2014, the total number of patients registered was 2,217. All data are cross-referenced with the Danish Pathology Registry and the Danish Patient Registry to ensure the completeness of registrations...

  19. Food Habits Database (FHDBS)

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — The NEFSC Food Habits Database has two major sources of data. The first, and most extensive, is the standard NEFSC Bottom Trawl Surveys Program. During these...

  20. Functionally Graded Materials Database

    Science.gov (United States)

    Kisara, Katsuto; Konno, Tomomi; Niino, Masayuki

    2008-02-01

    Functionally Graded Materials Database (hereinafter referred to as FGMs Database) was open to the society via Internet in October 2002, and since then it has been managed by the Japan Aerospace Exploration Agency (JAXA). As of October 2006, the database includes 1,703 research information entries with 2,429 researchers data, 509 institution data and so on. Reading materials such as "Applicability of FGMs Technology to Space Plane" and "FGMs Application to Space Solar Power System (SSPS)" were prepared in FY 2004 and 2005, respectively. The English version of "FGMs Application to Space Solar Power System (SSPS)" is now under preparation. This present paper explains the FGMs Database, describing the research information data, the sitemap and how to use it. From the access analysis, user access results and users' interests are discussed.

  1. Tethys Acoustic Metadata Database

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — The Tethys database houses the metadata associated with the acoustic data collection efforts by the Passive Acoustic Group. These metadata include dates, locations...

  2. NLCD 2011 database

    Data.gov (United States)

    U.S. Environmental Protection Agency — National Land Cover Database 2011 (NLCD 2011) is the most recent national land cover product created by the Multi-Resolution Land Characteristics (MRLC) Consortium....

  3. Medicare Coverage Database

    Data.gov (United States)

    U.S. Department of Health & Human Services — The Medicare Coverage Database (MCD) contains all National Coverage Determinations (NCDs) and Local Coverage Determinations (LCDs), local articles, and proposed NCD...

  4. Household Products Database

    Data.gov (United States)

    U.S. Department of Health & Human Services — This database links over 4,000 consumer brands to health effects from Material Safety Data Sheets (MSDS) provided by the manufacturers and allows scientists and...

  5. Global Volcano Locations Database

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — NGDC maintains a database of over 1,500 volcano locations obtained from the Smithsonian Institution Global Volcanism Program, Volcanoes of the World publication. The...

  6. 1988 Spitak Earthquake Database

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — The 1988 Spitak Earthquake database is an extensive collection of geophysical and geological data, maps, charts, images and descriptive text pertaining to the...

  7. Uranium Location Database

    Data.gov (United States)

    U.S. Environmental Protection Agency — A GIS compiled locational database in Microsoft Access of ~15,000 mines with uranium occurrence or production, primarily in the western United States. The metadata...

  8. INIST: databases reorientation

    International Nuclear Information System (INIS)

    Bidet, J.C.

    1995-01-01

    INIST is a CNRS (Centre National de la Recherche Scientifique) laboratory devoted to the treatment of scientific and technical informations and to the management of these informations compiled in a database. Reorientation of the database content has been proposed in 1994 to increase the transfer of research towards enterprises and services, to develop more automatized accesses to the informations, and to create a quality assurance plan. The catalog of publications comprises 5800 periodical titles (1300 for fundamental research and 4500 for applied research). A science and technology multi-thematic database will be created in 1995 for the retrieval of applied and technical informations. ''Grey literature'' (reports, thesis, proceedings..) and human and social sciences data will be added to the base by the use of informations selected in the existing GRISELI and Francis databases. Strong modifications are also planned in the thematic cover of Earth sciences and will considerably reduce the geological information content. (J.S.). 1 tab

  9. Fine Arts Database (FAD)

    Data.gov (United States)

    General Services Administration — The Fine Arts Database records information on federally owned art in the control of the GSA; this includes the location, current condition and information on artists.

  10. Kansas Cartographic Database (KCD)

    Data.gov (United States)

    Kansas Data Access and Support Center — The Kansas Cartographic Database (KCD) is an exact digital representation of selected features from the USGS 7.5 minute topographic map series. Features that are...

  11. Developments in diffraction databases

    International Nuclear Information System (INIS)

    Jenkins, R.

    1999-01-01

    Full text: There are a number of databases available to the diffraction community. Two of the more important of these are the Powder Diffraction File (PDF) maintained by the International Centre for Diffraction Data (ICDD), and the Inorganic Crystal Structure Database (ICSD) maintained by Fachsinformationzentrum (FIZ, Karlsruhe). In application, the PDF has been used as an indispensable tool in phase identification and identification of unknowns. The ICSD database has extensive and explicit reference to the structures of compounds: atomic coordinates, space group and even thermal vibration parameters. A similar database, but for organic compounds, is maintained by the Cambridge Crystallographic Data Centre. These databases are often used as independent sources of information. However, little thought has been given on how to exploit the combined properties of structural database tools. A recently completed agreement between ICDD and FIZ, plus ICDD and Cambridge, provides a first step in complementary use of the PDF and the ICSD databases. The focus of this paper (as indicated below) is to examine ways of exploiting the combined properties of both databases. In 1996, there were approximately 76,000 entries in the PDF and approximately 43,000 entries in the ICSD database. The ICSD database has now been used to calculate entries in the PDF. Thus, to derive d-spacing and peak intensity data requires the synthesis of full diffraction patterns, i.e., we use the structural data in the ICSD database and then add instrumental resolution information. The combined data from PDF and ICSD can be effectively used in many ways. For example, we can calculate PDF data for an ideally random crystal distribution and also in the absence of preferred orientation. Again, we can use systematic studies of intermediate members in solid solutions series to help produce reliable quantitative phase analyses. In some cases, we can study how solid solution properties vary with composition and

  12. Database Replication Prototype

    OpenAIRE

    Vandewall, R.

    2000-01-01

    This report describes the design of a Replication Framework that facilitates the implementation and com-parison of database replication techniques. Furthermore, it discusses the implementation of a Database Replication Prototype and compares the performance measurements of two replication techniques based on the Atomic Broadcast communication primitive: pessimistic active replication and optimistic active replication. The main contributions of this report can be split into four parts....

  13. Database on Wind Characteristics

    DEFF Research Database (Denmark)

    Højstrup, J.; Ejsing Jørgensen, Hans; Lundtang Petersen, Erik

    1999-01-01

    his report describes the work and results of the project: Database on Wind Characteristics which was sponsered partly by the European Commision within the framework of JOULE III program under contract JOR3-CT95-0061......his report describes the work and results of the project: Database on Wind Characteristics which was sponsered partly by the European Commision within the framework of JOULE III program under contract JOR3-CT95-0061...

  14. ORACLE DATABASE SECURITY

    OpenAIRE

    Cristina-Maria Titrade

    2011-01-01

    This paper presents some security issues, namely security database system level, data level security, user-level security, user management, resource management and password management. Security is a constant concern in the design and database development. Usually, there are no concerns about the existence of security, but rather how large it should be. A typically DBMS has several levels of security, in addition to those offered by the operating system or network. Typically, a DBMS has user a...

  15. Database computing in HEP

    International Nuclear Information System (INIS)

    Day, C.T.; Loken, S.; MacFarlane, J.F.; May, E.; Lifka, D.; Lusk, E.; Price, L.E.; Baden, A.

    1992-01-01

    The major SSC experiments are expected to produce up to 1 Petabyte of data per year each. Once the primary reconstruction is completed by farms of inexpensive processors. I/O becomes a major factor in further analysis of the data. We believe that the application of database techniques can significantly reduce the I/O performed in these analyses. We present examples of such I/O reductions in prototype based on relational and object-oriented databases of CDF data samples

  16. Update History of This Database - Trypanosomes Database | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Trypanosomes Database Update History of This Database Date Update contents 2014/05/07 The co...ntact information is corrected. The features and manner of utilization of the database are corrected. 2014/02/04 Trypanosomes Databas...e English archive site is opened. 2011/04/04 Trypanosomes Database ( http://www.tan...paku.org/tdb/ ) is opened. About This Database Database Description Download Lice...nse Update History of This Database Site Policy | Contact Us Update History of This Database - Trypanosomes Database | LSDB Archive ...

  17. Conserved Domain Database (CDD)

    Data.gov (United States)

    U.S. Department of Health & Human Services — CDD is a protein annotation resource that consists of a collection of well-annotated multiple sequence alignment models for ancient domains and full-length proteins.

  18. Specialist Bibliographic Databases.

    Science.gov (United States)

    Gasparyan, Armen Yuri; Yessirkepov, Marlen; Voronov, Alexander A; Trukhachev, Vladimir I; Kostyukova, Elena I; Gerasimov, Alexey N; Kitas, George D

    2016-05-01

    Specialist bibliographic databases offer essential online tools for researchers and authors who work on specific subjects and perform comprehensive and systematic syntheses of evidence. This article presents examples of the established specialist databases, which may be of interest to those engaged in multidisciplinary science communication. Access to most specialist databases is through subscription schemes and membership in professional associations. Several aggregators of information and database vendors, such as EBSCOhost and ProQuest, facilitate advanced searches supported by specialist keyword thesauri. Searches of items through specialist databases are complementary to those through multidisciplinary research platforms, such as PubMed, Web of Science, and Google Scholar. Familiarizing with the functional characteristics of biomedical and nonbiomedical bibliographic search tools is mandatory for researchers, authors, editors, and publishers. The database users are offered updates of the indexed journal lists, abstracts, author profiles, and links to other metadata. Editors and publishers may find particularly useful source selection criteria and apply for coverage of their peer-reviewed journals and grey literature sources. These criteria are aimed at accepting relevant sources with established editorial policies and quality controls.

  19. Specialist Bibliographic Databases

    Science.gov (United States)

    2016-01-01

    Specialist bibliographic databases offer essential online tools for researchers and authors who work on specific subjects and perform comprehensive and systematic syntheses of evidence. This article presents examples of the established specialist databases, which may be of interest to those engaged in multidisciplinary science communication. Access to most specialist databases is through subscription schemes and membership in professional associations. Several aggregators of information and database vendors, such as EBSCOhost and ProQuest, facilitate advanced searches supported by specialist keyword thesauri. Searches of items through specialist databases are complementary to those through multidisciplinary research platforms, such as PubMed, Web of Science, and Google Scholar. Familiarizing with the functional characteristics of biomedical and nonbiomedical bibliographic search tools is mandatory for researchers, authors, editors, and publishers. The database users are offered updates of the indexed journal lists, abstracts, author profiles, and links to other metadata. Editors and publishers may find particularly useful source selection criteria and apply for coverage of their peer-reviewed journals and grey literature sources. These criteria are aimed at accepting relevant sources with established editorial policies and quality controls. PMID:27134485

  20. Database citation in full text biomedical articles.

    Science.gov (United States)

    Kafkas, Şenay; Kim, Jee-Hyub; McEntyre, Johanna R

    2013-01-01

    Molecular biology and literature databases represent essential infrastructure for life science research. Effective integration of these data resources requires that there are structured cross-references at the level of individual articles and biological records. Here, we describe the current patterns of how database entries are cited in research articles, based on analysis of the full text Open Access articles available from Europe PMC. Focusing on citation of entries in the European Nucleotide Archive (ENA), UniProt and Protein Data Bank, Europe (PDBe), we demonstrate that text mining doubles the number of structured annotations of database record citations supplied in journal articles by publishers. Many thousands of new literature-database relationships are found by text mining, since these relationships are also not present in the set of articles cited by database records. We recommend that structured annotation of database records in articles is extended to other databases, such as ArrayExpress and Pfam, entries from which are also cited widely in the literature. The very high precision and high-throughput of this text-mining pipeline makes this activity possible both accurately and at low cost, which will allow the development of new integrated data services.

  1. IMPPAT: A curated database of Indian Medicinal Plants, Phytochemistry And Therapeutics.

    Science.gov (United States)

    Mohanraj, Karthikeyan; Karthikeyan, Bagavathy Shanmugam; Vivek-Ananth, R P; Chand, R P Bharath; Aparna, S R; Mangalapandi, Pattulingam; Samal, Areejit

    2018-03-12

    Phytochemicals of medicinal plants encompass a diverse chemical space for drug discovery. India is rich with a flora of indigenous medicinal plants that have been used for centuries in traditional Indian medicine to treat human maladies. A comprehensive online database on the phytochemistry of Indian medicinal plants will enable computational approaches towards natural product based drug discovery. In this direction, we present, IMPPAT, a manually curated database of 1742 Indian Medicinal Plants, 9596 Phytochemicals, And 1124 Therapeutic uses spanning 27074 plant-phytochemical associations and 11514 plant-therapeutic associations. Notably, the curation effort led to a non-redundant in silico library of 9596 phytochemicals with standard chemical identifiers and structure information. Using cheminformatic approaches, we have computed the physicochemical, ADMET (absorption, distribution, metabolism, excretion, toxicity) and drug-likeliness properties of the IMPPAT phytochemicals. We show that the stereochemical complexity and shape complexity of IMPPAT phytochemicals differ from libraries of commercial compounds or diversity-oriented synthesis compounds while being similar to other libraries of natural products. Within IMPPAT, we have filtered a subset of 960 potential druggable phytochemicals, of which majority have no significant similarity to existing FDA approved drugs, and thus, rendering them as good candidates for prospective drugs. IMPPAT database is openly accessible at: https://cb.imsc.res.in/imppat .

  2. Yeast Interacting Proteins Database: YNL189W, YGL175C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ait as prey (0) YGL175C SAE2 Endonuclease that processes hairpin DNA structures w... (0) Prey ORF YGL175C Prey gene name SAE2 Prey description Endonuclease that processes hairpin DNA structures

  3. Yeast Interacting Proteins Database: YDR490C, YGR086C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available bait as prey (0) YGR086C PIL1 Primary component of eisosomes, which are large immobile cell cortex struct...ctures associated with endocytosis; null mutants show activation of Pkc1p/Ypk1p str...y (0) Prey ORF YGR086C Prey gene name PIL1 Prey description Primary component of eisosomes, which are large immobile cell cortex stru...ures associated with endocytosis; null mutants show activation of Pkc1p/Ypk1p stres

  4. Yeast Interacting Proteins Database: YGR086C, YKL142W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YGR086C PIL1 Primary component of eisosomes, which are large immobile cell cortex structures... ORF YGR086C Bait gene name PIL1 Bait description Primary component of eisosomes, which are large immobile cell cortex structures

  5. Yeast Interacting Proteins Database: YPL022W, YLR135W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available repair; cleaves branched structures in a complex with Slx1p; involved in Rad1p/Rad10p-dependent removal of ... Prey gene name SLX4 Prey description Endonuclease involved in processing DNA during recombination and repair; cleaves branched struc...tures in a complex with Slx1p; involved in Rad1p/Rad10p-dependent removal of 3'-non

  6. Yeast Interacting Proteins Database: YLR447C, YDR277C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available uction pathway, required for repression of transcription by Rgt1p; interacts with Rgt1p and the Snf3p and Rgt2p glucose sensors...transduction pathway, required for repression of transcription by Rgt1p; interacts with Rgt1p and the Snf3p and Rgt2p glucose sensors

  7. Yeast Interacting Proteins Database: YNL189W, YDR318W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ponent of the COMA complex (Ctf19p, Okp1p, Mcm21p, Ame1p) that bridges kinetochore subunits that are in cont...t of the COMA complex (Ctf19p, Okp1p, Mcm21p, Ame1p) that bridges kinetochore subunits that are in contact w

  8. Yeast Interacting Proteins Database: YML042W, YML042W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available xisomes, transfers activated acetyl groups to carnitine to form acetylcarnitine which can be shuttled across membranes...etyl groups to carnitine to form acetylcarnitine which can be shuttled across membranes Rows with this prey ...ne which can be shuttled across membranes Rows with this bait as bait Rows with this bait as bait (1) Rows w...ansfers activated acetyl groups to carnitine to form acetylcarnitine which can be shuttled across membranes

  9. Yeast Interacting Proteins Database: YNL189W, YLR328W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ait as prey (0) YLR328W NMA1 Nicotinic acid mononucleotide adenylyltransferase, involved in pathways... of NAD biosynthesis, including the de novo, NAD(+) salvage, and nicotinamide riboside salvage pathways... Nicotinic acid mononucleotide adenylyltransferase, involved in pathways of NAD biosynthesis, including the ...de novo, NAD(+) salvage, and nicotinamide riboside salvage pathways Rows with this prey as prey Rows with th

  10. Yeast Interacting Proteins Database: YGR010W, YLR328W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available 1 Nicotinic acid mononucleotide adenylyltransferase, involved in pathways of NAD biosynthesis, including the... de novo, NAD(+) salvage, and nicotinamide riboside salvage pathways Rows with th...ne name NMA1 Prey description Nicotinic acid mononucleotide adenylyltransferase, involved in pathways of NAD... biosynthesis, including the de novo, NAD(+) salvage, and nicotinamide riboside salvage pathways

  11. Yeast Interacting Proteins Database: YLR328W, YGR010W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YLR328W NMA1 Nicotinic acid mononucleotide adenylyltransferase, involved in pathways... of NAD biosynthesis, including the de novo, NAD(+) salvage, and nicotinamide riboside salvage pathways Row... ORF YLR328W Bait gene name NMA1 Bait description Nicotinic acid mononucleotide adenylyltransferase, involved in pathways...otinamide riboside salvage pathways Rows with this bait as bait Rows with this bait as bait (2) Rows with th

  12. Yeast Interacting Proteins Database: YLR328W, YLR328W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YLR328W NMA1 Nicotinic acid mononucleotide adenylyltransferase, involved in pathways... of NAD biosynthesis, including the de novo, NAD(+) salvage, and nicotinamide riboside salvage pathways Row...ylyltransferase, involved in pathways of NAD biosynthesis, including the de novo,... NAD(+) salvage, and nicotinamide riboside salvage pathways Rows with this prey as prey (4) Rows with this p... description Nicotinic acid mononucleotide adenylyltransferase, involved in pathways

  13. Yeast Interacting Proteins Database: YML064C, YLR328W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available th this bait as prey (0) YLR328W NMA1 Nicotinic acid mononucleotide adenylyltransferase, involved in pathways... of NAD biosynthesis, including the de novo, NAD(+) salvage, and nicotinamide riboside salvage pathways...ic acid mononucleotide adenylyltransferase, involved in pathways of NAD biosynthe...sis, including the de novo, NAD(+) salvage, and nicotinamide riboside salvage pathways Rows with this prey a

  14. Yeast Interacting Proteins Database: YJL137C, YLR258W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ait as prey (1) YLR258W GSY2 Glycogen synthase, similar to Gsy1p; expression induced by glucose limitation, ...ssion induced by glucose limitation, nitrogen starvation, heat shock, and stationary phase; activity regulat

  15. Yeast Interacting Proteins Database: YEL062W, YPL255W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available iates downregulation of TOR Complex 1 activity in response to amino acid limitation; transcription is induce...regulation of TOR Complex 1 activity in response to amino acid limitation; transcription is induced in respo

  16. Yeast Interacting Proteins Database: YFR015C, YLR258W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available yeast homolog; expression induced by glucose limitation, nitrogen starvation, environmental stress, and entr...n synthase, similar to Gsy1p; expression induced by glucose limitation, nitrogen ...; expression induced by glucose limitation, nitrogen starvation, environmental stress, and entry into statio...ogen synthase, similar to Gsy1p; expression induced by glucose limitation, nitrogen starvation, heat shock,

  17. Yeast Interacting Proteins Database: YNL189W, YHR216W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available g, expression is repressed by nutrient limitation Rows with this prey as prey (1) Rows with this prey as bai...osynthesis, expression is induced by mycophenolic acid resulting in resistance to the drug, expression is repressed by nutrient limit...ation Rows with this prey as prey Rows with this prey as

  18. Yeast Interacting Proteins Database: YLR258W, YLR258W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YLR258W GSY2 Glycogen synthase, similar to Gsy1p; expression induced by glucose limitation...bait as prey (3) YLR258W GSY2 Glycogen synthase, similar to Gsy1p; expression induced by glucose limitatio...pression induced by glucose limitation, nitrogen starvation, heat shock, and stationary phase; activity regu...LR258W Bait ORF YLR258W Bait gene name GSY2 Bait description Glycogen synthase, similar to Gsy1p; expression induced by glucose limit...ation, nitrogen starvation, heat shock, and stationary phase; activity regulated by

  19. Yeast Interacting Proteins Database: YDL153C, YBR041W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available Faa4p that imports and activates exogenous fatty acids Rows with this prey as prey (1) Rows with this prey ...d transporter and very long-chain fatty acyl-CoA synthetase, may form a complex with Faa1p or Faa4p that imports and activates exogen...ous fatty acids Rows with this prey as prey Rows with this prey as prey (1) Rows wi

  20. Yeast Interacting Proteins Database: YOR171C, YOR034C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ain base phosphates, which function as signaling molecules, regulates synthesis of ceramide from exogenous l...chain base phosphates, which function as signaling molecules, regulates synthesis of ceramide from exogenous

  1. Yeast Interacting Proteins Database: YPL151C, YOR036W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available t-SNARE) for vesicular intermediates traveling between the Golgi apparatus and the vacuole; controls entry o...e PEP12 Prey description Target membrane receptor (t-SNARE) for vesicular intermediates traveling between th

  2. Yeast Interacting Proteins Database: YOR036W, YBL102W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YOR036W PEP12 Target membrane receptor (t-SNARE) for vesicular intermediates travel...me PEP12 Bait description Target membrane receptor (t-SNARE) for vesicular intermediates traveling between t

  3. Yeast Interacting Proteins Database: YOL069W, YMR294W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available complex (Ndc80p-Nuf2p-Spc24p-Spc25p); involved in chromosome segregation, spindle checkpoint activity and kinetochore clustering...heckpoint activity and kinetochore clustering Rows with this bait as bait Rows with this bait as bait (3) Ro

  4. Yeast Interacting Proteins Database: YOL069W, YNL086W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available complex (Ndc80p-Nuf2p-Spc24p-Spc25p); involved in chromosome segregation, spindle checkpoint activity and kinetochore clustering...ved in chromosome segregation, spindle checkpoint activity and kinetochore clustering Rows with this bait as

  5. Yeast Interacting Proteins Database: YGR113W, YIL144W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available indle checkpoint activity, kinetochore assembly and clustering Rows with this prey as prey (2) Rows with thi...heckpoint activity, kinetochore assembly and clustering Rows with this prey as pr

  6. Yeast Interacting Proteins Database: YPL031C, YPL219W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ting the cellular response to nutrient levels and environmental conditions and progression through the cell ...e cellular response to nutrient levels and environmental conditions and progression through the cell cycle R

  7. Yeast Interacting Proteins Database: YMR125W, YPL178W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available so contains Sto1p, component of the spliceosomal commitment complex; interacts with Npl3p, possibly to packa...lso contains Sto1p, component of the spliceosomal commitment complex; interacts with Npl3p, possibly to pack

  8. Yeast Interacting Proteins Database: YER059W, YDL224C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ment for passage through Start and commitment to cell division Rows with this prey as prey (1) Rows with thi... passage through Start and commitment to cell division Rows with this prey as prey Rows with this prey as pr

  9. Yeast Interacting Proteins Database: YNR051C, YER151C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available that coregulates anterograde and retrograde transport between the endoplasmic reticulum and Golgi compartme...C UBP3 Ubiquitin-specific protease that interacts with Bre5p to co-regulate anterograde and retrograde...t coregulates anterograde and retrograde transport between the endoplasmic reticulum and Golgi compartments;...3 Prey description Ubiquitin-specific protease that interacts with Bre5p to co-regulate anterograde and retrograde

  10. Yeast Interacting Proteins Database: YDR439W, YCR086W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available inetochores during meiosis I to mediate accurate homolog segregation; required for condensin recruitment to ...and then Mam1p at kinetochores during meiosis I to mediate accurate homolog segregation; required for condensin recruitment...p, and then Mam1p at kinetochores during meiosis I to mediate accurate homolog segregation; required for condensin recruitment...with Lrs4p and then Mam1p at kinetochores during meiosis I to mediate accurate homolog segregation; required for condensin recruitmen

  11. Yeast Interacting Proteins Database: YFR015C, YFR015C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available yeast homolog; expression induced by glucose limitation, nitrogen starvation, environmental stress, and entr...ression induced by glucose limitation, nitrogen starvation, environmental stress, and entry into stationary ...tion, nitrogen starvation, environmental stress, and entry into stationary phase Rows with this bait as bait..., the more highly expressed yeast homolog; expression induced by glucose limitation, nitrogen starvation, environmental

  12. Yeast Interacting Proteins Database: YFR015C, YJL137C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available yeast homolog; expression induced by glucose limitation, nitrogen starvation, environmental stress, and entr...pression induced by glucose limitation, nitrogen starvation, environmental stress, and entry into stationary

  13. Yeast Interacting Proteins Database: YPL031C, YER059W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ting the cellular response to nutrient levels and environmental conditions and progression through the cell ...ers; involved in regulating the cellular response to nutrient levels and environmental conditions and progre

  14. Yeast Interacting Proteins Database: YPL031C, YIL050W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ting the cellular response to nutrient levels and environmental conditions and progression through the cell ... with ten cyclin partners; involved in regulating the cellular response to nutrient levels and environmental

  15. Yeast Interacting Proteins Database: YNL189W, YPL111W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ession responds to both induction by arginine and nitrogen catabolite repression; disruption enhances freeze... catabolite repression; disruption enhances freeze tolerance Rows with this prey as prey Rows with this prey...ginase, responsible for arginine degradation, expression responds to both induction by arginine and nitrogen

  16. Yeast Interacting Proteins Database: YML064C, YPL111W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available both induction by arginine and nitrogen catabolite repression; disruption enhanc...inine and nitrogen catabolite repression; disruption enhances freeze tolerance Rows with this prey as prey R

  17. Yeast Interacting Proteins Database: YLR175W, YNL124W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available n ortholog dyskerin cause the disorder dyskeratosis congenita Rows with this bait as bait (1) Rows with this...dyskerin cause the disorder dyskeratosis congenita Rows with this bait as bait Ro

  18. Yeast Interacting Proteins Database: YLR347C, YLR377C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available gy-mediated degradation depending on growth conditions; interacts with Vid30p Rows with this prey as prey (4...r autophagy-mediated degradation depending on growth conditions; interacts with V

  19. Yeast Interacting Proteins Database: YNL189W, YLR377C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available phagy-mediated degradation depending on growth conditions; interacts with Vid30p Rows with this prey as prey...ated or autophagy-mediated degradation depending on growth conditions; interacts

  20. Yeast Interacting Proteins Database: YML064C, YLR377C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available d or autophagy-mediated degradation depending on growth conditions; interacts with Vid30p Rows with this pre...er proteasome-mediated or autophagy-mediated degradation depending on growth conditions; interacts with Vid3