WorldWideScience

Sample records for visualizing protein database

  1. SCOWLP: a web-based database for detailed characterization and visualization of protein interfaces

    Directory of Open Access Journals (Sweden)

    Schroeder Michael

    2006-03-01

    Full Text Available Abstract Background Currently there is a strong need for methods that help to obtain an accurate description of protein interfaces in order to be able to understand the principles that govern molecular recognition and protein function. Many of the recent efforts to computationally identify and characterize protein networks extract protein interaction information at atomic resolution from the PDB. However, they pay none or little attention to small protein ligands and solvent. They are key components and mediators of protein interactions and fundamental for a complete description of protein interfaces. Interactome profiling requires the development of computational tools to extract and analyze protein-protein, protein-ligand and detailed solvent interaction information from the PDB in an automatic and comparative fashion. Adding this information to the existing one on protein-protein interactions will allow us to better understand protein interaction networks and protein function. Description SCOWLP (Structural Characterization Of Water, Ligands and Proteins is a user-friendly and publicly accessible web-based relational database for detailed characterization and visualization of the PDB protein interfaces. The SCOWLP database includes proteins, peptidic-ligands and interface water molecules as descriptors of protein interfaces. It contains currently 74,907 protein interfaces and 2,093,976 residue-residue interactions formed by 60,664 structural units (protein domains and peptidic-ligands and their interacting solvent. The SCOWLP web-server allows detailed structural analysis and comparisons of protein interfaces at atomic level by text query of PDB codes and/or by navigating a SCOP-based tree. It includes a visualization tool to interactively display the interfaces and label interacting residues and interface solvent by atomic physicochemical properties. SCOWLP is automatically updated with every SCOP release. Conclusion SCOWLP enriches

  2. Milk bioactive peptide database: A comprehensive database of milk protein-derived bioactive peptides and novel visualization.

    Science.gov (United States)

    Nielsen, Søren Drud; Beverly, Robert L; Qu, Yunyao; Dallas, David C

    2017-10-01

    During processing and digestion, milk proteins are disassembled into peptides with an array of biological functions, including antimicrobial, angiotensin-converting enzyme inhibition, antioxidant, opioid, and immunomodulation. These functions are summarized in numerous reviews, yet information on which peptides have which functions remains scattered across hundreds of research articles. We systematically searched the literature for all instances of bioactive peptides derived from milk proteins from any mammalian source. The data were compiled into a comprehensive database, which can be used to search for specific functions, peptides, or proteins (http://mbpdb.nws.oregonstate.edu). To review this large dataset, the bioactive peptides reported in the literature were visually mapped on the parent protein sequences, providing information on sites with highest abundance of bioactive peptides. Copyright © 2017 Elsevier Ltd. All rights reserved.

  3. Visualization of multidimensional database

    Science.gov (United States)

    Lee, Chung

    2008-01-01

    The concept of multidimensional databases has been extensively researched and wildly used in actual database application. It plays an important role in contemporary information technology, but due to the complexity of its inner structure, the database design is a complicated process and users are having a hard time fully understanding and using the database. An effective visualization tool for higher dimensional information system helps database designers and users alike. Most visualization techniques focus on displaying dimensional data using spreadsheets and charts. This may be sufficient for the databases having three or fewer dimensions but for higher dimensions, various combinations of projection operations are needed and a full grasp of total database architecture is very difficult. This study reviews existing visualization techniques for multidimensional database and then proposes an alternate approach to visualize a database of any dimension by adopting the tool proposed by Kiviat for software engineering processes. In this diagramming method, each dimension is represented by one branch of concentric spikes. This paper documents a C++ based visualization tool with extensive use of OpenGL graphics library and GUI functions. Detailed examples of actual databases demonstrate the feasibility and effectiveness in visualizing multidimensional databases.

  4. Visual exploration across biomedical databases.

    Science.gov (United States)

    Lieberman, Michael D; Taheri, Sima; Guo, Huimin; Mirrashed, Fatemeh; Yahav, Inbal; Aris, Aleks; Shneiderman, Ben

    2011-01-01

    Though biomedical research often draws on knowledge from a wide variety of fields, few visualization methods for biomedical data incorporate meaningful cross-database exploration. A new approach is offered for visualizing and exploring a query-based subset of multiple heterogeneous biomedical databases. Databases are modeled as an entity-relation graph containing nodes (database records) and links (relationships between records). Users specify a keyword search string to retrieve an initial set of nodes, and then explore intra- and interdatabase links. Results are visualized with user-defined semantic substrates to take advantage of the rich set of attributes usually present in biomedical data. Comments from domain experts indicate that this visualization method is potentially advantageous for biomedical knowledge exploration.

  5. Protein-Protein Interaction Databases

    DEFF Research Database (Denmark)

    Szklarczyk, Damian; Jensen, Lars Juhl

    2015-01-01

    of research are explored. Here we present an overview of the most widely used protein-protein interaction databases and the methods they employ to gather, combine, and predict interactions. We also point out the trade-off between comprehensiveness and accuracy and the main pitfall scientists have to be aware...

  6. ProXL (Protein Cross-Linking Database): A Platform for Analysis, Visualization, and Sharing of Protein Cross-Linking Mass Spectrometry Data.

    Science.gov (United States)

    Riffle, Michael; Jaschob, Daniel; Zelter, Alex; Davis, Trisha N

    2016-08-05

    ProXL is a Web application and accompanying database designed for sharing, visualizing, and analyzing bottom-up protein cross-linking mass spectrometry data with an emphasis on structural analysis and quality control. ProXL is designed to be independent of any particular software pipeline. The import process is simplified by the use of the ProXL XML data format, which shields developers of data importers from the relative complexity of the relational database schema. The database and Web interfaces function equally well for any software pipeline and allow data from disparate pipelines to be merged and contrasted. ProXL includes robust public and private data sharing capabilities, including a project-based interface designed to ensure security and facilitate collaboration among multiple researchers. ProXL provides multiple interactive and highly dynamic data visualizations that facilitate structural-based analysis of the observed cross-links as well as quality control. ProXL is open-source, well-documented, and freely available at https://github.com/yeastrc/proxl-web-app .

  7. Database of Interacting Proteins (DIP)

    Data.gov (United States)

    U.S. Department of Health & Human Services — The DIP database catalogs experimentally determined interactions between proteins. It combines information from a variety of sources to create a single, consistent...

  8. Protein: CAD [Trypanosomes Database

    Lifescience Database Archive (English)

    Full Text Available CAD carbamoyl-phosphate synthetase 2, aspartate transcarbamylase, and dihydroorotaseCAD... trifunctional proteincarbamoylphosphate synthetase 2/aspartate transcarbamylase/dihydroorotasemultifunctional protein CAD... H.sapiens 47458828 18105007 790 P27708 CAD_(gene) 2.1.3.2|3.5.2.3|6.3.5.5 114010 2p22-p21 hsa00250|hsa00240 ...

  9. Visualizing Data and the Online FRED Database

    Science.gov (United States)

    Méndez-Carbajo, Diego

    2015-01-01

    The author discusses a pedagogical strategy based on data visualization and analysis in the teaching of intermediate macroeconomics and financial economics. In these short projects, students collect and manipulate economic data from the online Federal Reserve Economic Database (FRED) in order to illustrate theoretical relationships discussed in…

  10. Practical Database Programming With Visual C#NET

    CERN Document Server

    Bai, Ying

    2010-01-01

    A novel approach to developing and applying databases with Visual C#.NET. Practical Database Programming with Visual C#.NET clearly explains the considerations and applications in database programming with Visual C#.NET 2008 and in developing relational databases such as Microsoft Access, SQL Server, and Oracle Database. Sidestepping the traditional approach of using large blocks of code, Ying Bai utilizes both Design Tools and Wizards provided by Visual Studio.NET and real-time object methods to incorporate over sixty real sample database programming projects along with detailed illustrations

  11. NPIDB: nucleic acid?protein interaction database

    OpenAIRE

    Kirsanov, Dmitry D.; Zanegina, Olga N.; Aksianov, Evgeniy A.; Spirin, Sergei A.; Karyagina, Anna S.; Alexeevski, Andrei V

    2012-01-01

    The Nucleic acid?Protein Interaction DataBase (http://npidb.belozersky.msu.ru/) contains information derived from structures of DNA?protein and RNA?protein complexes extracted from the Protein Data Bank (3846 complexes in October 2012). It provides a web interface and a set of tools for extracting biologically meaningful characteristics of nucleoprotein complexes. The content of the database is updated weekly. The current version of the Nucleic acid?Protein Interaction DataBase is an upgrade ...

  12. PSIbase: a database of Protein Structural Interactome map (PSIMAP).

    Science.gov (United States)

    Gong, Sungsam; Yoon, Giseok; Jang, Insoo; Bolser, Dan; Dafas, Panos; Schroeder, Michael; Choi, Hansol; Cho, Yoobok; Han, Kyungsook; Lee, Sunghoon; Choi, Hwanho; Lappe, Michael; Holm, Liisa; Kim, Sangsoo; Oh, Donghoon; Bhak, Jonghwa

    2005-05-15

    Protein Structural Interactome map (PSIMAP) is a global interaction map that describes domain-domain and protein-protein interaction information for known Protein Data Bank structures. It calculates the Euclidean distance to determine interactions between possible pairs of structural domains in proteins. PSIbase is a database and file server for protein structural interaction information calculated by the PSIMAP algorithm. PSIbase also provides an easy-to-use protein domain assignment module, interaction navigation and visual tools. Users can retrieve possible interaction partners of their proteins of interests if a significant homology assignment is made with their query sequences. http://psimap.org and http://psibase.kaist.ac.kr/

  13. Visualization of database structures for information retrieval

    Directory of Open Access Journals (Sweden)

    Grete Lisbjerg Jensen

    1994-12-01

    Full Text Available This paper describes the Book House system, which is designed to support children's information retrieval in libraries as part of their education. It is a shareware program available on CD-ROM or floppy disks, and comprises functionality for database searching as well as for classifying and storing book information in the database. The system concept is based on an understanding of children's domain structures and their capabilities for categorization of information needs in connection with their activities in schools, in school libraries or in public libraries. These structures are visualized in the interface by using metaphors and multimedia technology. Through the use of text, images and animation, the Book House encourages children - even at a very early age - to learn by doing in an enjoyable way, which plays on their previous experiences with computer games. Both words and pictures can be used for searching; this makes the system suitable for all age groups. Even children who have not yet learned to read properly can, by selecting pictures, search for and find those books they would like to have read aloud. Thus, at the very beginning of their school life, they can learn to search for books on their own. For the library community, such a system will provide an extended service which will increase the number of children's own searches and also improve the relevance, quality and utilization of the book collections in the libraries. A market research report on the need for an annual indexing service for books in the Book House format is in preparation by the Danish Library Centre A/S.

  14. Improving decoy databases for protein folding algorithms

    KAUST Repository

    Lindsey, Aaron

    2014-01-01

    Copyright © 2014 ACM. Predicting protein structures and simulating protein folding are two of the most important problems in computational biology today. Simulation methods rely on a scoring function to distinguish the native structure (the most energetically stable) from non-native structures. Decoy databases are collections of non-native structures used to test and verify these functions. We present a method to evaluate and improve the quality of decoy databases by adding novel structures and removing redundant structures. We test our approach on 17 different decoy databases of varying size and type and show significant improvement across a variety of metrics. We also test our improved databases on a popular modern scoring function and show that they contain a greater number of native-like structures than the original databases, thereby producing a more rigorous database for testing scoring functions.

  15. Update History of This Database - Yeast Interacting Proteins Database | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us Yeast Interacting Proteins Database Update History of This Database Date Update contents 201... Download License Update History of This Database Site Policy | Contact Us Update Histor

  16. HCVpro: Hepatitis C virus protein interaction database

    KAUST Repository

    Kwofie, Samuel K.

    2011-12-01

    It is essential to catalog characterized hepatitis C virus (HCV) protein-protein interaction (PPI) data and the associated plethora of vital functional information to augment the search for therapies, vaccines and diagnostic biomarkers. In furtherance of these goals, we have developed the hepatitis C virus protein interaction database (HCVpro) by integrating manually verified hepatitis C virus-virus and virus-human protein interactions curated from literature and databases. HCVpro is a comprehensive and integrated HCV-specific knowledgebase housing consolidated information on PPIs, functional genomics and molecular data obtained from a variety of virus databases (VirHostNet, VirusMint, HCVdb and euHCVdb), and from BIND and other relevant biology repositories. HCVpro is further populated with information on hepatocellular carcinoma (HCC) related genes that are mapped onto their encoded cellular proteins. Incorporated proteins have been mapped onto Gene Ontologies, canonical pathways, Online Mendelian Inheritance in Man (OMIM) and extensively cross-referenced to other essential annotations. The database is enriched with exhaustive reviews on structure and functions of HCV proteins, current state of drug and vaccine development and links to recommended journal articles. Users can query the database using specific protein identifiers (IDs), chromosomal locations of a gene, interaction detection methods, indexed PubMed sources as well as HCVpro, BIND and VirusMint IDs. The use of HCVpro is free and the resource can be accessed via http://apps.sanbi.ac.za/hcvpro/ or http://cbrc.kaust.edu.sa/hcvpro/. © 2011 Elsevier B.V.

  17. Proteomics: Protein Identification Using Online Databases

    Science.gov (United States)

    Eurich, Chris; Fields, Peter A.; Rice, Elizabeth

    2012-01-01

    Proteomics is an emerging area of systems biology that allows simultaneous study of thousands of proteins expressed in cells, tissues, or whole organisms. We have developed this activity to enable high school or college students to explore proteomic databases using mass spectrometry data files generated from yeast proteins in a college laboratory…

  18. Manually curated database of rice proteins.

    Science.gov (United States)

    Gour, Pratibha; Garg, Priyanka; Jain, Rashmi; Joseph, Shaji V; Tyagi, Akhilesh K; Raghuvanshi, Saurabh

    2014-01-01

    'Manually Curated Database of Rice Proteins' (MCDRP) available at http://www.genomeindia.org/biocuration is a unique curated database based on published experimental data. Semantic integration of scientific data is essential to gain a higher level of understanding of biological systems. Since the majority of scientific data is available as published literature, text mining is an essential step before the data can be integrated and made available for computer-based search in various databases. However, text mining is a tedious exercise and thus, there is a large gap in the data available in curated databases and published literature. Moreover, data in an experiment can be perceived from several perspectives, which may not reflect in the text-based curation. In order to address such issues, we have demonstrated the feasibility of digitizing the experimental data itself by creating a database on rice proteins based on in-house developed data curation models. Using these models data of individual experiments have been digitized with the help of universal ontologies. Currently, the database has data for over 1800 rice proteins curated from >4000 different experiments of over 400 research articles. Since every aspect of the experiment such as gene name, plant type, tissue and developmental stage has been digitized, experimental data can be rapidly accessed and integrated.

  19. HCVpro: hepatitis C virus protein interaction database.

    Science.gov (United States)

    Kwofie, Samuel K; Schaefer, Ulf; Sundararajan, Vijayaraghava S; Bajic, Vladimir B; Christoffels, Alan

    2011-12-01

    It is essential to catalog characterized hepatitis C virus (HCV) protein-protein interaction (PPI) data and the associated plethora of vital functional information to augment the search for therapies, vaccines and diagnostic biomarkers. In furtherance of these goals, we have developed the hepatitis C virus protein interaction database (HCVpro) by integrating manually verified hepatitis C virus-virus and virus-human protein interactions curated from literature and databases. HCVpro is a comprehensive and integrated HCV-specific knowledgebase housing consolidated information on PPIs, functional genomics and molecular data obtained from a variety of virus databases (VirHostNet, VirusMint, HCVdb and euHCVdb), and from BIND and other relevant biology repositories. HCVpro is further populated with information on hepatocellular carcinoma (HCC) related genes that are mapped onto their encoded cellular proteins. Incorporated proteins have been mapped onto Gene Ontologies, canonical pathways, Online Mendelian Inheritance in Man (OMIM) and extensively cross-referenced to other essential annotations. The database is enriched with exhaustive reviews on structure and functions of HCV proteins, current state of drug and vaccine development and links to recommended journal articles. Users can query the database using specific protein identifiers (IDs), chromosomal locations of a gene, interaction detection methods, indexed PubMed sources as well as HCVpro, BIND and VirusMint IDs. The use of HCVpro is free and the resource can be accessed via http://apps.sanbi.ac.za/hcvpro/ or http://cbrc.kaust.edu.sa/hcvpro/. Copyright © 2011 Elsevier B.V. All rights reserved.

  20. MOPED: Model Organism Protein Expression Database

    OpenAIRE

    Kolker, Eugene; Higdon, Roger; Haynes, Winston; Welch, Dean; Broomall, William; Lancet, Doron; Stanberry, Larissa; Kolker, Natali

    2011-01-01

    Large numbers of mass spectrometry proteomics studies are being conducted to understand all types of biological processes. The size and complexity of proteomics data hinders efforts to easily share, integrate, query and compare the studies. The Model Organism Protein Expression Database (MOPED, htttp://moped.proteinspire.org) is a new and expanding proteomics resource that enables rapid browsing of protein expression information from publicly available studies on humans and model organisms. M...

  1. Role for protein-protein interaction databases in human genetics.

    Science.gov (United States)

    Pattin, Kristine A; Moore, Jason H

    2009-12-01

    Proteomics and the study of protein-protein interactions are becoming increasingly important in our effort to understand human diseases on a system-wide level. Thanks to the development and curation of protein-interaction databases, up-to-date information on these interaction networks is accessible and publicly available to the scientific community. As our knowledge of protein-protein interactions increases, it is important to give thought to the different ways that these resources can impact biomedical research. In this article, we highlight the importance of protein-protein interactions in human genetics and genetic epidemiology. Since protein-protein interactions demonstrate one of the strongest functional relationships between genes, combining genomic data with available proteomic data may provide us with a more in-depth understanding of common human diseases. In this review, we will discuss some of the fundamentals of protein interactions, the databases that are publicly available and how information from these databases can be used to facilitate genome-wide genetic studies.

  2. Visualizing Concurrency Control Algorithms for Real-Time Database Systems

    Directory of Open Access Journals (Sweden)

    Olusegun Folorunso

    2008-11-01

    Full Text Available This paper describes an approach to visualizing concurrency control (CC algorithms for real-time database systems (RTDBs. This approach is based on the principle of software visualization, which has been applied in related fields. The Model-View-controller (MVC architecture is used to alleviate the black box syndrome associated with the study of algorithm behaviour for RTDBs Concurrency Controls. We propose a Visualization "exploratory" tool that assists the RTDBS designer in understanding the actual behaviour of the concurrency control algorithms of choice and also in evaluating the performance quality of the algorithm. We demonstrate the feasibility of our approach using an optimistic concurrency control model as our case study. The developed tool substantiates the earlier simulation-based performance studies by exposing spikes at some points when visualized dynamically that are not observed using usual static graphs. Eventually this tool helps solve the problem of contradictory assumptions of CC in RTDBs.

  3. NPIDB: Nucleic acid-Protein Interaction DataBase

    National Research Council Canada - National Science Library

    Kirsanov, Dmitry D; Zanegina, Olga N; Aksianov, Evgeniy A; Spirin, Sergei A; Karyagina, Anna S; Alexeevski, Andrei V

    2013-01-01

    The Nucleic acid-Protein Interaction DataBase (http://npidb.belozersky.msu.ru/) contains information derived from structures of DNA-protein and RNA-protein complexes extracted from the Protein Data Bank...

  4. d-matrix – database exploration, visualization and analysis

    Directory of Open Access Journals (Sweden)

    Sperling Hans-Peter

    2004-10-01

    Full Text Available Abstract Background Motivated by a biomedical database set up by our group, we aimed to develop a generic database front-end with embedded knowledge discovery and analysis features. A major focus was the human-oriented representation of the data and the enabling of a closed circle of data query, exploration, visualization and analysis. Results We introduce a non-task-specific database front-end with a new visualization strategy and built-in analysis features, so called d-matrix. d-matrix is web-based and compatible with a broad range of database management systems. The graphical outcome consists of boxes whose colors show the quality of the underlying information and, as the name suggests, they are arranged in matrices. The granularity of the data display allows consequent drill-down. Furthermore, d-matrix offers context-sensitive categorization, hierarchical sorting and statistical analysis. Conclusions d-matrix enables data mining, with a high level of interactivity between humans and computer as a primary factor. We believe that the presented strategy can be very effective in general and especially useful for the integration of distinct data types such as phenotypical and molecular data.

  5. A protein domain interaction interface database: InterPare.

    Science.gov (United States)

    Gong, Sungsam; Park, Changbum; Choi, Hansol; Ko, Junsu; Jang, Insoo; Lee, Jungsul; Bolser, Dan M; Oh, Donghoon; Kim, Deok-Soo; Bhak, Jong

    2005-08-25

    Most proteins function by interacting with other molecules. Their interaction interfaces are highly conserved throughout evolution to avoid undesirable interactions that lead to fatal disorders in cells. Rational drug discovery includes computational methods to identify the interaction sites of lead compounds to the target molecules. Identifying and classifying protein interaction interfaces on a large scale can help researchers discover drug targets more efficiently. We introduce a large-scale protein domain interaction interface database called InterPare http://interpare.net. It contains both inter-chain (between chains) interfaces and intra-chain (within chain) interfaces. InterPare uses three methods to detect interfaces: 1) the geometric distance method for checking the distance between atoms that belong to different domains, 2) Accessible Surface Area (ASA), a method for detecting the buried region of a protein that is detached from a solvent when forming multimers or complexes, and 3) the Voronoi diagram, a computational geometry method that uses a mathematical definition of interface regions. InterPare includes visualization tools to display protein interior, surface, and interaction interfaces. It also provides statistics such as the amino acid propensities of queried protein according to its interior, surface, and interface region. The atom coordinates that belong to interface, surface, and interior regions can be downloaded from the website. InterPare is an open and public database server for protein interaction interface information. It contains the large-scale interface data for proteins whose 3D-structures are known. As of November 2004, there were 10,583 (Geometric distance), 10,431 (ASA), and 11,010 (Voronoi diagram) entries in the Protein Data Bank (PDB) containing interfaces, according to the above three methods. In the case of the geometric distance method, there are 31,620 inter-chain domain-domain interaction interfaces and 12,758 intra

  6. A protein domain interaction interface database: InterPare

    Directory of Open Access Journals (Sweden)

    Lee Jungsul

    2005-08-01

    Full Text Available Abstract Background Most proteins function by interacting with other molecules. Their interaction interfaces are highly conserved throughout evolution to avoid undesirable interactions that lead to fatal disorders in cells. Rational drug discovery includes computational methods to identify the interaction sites of lead compounds to the target molecules. Identifying and classifying protein interaction interfaces on a large scale can help researchers discover drug targets more efficiently. Description We introduce a large-scale protein domain interaction interface database called InterPare http://interpare.net. It contains both inter-chain (between chains interfaces and intra-chain (within chain interfaces. InterPare uses three methods to detect interfaces: 1 the geometric distance method for checking the distance between atoms that belong to different domains, 2 Accessible Surface Area (ASA, a method for detecting the buried region of a protein that is detached from a solvent when forming multimers or complexes, and 3 the Voronoi diagram, a computational geometry method that uses a mathematical definition of interface regions. InterPare includes visualization tools to display protein interior, surface, and interaction interfaces. It also provides statistics such as the amino acid propensities of queried protein according to its interior, surface, and interface region. The atom coordinates that belong to interface, surface, and interior regions can be downloaded from the website. Conclusion InterPare is an open and public database server for protein interaction interface information. It contains the large-scale interface data for proteins whose 3D-structures are known. As of November 2004, there were 10,583 (Geometric distance, 10,431 (ASA, and 11,010 (Voronoi diagram entries in the Protein Data Bank (PDB containing interfaces, according to the above three methods. In the case of the geometric distance method, there are 31,620 inter-chain domain

  7. 3D visualization of molecular structures in the MOGADOC database

    Science.gov (United States)

    Vogt, Natalja; Popov, Evgeny; Rudert, Rainer; Kramer, Rüdiger; Vogt, Jürgen

    2010-08-01

    The MOGADOC database (Molecular Gas-Phase Documentation) is a powerful tool to retrieve information about compounds which have been studied in the gas-phase by electron diffraction, microwave spectroscopy and molecular radio astronomy. Presently the database contains over 34,500 bibliographic references (from the beginning of each method) for about 10,000 inorganic, organic and organometallic compounds and structural data (bond lengths, bond angles, dihedral angles, etc.) for about 7800 compounds. Most of the implemented molecular structures are given in a three-dimensional (3D) presentation. To create or edit and visualize the 3D images of molecules, new tools (special editor and Java-based 3D applet) were developed. Molecular structures in internal coordinates were converted to those in Cartesian coordinates.

  8. Visualizing uncertainty for geographical information in the global terrorism database

    Science.gov (United States)

    Jones, Josh; Chang, Remco; Butkiewicz, Thomas; Ribarsky, William

    2008-04-01

    Presenting information on a geopolitical map can offer powerful insight into a problem by leveraging an individual's innate capacity to discover patterns and to use map-related cues to incorporate pre-existing knowledge. This mode of presentation is not without its flaws, however, as the act of placing information at specific coordinates can imply a false sense of the data's geo-spatial certainty. Traditional uncertainty visualization techniques, such as those that change primitive attributes or employ animation, can create large amounts of clutter or actively distract when visualizing geospatially uncertain events within large datasets. To effectively identify geo-spatial trends within the Global Terrorism Database of the START Center, we have developed a novel usage of squarified treemaps that maintains the strengths of traditional map-viewing but incorporates some measure of data verity.

  9. A Brief Review of RNA-Protein Interaction Database Resources

    Directory of Open Access Journals (Sweden)

    Ying Yi

    2017-01-01

    Full Text Available RNA-protein interactions play critical roles in various biological processes. By collecting and analyzing the RNA-protein interactions and binding sites from experiments and predictions, RNA-protein interaction databases have become an essential resource for the exploration of the transcriptional and post-transcriptional regulatory network. Here, we briefly review several widely used RNA-protein interaction database resources developed in recent years to provide a guide of these databases. The content and major functions in databases are presented. The brief description of database helps users to quickly choose the database containing information they interested. In short, these RNA-protein interaction database resources are continually updated, but the current state shows the efforts to identify and analyze the large amount of RNA-protein interactions.

  10. A Visual Language for Protein Design

    KAUST Repository

    Cox, Robert Sidney

    2017-02-08

    As protein engineering becomes more sophisticated, practitioners increasingly need to share diagrams for communicating protein designs. To this end, we present a draft visual language, Protein Language, that describes the high-level architecture of an engineered protein with easy-to draw glyphs, intended to be compatible with other biological diagram languages such as SBOL Visual and SBGN. Protein Language consists of glyphs for representing important features (e.g., globular domains, recognition and localization sequences, sites of covalent modification, cleavage and catalysis), rules for composing these glyphs to represent complex architectures, and rules constraining the scaling and styling of diagrams. To support Protein Language we have implemented an extensible web-based software diagram tool, Protein Designer, that uses Protein Language in a

  11. Protein - Trypanosomes Database | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available e NCBI Protein in the Simple search site) entrez Gene ID (NCBI Gene) uniprot UniProt Accession number wiki...pedia Search term in Wikipedia ec Enzyme Commission number omim OMIM ID ( Online Me

  12. An object-oriented database for protein structure analysis.

    Science.gov (United States)

    Gray, P M; Paton, N W; Kemp, G J; Fothergill, J E

    1990-03-01

    An object-oriented database system has been developed which is being used to store protein structure data. The database can be queried using the logic programming language Prolog or the query language Daplex. Queries retrieve information by navigating through a network of objects which represent the primary, secondary and tertiary structures of proteins. Routines written in both Prolog and Daplex can integrate complex calculations with the retrieval of data from the database, and can also be stored in the database for sharing among users. Thus object-oriented databases are better suited to prototyping applications and answering complex queries about protein structure than relational databases. This system has been used to find loops of varying length and anchor positions when modelling homologous protein structures.

  13. Determining and visualizing flexibility in protein structures.

    Science.gov (United States)

    Scott, Walter R P; Straus, Suzana K

    2015-05-01

    How to compare the structures of an ensemble of protein conformations is a fundamental problem in structural biology. As has been previously observed, the widely used RMSD measure due to Kabsch, in which a rigid-body superposition minimizing the least-squares positional deviations is performed, has its drawbacks when comparing and visualizing a set of flexible protein structures. Here, we develop a method, fleximatch, of protein structure comparison that takes flexibility into account. Based on a distance matrix measure of flexibility, a weighted superposition of distance matrices rather than of atomic coordinates is performed. Subsequently, this allows a consistent determination of (a) a superposition of structures for visualization, (b) a partitioning of the protein structure into rigid molecular components (core atoms), and (c) an atomic mobility measure. The method is suitable for highlighting both particularly flexible and rigid parts of a protein from structures derived from NMR, X-ray diffraction or molecular simulation. © 2015 Wiley Periodicals, Inc.

  14. MIPS: a database for genomes and protein sequences.

    Science.gov (United States)

    Mewes, H W; Frishman, D; Güldener, U; Mannhaupt, G; Mayer, K; Mokrejs, M; Morgenstern, B; Münsterkötter, M; Rudd, S; Weil, B

    2002-01-01

    The Munich Information Center for Protein Sequences (MIPS-GSF, Neuherberg, Germany) continues to provide genome-related information in a systematic way. MIPS supports both national and European sequencing and functional analysis projects, develops and maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences, and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the databases for the comprehensive set of genomes (PEDANT genomes), the database of annotated human EST clusters (HIB), the database of complete cDNAs from the DHGP (German Human Genome Project), as well as the project specific databases for the GABI (Genome Analysis in Plants) and HNB (Helmholtz-Netzwerk Bioinformatik) networks. The Arabidospsis thaliana database (MATDB), the database of mitochondrial proteins (MITOP) and our contribution to the PIR International Protein Sequence Database have been described elsewhere [Schoof et al. (2002) Nucleic Acids Res., 30, 91-93; Scharfe et al. (2000) Nucleic Acids Res., 28, 155-158; Barker et al. (2001) Nucleic Acids Res., 29, 29-32]. All databases described, the protein analysis tools provided and the detailed descriptions of our projects can be accessed through the MIPS World Wide Web server (http://mips.gsf.de).

  15. NPIDB: Nucleic acid-Protein Interaction DataBase.

    Science.gov (United States)

    Kirsanov, Dmitry D; Zanegina, Olga N; Aksianov, Evgeniy A; Spirin, Sergei A; Karyagina, Anna S; Alexeevski, Andrei V

    2013-01-01

    The Nucleic acid-Protein Interaction DataBase (http://npidb.belozersky.msu.ru/) contains information derived from structures of DNA-protein and RNA-protein complexes extracted from the Protein Data Bank (3846 complexes in October 2012). It provides a web interface and a set of tools for extracting biologically meaningful characteristics of nucleoprotein complexes. The content of the database is updated weekly. The current version of the Nucleic acid-Protein Interaction DataBase is an upgrade of the version published in 2007. The improvements include a new web interface, new tools for calculation of intermolecular interactions, a classification of SCOP families that contains DNA-binding protein domains and data on conserved water molecules on the DNA-protein interface.

  16. Visualization of protein interaction networks: problems and solutions.

    Science.gov (United States)

    Agapito, Giuseppe; Guzzi, Pietro Hiram; Cannataro, Mario

    2013-01-01

    Visualization concerns the representation of data visually and is an important task in scientific research. Protein-protein interactions (PPI) are discovered using either wet lab techniques, such mass spectrometry, or in silico predictions tools, resulting in large collections of interactions stored in specialized databases. The set of all interactions of an organism forms a protein-protein interaction network (PIN) and is an important tool for studying the behaviour of the cell machinery. Since graphic representation of PINs may highlight important substructures, e.g. protein complexes, visualization is more and more used to study the underlying graph structure of PINs. Although graphs are well known data structures, there are different open problems regarding PINs visualization: the high number of nodes and connections, the heterogeneity of nodes (proteins) and edges (interactions), the possibility to annotate proteins and interactions with biological information extracted by ontologies (e.g. Gene Ontology) that enriches the PINs with semantic information, but complicates their visualization. In these last years many software tools for the visualization of PINs have been developed. Initially thought for visualization only, some of them have been successively enriched with new functions for PPI data management and PIN analysis. The paper analyzes the main software tools for PINs visualization considering four main criteria: (i) technology, i.e. availability/license of the software and supported OS (Operating System) platforms; (ii) interoperability, i.e. ability to import/export networks in various formats, ability to export data in a graphic format, extensibility of the system, e.g. through plug-ins; (iii) visualization, i.e. supported layout and rendering algorithms and availability of parallel implementation; (iv) analysis, i.e. availability of network analysis functions, such as clustering or mining of the graph, and the possibility to interact with external

  17. NeXO Web: the NeXO ontology database and visualization platform.

    Science.gov (United States)

    Dutkowski, Janusz; Ono, Keiichiro; Kramer, Michael; Yu, Michael; Pratt, Dexter; Demchak, Barry; Ideker, Trey

    2014-01-01

    The Network-extracted Ontology (NeXO) is a gene ontology inferred directly from large-scale molecular networks. While most ontologies are constructed through manual expert curation, NeXO uses a principled computational approach which integrates evidence from hundreds of thousands of individual gene and protein interactions to construct a global hierarchy of cellular components and processes. Here, we describe the development of the NeXO Web platform (http://www.nexontology.org)-an online database and graphical user interface for visualizing, browsing and performing term enrichment analysis using NeXO and the gene ontology. The platform applies state-of-the-art web technology and visualization techniques to provide an intuitive framework for investigating biological machinery captured by both data-driven and manually curated ontologies.

  18. Prolinks: a database of protein functional linkages derived from coevolution

    Science.gov (United States)

    Bowers, Peter M; Pellegrini, Matteo; Thompson, Mike J; Fierro, Joe; Yeates, Todd O; Eisenberg, David

    2004-01-01

    The advent of whole-genome sequencing has led to methods that infer protein function and linkages. We have combined four such algorithms (phylogenetic profile, Rosetta Stone, gene neighbor and gene cluster) in a single database - Prolinks - that spans 83 organisms and includes 10 million high-confidence links. The Proteome Navigator tool allows users to browse predicted linkage networks interactively, providing accompanying annotation from public databases. The Prolinks database and the Proteome Navigator tool are available for use online at . PMID:15128449

  19. Inferring protein function from homology using the Princeton Protein Orthology Database (P-POD)

    Science.gov (United States)

    Livstone, Michael S.; Oughtred, Rose; Heinicke, Sven; Vernot, Benjamin; Huttenhower, Curtis; Durand, Dannie; Dolinski, Kara

    2011-01-01

    Inferring a protein’s function by homology is a powerful tool for biologists. The Princeton Protein Orthology Database (P-POD) offers a simple way to visualize and analyze the relationships between homologous proteins in order to infer function. P-POD contains computationally-generated analysis distinguishing orthologs from paralogs combined with curated published information on functional complementation and on human diseases. P-POD also features an applet, Notung, for users to explore and modify phylogenetic trees and generate their own ortholog/paralogs calls. This unit describes how to search P-POD for precomputed data, how to find and use the associated curated information from the literature, and how to use Notung to analyze and refine the results. PMID:21400696

  20. Yeast Interacting Proteins Database: YJL199C, YJL199C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available d in closely related Saccharomyces species; protein detected in large-scale protein-protein interaction studies...cies; protein detected in large-scale protein-protein interaction studies Rows with this prey as prey (4) Ro...n; not conserved in closely related Saccharomyces species; protein detected in large-scale protein-protein interaction studies... species; protein detected in large-scale protein-protein interaction studies Rows with this prey as prey Ro

  1. EcoProDB: the Escherichia coli protein database.

    Science.gov (United States)

    Yun, Hongseok; Lee, Jeong Wook; Jeong, Joonwoo; Chung, Jaesung; Park, Jong Myoung; Myoung, Han Na; Lee, Sang Yup

    2007-09-15

    EcoProDB is a web-based database for comparative proteomics of Escherichia coli. The database contains information on E. coli proteins identified on 2D gels along with other resources collected from various databases and published literature, with a special feature of showing the expression levels of E. coli proteins under different genetic and environmental conditions. It also provides comparative information of subcellular localization, theoretical 2D map, experimental 2D map and integrated protein information via an interactive web interface and application such as the Map Browser. Users can also upload their own 2D gels, extract core information associated with the proteins and 2D gel results from different experiments and consequently generate new knowledge and hypotheses for further studies. EcoProDB database system is accessible at http://eecoli.kaist.ac.kr.

  2. Download - Yeast Interacting Proteins Database | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available English ]; } else { document.getElementById(lang).innerHTML= '[ Japanese | English ]'; } } window.onload = ...List Contact us Yeast Interacting Proteins Database Download First of all, please read the license of this d...atabase. Data names and data descriptions are about the downloadable data in this page. They might not corre... Data name File Simple search and download 1 README README_e.html - 2 Core Data o...f Yeast Interacting Proteins Database (Annotation Updated Version) core_updated.zip (77KB) Simple search and dow

  3. Computer systems and methods for the query and visualization multidimensional databases

    Science.gov (United States)

    Stolte, Chris; Tang, Diane L.; Hanrahan, Patrick

    2017-04-25

    A method of generating a data visualization is performed at a computer having a display, one or more processors, and memory. The memory stores one or more programs for execution by the one or more processors. The process receives user specification of a plurality of characteristics of a data visualization. The data visualization is based on data from a multidimensional database. The characteristics specify at least x-position and y-position of data marks corresponding to tuples of data retrieved from the database. The process generates a data visualization according to the specified plurality of characteristics. The data visualization has an x-axis defined based on data for one or more first fields from the database that specify x-position of the data marks and the data visualization has a y-axis defined based on data for one or more second fields from the database that specify y-position of the data marks.

  4. Columba: an integrated database of proteins, structures, and annotations

    Directory of Open Access Journals (Sweden)

    Preissner Robert

    2005-03-01

    Full Text Available Abstract Background Structural and functional research often requires the computation of sets of protein structures based on certain properties of the proteins, such as sequence features, fold classification, or functional annotation. Compiling such sets using current web resources is tedious because the necessary data are spread over many different databases. To facilitate this task, we have created COLUMBA, an integrated database of annotations of protein structures. Description COLUMBA currently integrates twelve different databases, including PDB, KEGG, Swiss-Prot, CATH, SCOP, the Gene Ontology, and ENZYME. The database can be searched using either keyword search or data source-specific web forms. Users can thus quickly select and download PDB entries that, for instance, participate in a particular pathway, are classified as containing a certain CATH architecture, are annotated as having a certain molecular function in the Gene Ontology, and whose structures have a resolution under a defined threshold. The results of queries are provided in both machine-readable extensible markup language and human-readable format. The structures themselves can be viewed interactively on the web. Conclusion The COLUMBA database facilitates the creation of protein structure data sets for many structure-based studies. It allows to combine queries on a number of structure-related databases not covered by other projects at present. Thus, information on both many and few protein structures can be used efficiently. The web interface for COLUMBA is available at http://www.columba-db.de.

  5. Protein - TP Atlas | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...p_atlas_protein.zip File URL: ftp://ftp.biosciencedbc.jp/archive/tp_atlas/LATEST/...story of This Database Site Policy | Contact Us Protein - TP Atlas | LSDB Archive ...

  6. ARAMEMNON, a novel database for Arabidopsis integral membrane proteins

    DEFF Research Database (Denmark)

    Schwacke, Rainer; Schneider, Anja; van der Graaff, Eric

    2003-01-01

    A specialized database (DB) for Arabidopsis membrane proteins, ARAMEMNON, was designed that facilitates the interpretation of gene and protein sequence data by integrating features that are presently only available from individual sources. Using several publicly available prediction programs, put...... is accessible at the URL http://aramemnon.botanik.uni-koeln.de....

  7. cuticleDB: a relational database of Arthropod cuticular proteins

    Directory of Open Access Journals (Sweden)

    Willis Judith H

    2004-09-01

    Full Text Available Abstract Background The insect exoskeleton or cuticle is a bi-partite composite of proteins and chitin that provides protective, skeletal and structural functions. Little information is available about the molecular structure of this important complex that exhibits a helicoidal architecture. Scores of sequences of cuticular proteins have been obtained from direct protein sequencing, from cDNAs, and from genomic analyses. Most of these cuticular protein sequences contain motifs found only in arthropod proteins. Description cuticleDB is a relational database containing all structural proteins of Arthropod cuticle identified to date. Many come from direct sequencing of proteins isolated from cuticle and from sequences from cDNAs that share common features with these authentic cuticular proteins. It also includes proteins from the Drosophila melanogaster and the Anopheles gambiae genomes, that have been predicted to be cuticular proteins, based on a Pfam motif (PF00379 responsible for chitin binding in Arthropod cuticle. The total number of the database entries is 445: 370 derive from insects, 60 from Crustacea and 15 from Chelicerata. The database can be accessed from our web server at http://bioinformatics.biol.uoa.gr/cuticleDB. Conclusions CuticleDB was primarily designed to contain correct and full annotation of cuticular protein data. The database will be of help to future genome annotators. Users will be able to test hypotheses for the existence of known and also of yet unknown motifs in cuticular proteins. An analysis of motifs may contribute to understanding how proteins contribute to the physical properties of cuticle as well as to the precise nature of their interaction with chitin.

  8. Yeast Interacting Proteins Database: YNL189W, YOR284W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ait as prey (0) YOR284W HUA2 Cytoplasmic protein of unknown function; computational...protein of unknown function; computational analysis of large-scale protein-protein interaction data suggests

  9. Yeast Interacting Proteins Database: YDR446W, YDR510W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YDR446W ECM11 Non-essential protein apparently involved in meiosis, GFP fusion protein is present in discret...description Non-essential protein apparently involved in meiosis, GFP fusion protein is present in discrete

  10. Yeast Interacting Proteins Database: YDL239C, YGR268C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ith sequence similarity to that of Type I J-proteins; computational analysis of large-scale protein-protein ...equence similarity to that of Type I J-proteins; computational analysis of large-scale protein-protein inter

  11. Yeast Interacting Proteins Database: YGR268C, YER125W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available larity to that of Type I J-proteins; computational analysis of large-scale protein-protein interaction data ...equence similarity to that of Type I J-proteins; computational analysis of large-scale protein-protein inter

  12. Yeast Interacting Proteins Database: YNL189W, YJL199C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available tein; not conserved in closely related Saccharomyces species; protein detected in large-scale protein-protein interaction studies...myces species; protein detected in large-scale protein-protein interaction studies Rows with this prey as pr

  13. ProteoLens: a visual analytic tool for multi-scale database-driven biological network data mining.

    Science.gov (United States)

    Huan, Tianxiao; Sivachenko, Andrey Y; Harrison, Scott H; Chen, Jake Y

    2008-08-12

    according to associated data values. We demonstrated the advantages of these new capabilities through three biological network visualization case studies: human disease association network, drug-target interaction network and protein-peptide mapping network. The architectural design of ProteoLens makes it suitable for bioinformatics expert data analysts who are experienced with relational database management to perform large-scale integrated network visual explorations. ProteoLens is a promising visual analytic platform that will facilitate knowledge discoveries in future network and systems biology studies.

  14. AMYPdb: A database dedicated to amyloid precursor proteins

    Directory of Open Access Journals (Sweden)

    Delamarche Christian

    2008-06-01

    Full Text Available Abstract Background Misfolding and aggregation of proteins into ordered fibrillar structures is associated with a number of severe pathologies, including Alzheimer's disease, prion diseases, and type II diabetes. The rapid accumulation of knowledge about the sequences and structures of these proteins allows using of in silico methods to investigate the molecular mechanisms of their abnormal conformational changes and assembly. However, such an approach requires the collection of accurate data, which are inconveniently dispersed among several generalist databases. Results We therefore created a free online knowledge database (AMYPdb dedicated to amyloid precursor proteins and we have performed large scale sequence analysis of the included data. Currently, AMYPdb integrates data on 31 families, including 1,705 proteins from nearly 600 organisms. It displays links to more than 2,300 bibliographic references and 1,200 3D-structures. A Wiki system is available to insert data into the database, providing a sharing and collaboration environment. We generated and analyzed 3,621 amino acid sequence patterns, reporting highly specific patterns for each amyloid family, along with patterns likely to be involved in protein misfolding and aggregation. Conclusion AMYPdb is a comprehensive online database aiming at the centralization of bioinformatic data regarding all amyloid proteins and their precursors. Our sequence pattern discovery and analysis approach unveiled protein regions of significant interest. AMYPdb is freely accessible 1.

  15. Database Description - Yeast Interacting Proteins Database | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available ogy Development project Reference(s) Article title: Toward a protein-protein interaction map of the budding ...wa M, Yamamoto K, Kuhara S, Sakaki Y. Journal: Proc Natl Acad Sci U S A. 2000 Feb 1;97(3):1143-7. External Links: Article

  16. Yeast Interacting Proteins Database: YOR124C, YGR268C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available that of Type I J-proteins; computational analysis of large-scale protein-protein interaction data suggests a...plasmic protein containing a zinc finger domain with sequence similarity to that of Type I J-proteins; computational

  17. Yeast Interacting Proteins Database: YML064C, YJL199C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available y related Saccharomyces species; protein detected in large-scale protein-protein interaction studies Rows wi...in-protein interaction studies Rows with this prey as prey (4) Rows with this prey as bait (1) 28 6 3 4 0 0 ...d in closely related Saccharomyces species; protein detected in large-scale prote

  18. Yeast Interacting Proteins Database: YLR291C, YJL199C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ved in closely related Saccharomyces species; protein detected in large-scale protein-protein interaction studies...in large-scale protein-protein interaction studies Rows with this prey as prey Rows with this prey as prey (

  19. The Princeton Protein Orthology Database (P-POD: a comparative genomics analysis tool for biologists.

    Directory of Open Access Journals (Sweden)

    Sven Heinicke

    2007-08-01

    Full Text Available Many biological databases that provide comparative genomics information and tools are now available on the internet. While certainly quite useful, to our knowledge none of the existing databases combine results from multiple comparative genomics methods with manually curated information from the literature. Here we describe the Princeton Protein Orthology Database (P-POD, http://ortholog.princeton.edu, a user-friendly database system that allows users to find and visualize the phylogenetic relationships among predicted orthologs (based on the OrthoMCL method to a query gene from any of eight eukaryotic organisms, and to see the orthologs in a wider evolutionary context (based on the Jaccard clustering method. In addition to the phylogenetic information, the database contains experimental results manually collected from the literature that can be compared to the computational analyses, as well as links to relevant human disease and gene information via the OMIM, model organism, and sequence databases. Our aim is for the P-POD resource to be extremely useful to typical experimental biologists wanting to learn more about the evolutionary context of their favorite genes. P-POD is based on the commonly used Generic Model Organism Database (GMOD schema and can be downloaded in its entirety for installation on one's own system. Thus, bioinformaticians and software developers may also find P-POD useful because they can use the P-POD database infrastructure when developing their own comparative genomics resources and database tools.

  20. Data mining and visualization of the Alabama accident database

    Science.gov (United States)

    2000-08-01

    The Alabama Department of Public Safety has developed and maintains a centralized database that contain traffic accident data collected from crash report completed by local police officers and state troopers. The Critical Analysis Reporting Environme...

  1. Yeast Interacting Proteins Database: YML109W, YGL190C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available sential regulatory subunit B of protein phosphatase 2A, which has multiple roles ...-essential regulatory subunit B of protein phosphatase 2A, which has multiple roles in mitosis and protein b

  2. Yeast Interacting Proteins Database: YGL198W, YDR084C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YGL198W YIP4 Protein that interacts with Rab GTPases, localized to late Golgi vesicles; computational... GTPases, localized to late Golgi vesicles; computational analysis of large-scale protein-protein interactio

  3. Yeast Interacting Proteins Database: YGL161C, YDR084C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YGL161C YIP5 Protein that interacts with Rab GTPases, localized to late Golgi vesicles; computational...GTPases, localized to late Golgi vesicles; computational analysis of large-scale protein-protein interaction

  4. Yeast Interacting Proteins Database: YPL095C, YGL198W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available d to late Golgi vesicles; computational analysis of large-scale protein-protein interaction data suggests a ...gene name YIP4 Prey description Protein that interacts with Rab GTPases, localized to late Golgi vesicles; computational

  5. Yeast Interacting Proteins Database: YLR291C, YPL070W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YPL070W MUK1 Cytoplasmic protein of unknown function containing a Vps9 domain; computational...rotein of unknown function containing a Vps9 domain; computational analysis of large-scale protein-protein i

  6. Manually Curated Database of Rice Proteins (MCDRP, a database of digitized experimental data on rice

    Directory of Open Access Journals (Sweden)

    Saurabh Raghuvanshi

    2016-11-01

    Full Text Available MCDRP or ‘Manually Curated Database of Rice Proteins’ is a database of digitized experimental datasets on rice proteins. Every aspect of the experimental data published in peer-reviewed research articles on rice biology has been digitized with the help of novel data curation models. These models use a semantic and structured arrangement of alpha-numeric notation, including several well known ontologies, to represent various aspect of the data. As a result data from more than 15,000 different experiments pertaining to about 2400 rice proteins has been digitized from over 540 published and peer-reviewed research articles. The database portal provides access to the digitized experimental data via search or browse functions. In essence, one can instantly access data from even a single data-point from a collection of thousands of the experimental datasets. On the other hand, one can easily access the digitized experimental data from multiple research articles on a rice protein. Based on the analysis and integration of the digitized experimental data, more than 800 different traits (molecular, biochemical or phenotypic have been precisely mapped onto the rice proteins along with the underlying experimental evidences. Similarly, over 4370 associations, based on experimental evidence, have been established between the rice proteins and various gene ontology terms. The database is being continuously updated and is freely available at www.genomeindia.org.in/biocuration.

  7. Yeast Interacting Proteins Database: YGR239C, YDR142C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available PEX21 Peroxin required for targeting of peroxisomal matrix proteins containing PTS2; interacts with Pex7p;...N-terminal nonapeptide signal (PTS2) of peroxisomal matrix proteins; WD repeat protein; defects in human homolog...description Peroxin required for targeting of peroxisomal matrix proteins containing PTS2; interacts with Pex7p;...N-terminal nonapeptide signal (PTS2) of peroxisomal matrix proteins; WD repeat protein; defects in human homolog

  8. Yeast Interacting Proteins Database: YNR006W, YHL002W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ling Golgi proteins, forming lumenal membranes and sorting ubiquitinated proteins destined for degradation; ..., as well as for recycling of Golgi proteins and formation of lumenal membranes Rows with this prey as prey ...1p; required for recycling Golgi proteins, forming lumenal membranes and sorting ubiquitinated proteins dest...degradation, as well as for recycling of Golgi proteins and formation of lumenal membranes

  9. Yeast Interacting Proteins Database: YNL086W, YKL061W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available fluorescent protein (GFP)-fusion protein localizes to endosomes Rows with this bait as bait (3) Rows with this...fluorescent protein (GFP)-fusion protein localizes to the endosome Rows with this prey as prey (2) Rows with this...fluorescent protein (GFP)-fusion protein localizes to endosomes Rows with this bait as bait Rows with this bait...fluorescent protein (GFP)-fusion protein localizes to the endosome Rows with this prey as prey Rows with this prey

  10. HMM Logos for visualization of protein families

    Directory of Open Access Journals (Sweden)

    Schultz Jörg

    2004-01-01

    Full Text Available Abstract Background Profile Hidden Markov Models (pHMMs are a widely used tool for protein family research. Up to now, however, there exists no method to visualize all of their central aspects graphically in an intuitively understandable way. Results We present a visualization method that incorporates both emission and transition probabilities of the pHMM, thus extending sequence logos introduced by Schneider and Stephens. For each emitting state of the pHMM, we display a stack of letters. The stack height is determined by the deviation of the position's letter emission frequencies from the background frequencies. The stack width visualizes both the probability of reaching the state (the hitting probability and the expected number of letters the state emits during a pass through the model (the state's expected contribution. A web interface offering online creation of HMM Logos and the corresponding source code can be found at the Logos web server of the Max Planck Institute for Molecular Genetics http://logos.molgen.mpg.de. Conclusions We demonstrate that HMM Logos can be a useful tool for the biologist: We use them to highlight differences between two homologous subfamilies of GTPases, Rab and Ras, and we show that they are able to indicate structural elements of Ras.

  11. Yeast Interacting Proteins Database: YDR425W, YGL161C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available with this bait as prey (0) YGL161C YIP5 Protein that interacts with Rab GTPases, localized to late Golgi vesicles; computational...IP5 Prey description Protein that interacts with Rab GTPases, localized to late Golgi vesicles; computatio...nal analysis of large-scale protein-protein interaction data suggests a possible ro

  12. Yeast Interacting Proteins Database: YDR425W, YGL198W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available with this bait as prey (0) YGL198W YIP4 Protein that interacts with Rab GTPases, localized to late Golgi vesicles; computational...IP4 Prey description Protein that interacts with Rab GTPases, localized to late Golgi vesicles; computatio...nal analysis of large-scale protein-protein interaction data suggests a possible ro

  13. Yeast Interacting Proteins Database: YDL226C, YJL151C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available s bait as prey (0) YJL151C SNA3 Integral membrane protein localized to vacuolar intralumenal vesicles, computational...intralumenal vesicles, computational analysis of large-scale protein-protein interaction data suggests a pos... gene name SNA3 Prey description Integral membrane protein localized to vacuolar

  14. Yeast Interacting Proteins Database: YML064C, YOR284W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available th this bait as prey (0) YOR284W HUA2 Cytoplasmic protein of unknown function; computational analysis of lar...Rows with this bait as prey (0) Prey ORF YOR284W Prey gene name HUA2 Prey description Cytoplasmic protein of unknown function; comput...ational analysis of large-scale protein-protein interact

  15. Yeast Interacting Proteins Database: YNL086W, YGL172W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available fluorescent protein (GFP)-fusion protein localizes to endosomes Rows with this bait as bait (3) Rows with this...fluorescent protein (GFP)-fusion protein localizes to endosomes Rows with this bait as bait Rows with this bait

  16. Yeast Interacting Proteins Database: YEL005C, YNL086W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available fluorescent protein (GFP)-fusion protein localizes to endosomes Rows with this prey as prey (2) Rows with this...fluorescent protein (GFP)-fusion protein localizes to endosomes Rows with this prey as prey Rows with this prey

  17. Yeast Interacting Proteins Database: YGR119C, YKL061W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available fluorescent protein (GFP)-fusion protein localizes to the endosome Rows with this prey as prey (2) Rows with this...fluorescent protein (GFP)-fusion protein localizes to the endosome Rows with this prey as prey Rows with this prey

  18. Yeast Interacting Proteins Database: YEL043W, YOR164C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available on quantitative analysis of protein-protein interaction maps; may interact with ribosomes, based on co-purification studies...ing based on quantitative analysis of protein-protein interaction maps; may interact with ribosomes, based on co-purification studies

  19. Yeast Interacting Proteins Database: YLR447C, YOR047C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available xpression; interacts with protein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding protein Sp...; interacts with protein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding protein Spt15p; act

  20. Yeast Interacting Proteins Database: YEL017W, YEL017W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available Bait description Protein of unknown function with a possible role in glutathione metabolism, as suggested by computational...ion Protein of unknown function with a possible role in glutathione metabolism, as suggested by computational...putational analysis of large-scale protein-protein interaction data; GFP-fusion pro...tational analysis of large-scale protein-protein interaction data; GFP-fusion prote...17W GTT3 Protein of unknown function with a possible role in glutathione metabolism, as suggested by compu

  1. MultitaskProtDB: a database of multitasking proteins.

    Science.gov (United States)

    Hernández, Sergio; Ferragut, Gabriela; Amela, Isaac; Perez-Pons, JosepAntoni; Piñol, Jaume; Mozo-Villarias, Angel; Cedano, Juan; Querol, Enrique

    2014-01-01

    We have compiled MultitaskProtDB, available online at http://wallace.uab.es/multitask, to provide a repository where the many multitasking proteins found in the literature can be stored. Multitasking or moonlighting is the capability of some proteins to execute two or more biological functions. Usually, multitasking proteins are experimentally revealed by serendipity. This ability of proteins to perform multitasking functions helps us to understand one of the ways used by cells to perform many complex functions with a limited number of genes. Even so, the study of this phenomenon is complex because, among other things, there is no database of moonlighting proteins. The existence of such a tool facilitates the collection and dissemination of these important data. This work reports the database, MultitaskProtDB, which is designed as a friendly user web page containing >288 multitasking proteins with their NCBI and UniProt accession numbers, canonical and additional biological functions, monomeric/oligomeric states, PDB codes when available and bibliographic references. This database also serves to gain insight into some characteristics of multitasking proteins such as frequencies of the different pairs of functions, phylogenetic conservation and so forth.

  2. Yeast Interacting Proteins Database: YFR049W, YOR047C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available protein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding protein Spt15p; acts as a regulator... (0) YOR047C STD1 Protein involved in control of glucose-regulated gene expression; interacts with protein kinase Snf1p, glucose sens...ors Snf3p and Rgt2p, and TATA-binding protein Spt15p; ac

  3. Yeast Interacting Proteins Database: YOR284W, YOR284W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YOR284W HUA2 Cytoplasmic protein of unknown function; computational analysis of lar...it as bait (1) Rows with this bait as prey (4) YOR284W HUA2 Cytoplasmic protein of unknown function; computational...tein of unknown function; computational analysis of large-scale protein-protein i... HUA2 Prey description Cytoplasmic protein of unknown function; computational ana

  4. Yeast Interacting Proteins Database: YOR047C, YKL038W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available racts with protein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding protein Spt15p; acts as a...Bait description Protein involved in control of glucose-regulated gene expression; interacts with protein kinase Snf1p, glucose senso...rs Snf3p and Rgt2p, and TATA-binding protein Spt15p; acts as a regulator of the tra

  5. Yeast Interacting Proteins Database: YHL002W, YNR006W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ycling of Golgi proteins and formation of lumenal membranes Rows with this bait as bait (1) Rows with this b...required for recycling Golgi proteins, forming lumenal membranes and sorting ubiquitinated proteins destined...on, as well as for recycling of Golgi proteins and formation of lumenal membranes...ith Hse1p; required for recycling Golgi proteins, forming lumenal membranes and sorting ubiquitinated protei

  6. Visualizing the semantic content of large text databases using text maps

    Science.gov (United States)

    Combs, Nathan

    1993-01-01

    A methodology for generating text map representations of the semantic content of text databases is presented. Text maps provide a graphical metaphor for conceptualizing and visualizing the contents and data interrelationships of large text databases. Described are a set of experiments conducted against the TIPSTER corpora of Wall Street Journal articles. These experiments provide an introduction to current work in the representation and visualization of documents by way of their semantic content.

  7. Interactome Data and Databases: Different Types of Protein Interaction

    Directory of Open Access Journals (Sweden)

    Alberto de Luis

    2006-04-01

    Full Text Available In recent years, the biomolecular sciences have been driven forward by overwhelming advances in new biotechnological high-throughput experimental methods and bioinformatic genome-wide computational methods. Such breakthroughs are producing huge amounts of new data that need to be carefully analysed to obtain correct and useful scientific knowledge. One of the fields where this advance has become more intense is the study of the network of ‘protein–protein interactions’, i.e. the ‘interactome’. In this short review we comment on the main data and databases produced in this field in last 5 years. We also present a rationalized scheme of biological definitions that will be useful for a better understanding and interpretation of ‘what a protein–protein interaction is’ and ‘which types of protein–protein interactions are found in a living cell’. Finally, we comment on some assignments of interactome data to defined types of protein interaction and we present a new bioinformatic tool called APIN (Agile Protein Interaction Network browser, which is in development and will be applied to browsing protein interaction databases.

  8. SWORD-a highly efficient protein database search.

    Science.gov (United States)

    Vaser, Robert; Pavlović, Dario; Šikić, Mile

    2016-09-01

    Protein database search is one of the fundamental problems in bioinformatics. For decades, it has been explored and solved using different exact and heuristic approaches. However, exponential growth of data in recent years has brought significant challenges in improving already existing algorithms. BLAST has been the most successful tool for protein database search, but is also becoming a bottleneck in many applications. Due to that, many different approaches have been developed to complement or replace it. In this article, we present SWORD, an efficient protein database search implementation that runs 8-16 times faster than BLAST in the sensitive mode and up to 68 times faster in the fast and less accurate mode. It is designed to be used in nearly all database search environments, but is especially suitable for large databases. Its sensitivity exceeds that of BLAST for majority of input datasets and provides guaranteed optimal alignments. Sword is freely available for download from https://github.com/rvaser/sword robert.vaser@fer.hr and mile.sikic@fer.hr Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  9. CancerPPD: a database of anticancer peptides and proteins

    OpenAIRE

    Tyagi, Atul; Tuknait, Abhishek; Anand, Priya; Gupta, Sudheer; Sharma, Minakshi; Mathur, Deepika; Joshi, Anshika; Singh, Sandeep; Gautam, Ankur; Raghava, Gajendra P. S.

    2014-01-01

    CancerPPD (http://crdd.osdd.net/raghava/cancerppd/) is a repository of experimentally verified anticancer peptides (ACPs) and anticancer proteins. Data were manually collected from published research articles, patents and from other databases. The current release of CancerPPD consists of 3491 ACP and 121 anticancer protein entries. Each entry provides comprehensive information related to a peptide like its source of origin, nature of the peptide, anticancer activity, N- and C-terminal modific...

  10. Present status of protein and nucleic acid database activities in the world

    Science.gov (United States)

    Tsugita, Akira

    The first protein database was founded in 1965, followed by the establishment of nucleic acid databases from 1971. Presently there are six major sequence databases, located in Japan, USA and the FRG-three for protein data and three for nucleic acid data. International cooperation between the protein databases and between the nucleic acid databases have greatly facilitated compilation and dissemination of data. Coordination between these protein and nucleic acid databases have progressed with the support of the CODATA Task Group and the International Advisory Board for Nucleic Acid Databases. In the protein field, several additional database activities are initiated to contribute to protein engineering and structure-activity relationships.

  11. Yeast Interacting Proteins Database: YPR083W, YMR294W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YPR083W MDM36 Protein required for normal mitochondrial morphology and inheritance ...description Protein required for normal mitochondrial morphology and inheritance Rows with this bait as bait

  12. Yeast Interacting Proteins Database: YNL189W, YBR072W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ait as prey (0) YBR072W HSP26 Small heat shock protein (sHSP) with chaperone activity; forms hollow...chaperone activity; forms hollow, sphere-shaped oligomers that suppress unfolded proteins aggregation; oligo

  13. Yeast Interacting Proteins Database: YIL007C, YOR117W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YIL007C NAS2 Proteasome-interacting protein involved in the assembly of the base su...tion Proteasome-interacting protein involved in the assembly of the base subcomplex of the 19S proteasomal r

  14. Yeast Interacting Proteins Database: YLR373C, YGL190C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ase 2A, which has multiple roles in mitosis and protein biosynthesis; involved in regulation of mitotic exit...phosphatase 2A, which has multiple roles in mitosis and protein biosynthesis; involved in regulation of mito

  15. Yeast Interacting Proteins Database: YPL002C, YJR102C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ndent sorting of proteins into the endosome; appears to be functionally related to SNF7; involved in glucose...x, which is involved in ubiquitin-dependent sorting of proteins into the endosome; appears

  16. Yeast Interacting Proteins Database: YNL311C, YKL001C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available -purification experiments; putative F-box protein; analysis of integrated high-throughput datasets predicts ...ments; putative F-box protein; analysis of integrated high-throughput datasets predicts involvement in ubiqu

  17. Yeast Interacting Proteins Database: YMR146C, YPL105C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ts; authentic, non-tagged protein is detected in highly purified mitochondria in high-throughpu...tagged protein is detected in highly purified mitochondria in high-throughput studies Rows with this prey as

  18. Yeast Interacting Proteins Database: YGL181W, YHR177W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available y (0) YHR177W - Putative protein of unknown function; overexpression causes a cel...ative protein of unknown function; overexpression causes a cell cycle delay or arrest Rows with this prey as

  19. Yeast Interacting Proteins Database: YPL070W, YOR155C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YPL070W MUK1 Cytoplasmic protein of unknown function containing a Vps9 domain; computational...me MUK1 Bait description Cytoplasmic protein of unknown function containing a Vps9 domain; computational

  20. Yeast Interacting Proteins Database: YPL070W, YPR193C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YPL070W MUK1 Cytoplasmic protein of unknown function containing a Vps9 domain; computational...1 Bait description Cytoplasmic protein of unknown function containing a Vps9 domain; computational

  1. Yeast Interacting Proteins Database: YKR092C, YKL023W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available W - Putative protein of unknown function, predicted by computational methods to b...ait as prey (0) Prey ORF YKL023W Prey gene name - Prey description Putative protein of unknown function, predicted by computational

  2. Yeast Interacting Proteins Database: YPL070W, YLR245C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YPL070W MUK1 Cytoplasmic protein of unknown function containing a Vps9 domain; computational... name MUK1 Bait description Cytoplasmic protein of unknown function containing a Vps9 domain; computationa

  3. Yeast Interacting Proteins Database: YDL226C, YGL198W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available s bait as prey (0) YGL198W YIP4 Protein that interacts with Rab GTPases, localized to late Golgi vesicles; computational...iption Protein that interacts with Rab GTPases, localized to late Golgi vesicles; computational

  4. Yeast Interacting Proteins Database: YOR158W, YLR424W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YOR158W PET123 Mitochondrial ribosomal protein of the small subunit; PET123 exhibits genetic interactions...al ribosomal protein of the small subunit; PET123 exhibits genetic interactions with PET122, which encodes a

  5. Yeast Interacting Proteins Database: YGL115W, YGL208W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available nine protein kinase complex involved in the response to glucose starvation; null mutants exhibit accelerated...serine/threonine protein kinase complex involved in the response to glucose starvation; null mutants exhibit accelerated

  6. Yeast Interacting Proteins Database: YPR103W, YOR047C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available tein involved in control of glucose-regulated gene expression; interacts with protein kinase Snf1p, glucose sensors...gulated gene expression; interacts with protein kinase Snf1p, glucose sensors Snf

  7. Yeast Interacting Proteins Database: YLR291C, YOR284W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YOR284W HUA2 Cytoplasmic protein of unknown function; computational analysis of l...prey (0) Prey ORF YOR284W Prey gene name HUA2 Prey description Cytoplasmic protein of unknown function; computational

  8. Yeast Interacting Proteins Database: YDL239C, YPL070W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available it as prey (1) YPL070W MUK1 Cytoplasmic protein of unknown function containing a Vps9 domain; computationa...ey description Cytoplasmic protein of unknown function containing a Vps9 domain; computational analysis of l

  9. Yeast Interacting Proteins Database: YLR295C, YJR083C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available 7) Rows with this bait as prey (0) YJR083C ACF4 Protein of unknown function, computational analysis of large...me ACF4 Prey description Protein of unknown function, computational analysis of l

  10. Yeast Interacting Proteins Database: YNL258C, YKR022C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available membrane protein required for Golgi-to-ER retrograde traffic; component of the ER target site that interacts...membrane protein required for Golgi-to-ER retrograde traffic; component of the ER target site that interacts

  11. Yeast Interacting Proteins Database: YGL145W, YNL258C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available membrane protein required for Golgi-to-ER retrograde traffic; component of the ER target site that interacts...membrane protein required for Golgi-to-ER retrograde traffic; component of the ER target site that interacts

  12. Yeast Interacting Proteins Database: YNL258C, YLR440C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available membrane protein required for Golgi-to-ER retrograde traffic; component of the ER target site that interacts...membrane protein required for Golgi-to-ER retrograde traffic; component of the ER target site that interacts

  13. Yeast Interacting Proteins Database: YNL092W, YML037C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available Putative protein of unknown function with some characteristics of a transcriptional activator; may be a target...Putative protein of unknown function with some characteristics of a transcriptional activator; may be a target

  14. Yeast Interacting Proteins Database: YNL078W, YKR048C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available Protein localized in the bud neck at G2/M phase; physically interacts with septins; possibly involved in...Protein localized in the bud neck at G2/M phase; physically interacts with septins; possibly involved in

  15. Yeast Interacting Proteins Database: YBL033C, YNL105W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available reading frame unlikely to encode a protein, based on available experimental and comparative sequence data; p...a protein, based on available experimental and comparative sequence data; partial

  16. Yeast Interacting Proteins Database: YER081W, YPR126C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YPR126C - Dubious open reading frame unlikely to encode a functional protein, based on available experimental and comparative...ubious open reading frame unlikely to encode a functional protein, based on available experimental and comparative

  17. Yeast Interacting Proteins Database: YJR091C, YEL013W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available encoding membrane-associated proteins; involved in localizing the Arp2/3 complex to mitochondria; overexpression causes...ed proteins; involved in localizing the Arp2/3 complex to mitochondria; overexpression causes increased sens

  18. Yeast Interacting Proteins Database: YPL114W, YMR133W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available as prey (0) YMR133W REC114 Protein involved in early stages of meiotic recombination; possibly involved...name REC114 Prey description Protein involved in early stages of meiotic recombination; possibly involved

  19. Yeast Interacting Proteins Database: YNL334C, YNL333W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YNL334C SNO2 Protein of unknown function, nearly identical to Sno3p; expression is induced before the...Bait description Protein of unknown function, nearly identical to Sno3p; expression is induced before

  20. Yeast Interacting Proteins Database: YDR394W, YGR232W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ity (BRITE) - Alternative path with 1 intervening protein (YPD) 0 Alternative path with 2 intervening proteins (YPD) 0 IST hit 16 IST hit in the opposite bait/prey orientation 18 ...

  1. Yeast Interacting Proteins Database: YNL258C, YGL145W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YNL258C DSL1 Peripheral membrane protein required for Golgi-to-ER retrograde traffi...t description Peripheral membrane protein required for Golgi-to-ER retrograde traffic; component of the ER t

  2. Yeast Interacting Proteins Database: YML064C, YKL103C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available he peptidase family M18; often used as a marker protein in studies of autophagy a... to the peptidase family M18; often used as a marker protein in studies of autophagy and cytosol to vacuole

  3. Yeast Interacting Proteins Database: YCL046W, YGL115W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YCL046W - Dubious open reading frame unlikely to encode a protein, based on availab...ading frame unlikely to encode a protein, based on available experimental and comparative sequence data; par

  4. Yeast Interacting Proteins Database: YDR176W, YDL239C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available 9C ADY3 Protein required for spore wall formation, thought to mediate assembly of...DY3 Prey description Protein required for spore wall formation, thought to mediate assembly of a Don1p-conta

  5. Yeast Interacting Proteins Database: YDL239C, YDR148C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YDL239C ADY3 Protein required for spore wall formation, thought to mediate assembly...239C Bait gene name ADY3 Bait description Protein required for spore wall formation, thought to mediate asse

  6. Yeast Interacting Proteins Database: YDL239C, YLR423C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YDL239C ADY3 Protein required for spore wall formation, thought to mediate assembly...cription Protein required for spore wall formation, thought to mediate assembly of a Don1p-containing struct

  7. Yeast Interacting Proteins Database: YDL239C, YAL028W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YDL239C ADY3 Protein required for spore wall formation, thought to mediate assembly...39C Bait ORF YDL239C Bait gene name ADY3 Bait description Protein required for spore wall formation, thought

  8. Yeast Interacting Proteins Database: YDL239C, YPL255W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YDL239C ADY3 Protein required for spore wall formation, thought to mediate assembly...ait ORF YDL239C Bait gene name ADY3 Bait description Protein required for spore wall formation, thought to m

  9. Yeast Interacting Proteins Database: YDL239C, YML042W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YDL239C ADY3 Protein required for spore wall formation, thought to mediate assembly...iption Protein required for spore wall formation, thought to mediate assembly of a Don1p-containing structur

  10. Yeast Interacting Proteins Database: YDL239C, YDR273W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YDL239C ADY3 Protein required for spore wall formation, thought to mediate assembly...ption Protein required for spore wall formation, thought to mediate assembly of a Don1p-containing structure

  11. Yeast Interacting Proteins Database: YDL239C, YOR324C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YDL239C ADY3 Protein required for spore wall formation, thought to mediate assembly...it gene name ADY3 Bait description Protein required for spore wall formation, thought to mediate assembly of

  12. Yeast Interacting Proteins Database: YDL239C, YHR184W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YDL239C ADY3 Protein required for spore wall formation, thought to mediate assembly...C Bait ORF YDL239C Bait gene name ADY3 Bait description Protein required for spore wall formation, thought

  13. Yeast Interacting Proteins Database: YDL239C, YLR072W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YDL239C ADY3 Protein required for spore wall formation, thought to mediate assembly... Bait ORF YDL239C Bait gene name ADY3 Bait description Protein required for spore wall formation, thought

  14. Yeast Interacting Proteins Database: YPL059W, YIL105C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available oxidoreductase; mitochondrial matrix protein involved in the synthesis/assembly of iron-sulfur centers; mono...oreductase; mitochondrial matrix protein involved in the synthesis/assembly of iron-sulfur centers; monothio

  15. Yeast Interacting Proteins Database: YHR197W, YLR423C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available protein responsible for phagophore assembly site organization; regulatory subunit of an autophagy-specific...protein responsible for phagophore assembly site organization; regulatory subunit of an autophagy-specific

  16. Yeast Interacting Proteins Database: YGL127C, YLR423C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available protein responsible for phagophore assembly site organization; regulatory subunit of an autophagy-specific...protein responsible for phagophore assembly site organization; regulatory subunit of an autophagy-specific

  17. Yeast Interacting Proteins Database: YDR473C, YLR423C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available protein responsible for phagophore assembly site organization; regulatory subunit of an autophagy-specific...protein responsible for phagophore assembly site organization; regulatory subunit of an autophagy-specific

  18. Yeast Interacting Proteins Database: YNL182C, YLR423C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available protein responsible for phagophore assembly site organization; regulatory subunit of an autophagy-specific...protein responsible for phagophore assembly site organization; regulatory subunit of an autophagy-specific

  19. Yeast Interacting Proteins Database: YKL050C, YLR423C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available protein responsible for phagophore assembly site organization; regulatory subunit of an autophagy-specific...protein responsible for phagophore assembly site organization; regulatory subunit of an autophagy-specific

  20. Yeast Interacting Proteins Database: YPL159C, YLR423C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available protein responsible for phagophore assembly site organization; regulatory subunit of an autophagy-specific...protein responsible for phagophore assembly site organization; regulatory subunit of an autophagy-specific

  1. Yeast Interacting Proteins Database: YDR052C, YLR423C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available protein responsible for phagophore assembly site organization; regulatory subunit of an autophagy-specific...protein responsible for phagophore assembly site organization; regulatory subunit of an autophagy-specific

  2. Yeast Interacting Proteins Database: YGL237C, YLR423C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available protein responsible for phagophore assembly site organization; regulatory subunit of an autophagy-specific...protein responsible for phagophore assembly site organization; regulatory subunit of an autophagy-specific

  3. Yeast Interacting Proteins Database: YGR113W, YLR423C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available protein responsible for phagophore assembly site organization; regulatory subunit of an autophagy-specific...protein responsible for phagophore assembly site organization; regulatory subunit of an autophagy-specific

  4. Yeast Interacting Proteins Database: YNL092W, YOR329C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available SCD5 Protein required for normal cortical actin organization and endocytosis; multicopy suppressor of clathrin...description Protein required for normal cortical actin organization and endocytosis; multicopy suppressor of clathrin

  5. Yeast Interacting Proteins Database: YLR423C, YNL182C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available protein responsible for phagophore assembly site organization; regulatory subunit of an autophagy-specific...protein responsible for phagophore assembly site organization; regulatory subunit of an autophagy-specific

  6. Yeast Interacting Proteins Database: YNR068C, YNR069C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available Protein of unknown function, ORF exhibits genomic organization compatible with a translational readthrough-dependent...Protein of unknown function, ORF exhibits genomic organization compatible with a translational readthrough-dependent

  7. Yeast Interacting Proteins Database: YJL061W, YLR423C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available protein responsible for phagophore assembly site organization; regulatory subunit of an autophagy-specific...protein responsible for phagophore assembly site organization; regulatory subunit of an autophagy-specific

  8. Yeast Interacting Proteins Database: YBR270C, YLR423C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available protein responsible for phagophore assembly site organization; regulatory subunit of an autophagy-specific...protein responsible for phagophore assembly site organization; regulatory subunit of an autophagy-specific

  9. Yeast Interacting Proteins Database: YCL063W, YLR423C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available protein responsible for phagophore assembly site organization; regulatory subunit of an autophagy-specific...protein responsible for phagophore assembly site organization; regulatory subunit of an autophagy-specific

  10. Yeast Interacting Proteins Database: YBR217W, YLR423C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available protein responsible for phagophore assembly site organization; regulatory subunit of an autophagy-specific...protein responsible for phagophore assembly site organization; regulatory subunit of an autophagy-specific

  11. Yeast Interacting Proteins Database: YPR040W, YDL188C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YPR040W TIP41 Protein that interacts physically and genetically with Tap42p, which ...ait ORF YPR040W Bait gene name TIP41 Bait description Protein that interacts physically and genetically

  12. Yeast Interacting Proteins Database: YPR040W, YDL134C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YPR040W TIP41 Protein that interacts physically and genetically with Tap42p, which ...Bait ORF YPR040W Bait gene name TIP41 Bait description Protein that interacts physically and genetically

  13. Yeast Interacting Proteins Database: YGR218W, YGR178C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available involved in export of proteins, RNAs, and ribosomal subunits from the nucleus; exportin Rows with this... involved in export of proteins, RNAs, and ribosomal subunits from the nucleus; exportin Rows with this

  14. Yeast Interacting Proteins Database: YGR218W, YMR124W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available involved in export of proteins, RNAs, and ribosomal subunits from the nucleus; exportin Rows with this... involved in export of proteins, RNAs, and ribosomal subunits from the nucleus; exportin Rows with this

  15. Yeast Interacting Proteins Database: YGR218W, YDL065C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available involved in export of proteins, RNAs, and ribosomal subunits from the nucleus; exportin Rows with this... involved in export of proteins, RNAs, and ribosomal subunits from the nucleus; exportin Rows with this

  16. Yeast Interacting Proteins Database: YGR218W, YOL149W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available involved in export of proteins, RNAs, and ribosomal subunits from the nucleus; exportin Rows with this... involved in export of proteins, RNAs, and ribosomal subunits from the nucleus; exportin Rows with this

  17. Yeast Interacting Proteins Database: YPL204W, YER095W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YPL204W HRR25 Protein kinase involved in regulating diverse events including vesicu... gene name HRR25 Bait description Protein kinase involved in regulating diverse events including vesicular t

  18. Yeast Interacting Proteins Database: YJR091C, YKL002W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available g of integral membrane proteins into lumenal vesicles of multivesicular bodies, and for delivery of newly sy... integral membrane proteins into lumenal vesicles of multivesicular bodies, and for delivery of newly synthe

  19. Yeast Interacting Proteins Database: YMR316W, YER125W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YMR316W DIA1 Protein of unknown function, involved in invasive and pseudohyphal gro... of unknown function, involved in invasive and pseudohyphal growth; green fluorescent protein (GFP)-fusion p

  20. Yeast Interacting Proteins Database: YOR117W, YJL184W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available GON7 Protein proposed to be involved in the modification of N-linked oligosaccharides, osmotic stress...description Protein proposed to be involved in the modification of N-linked oligosaccharides, osmotic stress

  1. Yeast Interacting Proteins Database: YPR106W, YBR038W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YPR106W ISR1 Predicted protein kinase, overexpression causes sensitivity to staurosporine, which is a potent...description Predicted protein kinase, overexpression causes sensitivity to staurosporine, which is a potent

  2. Yeast Interacting Proteins Database: YMR077C, YLR417W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available lumen; cytoplasmic protein recruited to endosomal membranes Rows with this bait as bait (3) Rows with this b...oplasmic protein recruited to endosomal membranes Rows with this bait as bait Row

  3. Yeast Interacting Proteins Database: YMR077C, YJR102C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available lumen; cytoplasmic protein recruited to endosomal membranes Rows with this bait as bait (3) Rows with this b...lar lumen; cytoplasmic protein recruited to endosomal membranes Rows with this bait as bait Rows with this b

  4. Yeast Interacting Proteins Database: YPR029C, YFR043C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available his bait as prey (1) YFR043C IRC6 Putative protein of unknown function; null mutant displays increased level...C6 Prey description Putative protein of unknown function; null mutant displays increased levels of spontaneo

  5. Yeast Interacting Proteins Database: YPL077C, YLR423C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YPL077C - Putative protein of unknown function; regulates PIS1 expression; mutant display...Bait description Putative protein of unknown function; regulates PIS1 expression; mutant display

  6. Yeast Interacting Proteins Database: YLR423C, YLR423C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YLR423C ATG17 Scaffold protein responsible for phagophore assembly site organizatio...se activity Rows with this bait as bait (9) Rows with this bait as prey (29) YLR423C ATG17 Scaffold protein responsible...LR423C Bait gene name ATG17 Bait description Scaffold protein responsible for pha...ene name ATG17 Prey description Scaffold protein responsible for phagophore assembly site organization; regu

  7. Yeast Interacting Proteins Database: YCL020W, YDR261W-A [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YA or TYA-TYB polyprotein; Gag is a nucleocapsid protein that is the structural constituent of virus-like particles... TYA or TYA-TYB polyprotein; Gag is a nucleocapsid protein that is the structural constituent of virus-like particles...lyprotein; Gag is a nucleocapsid protein that is the structural constituent of virus-like particles (VLPs); ...; Gag is a nucleocapsid protein that is the structural constituent of virus-like particles (VLPs); similar t

  8. Yeast Interacting Proteins Database: YGL161C, YGL198W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YGL161C YIP5 Protein that interacts with Rab GTPases, localized to late Golgi vesicles; computational...that interacts with Rab GTPases, localized to late Golgi vesicles; computational ...eracts with Rab GTPases, localized to late Golgi vesicles; computational analysis of large-scale protein-pro...ized to late Golgi vesicles; computational analysis of large-scale protein-protein interaction data suggests

  9. Yeast Interacting Proteins Database: YGL198W, YGL161C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YGL198W YIP4 Protein that interacts with Rab GTPases, localized to late Golgi vesicles; computational...that interacts with Rab GTPases, localized to late Golgi vesicles; computational ...eracts with Rab GTPases, localized to late Golgi vesicles; computational analysis of large-scale protein-pro...ized to late Golgi vesicles; computational analysis of large-scale protein-protein interaction data suggests

  10. Yeast Interacting Proteins Database: YCL019W, YDR261W-B [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available a nucleocapsid-like protein (Gag), reverse transcriptase (RT), protease (PR), and integrase (IN); similar...a nucleocapsid-like protein (Gag), reverse transcriptase (RT), protease (PR), and integrase (IN); similar...a nucleocapsid-like protein (Gag), reverse transcriptase (RT), protease (PR), and integrase (IN); similar...a nucleocapsid-like protein (Gag), reverse transcriptase (RT), protease (PR), and integrase (IN); similar

  11. SPROUTS: a database for the evaluation of protein stability upon point mutation.

    Science.gov (United States)

    Lonquety, Mathieu; Lacroix, Zoé; Papandreou, Nikolaos; Chomilier, Jacques

    2009-01-01

    SPROUTS (Structural Prediction for pRotein fOlding UTility System) is a new database that provides access to various structural data sets and integrated functionalities not yet available to the community. The originality of the SPROUTS database is the ability to gain access to a variety of structural analyses at one place and with a strong interaction between them. SPROUTS currently combines data pertaining to 429 structures that capture representative folds and results related to the prediction of critical residues expected to belong to the folding nucleus: the MIR (Most Interacting Residues), the description of the structures in terms of modular fragments: the TEF (Tightened End Fragments), and the calculation at each position of the free energy change gradient upon mutation by one of the 19 amino acids. All database results can be displayed and downloaded in textual files and Excel spreadsheets and visualized on the protein structure. SPROUTS is a unique resource to access as well as visualize state-of-the-art characteristics of protein folding and analyse the effect of point mutations on protein structure. It is available at http://bioinformatics.eas.asu.edu/sprouts.html.

  12. Method for Rapid Protein Identification in a Large Database

    Directory of Open Access Journals (Sweden)

    Wenli Zhang

    2013-01-01

    Full Text Available Protein identification is an integral part of proteomics research. The available tools to identify proteins in tandem mass spectrometry experiments are not optimized to face current challenges in terms of identification scale and speed owing to the exponential growth of the protein database and the accelerated generation of mass spectrometry data, as well as the demand for nonspecific digestion and post-modifications in complex-sample identification. As a result, a rapid method is required to mitigate such complexity and computation challenges. This paper thus aims to present an open method to prevent enzyme and modification specificity on a large database. This paper designed and developed a distributed program to facilitate application to computer resources. With this optimization, nearly linear speedup and real-time support are achieved on a large database with nonspecific digestion, thus enabling testing with two classical large protein databases in a 20-blade cluster. This work aids in the discovery of more significant biological results, such as modification sites, and enables the identification of more complex samples, such as metaproteomics samples.

  13. Yeast Interacting Proteins Database: YGL237C, YOR047C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ene expression; interacts with protein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding prote... expression; interacts with protein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding protein

  14. Yeast Interacting Proteins Database: YGL127C, YOR047C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ith protein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding protein Spt15p; acts as a regula...rotein involved in control of glucose-regulated gene expression; interacts with protein kinase Snf1p, glucose sensors

  15. Yeast Interacting Proteins Database: YOR358W, YOR047C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ; interacts with protein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding protein Spt15p; act...rotein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding protein Spt15p; acts as a regulator o

  16. Yeast Interacting Proteins Database: YKL002W, YOR047C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ene expression; interacts with protein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding prote...xpression; interacts with protein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding protein Sp

  17. Yeast Interacting Proteins Database: YOR037W, YCL056C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available (GFP)-fusion protein localizes to the cytoplasm in a punctate pattern; null mutant displays decreased thermo...e pattern; null mutant displays decreased thermotolerance Rows with this prey as prey Rows with this prey as... of unknown function; green fluorescent protein (GFP)-fusion protein localizes to the cytoplasm in a punctat

  18. Yeast Interacting Proteins Database: YGR071C, YJL058C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available own function; deletion mutant has increased glycogen accumulation and displays elongated buds; green fluores...YGR071C - Putative protein of unknown function; deletion mutant has increased glycogen accumulation and disp...lays elongated buds; green fluorescent protein (GFP)-fusion protein localizes to th

  19. The Protein Identifier Cross-Referencing (PICR) service: reconciling protein identifiers across multiple source databases.

    Science.gov (United States)

    Côté, Richard G; Jones, Philip; Martens, Lennart; Kerrien, Samuel; Reisinger, Florian; Lin, Quan; Leinonen, Rasko; Apweiler, Rolf; Hermjakob, Henning

    2007-10-18

    Each major protein database uses its own conventions when assigning protein identifiers. Resolving the various, potentially unstable, identifiers that refer to identical proteins is a major challenge. This is a common problem when attempting to unify datasets that have been annotated with proteins from multiple data sources or querying data providers with one flavour of protein identifiers when the source database uses another. Partial solutions for protein identifier mapping exist but they are limited to specific species or techniques and to a very small number of databases. As a result, we have not found a solution that is generic enough and broad enough in mapping scope to suit our needs. We have created the Protein Identifier Cross-Reference (PICR) service, a web application that provides interactive and programmatic (SOAP and REST) access to a mapping algorithm that uses the UniProt Archive (UniParc) as a data warehouse to offer protein cross-references based on 100% sequence identity to proteins from over 70 distinct source databases loaded into UniParc. Mappings can be limited by source database, taxonomic ID and activity status in the source database. Users can copy/paste or upload files containing protein identifiers or sequences in FASTA format to obtain mappings using the interactive interface. Search results can be viewed in simple or detailed HTML tables or downloaded as comma-separated values (CSV) or Microsoft Excel (XLS) files suitable for use in a local database or a spreadsheet. Alternatively, a SOAP interface is available to integrate PICR functionality in other applications, as is a lightweight REST interface. We offer a publicly available service that can interactively map protein identifiers and protein sequences to the majority of commonly used protein databases. Programmatic access is available through a standards-compliant SOAP interface or a lightweight REST interface. The PICR interface, documentation and code examples are available at

  20. The Protein Identifier Cross-Referencing (PICR service: reconciling protein identifiers across multiple source databases

    Directory of Open Access Journals (Sweden)

    Leinonen Rasko

    2007-10-01

    Full Text Available Abstract Background Each major protein database uses its own conventions when assigning protein identifiers. Resolving the various, potentially unstable, identifiers that refer to identical proteins is a major challenge. This is a common problem when attempting to unify datasets that have been annotated with proteins from multiple data sources or querying data providers with one flavour of protein identifiers when the source database uses another. Partial solutions for protein identifier mapping exist but they are limited to specific species or techniques and to a very small number of databases. As a result, we have not found a solution that is generic enough and broad enough in mapping scope to suit our needs. Results We have created the Protein Identifier Cross-Reference (PICR service, a web application that provides interactive and programmatic (SOAP and REST access to a mapping algorithm that uses the UniProt Archive (UniParc as a data warehouse to offer protein cross-references based on 100% sequence identity to proteins from over 70 distinct source databases loaded into UniParc. Mappings can be limited by source database, taxonomic ID and activity status in the source database. Users can copy/paste or upload files containing protein identifiers or sequences in FASTA format to obtain mappings using the interactive interface. Search results can be viewed in simple or detailed HTML tables or downloaded as comma-separated values (CSV or Microsoft Excel (XLS files suitable for use in a local database or a spreadsheet. Alternatively, a SOAP interface is available to integrate PICR functionality in other applications, as is a lightweight REST interface. Conclusion We offer a publicly available service that can interactively map protein identifiers and protein sequences to the majority of commonly used protein databases. Programmatic access is available through a standards-compliant SOAP interface or a lightweight REST interface. The PICR

  1. Yeast Interacting Proteins Database: YLR082C, YLR082C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YLR082C SRL2 Protein of unknown function; overexpression suppresses the lethality cause...Protein of unknown function; overexpression suppresses the lethality caused by a rad53 null mutation Rows wi...; overexpression suppresses the lethality caused by a rad53 null mutation Rows with this bait as bait Rows w... (1) Prey ORF YLR082C Prey gene name SRL2 Prey description Protein of unknown function; overexpression suppresses the lethality cause

  2. Yeast Interacting Proteins Database: YKL103C, YKL103C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available he peptidase family M18; often used as a marker protein in studies of autophagy and cytosol to vacuole targe...; often used as a marker protein in studies of autophagy and cytosol to vacuole targeting (CVT) pathway Rows...e yscI; zinc metalloproteinase that belongs to the peptidase family M18; often used as a marker protein in studies...t belongs to the peptidase family M18; often used as a marker protein in studies of autophagy and cytosol to

  3. Yeast Interacting Proteins Database: YML064C, YBR072W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available th this bait as prey (0) YBR072W HSP26 Small heat shock protein (sHSP) with chaperone activity; forms hollow...tein (sHSP) with chaperone activity; forms hollow, sphere-shaped oligomers that suppress unfolded proteins a

  4. Yeast Interacting Proteins Database: YNL189W, YKL130C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available teracts with She3p; part of the mRNA localization machinery that restricts accumulation of certain proteins ...A-binding protein that binds specific mRNAs and interacts with She3p; part of the mRNA localization machinery that restricts

  5. Yeast Interacting Proteins Database: YER081W, YBR042C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YBR042C CST26 Protein of unknown function, affects chromosome stability when overexpressed Rows with this pr...ey (1) Prey ORF YBR042C Prey gene name CST26 Prey description Protein of unknown function, affects chromosom

  6. Yeast Interacting Proteins Database: YER081W, YDR194C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YDR194C MSS116 DEAD-box protein required for efficient splicing of mitochondrial Group I and II introns; non...e name MSS116 Prey description DEAD-box protein required for efficient splicing of mitochondrial Group I and

  7. Yeast Interacting Proteins Database: YOR097C, YML008C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YOR097C - Putative protein of unknown function; identified as interacting with Hsp82p in a high-throughpu... description Putative protein of unknown function; identified as interacting with... Hsp82p in a high-throughput two-hybrid screen; YOR097C is not an essential gene Rows with this bait as bait

  8. Yeast Interacting Proteins Database: YJL070C, YDR504C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available he authentic, non-tagged protein is detected in highly purified mitochondria in high-throughput studies; YJL...c, non-tagged protein is detected in highly purified mitochondria in high-throughput studies; YJL070C is a n

  9. Yeast Interacting Proteins Database: YPL105C, YDR429C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available co-purification experiments; authentic, non-tagged protein is detected in highly purified mitochondria in high-throughpu...entic, non-tagged protein is detected in highly purified mitochondria in high-throughput studies Rows with t

  10. Yeast Interacting Proteins Database: YMR095C, YMR096W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available bait as prey (0) YMR096W SNZ1 Protein involved in vitamin B6 biosynthesis; member of a stationary phase-induced...name SNZ1 Prey description Protein involved in vitamin B6 biosynthesis; member of a stationary phase-induced

  11. Yeast Interacting Proteins Database: YMR096W, YFL059W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YMR096W SNZ1 Protein involved in vitamin B6 biosynthesis; member of a stationary phase-induced gene family;...name SNZ1 Bait description Protein involved in vitamin B6 biosynthesis; member of a stationary phase-induced

  12. Yeast Interacting Proteins Database: YMR096W, YNL333W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YMR096W SNZ1 Protein involved in vitamin B6 biosynthesis; member of a stationary phase-induced gene family;...name SNZ1 Bait description Protein involved in vitamin B6 biosynthesis; member of a stationary phase-induced

  13. Yeast Interacting Proteins Database: YLR223C, YOR247W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YLR223C IFH1 Essential protein with a highly acidic N-terminal domain; IFH1 exhibits genetic interactions...ion Essential protein with a highly acidic N-terminal domain; IFH1 exhibits genetic interactions with FHL1,

  14. Yeast Interacting Proteins Database: YOR158W, YLR423C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YOR158W PET123 Mitochondrial ribosomal protein of the small subunit; PET123 exhibits genetic interactions...23 Bait description Mitochondrial ribosomal protein of the small subunit; PET123 exhibits genetic interact...ions with PET122, which encodes a COX3 mRNA-specific translational activator Rows w

  15. Yeast Interacting Proteins Database: YBR187W, YNR032W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available null mutant; GFP-fusion protein localizes to the vacuole; expression pattern and physical interactions sugge...expression is reduced in a gcr1 null mutant; GFP-fusion protein localizes to the vacuole; expression pattern and physical interaction

  16. Yeast Interacting Proteins Database: YER128W, YPR173C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available (MVB) protein sorting, ATP-bound Vps4p localizes to endosomes and catalyzes ESCRT-III disassembly and membrane...(MVB) protein sorting, ATP-bound Vps4p localizes to endosomes and catalyzes ESCRT-III disassembly and membrane

  17. Yeast Interacting Proteins Database: YML101C, YJR102C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ubiquitin-dependent sorting of proteins into the endosome Rows with this prey as prey (3) Rows with this...ubiquitin-dependent sorting of proteins into the endosome Rows with this prey as prey Rows with this prey

  18. Yeast Interacting Proteins Database: YGL061C, YER016W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available 6W BIM1 Microtubule-binding protein that together with Kar9p makes up the cortica...s prey (2) Prey ORF YER016W Prey gene name BIM1 Prey description Microtubule-binding protein that together with Kar9p makes

  19. Yeast Interacting Proteins Database: YCL029C, YER016W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available as prey (0) YER016W BIM1 Microtubule-binding protein that together with Kar9p makes...tubule-binding protein that together with Kar9p makes up the cortical microtubule capture site and delays th

  20. Yeast Interacting Proteins Database: YDR091C, YLR192C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available t as prey (0) YLR192C HCR1 Dual function protein involved in translation initiation as a substoic... (0) Prey ORF YLR192C Prey gene name HCR1 Prey description Dual function protein involved in translation initiation as a substoic

  1. Yeast Interacting Proteins Database: YNL189W, YGR136W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available as17p, which is a homolog of human Wiskott-Aldrich Syndrome protein involved in actin patch assembly and act...ion Protein containing an N-terminal SH3 domain; binds Las17p, which is a homolog of human Wiskott-Aldrich Syndrome

  2. Yeast Interacting Proteins Database: YGR058W, YGR136W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available main; binds Las17p, which is a homolog of human Wiskott-Aldrich Syndrome protein ...erminal SH3 domain; binds Las17p, which is a homolog of human Wiskott-Aldrich Syndrome protein involved in a

  3. Yeast Interacting Proteins Database: YOR181W, YGR136W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available og of human Wiskott-Aldrich Syndrome protein involved in actin patch assembly and actin polymerization Rows ...in; binds Las17p, which is a homolog of human Wiskott-Aldrich Syndrome protein involved in actin patch assem

  4. Yeast Interacting Proteins Database: YOL123W, YGL122C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available RRM-containing heteronuclear RNA binding protein and hnRNPA/B family member that binds to poly (A) signal sequences...RRM-containing heteronuclear RNA binding protein and hnRNPA/B family member that binds to poly (A) signal sequences

  5. Yeast Interacting Proteins Database: YPL204W, YHR185C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available n Sporulation protein required for prospore membrane formation at selected spindle poles, ensures functionality...Sporulation protein required for prospore membrane formation at selected spindle poles, ensures functional...ity of all four spindle pole bodies during meiosis II; not required for meiotic rec

  6. Yeast Interacting Proteins Database: YHR180W, YDL100C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available on Dubious open reading frame unlikely to encode a protein, based on available experimental and comparativ...YHR180W - Dubious open reading frame unlikely to encode a protein, based on available experimental and compa...rative sequence data Rows with this bait as bait (1) Rows with this bait as prey (0

  7. Yeast Interacting Proteins Database: YDR271C, YOR128C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available n Dubious open reading frame unlikely to encode a protein, based on available experimental and comparative...YDR271C - Dubious open reading frame unlikely to encode a protein, based on available experimental and compa...rative sequence data; partially overlaps the verified ORF CCC2/YDR270W Rows with th

  8. Yeast Interacting Proteins Database: YER081W, YPR136C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YPR136C - Dubious open reading frame unlikely to encode a protein, based on available experimental and comparative...e name - Prey description Dubious open reading frame unlikely to encode a protein, based on available experimental and comparative

  9. Yeast Interacting Proteins Database: YJR091C, YOR317W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available encoding membrane-associated proteins; involved in localizing the Arp2/3 complex to mitochondria; overexpression causes...NAs encoding membrane-associated proteins; involved in localizing the Arp2/3 complex to mitochondria; overexpression causes

  10. Yeast Interacting Proteins Database: YJR091C, YLR059C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available encoding membrane-associated proteins; involved in localizing the Arp2/3 complex to mitochondria; overexpression causes... mRNAs encoding membrane-associated proteins; involved in localizing the Arp2/3 complex to mitochondria; overexpression causes

  11. Yeast Interacting Proteins Database: YHR121W, YGR178C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available domain; GFP-fusion protein is induced by the DNA-damaging agent MMS Rows with this bait as bait (1) Rows with...domain; GFP-fusion protein is induced by the DNA-damaging agent MMS Rows with this bait as bait Rows with

  12. Yeast Interacting Proteins Database: YGR178C, YHR121W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available domain; GFP-fusion protein is induced by the DNA-damaging agent MMS Rows with this prey as prey (1) Rows with...domain; GFP-fusion protein is induced by the DNA-damaging agent MMS Rows with this prey as prey Rows with

  13. Yeast Interacting Proteins Database: YHR129C, YMR294W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YHR129C ARP1 Actin-related protein of the dynactin complex; required for spindle orientation...ein of the dynactin complex; required for spindle orientation and nuclear migrati...PD) 1 Alternative path with 2 intervening proteins (YPD) 2 IST hit 21 IST hit in the opposite bait/prey orientation 7 ...

  14. Yeast Interacting Proteins Database: YJR008W, YHR129C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available the dynactin complex; required for spindle orientation and nuclear migration; put...dynactin complex; required for spindle orientation and nuclear migration; putative ortholog of mammalian cen...ervening protein (YPD) 0 Alternative path with 2 intervening proteins (YPD) 0 IST hit 3 IST hit in the opposite bait/prey orientation - ...

  15. Yeast Interacting Proteins Database: YKR100C, YDL100C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YKR100C SKG1 Transmembrane protein with a role in cell wall polymer composition; lo...position; localizes on the inner surface of the plasma membrane at the bud and in t...RF YKR100C Bait gene name SKG1 Bait description Transmembrane protein with a role in cell wall polymer com

  16. Yeast Interacting Proteins Database: YOR302W, YOR047C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available rol of glucose-regulated gene expression; interacts with protein kinase Snf1p, glucose sensors Snf3p and Rgt...tein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding protein Spt1

  17. Yeast Interacting Proteins Database: YOR180C, YGL153W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available central component of the peroxisomal protein import machinery; interacts with both PTS1 (Pex5p) and PTS2...central component of the peroxisomal protein import machinery; interacts with both PTS1 (Pex5p) and PTS2

  18. Yeast Interacting Proteins Database: YCR036W, YGL153W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available central component of the peroxisomal protein import machinery; interacts with both PTS1 (Pex5p) and PTS2...central component of the peroxisomal protein import machinery; interacts with both PTS1 (Pex5p) and PTS2

  19. Yeast Interacting Proteins Database: YDR256C, YGL153W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available central component of the peroxisomal protein import machinery; interacts with both PTS1 (Pex5p) and PTS2...central component of the peroxisomal protein import machinery; interacts with both PTS1 (Pex5p) and PTS2

  20. Yeast Interacting Proteins Database: YDL239C, YKL103C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available that belongs to the peptidase family M18; often used as a marker protein in studies of autophagy and cytosol...ily M18; often used as a marker protein in studies of autophagy and cytosol to vacuole targeting (CVT) pathw

  1. Yeast Interacting Proteins Database: YDR311W, YKL103C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ngs to the peptidase family M18; often used as a marker protein in studies of aut...ase that belongs to the peptidase family M18; often used as a marker protein in studies of autophagy and cyt

  2. Yeast Interacting Proteins Database: YNL189W, YKL103C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available that belongs to the peptidase family M18; often used as a marker protein in studies of autophagy and cytoso...amily M18; often used as a marker protein in studies of autophagy and cytosol to vacuole targeting (CVT) pat

  3. Yeast Interacting Proteins Database: YKL103C, YOL082W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available he peptidase family M18; often used as a marker protein in studies of autophagy and cytosol to vacuole targe... as a marker protein in studies of autophagy and cytosol to vacuole targeting (CVT) pathway Rows with this b...on Vacuolar aminopeptidase yscI; zinc metalloproteinase that belongs to the peptidase family M18; often used

  4. Yeast Interacting Proteins Database: YMR280C, YOR047C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available olved in control of glucose-regulated gene expression; interacts with protein kinase Snf1p, glucose sensor... glucose-regulated gene expression; interacts with protein kinase Snf1p, glucose sensors Snf3p and Rgt2p, an

  5. Yeast Interacting Proteins Database: YGR173W, YDR152W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available plasmic RWD domain-containing protein of unknown function; interacts with Rbg1p and Gcn1p; associates with translating...slating ribosomes; putative intrinsically unstructured p...ion Highly-acidic cytoplasmic RWD domain-containing protein of unknown function; interacts with Rbg1p and Gcn1p; associates with tran

  6. Yeast Interacting Proteins Database: YDL239C, YBR072W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YDL239C ADY3 Protein required for spore wall formation, thought to mediate assembly...L239C Bait ORF YDL239C Bait gene name ADY3 Bait description Protein required for spore wall formation, thought

  7. Yeast Interacting Proteins Database: YDL239C, YPL124W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YDL239C ADY3 Protein required for spore wall formation, thought to mediate assembly...39C Bait ORF YDL239C Bait gene name ADY3 Bait description Protein required for spore wall formation, thoug...ht to mediate assembly of a Don1p-containing structure at the leading edge of the p

  8. Yeast Interacting Proteins Database: YHR114W, YDR422C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available substrate specificity; vacuolar protein containing KIS (Kinase-Interacting Sequence) and ASC (Association w...strate specificity; vacuolar protein containing KIS (Kinase-Interacting Sequence) and ASC (Association with ...e 4 CuraGen (0 or 1) 0 S. Fields (0 or 1) 0 Association (0 or 1,YPD) 0 Complex (0

  9. Yeast Interacting Proteins Database: YKL186C, YPL138C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available essential nuclear protein; Mex67p and Mtr2p form a mRNA export complex which binds to RNA Rows with this bait...essential nuclear protein; Mex67p and Mtr2p form a mRNA export complex which binds to RNA Rows with this bait

  10. Yeast Interacting Proteins Database: YBL056W, YDR071C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YBL056W PTC3 Type 2C protein phosphatase; dephosphorylates Hog1p (see also Ptc2p) to limit...it description Type 2C protein phosphatase; dephosphorylates Hog1p (see also Ptc2p) to limit maximal kinase

  11. Yeast Interacting Proteins Database: YER071C, YLR200W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available bait as prey (0) YLR200W YKE2 Subunit of the heterohexameric Gim/prefoldin protein complex involved in the...name YKE2 Prey description Subunit of the heterohexameric Gim/prefoldin protein complex involved in the

  12. Yeast Interacting Proteins Database: YJR091C, YHR026W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available encoding membrane-associated proteins; involved in localizing the Arp2/3 complex to mitochondria; overexpression causes...s with mRNAs encoding membrane-associated proteins; involved in localizing the Arp2/3 complex to mitochondria; overexpression causes

  13. Yeast Interacting Proteins Database: YHR156C, YJR010W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YHR156C LIN1 Non-essential component of U5 snRNP; nuclear protein; physically inter...6C Bait gene name LIN1 Bait description Non-essential component of U5 snRNP; nuclear protein; physically

  14. Yeast Interacting Proteins Database: YDR026C, YDL030W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available -purification experiments; Myb-like DNA-binding protein that may bind to the Ter region of rDNA; interacts physically...n experiments; Myb-like DNA-binding protein that may bind to the Ter region of rDNA; interacts physically wi

  15. Yeast Interacting Proteins Database: YMR047C, YNL078W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available L078W NIS1 Protein localized in the bud neck at G2/M phase; physically interacts ...ene name NIS1 Prey description Protein localized in the bud neck at G2/M phase; physically interacts with se

  16. Yeast Interacting Proteins Database: YPR040W, YNR032W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YPR040W TIP41 Protein that interacts physically and genetically with Tap42p, which ... 0 0 0 0 0 - - - - - 0 0 3 - Show YPR040W Bait ORF YPR040W Bait gene name TIP41 Bait description Protein that interacts physically

  17. Yeast Interacting Proteins Database: YBR108W, YGR136W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YBR108W AIM3 Protein interacting with Rvs167p; null mutant is viable and displays e...w YBR108W Bait ORF YBR108W Bait gene name AIM3 Bait description Protein interacting with Rvs167p; null mutant is viable and display

  18. Yeast Interacting Proteins Database: YER127W, YDR299W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YER127W LCP5 Essential protein involved in maturation of 18S rRNA; depletion leads ...7W Bait gene name LCP5 Bait description Essential protein involved in maturation ...of 18S rRNA; depletion leads to inhibited pre-rRNA processing and reduced polysome levels; localizes primari

  19. Computer systems and methods for the query and visualization of multidimensional database

    Science.gov (United States)

    Stolte, Chris; Tang, Diane L.; Hanrahan, Patrick

    2010-05-11

    A method and system for producing graphics. A hierarchical structure of a database is determined. A visual table, comprising a plurality of panes, is constructed by providing a specification that is in a language based on the hierarchical structure of the database. In some cases, this language can include fields that are in the database schema. The database is queried to retrieve a set of tuples in accordance with the specification. A subset of the set of tuples is associated with a pane in the plurality of panes.

  20. Yeast Interacting Proteins Database: YMR204C, YJL185C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YMR204C INP1 Peripheral membrane protein of peroxisomes involved in peroxisomal inheritance... of peroxisomes involved in peroxisomal inheritance Rows with this bait as bait Rows with this bait as bait

  1. Yeast Interacting Proteins Database: YBR239C, YPL133C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available cytoplasm and nucleus; null mutation affects periodicity of transcriptional and metabolic oscillation; play...ion; GFP-fusion protein localizes to the cytoplasm and nucleus; null mutation affects

  2. Yeast Interacting Proteins Database: YEL005C, YGL079W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available endosome; identified as a transcriptional activator in a high-throughput yeast one-hybrid assay Rows with th...protein localizes to the endosome; identified as a transcriptional activator in a high-throughput yeast one-

  3. Yeast Interacting Proteins Database: YJR091C, YKL113C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available encoding membrane-associated proteins; involved in localizing the Arp2/3 complex to mitochondria; overexpression causes...olved in localizing the Arp2/3 complex to mitochondria; overexpression causes increased sensitivity to benom

  4. Yeast Interacting Proteins Database: YDR084C, YGL198W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available with Rab GTPases, localized to late Golgi vesicles; computational analysis of lar...gene name YIP4 Prey description Protein that interacts with Rab GTPases, localized to late Golgi vesicles; computational

  5. Yeast Interacting Proteins Database: YBR135W, YBR252W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available tes proteolysis of M-phase targets through interactions with the proteasome; role in transcriptional regulat...yclin-dependent protein kinase regulatory subunit and adaptor; modulates proteolysis of M-phase targets through interactions

  6. Yeast Interacting Proteins Database: YNL216W, YLR453C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available nscription, depending on binding site context; also binds telomere sequences and plays a role in telomeric p...NA-binding protein involved in either activation or repression of transcription, depending on binding site context

  7. Yeast Interacting Proteins Database: YLR295C, YDR455C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available on available experimental and comparative sequence data; partially overlaps the v...ous open reading frame unlikely to encode a protein, based on available experimental and comparative

  8. Yeast Interacting Proteins Database: YLR295C, YOL050C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available on available experimental and comparative sequence data; overlaps verified gene G...en reading frame unlikely to encode a protein, based on available experimental and comparative sequence data

  9. Yeast Interacting Proteins Database: YHR114W, YJL086C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available available experimental and comparative sequence data; partially overlaps the verified genes YJL085W/EXO70 a... reading frame unlikely to encode a protein, based on available experimental and comparative sequence data;

  10. Yeast Interacting Proteins Database: YJR091C, YDR008C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ailable experimental and comparative sequence data Rows with this prey as prey (1) Rows with this prey as ba... - Prey description Dubious open reading frame unlikely to encode a protein, based on available experimental and comparative

  11. Yeast Interacting Proteins Database: YJR091C, YOR265W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available encoding membrane-associated proteins; involved in localizing the Arp2/3 complex to mitochondria; overexpression causes...ns; involved in localizing the Arp2/3 complex to mitochondria; overexpression causes increased sensitivity t

  12. Yeast Interacting Proteins Database: YPL203W, YIL033C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available tein kinase (PKA), a component of a signaling pathway that controls a variety of cellular processes, includi...ependent protein kinase (PKA), a component of a signaling pathway that controls a variety of cellular processes

  13. Yeast Interacting Proteins Database: YKL166C, YIL033C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available tein kinase (PKA), a component of a signaling pathway that controls a variety of cellular processes, includi...dependent protein kinase (PKA), a component of a signaling pathway that controls a variety of cellular processes

  14. Yeast Interacting Proteins Database: YJL164C, YIL033C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available tein kinase (PKA), a component of a signaling pathway that controls a variety of cellular processes, includi...dependent protein kinase (PKA), a component of a signaling pathway that controls a variety of cellular processes

  15. Yeast Interacting Proteins Database: YJR091C, YMR067C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available encoding membrane-associated proteins; involved in localizing the Arp2/3 complex to mitochondria; overexpression causes...olved in localizing the Arp2/3 complex to mitochondria; overexpression causes increased sensitivity to benom

  16. Yeast Interacting Proteins Database: YJR091C, YKL076C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available encoding membrane-associated proteins; involved in localizing the Arp2/3 complex to mitochondria; overexpression causes...Arp2/3 complex to mitochondria; overexpression causes increased sensitivity to benomyl Rows with this bait a

  17. Yeast Interacting Proteins Database: YJR091C, YDL147W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available encoding membrane-associated proteins; involved in localizing the Arp2/3 complex to mitochondria; overexpression causes...mplex to mitochondria; overexpression causes increased sensitivity to benomyl Rows with this bait as bait Ro

  18. Yeast Interacting Proteins Database: YHR107C, YJR076C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available for cytokinesis; septins recruit proteins to the neck and can act as a barrier to diffusion at the membrane, and they comprise...ired for cytokinesis; septins recruit proteins to the neck and can act as a barrier to diffusion at the membrane, and they comprise...ed for cytokinesis; septins recruit proteins to the neck and can act as a barrier to diffusion at the membrane, and they comprise...required for cytokinesis; septins recruit proteins to the neck and can act as a barrier to diffusion at the membrane, and they compri...se the 10nm filaments seen with EM Rows with this prey as prey (1) Rows with this p

  19. Yeast Interacting Proteins Database: YKL002W, YDL165W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available integral membrane proteins into lumenal vesicles of multivesicular bodies, and for delivery of newly synthes...ins into lumenal vesicles of multivesicular bodies, and for delivery of newly synthesized vacuolar enzymes t

  20. Yeast Interacting Proteins Database: YKL002W, YLR423C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available integral membrane proteins into lumenal vesicles of multivesicular bodies, and for delivery of newly synthes... into lumenal vesicles of multivesicular bodies, and for delivery of newly synthesized vacuolar enzymes to t

  1. Yeast Interacting Proteins Database: YCL032W, YLR423C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YCL032W STE50 Protein involved in mating response, invasive/filamentous growth, and...lved in mating response, invasive/filamentous growth, and osmotolerance, acts as an adaptor that links G pro

  2. Yeast Interacting Proteins Database: YMR154C, YLR025W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available the multivesicular body (MVB) pathway; recruited from the cytoplasm to endosomal membranes Rows with this pr...mbrane proteins into the multivesicular body (MVB) pathway; recruited from the cytoplasm to endosomal membranes

  3. Yeast Interacting Proteins Database: YMR294W, YHR129C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available complex; required for spindle orientation and nuclear migration; putative ortholo...PD) 1 Alternative path with 2 intervening proteins (YPD) 2 IST hit 7 IST hit in the opposite bait/prey orientation 21 ... ...ne name ARP1 Prey description Actin-related protein of the dynactin complex; required for spindle orientat...ion and nuclear migration; putative ortholog of mammalian centractin Rows with this

  4. Yeast Interacting Proteins Database: YMR047C, YER107C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available repetitive GLFG motif that interacts with mRNA export factor Mex67p and with karyopherin Kap95p; homologous...nuclear pore complex required for polyadenylated RNA export but not for protein import, homologous to S. pombe...repetitive GLFG motif that interacts with mRNA export factor Mex67p and with karyopherin Kap95p; homologous...nuclear pore complex required for polyadenylated RNA export but not for protein import, homologous to S. pombe

  5. Yeast Interacting Proteins Database: YCL032W, YLR362W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YCL032W STE50 Protein involved in mating response, invasive/filamentous growth, and...STE11 Signal transducing MEK kinase involved in pheromone response and pseudohyphal/invasive...2W Bait gene name STE50 Bait description Protein involved in mating response, invasive/filamentous growth, a... STE11 Prey description Signal transducing MEK kinase involved in pheromone response and pseudohyphal/invasive

  6. Yeast Interacting Proteins Database: YLR362W, YCL032W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YLR362W STE11 Signal transducing MEK kinase involved in pheromone response and pseudohyphal/invasive...ait as prey (1) YCL032W STE50 Protein involved in mating response, invasive/filam...2W Bait gene name STE11 Bait description Signal transducing MEK kinase involved in pheromone response and pseudohyphal/invasive...F YCL032W Prey gene name STE50 Prey description Protein involved in mating response, invasive/filamentous gr

  7. Yeast Interacting Proteins Database: YDL226C, YOR327C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available P-ribosylation factor GTPase activating protein (ARF GAP), involved in ER-Golgi transport; share...Rows with this prey as bait (0) Literature on bait (YPD) 20 Literature on prey (YPD) 27 Literature shared by...YDL226C GCS1 ADP-ribosylation factor GTPase activating protein (ARF GAP), involved ...in ER-Golgi transport; shares functional similarity with Glo3p Rows with this bait as bait (9) Rows with thi

  8. Yeast Interacting Proteins Database: YDL226C, YGR172C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ase activating protein (ARF GAP), involved in ER-Golgi transport; shares function...t (YPD) 20 Literature on prey (YPD) 6 Literature shared by bait and prey 3 Literature sharing score 4 CuraGe...YDL226C GCS1 ADP-ribosylation factor GTPase activating protein (ARF GAP), involved ...in ER-Golgi transport; shares functional similarity with Glo3p Rows with this bait as bait (9) Rows with thi

  9. Yeast Interacting Proteins Database: YDL226C, YER118C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available se activating protein (ARF GAP), involved in ER-Golgi transport; shares functiona...YPD) 20 Literature on prey (YPD) 24 Literature shared by bait and prey 2 Literature sharing score 2 CuraGen ...YDL226C GCS1 ADP-ribosylation factor GTPase activating protein (ARF GAP), involved ...in ER-Golgi transport; shares functional similarity with Glo3p Rows with this bait as bait (9) Rows with thi

  10. NetworkAnalyst--integrative approaches for protein-protein interaction network analysis and visual exploration.

    Science.gov (United States)

    Xia, Jianguo; Benner, Maia J; Hancock, Robert E W

    2014-07-01

    Biological network analysis is a powerful approach to gain systems-level understanding of patterns of gene expression in different cell types, disease states and other biological/experimental conditions. Three consecutive steps are required--identification of genes or proteins of interest, network construction and network analysis and visualization. To date, researchers have to learn to use a combination of several tools to accomplish this task. In addition, interactive visualization of large networks has been primarily restricted to locally installed programs. To address these challenges, we have developed NetworkAnalyst, taking advantage of state-of-the-art web technologies, to enable high performance network analysis with rich user experience. NetworkAnalyst integrates all three steps and presents the results via a powerful online network visualization framework. Users can upload gene or protein lists, single or multiple gene expression datasets to perform comprehensive gene annotation and differential expression analysis. Significant genes are mapped to our manually curated protein-protein interaction database to construct relevant networks. The results are presented through standard web browsers for network analysis and interactive exploration. NetworkAnalyst supports common functions for network topology and module analyses. Users can easily search, zoom and highlight nodes or modules, as well as perform functional enrichment analysis on these selections. The networks can be customized with different layouts, colors or node sizes, and exported as PNG, PDF or GraphML files. Comprehensive FAQs, tutorials and context-based tips and instructions are provided. NetworkAnalyst currently supports protein-protein interaction network analysis for human and mouse and is freely available at http://www.networkanalyst.ca. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  11. CREDO: a protein-ligand interaction database for drug discovery.

    Science.gov (United States)

    Schreyer, Adrian; Blundell, Tom

    2009-02-01

    Harnessing data from the growing number of protein-ligand complexes in the Protein Data Bank is an important task in drug discovery. In order to benefit from the abundance of three-dimensional structures, structural data must be integrated with sequence as well as chemical data and the protein-small molecule interactions characterized structurally at the inter-atomic level. In this study, we present CREDO, a new publicly available database of protein-ligand interactions, which represents contacts as structural interaction fingerprints, implements novel features and is completely scriptable through its application programming interface. Features of CREDO include implementation of molecular shape descriptors with ultrafast shape recognition, fragmentation of ligands in the Protein Data Bank, sequence-to-structure mapping and the identification of approved drugs. Selected analyses of these key features are presented to highlight a range of potential applications of CREDO. The CREDO dataset has been released into the public domain together with the application programming interface under a Creative Commons license at http://www-cryst.bioc.cam.ac.uk/credo. We believe that the free availability and numerous features of CREDO database will be useful not only for commercial but also for academia-driven drug discovery programmes.

  12. Yeast Interacting Proteins Database: YDL226C, YPR113W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available tion ADP-ribosylation factor GTPase activating protein (ARF GAP), involved in ER-Golgi transport; share...s prey as bait Rows with this prey as bait (0) Literature on bait (YPD) 20 Literature on prey (YPD) 24 Literature share...YDL226C GCS1 ADP-ribosylation factor GTPase activating protein (ARF GAP), involved ...in ER-Golgi transport; shares functional similarity with Glo3p Rows with this bait as bait (9) Rows with thi

  13. Yeast Interacting Proteins Database: YDL226C, YOL129W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available GCS1 Bait description ADP-ribosylation factor GTPase activating protein (ARF GAP), involved in ER-Golgi transport; share...0 Literature on prey (YPD) 5 Literature shared by bait and prey 3 Literature sharing score 4 CuraGen (0 or 1...YDL226C GCS1 ADP-ribosylation factor GTPase activating protein (ARF GAP), involved ...in ER-Golgi transport; shares functional similarity with Glo3p Rows with this bait as bait (9) Rows with thi

  14. MannDB: A microbial annotation database for protein characterization

    Energy Technology Data Exchange (ETDEWEB)

    Zhou, C; Lam, M; Smith, J; Zemla, A; Dyer, M; Kuczmarski, T; Vitalis, E; Slezak, T

    2006-05-19

    MannDB was created to meet a need for rapid, comprehensive automated protein sequence analyses to support selection of proteins suitable as targets for driving the development of reagents for pathogen or protein toxin detection. Because a large number of open-source tools were needed, it was necessary to produce a software system to scale the computations for whole-proteome analysis. Thus, we built a fully automated system for executing software tools and for storage, integration, and display of automated protein sequence analysis and annotation data. MannDB is a relational database that organizes data resulting from fully automated, high-throughput protein-sequence analyses using open-source tools. Types of analyses provided include predictions of cleavage, chemical properties, classification, features, functional assignment, post-translational modifications, motifs, antigenicity, and secondary structure. Proteomes (lists of hypothetical and known proteins) are downloaded and parsed from Genbank and then inserted into MannDB, and annotations from SwissProt are downloaded when identifiers are found in the Genbank entry or when identical sequences are identified. Currently 36 open-source tools are run against MannDB protein sequences either on local systems or by means of batch submission to external servers. In addition, BLAST against protein entries in MvirDB, our database of microbial virulence factors, is performed. A web client browser enables viewing of computational results and downloaded annotations, and a query tool enables structured and free-text search capabilities. When available, links to external databases, including MvirDB, are provided. MannDB contains whole-proteome analyses for at least one representative organism from each category of biological threat organism listed by APHIS, CDC, HHS, NIAID, USDA, USFDA, and WHO. MannDB comprises a large number of genomes and comprehensive protein sequence analyses representing organisms listed as high

  15. Visual analysis of entity relationships in the Global Terrorism Database

    Science.gov (United States)

    Godwin, Alex; Chang, Remco; Kosara, Robert; Ribarsky, William

    2008-04-01

    With the increase of terrorist activity around the world, it has become more important than ever to analyze and understand these activities over time. Although the data on terrorist activities are detailed and relevant, the complexity of the data has rendered the understanding and analysis difficult. We present a visual analytical approach to effectively identify related entities such as terrorist groups, events, locations, etc. based on a 2D layout. Our methods are based on sequence comparison from bioinformatics, modified to incorporate the element of time. By allowing the user the freedom to link entities by their activities over time, we provide a new framework for comparison of event sequences. Our scoring mechanism is robust and flexible, giving the user the flexibility to define the extent to which time is considered in aligning entities. Incorporated with high interactivity, the user can efficiently navigate through tens of thousands of records recorded in over a hundred dimensions of data by choosing combinations of categories to examine. Exploration of the terrorist activities in our system reveals relationships between entities that are not easily detectable using traditional methods.

  16. Yeast Interacting Proteins Database: YGR223C, YOR089C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available micronucleophagy; predicted to fold as a seven-bladed beta-propeller; displays punctate cytoplasmic localiz...tion Phosphatidylinositol 3,5-bisphosphate-binding protein, plays a role in micronucleophagy; predicted to fold as a seven-blade

  17. Yeast Interacting Proteins Database: YJR091C, YFR036W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available 0) YFR036W CDC26 Subunit of the Anaphase-Promoting Complex/Cyclosome (APC/C), whi...rey description Subunit of the Anaphase-Promoting Complex/Cyclosome (APC/C), which is a ubiquitin-protein li

  18. Yeast Interacting Proteins Database: YHR166C, YLR451W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YHR166C CDC23 Subunit of the Anaphase-Promoting Complex/Cyclosome (APC/C), which is...ait description Subunit of the Anaphase-Promoting Complex/Cyclosome (APC/C), which is a ubiquitin-protein li

  19. Yeast Interacting Proteins Database: YGR113W, YGL079W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available sion protein localizes to the endosome; identified as a transcriptional activator in a high-throughput yeast... a transcriptional activator in a high-throughput yeast one-hybrid assay Rows with this prey as prey Rows wi

  20. Yeast Interacting Proteins Database: YNL189W, YGL221C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available urified mitochondria in high-throughput studies Rows with this prey as prey (2) Rows with this prey as bait ...a factor (rpoD gene product); the authentic, non-tagged protein is detected in highly purified mitochondria in high-throughpu

  1. Yeast Interacting Proteins Database: YJR102C, YLR417W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available omain which is involved in interactions with ESCRT-I and ubiquitin-dependent sort...T-II complex; contains the GLUE (GRAM Like Ubiquitin binding in EAP45) domain which is involved in interac...tions with ESCRT-I and ubiquitin-dependent sorting of proteins into the endosome Ro

  2. Yeast Interacting Proteins Database: YPL002C, YLR417W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available RAM Like Ubiquitin binding in EAP45) domain which is involved in interactions with ESCRT-I and ubiquitin-dep...n which is involved in interactions with ESCRT-I and ubiquitin-dependent sorting of proteins into the endoso... ESCRT-II complex; contains the GLUE (GRAM Like Ubiquitin binding in EAP45) domai

  3. Yeast Interacting Proteins Database: YHR009C, YOR359W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ding protein containing a SAM domain; shows genetic interactions with Vti1p, which is a v-SNARE involved in ...aining a SAM domain; shows genetic interactions with Vti1p, which is a v-SNARE involved in cis-Golgi membran

  4. Yeast Interacting Proteins Database: YBR135W, YBR160W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available tes proteolysis of M-phase targets through interactions with the proteasome; role in transcriptional regulat... description Cyclin-dependent protein kinase regulatory subunit and adaptor; modulates proteolysis of M-phase targets through interac...tions with the proteasome; role in transcriptional regulation, recruiting proteasom

  5. Yeast Interacting Proteins Database: YBR254C, YKR068C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available oepiphyseal dysplasia tarda (SEDL) disorder Rows with this bait as bait (2) Rows with this bait as prey (0) YKR068C BET3 Hydrophilic...Rows with this bait as prey (0) Prey ORF YKR068C Prey gene name BET3 Prey description Hydrophilic protein th

  6. Yeast Interacting Proteins Database: YLR026C, YDR189W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ait as prey (0) YDR189W SLY1 Hydrophilic protein involved in vesicle trafficking between the ER and Golgi; S...it (1) Rows with this bait as prey Rows with this bait as prey (0) Prey ORF YDR189W Prey gene name SLY1 Prey description Hydrophilic

  7. Yeast Interacting Proteins Database: YMR025W, YGR120C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YMR025W CSI1 Subunit of the Cop9 signalosome, which is required for deneddylation, or removal...signalosome, which is required for deneddylation, or removal of the ubiquitin-like protein Rub1p from Cdc53p

  8. Yeast Interacting Proteins Database: YHR114W, YLR112W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available available experimental and comparative sequence data Rows with this prey as prey (1) Rows with this prey as...e name - Prey description Dubious open reading frame unlikely to encode a protein, based on available experimental and comparative

  9. Yeast Interacting Proteins Database: YDL089W, YLR324W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available rDNA repeat stability; null mutant causes increase in unequal sister-chromatid exchange; GFP-fusion protein...Lrs4p; required for rDNA repeat stability; null mutant causes increase in unequal sister-chromatid exchange;

  10. Yeast Interacting Proteins Database: YJR091C, YDR389W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available to the actin cytoskeleton, null mutations suppress tor2 mutations and temperature sensitive mutations in act...ion GTPase activating protein (GAP) for Rho1p, involved in signaling to the actin cytoskeleton, null mutations suppress tor2 mutation...s and temperature sensitive mutations in actin; potential Cdc28p substrate Rows wit

  11. Yeast Interacting Proteins Database: YER059W, YDL224C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ment for passage through Start and commitment to cell division Rows with this prey as prey (1) Rows with thi...ative RNA binding protein and partially redundant Whi3p homolog that regulates the cell size requirement for passage through Start

  12. Yeast Interacting Proteins Database: YPR054W, YFL010C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available quired for production of the outer spore wall layers; negatively regulates activity of the glucan synthase s...escription Middle sporulation-specific mitogen-activated protein kinase (MAPK) required for production of the outer spore wall layers

  13. Yeast Interacting Proteins Database: YGR196C, YBR260C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YGR196C FYV8 Protein of unknown function, required for survival upon exposure to K1...n, required for survival upon exposure to K1 killer toxin Rows with this bait as bait Rows with this bait as

  14. Yeast Interacting Proteins Database: YBR108W, YDR388W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available Rvs161p to regulate actin cytoskeleton, endocytosis, and viability following star...0) YDR388W RVS167 Actin-associated protein, interacts with Rvs161p to regulate actin cytoskeleton, endocytosis, and viability followi...ng starvation or osmotic stress; homolog of mammalian am

  15. Of red planets and indigo computers: Mars database visualization as an example of platform downsizing

    Science.gov (United States)

    Kaiser, M. K.; Montegut, M. J.

    1997-01-01

    The last decade has witnessed tremendous advancements in the computer hardware and software used to perform scientific visualization. In this paper, we consider how the visualization of a particular data set, the digital terrain model derived from the Viking orbiter imagery, has been realized in four distinct projects over this period. These examples serve to demonstrate how the vast improvements in computational performance both decrease the cost of such visualization efforts and permit an increasing level of interactivity. We then consider how even today's graphical systems require the visualization designer to make intelligent choices and tradeoffs in database rendering. Finally, we discuss how insights gleaned from an understanding of human visual perception can guide these design decisions, and suggest new options for visualization hardware and software.

  16. Yeast Interacting Proteins Database: YDL226C, YKR088C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available lved in ER-Golgi transport; shares functional similarity with Glo3p Rows with this bait as bait Rows with th...on prey (YPD) 6 Literature shared by bait and prey 3 Literature sharing score 4 C...YDL226C GCS1 ADP-ribosylation factor GTPase activating protein (ARF GAP), involved ...in ER-Golgi transport; shares functional similarity with Glo3p Rows with this bait as bait (9) Rows with thi

  17. Yeast Interacting Proteins Database: YDL226C, YPR183W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available (ARF GAP), involved in ER-Golgi transport; shares functional similarity with Glo3p Rows with this bait as b...re on bait (YPD) 20 Literature on prey (YPD) 28 Literature shared by bait and pre...YDL226C GCS1 ADP-ribosylation factor GTPase activating protein (ARF GAP), involved ...in ER-Golgi transport; shares functional similarity with Glo3p Rows with this bait as bait (9) Rows with thi

  18. PDTD: a web-accessible protein database for drug target identification

    Directory of Open Access Journals (Sweden)

    Gao Zhenting

    2008-02-01

    Full Text Available Abstract Background Target identification is important for modern drug discovery. With the advances in the development of molecular docking, potential binding proteins may be discovered by docking a small molecule to a repository of proteins with three-dimensional (3D structures. To complete this task, a reverse docking program and a drug target database with 3D structures are necessary. To this end, we have developed a web server tool, TarFisDock (Target Fishing Docking http://www.dddc.ac.cn/tarfisdock, which has been used widely by others. Recently, we have constructed a protein target database, Potential Drug Target Database (PDTD, and have integrated PDTD with TarFisDock. This combination aims to assist target identification and validation. Description PDTD is a web-accessible protein database for in silico target identification. It currently contains >1100 protein entries with 3D structures presented in the Protein Data Bank. The data are extracted from the literatures and several online databases such as TTD, DrugBank and Thomson Pharma. The database covers diverse information of >830 known or potential drug targets, including protein and active sites structures in both PDB and mol2 formats, related diseases, biological functions as well as associated regulating (signaling pathways. Each target is categorized by both nosology and biochemical function. PDTD supports keyword search function, such as PDB ID, target name, and disease name. Data set generated by PDTD can be viewed with the plug-in of molecular visualization tools and also can be downloaded freely. Remarkably, PDTD is specially designed for target identification. In conjunction with TarFisDock, PDTD can be used to identify binding proteins for small molecules. The results can be downloaded in the form of mol2 file with the binding pose of the probe compound and a list of potential binding targets according to their ranking scores. Conclusion PDTD serves as a comprehensive and

  19. Full Data of Yeast Interacting Proteins Database (Original Version) - Yeast Interacting Proteins Database | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available E., Lingner, C., et al. (2000) Nucleic Acids Res. 28, 73-76.) are used for literature collection. Number of...ng co-occurrence of Prey and Bait in the literature, calculated by the calculation formula. Calculation form...don) 340, 245-246.) Data analysis method As the indicator of reliability of the interactions obtained by the experiment, the literatu...re information described about the yeast proteins and th

  20. Yeast Interacting Proteins Database: YOL069W, YIL144W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available complex (Ndc80p-Nuf2p-Spc24p-Spc25p); involved in chromosome segregation, spindle checkpoint activity and kinetochore clustering...vity, kinetochore assembly and clustering Rows with this prey as prey (2) Rows with this prey as bait (0) 12...-Nuf2p-Spc24p-Spc25p); involved in chromosome segregation, spindle checkpoint activity and kinetochore clustering...d coiled-coil protein involved in chromosome segregation, spindle checkpoint activity, kinetochore assembly and clustering

  1. Ebolavirus Database: Gene and Protein Information Resource for Ebolaviruses

    Directory of Open Access Journals (Sweden)

    Rayapadi G. Swetha

    2016-01-01

    Full Text Available Ebola Virus Disease (EVD is a life-threatening haemorrhagic fever in humans. Even though there are many reports on EVD, the protein precursor functions and virulent factors of ebolaviruses remain poorly understood. Comparative analyses of Ebolavirus genomes will help in the identification of these important features. This prompted us to develop the Ebolavirus Database (EDB and we have provided links to various tools that will aid researchers to locate important regions in both the genomes and proteomes of Ebolavirus. The genomic analyses of ebolaviruses will provide important clues for locating the essential and core functional genes. The aim of EDB is to act as an integrated resource for ebolaviruses and we strongly believe that the database will be a useful tool for clinicians, microbiologists, health care workers, and bioscience researchers.

  2. HPIminer: A text mining system for building and visualizing human protein interaction networks and pathways.

    Science.gov (United States)

    Subramani, Suresh; Kalpana, Raja; Monickaraj, Pankaj Moses; Natarajan, Jeyakumar

    2015-04-01

    The knowledge on protein-protein interactions (PPI) and their related pathways are equally important to understand the biological functions of the living cell. Such information on human proteins is highly desirable to understand the mechanism of several diseases such as cancer, diabetes, and Alzheimer's disease. Because much of that information is buried in biomedical literature, an automated text mining system for visualizing human PPI and pathways is highly desirable. In this paper, we present HPIminer, a text mining system for visualizing human protein interactions and pathways from biomedical literature. HPIminer extracts human PPI information and PPI pairs from biomedical literature, and visualize their associated interactions, networks and pathways using two curated databases HPRD and KEGG. To our knowledge, HPIminer is the first system to build interaction networks from literature as well as curated databases. Further, the new interactions mined only from literature and not reported earlier in databases are highlighted as new. A comparative study with other similar tools shows that the resultant network is more informative and provides additional information on interacting proteins and their associated networks. Copyright © 2015 Elsevier Inc. All rights reserved.

  3. ArachnoServer: a database of protein toxins from spiders

    Directory of Open Access Journals (Sweden)

    Kaas Quentin

    2009-08-01

    Full Text Available Abstract Background Venomous animals incapacitate their prey using complex venoms that can contain hundreds of unique protein toxins. The realisation that many of these toxins may have pharmaceutical and insecticidal potential due to their remarkable potency and selectivity against target receptors has led to an explosion in the number of new toxins being discovered and characterised. From an evolutionary perspective, spiders are the most successful venomous animals and they maintain by far the largest pool of toxic peptides. However, at present, there are no databases dedicated to spider toxins and hence it is difficult to realise their full potential as drugs, insecticides, and pharmacological probes. Description We have developed ArachnoServer, a manually curated database that provides detailed information about proteinaceous toxins from spiders. Key features of ArachnoServer include a new molecular target ontology designed especially for venom toxins, the most up-to-date taxonomic information available, and a powerful advanced search interface. Toxin information can be browsed through dynamic trees, and each toxin has a dedicated page summarising all available information about its sequence, structure, and biological activity. ArachnoServer currently manages 567 protein sequences, 334 nucleic acid sequences, and 51 protein structures. Conclusion ArachnoServer provides a single source of high-quality information about proteinaceous spider toxins that will be an invaluable resource for pharmacologists, neuroscientists, toxinologists, medicinal chemists, ion channel scientists, clinicians, and structural biologists. ArachnoServer is available online at http://www.arachnoserver.org.

  4. Computer systems and methods for the query and visualization of multidimensional databases

    Science.gov (United States)

    Stolte, Chris [Palo Alto, CA; Tang, Diane L [Palo Alto, CA; Hanrahan, Patrick [Portola Valley, CA

    2011-02-01

    In response to a user request, a computer generates a graphical user interface on a computer display. A schema information region of the graphical user interface includes multiple operand names, each operand name associated with one or more fields of a multi-dimensional database. A data visualization region of the graphical user interface includes multiple shelves. Upon detecting a user selection of the operand names and a user request to associate each user-selected operand name with a respective shelf in the data visualization region, the computer generates a visual table in the data visualization region in accordance with the associations between the operand names and the corresponding shelves. The visual table includes a plurality of panes, each pane having at least one axis defined based on data for the fields associated with a respective operand name.

  5. Computer systems and methods for the query and visualization of multidimensional databases

    Science.gov (United States)

    Stolte, Chris [Palo Alto, CA; Tang, Diane L [Palo Alto, CA; Hanrahan, Patrick [Portola Valley, CA

    2012-03-20

    In response to a user request, a computer generates a graphical user interface on a computer display. A schema information region of the graphical user interface includes multiple operand names, each operand name associated with one or more fields of a multi-dimensional database. A data visualization region of the graphical user interface includes multiple shelves. Upon detecting a user selection of the operand names and a user request to associate each user-selected operand name with a respective shelf in the data visualization region, the computer generates a visual table in the data visualization region in accordance with the associations between the operand names and the corresponding shelves. The visual table includes a plurality of panes, each pane having at least one axis defined based on data for the fields associated with a respective operand name.

  6. Computer systems and methods for the query and visualization of multidimensional databases

    Science.gov (United States)

    Stolte, Chris; Tang, Diane L; Hanrahan, Patrick

    2015-03-03

    A computer displays a graphical user interface on its display. The graphical user interface includes a schema information region and a data visualization region. The schema information region includes multiple operand names, each operand corresponding to one or more fields of a multi-dimensional database that includes at least one data hierarchy. The data visualization region includes a columns shelf and a rows shelf. The computer detects user actions to associate one or more first operands with the columns shelf and to associate one or more second operands with the rows shelf. The computer generates a visual table in the data visualization region in accordance with the user actions. The visual table includes one or more panes. Each pane has an x-axis defined based on data for the one or more first operands, and each pane has a y-axis defined based on data for the one or more second operands.

  7. Evaluation of Software for Introducing Protein Structure: Visualization and Simulation

    Science.gov (United States)

    White, Brian; Kahriman, Azmin; Luberice, Lois; Idleh, Farhia

    2010-01-01

    Communicating an understanding of the forces and factors that determine a protein's structure is an important goal of many biology and biochemistry courses at a variety of levels. Many educators use computer software that allows visualization of these complex molecules for this purpose. Although visualization is in wide use and has been associated…

  8. Geothopica and the interactive analysis and visualization of the updated Italian National Geothermal Database

    Science.gov (United States)

    Trumpy, Eugenio; Manzella, Adele

    2017-02-01

    The Italian National Geothermal Database (BDNG), is the largest collection of Italian Geothermal data and was set up in the 1980s. It has since been updated both in terms of content and management tools: information on deep wells and thermal springs (with temperature > 30 °C) are currently organized and stored in a PostgreSQL relational database management system, which guarantees high performance, data security and easy access through different client applications. The BDNG is the core of the Geothopica web site, whose webGIS tool allows different types of user to access geothermal data, to visualize multiple types of datasets, and to perform integrated analyses. The webGIS tool has been recently improved by two specially designed, programmed and implemented visualization tools to display data on well lithology and underground temperatures. This paper describes the contents of the database and its software and data update, as well as the webGIS tool including the new tools for data lithology and temperature visualization. The geoinformation organized in the database and accessible through Geothopica is of use not only for geothermal purposes, but also for any kind of georesource and CO2 storage project requiring the organization of, and access to, deep underground data. Geothopica also supports project developers, researchers, and decision makers in the assessment, management and sustainable deployment of georesources.

  9. LEGER: knowledge database and visualization tool for comparative genomics of pathogenic and non-pathogenic Listeria species.

    Science.gov (United States)

    Dieterich, Guido; Kärst, Uwe; Fischer, Elmar; Wehland, Jürgen; Jänsch, Lothar

    2006-01-01

    Listeria species are ubiquitous in the environment and often contaminate foods because they grow under conditions used for food preservation. Listeria monocytogenes, the human and animal pathogen, causes Listeriosis, an infection with a high mortality rate in risk groups such as immune-compromised individuals. Furthermore, L.monocytogenes is a model organism for the study of intracellular bacterial pathogens. The publication of its genome sequence and that of the non-pathogenic species Listeria innocua initiated numerous comparative studies and efforts to sequence all species comprising the genus. The Proteome database LEGER (http://leger2.gbf.de/cgi-bin/expLeger.pl) was developed to support functional genome analyses by combining information obtained by applying bioinformatics methods and from public databases to improve the original annotations. LEGER offers three unique key features: (i) it is the first comprehensive information system focusing on the functional assignment of genes and proteins; (ii) integrated visualization tools, KEGG pathway and Genome Viewer, alleviate the functional exploration of complex data; and (iii) LEGER presents results of systematic post-genome studies, thus facilitating analyses combining computational and experimental results. Moreover, LEGER provides an unpublished membrane proteome analysis of L.innocua and in total visualizes experimentally validated information about the subcellular localizations of 789 different listerial proteins.

  10. Integrating protein structures and precomputed genealogies in the Magnum database: Examples with cellular retinoid binding proteins

    Directory of Open Access Journals (Sweden)

    Bradley Michael E

    2006-02-01

    Full Text Available Abstract Background When accurate models for the divergent evolution of protein sequences are integrated with complementary biological information, such as folded protein structures, analyses of the combined data often lead to new hypotheses about molecular physiology. This represents an excellent example of how bioinformatics can be used to guide experimental research. However, progress in this direction has been slowed by the lack of a publicly available resource suitable for general use. Results The precomputed Magnum database offers a solution to this problem for ca. 1,800 full-length protein families with at least one crystal structure. The Magnum deliverables include 1 multiple sequence alignments, 2 mapping of alignment sites to crystal structure sites, 3 phylogenetic trees, 4 inferred ancestral sequences at internal tree nodes, and 5 amino acid replacements along tree branches. Comprehensive evaluations revealed that the automated procedures used to construct Magnum produced accurate models of how proteins divergently evolve, or genealogies, and correctly integrated these with the structural data. To demonstrate Magnum's capabilities, we asked for amino acid replacements requiring three nucleotide substitutions, located at internal protein structure sites, and occurring on short phylogenetic tree branches. In the cellular retinoid binding protein family a site that potentially modulates ligand binding affinity was discovered. Recruitment of cellular retinol binding protein to function as a lens crystallin in the diurnal gecko afforded another opportunity to showcase the predictive value of a browsable database containing branch replacement patterns integrated with protein structures. Conclusion We integrated two areas of protein science, evolution and structure, on a large scale and created a precomputed database, known as Magnum, which is the first freely available resource of its kind. Magnum provides evolutionary and structural

  11. SUPFAM: A database of sequence superfamilies of protein domains

    Directory of Open Access Journals (Sweden)

    Anand B

    2004-03-01

    Full Text Available Abstract Background SUPFAM database is a compilation of superfamily relationships between protein domain families of either known or unknown 3-D structure. In SUPFAM, sequence families from Pfam and structural families from SCOP are associated, using profile matching, to result in sequence superfamilies of known structure. Subsequently all-against-all family profile matches are made to deduce a list of new potential superfamilies of yet unknown structure. Description The current version of SUPFAM (release 1.4 corresponds to significant enhancements and major developments compared to the earlier and basic version. In the present version we have used RPS-BLAST, which is robust and sensitive, for profile matching. The reliability of connections between protein families is ensured better than before by use of benchmarked criteria involving strict e-value cut-off and a minimal alignment length condition. An e-value based indication of reliability of connections is now presented in the database. Web access to a RPS-BLAST-based tool to associate a query sequence to one of the family profiles in SUPFAM is available with the current release. In terms of the scientific content the present release of SUPFAM is entirely reorganized with the use of 6190 Pfam families and 2317 structural families derived from SCOP. Due to a steep increase in the number of sequence and structural families used in SUPFAM the details of scientific content in the present release are almost entirely complementary to previous basic version. Of the 2286 families, we could relate 245 Pfam families with apparently no structural information to families of known 3-D structures, thus resulting in the identification of new families in the existing superfamilies. Using the profiles of 3904 Pfam families of yet unknown structure, an all-against-all comparison involving sequence-profile match resulted in clustering of 96 Pfam families into 39 new potential superfamilies. Conclusion SUPFAM

  12. SDR: a database of predicted specificity-determining residues in proteins.

    Science.gov (United States)

    Donald, Jason E; Shakhnovich, Eugene I

    2009-01-01

    The specificity-determining residue database (SDR database) presents residue positions where mutations are predicted to have changed protein function in large protein families. Because the database pre-calculates predictions on existing protein sequence alignments, users can quickly find the predictions by selecting the appropriate protein family or searching by protein sequence. Predictions can be used to guide mutagenesis or to gain a better understanding of specificity changes in a protein family. The database is available on the web at http://paradox.harvard.edu/sdr.

  13. DaVIE: Database for the Visualization and Integration of Epigenetic data.

    Directory of Open Access Journals (Sweden)

    Anthony Peter Fejes

    2014-09-01

    Full Text Available One of the challenges in the analysis of large data sets, particularly in a population-based setting, is the ability to perform comparisons across projects. This has to be done in such a way that the integrity of each individual project is maintained, while ensuring that the data are comparable across projects. These issues are beginning to be observed in human DNA methylation studies, as the Illumina 450k platform and next generation sequencing-based assays grow in popularity and decrease in price. This increase in productivity is enabling new insights into epigenetics, but also requires the development of pipelines and software capable of handling the large volumes of data. The specific problems inherent in creating a platform for the storage, comparison, integration and visualization of DNA methylation data include data storage, algorithm efficiency and ability to interpret the results to derive biological meaning from them. Databases provide a ready-made solution to these issues, but as yet no tools exist that that leverage these advantages while providing an intuitive user interface for interpreting results in a genomic context.We have addressed this void by integrating a database to store DNA methylation data with a web interface to query and visualize the database and a set of libraries for more complex analysis. The resulting platform is called DaVIE: Database for the Visualization and I of Epigenetics data. DaVIE can use data culled from a variety of sources, and the web interface includes the ability to group samples by sub-type, compare multiple projects and visualize genomic features in relation to sites of interest. We have used DaVIE to identify patterns of DNA methylation in specific project and across different projects, identify outlier samples, and cross-check differentially methylated CpG sites identified in specific projects across large numbers of samples.

  14. Choosing an Optimal Database for Protein Identification from Tandem Mass Spectrometry Data.

    Science.gov (United States)

    Kumar, Dhirendra; Yadav, Amit Kumar; Dash, Debasis

    2017-01-01

    Database searching is the preferred method for protein identification from digital spectra of mass to charge ratios (m/z) detected for protein samples through mass spectrometers. The search database is one of the major influencing factors in discovering proteins present in the sample and thus in deriving biological conclusions. In most cases the choice of search database is arbitrary. Here we describe common search databases used in proteomic studies and their impact on final list of identified proteins. We also elaborate upon factors like composition and size of the search database that can influence the protein identification process. In conclusion, we suggest that choice of the database depends on the type of inferences to be derived from proteomics data. However, making additional efforts to build a compact and concise database for a targeted question should generally be rewarding in achieving confident protein identifications.

  15. Gene composer: database software for protein construct design, codon engineering, and gene synthesis.

    Science.gov (United States)

    Lorimer, Don; Raymond, Amy; Walchli, John; Mixon, Mark; Barrow, Adrienne; Wallace, Ellen; Grice, Rena; Burgin, Alex; Stewart, Lance

    2009-04-21

    To improve efficiency in high throughput protein structure determination, we have developed a database software package, Gene Composer, which facilitates the information-rich design of protein constructs and their codon engineered synthetic gene sequences. With its modular workflow design and numerous graphical user interfaces, Gene Composer enables researchers to perform all common bio-informatics steps used in modern structure guided protein engineering and synthetic gene engineering. An interactive Alignment Viewer allows the researcher to simultaneously visualize sequence conservation in the context of known protein secondary structure, ligand contacts, water contacts, crystal contacts, B-factors, solvent accessible area, residue property type and several other useful property views. The Construct Design Module enables the facile design of novel protein constructs with altered N- and C-termini, internal insertions or deletions, point mutations, and desired affinity tags. The modifications can be combined and permuted into multiple protein constructs, and then virtually cloned in silico into defined expression vectors. The Gene Design Module uses a protein-to-gene algorithm that automates the back-translation of a protein amino acid sequence into a codon engineered nucleic acid gene sequence according to a selected codon usage table with minimal codon usage threshold, defined G:C% content, and desired sequence features achieved through synonymous codon selection that is optimized for the intended expression system. The gene-to-oligo algorithm of the Gene Design Module plans out all of the required overlapping oligonucleotides and mutagenic primers needed to synthesize the desired gene constructs by PCR, and for physically cloning them into selected vectors by the most popular subcloning strategies. We present a complete description of Gene Composer functionality, and an efficient PCR-based synthetic gene assembly procedure with mis-match specific endonuclease

  16. Gene Composer: database software for protein construct design, codon engineering, and gene synthesis

    Directory of Open Access Journals (Sweden)

    Mixon Mark

    2009-04-01

    Full Text Available Abstract Background To improve efficiency in high throughput protein structure determination, we have developed a database software package, Gene Composer, which facilitates the information-rich design of protein constructs and their codon engineered synthetic gene sequences. With its modular workflow design and numerous graphical user interfaces, Gene Composer enables researchers to perform all common bio-informatics steps used in modern structure guided protein engineering and synthetic gene engineering. Results An interactive Alignment Viewer allows the researcher to simultaneously visualize sequence conservation in the context of known protein secondary structure, ligand contacts, water contacts, crystal contacts, B-factors, solvent accessible area, residue property type and several other useful property views. The Construct Design Module enables the facile design of novel protein constructs with altered N- and C-termini, internal insertions or deletions, point mutations, and desired affinity tags. The modifications can be combined and permuted into multiple protein constructs, and then virtually cloned in silico into defined expression vectors. The Gene Design Module uses a protein-to-gene algorithm that automates the back-translation of a protein amino acid sequence into a codon engineered nucleic acid gene sequence according to a selected codon usage table with minimal codon usage threshold, defined G:C% content, and desired sequence features achieved through synonymous codon selection that is optimized for the intended expression system. The gene-to-oligo algorithm of the Gene Design Module plans out all of the required overlapping oligonucleotides and mutagenic primers needed to synthesize the desired gene constructs by PCR, and for physically cloning them into selected vectors by the most popular subcloning strategies. Conclusion We present a complete description of Gene Composer functionality, and an efficient PCR-based synthetic gene

  17. A web-based data visualization tool for the MIMIC-II database.

    Science.gov (United States)

    Lee, Joon; Ribey, Evan; Wallace, James R

    2016-02-04

    Although MIMIC-II, a public intensive care database, has been recognized as an invaluable resource for many medical researchers worldwide, becoming a proficient MIMIC-II researcher requires knowledge of SQL programming and an understanding of the MIMIC-II database schema. These are challenging requirements especially for health researchers and clinicians who may have limited computer proficiency. In order to overcome this challenge, our objective was to create an interactive, web-based MIMIC-II data visualization tool that first-time MIMIC-II users can easily use to explore the database. The tool offers two main features: Explore and Compare. The Explore feature enables the user to select a patient cohort within MIMIC-II and visualize the distributions of various administrative, demographic, and clinical variables within the selected cohort. The Compare feature enables the user to select two patient cohorts and visually compare them with respect to a variety of variables. The tool is also helpful to experienced MIMIC-II researchers who can use it to substantially accelerate the cumbersome and time-consuming steps of writing SQL queries and manually visualizing extracted data. Any interested researcher can use the MIMIC-II data visualization tool for free to quickly and conveniently conduct a preliminary investigation on MIMIC-II with a few mouse clicks. Researchers can also use the tool to learn the characteristics of the MIMIC-II patients. Since it is still impossible to conduct multivariable regression inside the tool, future work includes adding analytics capabilities. Also, the next version of the tool will aim to utilize MIMIC-III which contains more data.

  18. CancerPPD: a database of anticancer peptides and proteins.

    Science.gov (United States)

    Tyagi, Atul; Tuknait, Abhishek; Anand, Priya; Gupta, Sudheer; Sharma, Minakshi; Mathur, Deepika; Joshi, Anshika; Singh, Sandeep; Gautam, Ankur; Raghava, Gajendra P S

    2015-01-01

    CancerPPD (http://crdd.osdd.net/raghava/cancerppd/) is a repository of experimentally verified anticancer peptides (ACPs) and anticancer proteins. Data were manually collected from published research articles, patents and from other databases. The current release of CancerPPD consists of 3491 ACP and 121 anticancer protein entries. Each entry provides comprehensive information related to a peptide like its source of origin, nature of the peptide, anticancer activity, N- and C-terminal modifications, conformation, etc. Additionally, CancerPPD provides the information of around 249 types of cancer cell lines and 16 different assays used for testing the ACPs. In addition to natural peptides, CancerPPD contains peptides having non-natural, chemically modified residues and D-amino acids. Besides this primary information, CancerPPD stores predicted tertiary structures as well as peptide sequences in SMILES format. Tertiary structures of peptides were predicted using the state-of-art method, PEPstr and secondary structural states were assigned using DSSP. In order to assist users, a number of web-based tools have been integrated, these include keyword search, data browsing, sequence and structural similarity search. We believe that CancerPPD will be very useful in designing peptide-based anticancer therapeutics. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  19. ChemProt-2.0: visual navigation in a disease chemical biology database

    DEFF Research Database (Denmark)

    Kjærulff, Sonny Kim; Wich, Louis; Kringelum, Jens Vindahl

    2013-01-01

    ChemProt-2.0 (http://www.cbs.dtu.dk/services/ChemProt-2.0) is a public available compilation of multiple chemical-protein annotation resources integrated with diseases and clinical outcomes information. The database has been updated to > 1.15 million compounds with 5.32 millions bioactivity...

  20. Using the clustered circular layout as an informative method for visualizing protein-protein interaction networks.

    Science.gov (United States)

    Fung, David C Y; Wilkins, Marc R; Hart, David; Hong, Seok-Hee

    2010-07-01

    The force-directed layout is commonly used in computer-generated visualizations of protein-protein interaction networks. While it is good for providing a visual outline of the protein complexes and their interactions, it has two limitations when used as a visual analysis method. The first is poor reproducibility. Repeated running of the algorithm does not necessarily generate the same layout, therefore, demanding cognitive readaptation on the investigator's part. The second limitation is that it does not explicitly display complementary biological information, e.g. Gene Ontology, other than the protein names or gene symbols. Here, we present an alternative layout called the clustered circular layout. Using the human DNA replication protein-protein interaction network as a case study, we compared the two network layouts for their merits and limitations in supporting visual analysis.

  1. CASMI—A visualization tool for the World Stress Map database

    Science.gov (United States)

    Heidbach, Oliver; Höhne, Jens

    2008-07-01

    The World Stress Map (WSM) project has compiled a global database of quality-ranked data records on the contemporary tectonic stresses in the Earth's crust. The WSM 2005 database release contains approximately 16 000 data records from different types of stress indicators such as earthquake focal mechanisms solutions, well bore breakouts, hydraulic fracturing and overcoring measurements, as well as quaternary fault-slip data and volcanic alignments. To provide a software tool for database visualization, analysis and interpretation of stress data as well its integration with other data records, we developed the program CASMI. This public domain software tool for Unix-like operating systems enables the selection of stress data records from the WSM database according to location, data quality, type of stress indicator, and depth. Each selected data record is visualized by a symbol that represents the type of stress indicator and the orientation of the maximum horizontal compressive stress. Symbol size is proportional to the quality of the data record, and the colour indicates different tectonic regimes. Stress maps can be produced in different geographical projections and high-quality output formats. CASMI also allows the integration of user-defined stress data sets and a wide range of other data such as topography, Harvard centroid moment tensors, polygons, text data, and plate motion trajectories. CASMI, including the WSM 2005 database release, can be requested free of charge from the project's website at http://www.world-stress-map.org/casmi. We present two stress map examples generated with CASMI ranging from plate-wide to regional scale: (1) A stress map of central Europe, that reveals the correlation of stress field orientation and relative plate motion. (2) The fan-shape stress pattern in North Germany.

  2. Development of human protein reference database as an initial platform for approaching systems biology in humans

    DEFF Research Database (Denmark)

    Peri, Suraj; Navarro, J Daniel; Amanchy, Ramars

    2003-01-01

    Human Protein Reference Database (HPRD) is an object database that integrates a wealth of information relevant to the function of human proteins in health and disease. Data pertaining to thousands of protein-protein interactions, posttranslational modifications, enzyme/substrate relationships......, disease associations, tissue expression, and subcellular localization were extracted from the literature for a nonredundant set of 2750 human proteins. Almost all the information was obtained manually by biologists who read and interpreted >300,000 published articles during the annotation process...

  3. Visualization of Periplasmic and Cytoplasmic Proteins with a Self-Labeling Protein Tag.

    Science.gov (United States)

    Ke, Na; Landgraf, Dirk; Paulsson, Johan; Berkmen, Mehmet

    2016-01-19

    The use of fluorescent and luminescent proteins in visualizing proteins has become a powerful tool in understanding molecular and cellular processes within living organisms. This success has resulted in an ever-increasing demand for new and more versatile protein-labeling tools that permit light-based detection of proteins within living cells. In this report, we present data supporting the use of the self-labeling HaloTag protein as a light-emitting reporter for protein fusions within the model prokaryote Escherichia coli. We show that functional protein fusions of the HaloTag can be detected both in vivo and in vitro when expressed within the cytoplasmic or periplasmic compartments of E. coli. The capacity to visually detect proteins localized in various prokaryotic compartments expands today's molecular biologist toolbox and paves the path to new applications. Visualizing proteins microscopically within living cells is important for understanding both the biology of cells and the role of proteins within living cells. Currently, the most common tool is green fluorescent protein (GFP). However, fluorescent proteins such as GFP have many limitations; therefore, the field of molecular biology is always in need of new tools to visualize proteins. In this paper, we demonstrate, for the first time, the use of HaloTag to visualize proteins in two different compartments within the model prokaryote Escherichia coli. The use of HaloTag as an additional tool to visualize proteins within prokaryotes increases our capacity to ask about and understand the role of proteins within living cells. Copyright © 2016 Ke et al.

  4. The Nuclear Protein Database (NPD): sub-nuclear localisation and functional annotation of the nuclear proteome

    Science.gov (United States)

    Dellaire, G.; Farrall, R.; Bickmore, W.A.

    2003-01-01

    The Nuclear Protein Database (NPD) is a curated database that contains information on more than 1300 vertebrate proteins that are thought, or are known, to localise to the cell nucleus. Each entry is annotated with information on predicted protein size and isoelectric point, as well as any repeats, motifs or domains within the protein sequence. In addition, information on the sub-nuclear localisation of each protein is provided and the biological and molecular functions are described using Gene Ontology (GO) terms. The database is searchable by keyword, protein name, sub-nuclear compartment and protein domain/motif. Links to other databases are provided (e.g. Entrez, SWISS-PROT, OMIM, PubMed, PubMed Central). Thus, NPD provides a gateway through which the nuclear proteome may be explored. The database can be accessed at http://npd.hgu.mrc.ac.uk and is updated monthly. PMID:12520015

  5. The Nuclear Protein Database (NPD): sub-nuclear localisation and functional annotation of the nuclear proteome.

    Science.gov (United States)

    Dellaire, G; Farrall, R; Bickmore, W A

    2003-01-01

    The Nuclear Protein Database (NPD) is a curated database that contains information on more than 1300 vertebrate proteins that are thought, or are known, to localise to the cell nucleus. Each entry is annotated with information on predicted protein size and isoelectric point, as well as any repeats, motifs or domains within the protein sequence. In addition, information on the sub-nuclear localisation of each protein is provided and the biological and molecular functions are described using Gene Ontology (GO) terms. The database is searchable by keyword, protein name, sub-nuclear compartment and protein domain/motif. Links to other databases are provided (e.g. Entrez, SWISS-PROT, OMIM, PubMed, PubMed Central). Thus, NPD provides a gateway through which the nuclear proteome may be explored. The database can be accessed at http://npd.hgu.mrc.ac.uk and is updated monthly.

  6. CLIPZ: a database and analysis environment for experimentally determined binding sites of RNA-binding proteins.

    Science.gov (United States)

    Khorshid, Mohsen; Rodak, Christoph; Zavolan, Mihaela

    2011-01-01

    The stability, localization and translation rate of mRNAs are regulated by a multitude of RNA-binding proteins (RBPs) that find their targets directly or with the help of guide RNAs. Among the experimental methods for mapping RBP binding sites, cross-linking and immunoprecipitation (CLIP) coupled with deep sequencing provides transcriptome-wide coverage as well as high resolution. However, partly due to their vast volume, the data that were so far generated in CLIP experiments have not been put in a form that enables fast and interactive exploration of binding sites. To address this need, we have developed the CLIPZ database and analysis environment. Binding site data for RBPs such as Argonaute 1-4, Insulin-like growth factor II mRNA-binding protein 1-3, TNRC6 proteins A-C, Pumilio 2, Quaking and Polypyrimidine tract binding protein can be visualized at the level of the genome and of individual transcripts. Individual users can upload their own sequence data sets while being able to limit the access to these data to specific users, and analyses of the public and private data sets can be performed interactively. CLIPZ, available at http://www.clipz.unibas.ch, aims to provide an open access repository of information for post-transcriptional regulatory elements.

  7. VASCo: computation and visualization of annotated protein surface contacts

    Directory of Open Access Journals (Sweden)

    Thallinger Gerhard G

    2009-01-01

    Full Text Available Abstract Background Structural data from crystallographic analyses contain a vast amount of information on protein-protein contacts. Knowledge on protein-protein interactions is essential for understanding many processes in living cells. The methods to investigate these interactions range from genetics to biophysics, crystallography, bioinformatics and computer modeling. Also crystal contact information can be useful to understand biologically relevant protein oligomerisation as they rely in principle on the same physico-chemical interaction forces. Visualization of crystal and biological contact data including different surface properties can help to analyse protein-protein interactions. Results VASCo is a program package for the calculation of protein surface properties and the visualization of annotated surfaces. Special emphasis is laid on protein-protein interactions, which are calculated based on surface point distances. The same approach is used to compare surfaces of two aligned molecules. Molecular properties such as electrostatic potential or hydrophobicity are mapped onto these surface points. Molecular surfaces and the corresponding properties are calculated using well established programs integrated into the package, as well as using custom developed programs. The modular package can easily be extended to include new properties for annotation. The output of the program is most conveniently displayed in PyMOL using a custom-made plug-in. Conclusion VASCo supplements other available protein contact visualisation tools and provides additional information on biological interactions as well as on crystal contacts. The tool provides a unique feature to compare surfaces of two aligned molecules based on point distances and thereby facilitates the visualization and analysis of surface differences.

  8. DaVIE: Database for the Visualization and Integration of Epigenetic data.

    Science.gov (United States)

    Fejes, Anthony P; Jones, Meaghan J; Kobor, Michael S

    2014-01-01

    One of the challenges in the analysis of large data sets, particularly in a population-based setting, is the ability to perform comparisons across projects. This has to be done in such a way that the integrity of each individual project is maintained, while ensuring that the data are comparable across projects. These issues are beginning to be observed in human DNA methylation studies, as the Illumina 450k platform and next generation sequencing-based assays grow in popularity and decrease in price. This increase in productivity is enabling new insights into epigenetics, but also requires the development of pipelines and software capable of handling the large volumes of data. The specific problems inherent in creating a platform for the storage, comparison, integration, and visualization of DNA methylation data include data storage, algorithm efficiency and ability to interpret the results to derive biological meaning from them. Databases provide a ready-made solution to these issues, but as yet no tools exist that that leverage these advantages while providing an intuitive user interface for interpreting results in a genomic context. We have addressed this void by integrating a database to store DNA methylation data with a web interface to query and visualize the database and a set of libraries for more complex analysis. The resulting platform is called DaVIE: Database for the Visualization and Integration of Epigenetics data. DaVIE can use data culled from a variety of sources, and the web interface includes the ability to group samples by sub-type, compare multiple projects and visualize genomic features in relation to sites of interest. We have used DaVIE to identify patterns of DNA methylation in specific projects and across different projects, identify outlier samples, and cross-check differentially methylated CpG sites identified in specific projects across large numbers of samples. A demonstration server has been setup using GEO data at http

  9. Atomic analysis of protein-protein interfaces with known inhibitors: the 2P2I database.

    Directory of Open Access Journals (Sweden)

    Raphaël Bourgeas

    Full Text Available BACKGROUND: In the last decade, the inhibition of protein-protein interactions (PPIs has emerged from both academic and private research as a new way to modulate the activity of proteins. Inhibitors of these original interactions are certainly the next generation of highly innovative drugs that will reach the market in the next decade. However, in silico design of such compounds still remains challenging. METHODOLOGY/PRINCIPAL FINDINGS: Here we describe this particular PPI chemical space through the presentation of 2P2I(DB, a hand-curated database dedicated to the structure of PPIs with known inhibitors. We have analyzed protein/protein and protein/inhibitor interfaces in terms of geometrical parameters, atom and residue properties, buried accessible surface area and other biophysical parameters. The interfaces found in 2P2I(DB were then compared to those of representative datasets of heterodimeric complexes. We propose a new classification of PPIs with known inhibitors into two classes depending on the number of segments present at the interface and corresponding to either a single secondary structure element or to a more globular interacting domain. 2P2I(DB complexes share global shape properties with standard transient heterodimer complexes, but their accessible surface areas are significantly smaller. No major conformational changes are seen between the different states of the proteins. The interfaces are more hydrophobic than general PPI's interfaces, with less charged residues and more non-polar atoms. Finally, fifty percent of the complexes in the 2P2I(DB dataset possess more hydrogen bonds than typical protein-protein complexes. Potential areas of study for the future are proposed, which include a new classification system consisting of specific families and the identification of PPI targets with high druggability potential based on key descriptors of the interaction. CONCLUSIONS: 2P2I database stores structural information about PPIs

  10. Atomic analysis of protein-protein interfaces with known inhibitors: the 2P2I database.

    Science.gov (United States)

    Bourgeas, Raphaël; Basse, Marie-Jeanne; Morelli, Xavier; Roche, Philippe

    2010-03-09

    In the last decade, the inhibition of protein-protein interactions (PPIs) has emerged from both academic and private research as a new way to modulate the activity of proteins. Inhibitors of these original interactions are certainly the next generation of highly innovative drugs that will reach the market in the next decade. However, in silico design of such compounds still remains challenging. Here we describe this particular PPI chemical space through the presentation of 2P2I(DB), a hand-curated database dedicated to the structure of PPIs with known inhibitors. We have analyzed protein/protein and protein/inhibitor interfaces in terms of geometrical parameters, atom and residue properties, buried accessible surface area and other biophysical parameters. The interfaces found in 2P2I(DB) were then compared to those of representative datasets of heterodimeric complexes. We propose a new classification of PPIs with known inhibitors into two classes depending on the number of segments present at the interface and corresponding to either a single secondary structure element or to a more globular interacting domain. 2P2I(DB) complexes share global shape properties with standard transient heterodimer complexes, but their accessible surface areas are significantly smaller. No major conformational changes are seen between the different states of the proteins. The interfaces are more hydrophobic than general PPI's interfaces, with less charged residues and more non-polar atoms. Finally, fifty percent of the complexes in the 2P2I(DB) dataset possess more hydrogen bonds than typical protein-protein complexes. Potential areas of study for the future are proposed, which include a new classification system consisting of specific families and the identification of PPI targets with high druggability potential based on key descriptors of the interaction. 2P2I database stores structural information about PPIs with known inhibitors and provides a useful tool for biologists to assess

  11. GIS-based NEXRAD Stage III precipitation database: automated approaches for data processing and visualization

    Science.gov (United States)

    Xie, Hongjie; Zhou, Xiaobing; Vivoni, Enrique R.; Hendrickx, Jan M. H.; Small, Eric E.

    2005-02-01

    This study develops a geographical information system (GIS) approach for automated processing of the Next Generation Weather Radar (NEXRAD) Stage III precipitation data. The automated processing system, implemented by using commercial GIS and a number of Perl scripts and C/C++ programs, allows for rapid data display, requires less storage capacity, and provides the analytical and data visualization tools inherent in GIS as compared to traditional methods. In this paper, we illustrate the development of automatic techniques to preprocess raw NEXRAD Stage III data, transform the data to a GIS format, select regions of interest, and retrieve statistical rainfall analysis over user-defined spatial and temporal scales. Computational expense is reduced significantly using the GIS-based automated techniques. For example, 1-year Stage III data processing (˜9000 files) for the West Gulf River Forecast Center takes about 3 days of computation time instead of months of manual work. To illustrate the radar precipitation database and its visualization capabilities, we present three application examples: (1) GIS-based data visualization and integration, and ArcIMS-based web visualization and publication system, (2) a spatial-temporal analysis of monsoon rainfall patterns over the Rio Grande River Basin, and (3) the potential of GIS-based radar data for distributed watershed models. We conclude by discussing the potential applications of automated techniques for radar rainfall processing and its integration with GIS-based hydrologic information systems.

  12. Databases

    Directory of Open Access Journals (Sweden)

    Nick Ryan

    2004-01-01

    Full Text Available Databases are deeply embedded in archaeology, underpinning and supporting many aspects of the subject. However, as well as providing a means for storing, retrieving and modifying data, databases themselves must be a result of a detailed analysis and design process. This article looks at this process, and shows how the characteristics of data models affect the process of database design and implementation. The impact of the Internet on the development of databases is examined, and the article concludes with a discussion of a range of issues associated with the recording and management of archaeological data.

  13. O-GLYCOBASE version 4.0: a revised database of O-glycosylated proteins

    DEFF Research Database (Denmark)

    Gupta, Ramneek; Birch, Hanne; Rapacki, Krzysztof

    1999-01-01

    O-GLYCBASE is a database of glycoproteins with O-linked glycosylation sites. Entries with at least one experimentally verified O-glycosylation site have been complied from protein sequence databases and literature. Each entry contains information about the glycan involved, the species, sequence......, a literature reference and http-linked cross-references to other databases. Version 4.0 contains 179 protein entries, an approximate 15% increase over the last version. Sequence logos representing the acceptor specificity patterns for GalNAc, GlcNAc, mannosyl and xylosyl transferases are shown. The O......-GLYCBASE database is available through the WWW at http://www.cbs.dtu.dk/databases/OGLYCBASE/....

  14. Database of two-dimensional polyacrylamide gel electrophoresis of proteins labeled with CyDye DIGE Fluor saturation dye.

    Science.gov (United States)

    Fujii, Kazuyasu; Kondo, Tadashi; Yokoo, Hideki; Okano, Tetsuya; Yamada, Masayo; Yamada, Tesshi; Iwatsuki, Keiji; Hirohashi, Setsuo

    2006-03-01

    CyDye DIGE Fluor saturation dye (saturation dye, GE Healthcare Amersham Biosciences) enables highly sensitive 2-D PAGE. As the dye reacts with all reduced cysteine thiols, 2-D PAGE can be performed with a lower amount of protein, compared with CyDye DIGE Fluor minimal dye (GE Healthcare Amersham Biosciences), the sensitivity of which is equivalent to that of silver staining. We constructed a 2-D map of the saturation dye-labeled proteins of a liver cancer cell line (HepG2) and identified by MS 92 proteins corresponding to 123 protein spots. Functional classification revealed that the identified proteins had chaperone, protein binding, nucleotide binding, metal ion binding, isomerase activity, and motor activity. The functional distribution and the cysteine contents of the proteins were similar to those in the most comprehensive 2-D database of hepatoma cells (Seow et al.., Electrophoresis 2000, 21, 1787-1813), where silver staining was used for protein visualization. Hierarchical clustering on the basis of the quantitative expression profiles of the 123 characterized spots labeled with two charge- and mass-matched saturation dyes (Cy3 and Cy5) discriminated between nine hepatocellular carcinoma cell lines and primary cultured hepatocytes from five individuals, suggesting the utility of saturation dye and our database for proteomic studies of liver cancer.

  15. Versatile annotation and publication quality visualization of protein complexes using POLYVIEW-3D

    Directory of Open Access Journals (Sweden)

    Meller Jaroslaw

    2007-08-01

    Full Text Available Abstract Background Macromolecular visualization as well as automated structural and functional annotation tools play an increasingly important role in the post-genomic era, contributing significantly towards the understanding of molecular systems and processes. For example, three dimensional (3D models help in exploring protein active sites and functional hot spots that can be targeted in drug design. Automated annotation and visualization pipelines can also reveal other functionally important attributes of macromolecules. These goals are dependent on the availability of advanced tools that integrate better the existing databases, annotation servers and other resources with state-of-the-art rendering programs. Results We present a new tool for protein structure analysis, with the focus on annotation and visualization of protein complexes, which is an extension of our previously developed POLYVIEW web server. By integrating the web technology with state-of-the-art software for macromolecular visualization, such as the PyMol program, POLYVIEW-3D enables combining versatile structural and functional annotations with a simple web-based interface for creating publication quality structure rendering, as well as animated images for Powerpoint™, web sites and other electronic resources. The service is platform independent and no plug-ins are required. Several examples of how POLYVIEW-3D can be used for structural and functional analysis in the context of protein-protein interactions are presented to illustrate the available annotation options. Conclusion POLYVIEW-3D server features the PyMol image rendering that provides detailed and high quality presentation of macromolecular structures, with an easy to use web-based interface. POLYVIEW-3D also provides a wide array of options for automated structural and functional analysis of proteins and their complexes. Thus, the POLYVIEW-3D server may become an important resource for researches and educators in

  16. Visualization and Identification of Fatty Acylated Proteins Using Chemical Reporters

    Science.gov (United States)

    Yount, Jacob S.; Zhang, Mingzi M.; Hang, Howard C.

    2011-01-01

    Protein fatty-acylation is the covalent addition of a lipid chain at specific amino acids. This modification changes the inherent hydrophobicity of a protein, often targeting it to cellular membrane compartments. Acylation may also regulate protein activity, stability, and protein-protein interactions. Its study is therefore critical to understanding the biology of the hundreds of proteins described to be lipid-modified, as well as those that are continually being discovered. Fatty-acylation can be analyzed using chemical reporters that mimic natural lipids and contain bioorthogonal chemical handles allowing them to be reacted with detection tags such as fluorophores or affinity tags. Our laboratory has successfully utilized alkynyl-chemical reporters of protein myristoylation, S-palmitoylation, prenylation and acetylation. Protocol 1 describes metabolic incorporation of these chemical reporters onto proteins in living cells. Protocol 2 describes the global visualization of reporter-labeled proteins by selectively reacting alkyne-containing chemical reporter-labeled proteins in cell lysates with azido-rhodamine via the click chemistry and fluorescence gel scanning. Protocol 3 describes analysis of protein acylation on individual candidate proteins using immunoprecipitation, click chemistry and fluorescence gel scanning. Finally, Protocol 4 allows identification of novel fatty acylated proteins by reacting chemical reporter-labeled proteins with azido-biotin via click chemistry and selective retrieval using streptavidin beads. This may be particularly valuable for the examination of S-palmitoylomes in different cell types or activation states, as these modifications do not occur on readily predicted consensus amino acid motifs. Overall, these techniques provide robust, non-radioactive methods for examining the acylation states of full cellular proteomes and individual proteins of interest. PMID:23061028

  17. PACSY, a relational database management system for protein structure and chemical shift analysis

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Woonghee, E-mail: whlee@nmrfam.wisc.edu [University of Wisconsin-Madison, National Magnetic Resonance Facility at Madison, and Biochemistry Department (United States); Yu, Wookyung [Center for Proteome Biophysics, Pusan National University, Department of Physics (Korea, Republic of); Kim, Suhkmann [Pusan National University, Department of Chemistry and Chemistry Institute for Functional Materials (Korea, Republic of); Chang, Iksoo [Center for Proteome Biophysics, Pusan National University, Department of Physics (Korea, Republic of); Lee, Weontae, E-mail: wlee@spin.yonsei.ac.kr [Yonsei University, Structural Biochemistry and Molecular Biophysics Laboratory, Department of Biochemistry (Korea, Republic of); Markley, John L., E-mail: markley@nmrfam.wisc.edu [University of Wisconsin-Madison, National Magnetic Resonance Facility at Madison, and Biochemistry Department (United States)

    2012-10-15

    PACSY (Protein structure And Chemical Shift NMR spectroscopY) is a relational database management system that integrates information from the Protein Data Bank, the Biological Magnetic Resonance Data Bank, and the Structural Classification of Proteins database. PACSY provides three-dimensional coordinates and chemical shifts of atoms along with derived information such as torsion angles, solvent accessible surface areas, and hydrophobicity scales. PACSY consists of six relational table types linked to one another for coherence by key identification numbers. Database queries are enabled by advanced search functions supported by an RDBMS server such as MySQL or PostgreSQL. PACSY enables users to search for combinations of information from different database sources in support of their research. Two software packages, PACSY Maker for database creation and PACSY Analyzer for database analysis, are available from http://pacsy.nmrfam.wisc.eduhttp://pacsy.nmrfam.wisc.edu.

  18. PACSY, a relational database management system for protein structure and chemical shift analysis

    Science.gov (United States)

    Lee, Woonghee; Yu, Wookyung; Kim, Suhkmann; Chang, Iksoo

    2012-01-01

    PACSY (Protein structure And Chemical Shift NMR spectroscopY) is a relational database management system that integrates information from the Protein Data Bank, the Biological Magnetic Resonance Data Bank, and the Structural Classification of Proteins database. PACSY provides three-dimensional coordinates and chemical shifts of atoms along with derived information such as torsion angles, solvent accessible surface areas, and hydrophobicity scales. PACSY consists of six relational table types linked to one another for coherence by key identification numbers. Database queries are enabled by advanced search functions supported by an RDBMS server such as MySQL or PostgreSQL. PACSY enables users to search for combinations of information from different database sources in support of their research. Two software packages, PACSY Maker for database creation and PACSY Analyzer for database analysis, are available from http://pacsy.nmrfam.wisc.edu. PMID:22903636

  19. UNcleProt (Universal Nuclear Protein database of barley): The first nuclear protein database that distinguishes proteins from different phases of the cell cycle

    Czech Academy of Sciences Publication Activity Database

    Blavet, Nicolas; Uřinovská, J.; Jeřábková, Hana; Chamrád, I.; Vrána, Jan; Lenobel, R.; Beinhauer, D.; Šebela, M.; Doležel, Jaroslav; Petrovská, Beáta

    2017-01-01

    Roč. 8, č. 1 (2017), s. 70-80 ISSN 1949-1034 R&D Projects: GA ČR(CZ) GA14-28443S; GA MŠk(CZ) LO1204 Institutional support: RVO:61389030 Keywords : cicer-arietinum l. * rice oryza-sativa * chromatin-associated proteins * proteomic analysis * mitotic chromosomes * dehydration * localization * chickpea * network * phosphoproteome * barley * cell cycle * database * flow-cytometry * localization * mass spectrometry * nuclear proteome * nucleus Subject RIV: CE - Biochemistry Impact factor: 2.387, year: 2016

  20. Remote access to ACNUC nucleotide and protein sequence databases at PBIL.

    Science.gov (United States)

    Gouy, Manolo; Delmotte, Stéphane

    2008-04-01

    The ACNUC biological sequence database system provides powerful and fast query and extraction capabilities to a variety of nucleotide and protein sequence databases. The collection of ACNUC databases served by the Pôle Bio-Informatique Lyonnais includes the EMBL, GenBank, RefSeq and UniProt nucleotide and protein sequence databases and a series of other sequence databases that support comparative genomics analyses: HOVERGEN and HOGENOM containing families of homologous protein-coding genes from vertebrate and prokaryotic genomes, respectively; Ensembl and Genome Reviews for analyses of prokaryotic and of selected eukaryotic genomes. This report describes the main features of the ACNUC system and the access to ACNUC databases from any internet-connected computer. Such access was made possible by the definition of a remote ACNUC access protocol and the implementation of Application Programming Interfaces between the C, Python and R languages and this communication protocol. Two retrieval programs for ACNUC databases, Query_win, with a graphical user interface and raa_query, with a command line interface, are also described. Altogether, these bioinformatics tools provide users with either ready-to-use means of querying remote sequence databases through a variety of selection criteria, or a simple way to endow application programs with an extensive access to these databases. Remote access to ACNUC databases is open to all and fully documented (http://pbil.univ-lyon1.fr/databases/acnuc/acnuc.html).

  1. Deep Multimodal Pain Recognition: A Database and Comparison of Spatio-Temporal Visual Modalities

    DEFF Research Database (Denmark)

    Haque, Mohammad Ahsanul; Nasrollahi, Kamal; Moeslund, Thomas B.

    2018-01-01

    inspection by experts. However, automatic pain assessment systems from facial videos are also rapidly evolving due to the need of managing pain in a robust and cost effective way. Among different challenges of automatic pain assessment from facial video data two issues are increasingly prevalent: first......PAIN)' database, for RGBDT pain level recognition in sequences. We provide a first baseline results including 5 pain levels recognition by analyzing independent visual modalities and their fusion with CNN and LSTM models. From the experimental evaluation we observe that fusion of modalities helps to enhance...... recognition performance of pain levels in comparison to isolated ones. In particular, the combination of RGB, D, and T in an early fusion fashion achieved the best recognition rate....

  2. SHEETSPAIR: A Database of Amino Acid Pairs in Protein Sheet Structures

    Directory of Open Access Journals (Sweden)

    Ning Zhang

    2007-10-01

    Full Text Available Within folded strands of a protein, amino acids (AAs on every adjacent two strands form a pair of AAs. To explore the interactions between strands in a protein sheet structure, we have established an Internet-accessible relational database named SheetsPairs based on SQL Server 2000. The database has collected AAs pairs in proteins with detailed information. Furthermore, it utilizes a non-freetext database structure to store protein sequences and a specific database table with a unique number to store strands, which provides more searching options and rapid and accurate access to data queries. An IIS web server has been set up for data retrieval through a custom web interface, which enables complex data queries. Also searchable are parallel or anti-parallel folded strands and the list of strands in a specified protein.

  3. ExDom: an integrated database for comparative analysis of the exon-intron structures of protein domains in eukaryotes.

    Science.gov (United States)

    Bhasi, Ashwini; Philip, Philge; Manikandan, Vinu; Senapathy, Periannan

    2009-01-01

    We have developed ExDom, a unique database for the comparative analysis of the exon-intron structures of 96 680 protein domains from seven eukaryotic organisms (Homo sapiens, Mus musculus, Bos taurus, Rattus norvegicus, Danio rerio, Gallus gallus and Arabidopsis thaliana). ExDom provides integrated access to exon-domain data through a sophisticated web interface which has the following analytical capabilities: (i) intergenomic and intragenomic comparative analysis of exon-intron structure of domains; (ii) color-coded graphical display of the domain architecture of proteins correlated with their corresponding exon-intron structures; (iii) graphical analysis of multiple sequence alignments of amino acid and coding nucleotide sequences of homologous protein domains from seven organisms; (iv) comparative graphical display of exon distributions within the tertiary structures of protein domains; and (v) visualization of exon-intron structures of alternative transcripts of a gene correlated to variations in the domain architecture of corresponding protein isoforms. These novel analytical features are highly suited for detailed investigations on the exon-intron structure of domains and make ExDom a powerful tool for exploring several key questions concerning the function, origin and evolution of genes and proteins. ExDom database is freely accessible at: http://66.170.16.154/ExDom/.

  4. License - Yeast Interacting Proteins Database | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available 2009 Takashi Ito(the University of Tokyo) licensed under CC Attribution-Share Alike 2.1 Japan . The summary... of the Creative Commons Attribution-Share Alike 2.1 Japan is found here . With r...the Standard License. Standard License The Standard License for this database is the license specified in the Creative Commons Attrib...ution-Share Alike 2.1 Japan . If you use data from this

  5. ModuleRole: a tool for modulization, role determination and visualization in protein-protein interaction networks.

    Science.gov (United States)

    Li, Guipeng; Li, Ming; Zhang, Yiwei; Wang, Dong; Li, Rong; Guimerà, Roger; Gao, Juntao Tony; Zhang, Michael Q

    2014-01-01

    Rapidly increasing amounts of (physical and genetic) protein-protein interaction (PPI) data are produced by various high-throughput techniques, and interpretation of these data remains a major challenge. In order to gain insight into the organization and structure of the resultant large complex networks formed by interacting molecules, using simulated annealing, a method based on the node connectivity, we developed ModuleRole, a user-friendly web server tool which finds modules in PPI network and defines the roles for every node, and produces files for visualization in Cytoscape and Pajek. For given proteins, it analyzes the PPI network from BioGRID database, finds and visualizes the modules these proteins form, and then defines the role every node plays in this network, based on two topological parameters Participation Coefficient and Z-score. This is the first program which provides interactive and very friendly interface for biologists to find and visualize modules and roles of proteins in PPI network. It can be tested online at the website http://www.bioinfo.org/modulerole/index.php, which is free and open to all users and there is no login requirement, with demo data provided by "User Guide" in the menu Help. Non-server application of this program is considered for high-throughput data with more than 200 nodes or user's own interaction datasets. Users are able to bookmark the web link to the result page and access at a later time. As an interactive and highly customizable application, ModuleRole requires no expert knowledge in graph theory on the user side and can be used in both Linux and Windows system, thus a very useful tool for biologist to analyze and visualize PPI networks from databases such as BioGRID. ModuleRole is implemented in Java and C, and is freely available at http://www.bioinfo.org/modulerole/index.php. Supplementary information (user guide, demo data) is also available at this website. API for ModuleRole used for this program can be

  6. ModuleRole: a tool for modulization, role determination and visualization in protein-protein interaction networks.

    Directory of Open Access Journals (Sweden)

    Guipeng Li

    Full Text Available Rapidly increasing amounts of (physical and genetic protein-protein interaction (PPI data are produced by various high-throughput techniques, and interpretation of these data remains a major challenge. In order to gain insight into the organization and structure of the resultant large complex networks formed by interacting molecules, using simulated annealing, a method based on the node connectivity, we developed ModuleRole, a user-friendly web server tool which finds modules in PPI network and defines the roles for every node, and produces files for visualization in Cytoscape and Pajek. For given proteins, it analyzes the PPI network from BioGRID database, finds and visualizes the modules these proteins form, and then defines the role every node plays in this network, based on two topological parameters Participation Coefficient and Z-score. This is the first program which provides interactive and very friendly interface for biologists to find and visualize modules and roles of proteins in PPI network. It can be tested online at the website http://www.bioinfo.org/modulerole/index.php, which is free and open to all users and there is no login requirement, with demo data provided by "User Guide" in the menu Help. Non-server application of this program is considered for high-throughput data with more than 200 nodes or user's own interaction datasets. Users are able to bookmark the web link to the result page and access at a later time. As an interactive and highly customizable application, ModuleRole requires no expert knowledge in graph theory on the user side and can be used in both Linux and Windows system, thus a very useful tool for biologist to analyze and visualize PPI networks from databases such as BioGRID.ModuleRole is implemented in Java and C, and is freely available at http://www.bioinfo.org/modulerole/index.php. Supplementary information (user guide, demo data is also available at this website. API for ModuleRole used for this

  7. Yeast Interacting Proteins Database: YDR261W-A, YDR261W-B [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available a nucleocapsid-like protein (Gag), reverse transcriptase (RT), protease (PR), and integrase (IN); similar...a nucleocapsid-like protein (Gag), reverse transcriptase (RT), protease (PR), and integrase (IN); similar

  8. MoonProt 2.0: an expansion and update of the moonlighting proteins database.

    Science.gov (United States)

    Chen, Chang; Zabad, Shadi; Liu, Haipeng; Wang, Wangfei; Jeffery, Constance

    2018-01-04

    MoonProt 2.0 (http://moonlightingproteins.org) is an updated, comprehensive and open-access database storing expert-curated annotations for moonlighting proteins. Moonlighting proteins contain two or more physiologically relevant distinct functions performed by a single polypeptide chain. Here, we describe developments in the MoonProt website and database since our previous report in the Database Issue of Nucleic Acids Research. For this V 2.0 release, we expanded the number of proteins annotated to 370 and modified several dozen protein annotations with additional or updated information, including more links to protein structures in the Protein Data Bank, compared with the previous release. The new entries include more examples from humans and several model organisms, more proteins involved in disease, and proteins with different combinations of functions. The updated web interface includes a search function using BLAST to enable users to search the database for proteins that share amino acid sequence similarity with a protein of interest. The updated website also includes additional background information about moonlighting proteins and an expanded list of links to published articles about moonlighting proteins. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  9. ARCPHdb: A comprehensive protein database for SF1 and SF2 helicase from archaea.

    Science.gov (United States)

    Moukhtar, Mirna; Chaar, Wafi; Abdel-Razzak, Ziad; Khalil, Mohamad; Taha, Samir; Chamieh, Hala

    2017-01-01

    Superfamily 1 and Superfamily 2 helicases, two of the largest helicase protein families, play vital roles in many biological processes including replication, transcription and translation. Study of helicase proteins in the model microorganisms of archaea have largely contributed to the understanding of their function, architecture and assembly. Based on a large phylogenomics approach, we have identified and classified all SF1 and SF2 protein families in ninety five sequenced archaea genomes. Here we developed an online webserver linked to a specialized protein database named ARCPHdb to provide access for SF1 and SF2 helicase families from archaea. ARCPHdb was implemented using MySQL relational database. Web interfaces were developed using Netbeans. Data were stored according to UniProt accession numbers, NCBI Ref Seq ID, PDB IDs and Entrez Databases. A user-friendly interactive web interface has been developed to browse, search and download archaeal helicase protein sequences, their available 3D structure models, and related documentation available in the literature provided by ARCPHdb. The database provides direct links to matching external databases. The ARCPHdb is the first online database to compile all protein information on SF1 and SF2 helicase from archaea in one platform. This database provides essential resource information for all researchers interested in the field. Copyright © 2016 Elsevier Ltd. All rights reserved.

  10. solQTL: a tool for QTL analysis, visualization and linking to genomes at SGN database

    Directory of Open Access Journals (Sweden)

    van der Knaap Esther

    2010-10-01

    Full Text Available Abstract Background A common approach to understanding the genetic basis of complex traits is through identification of associated quantitative trait loci (QTL. Fine mapping QTLs requires several generations of backcrosses and analysis of large populations, which is time-consuming and costly effort. Furthermore, as entire genomes are being sequenced and an increasing amount of genetic and expression data are being generated, a challenge remains: linking phenotypic variation to the underlying genomic variation. To identify candidate genes and understand the molecular basis underlying the phenotypic variation of traits, bioinformatic approaches are needed to exploit information such as genetic map, expression and whole genome sequence data of organisms in biological databases. Description The Sol Genomics Network (SGN, http://solgenomics.net is a primary repository for phenotypic, genetic, genomic, expression and metabolic data for the Solanaceae family and other related Asterids species and houses a variety of bioinformatics tools. SGN has implemented a new approach to QTL data organization, storage, analysis, and cross-links with other relevant data in internal and external databases. The new QTL module, solQTL, http://solgenomics.net/qtl/, employs a user-friendly web interface for uploading raw phenotype and genotype data to the database, R/QTL mapping software for on-the-fly QTL analysis and algorithms for online visualization and cross-referencing of QTLs to relevant datasets and tools such as the SGN Comparative Map Viewer and Genome Browser. Here, we describe the development of the solQTL module and demonstrate its application. Conclusions solQTL allows Solanaceae researchers to upload raw genotype and phenotype data to SGN, perform QTL analysis and dynamically cross-link to relevant genetic, expression and genome annotations. Exploration and synthesis of the relevant data is expected to help facilitate identification of candidate genes

  11. Protein - AT Atlas | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available of data contents It is a table of proteins whose structures were solved using method(s) developed in the Technology Development proje...cts of the Targeted Proteins Research Program (TPRP). Data file File name: at_atlas

  12. Development of Human Protein Reference Database as an Initial Platform for Approaching Systems Biology in Humans

    Science.gov (United States)

    Peri, Suraj; Navarro, J. Daniel; Amanchy, Ramars; Kristiansen, Troels Z.; Jonnalagadda, Chandra Kiran; Surendranath, Vineeth; Niranjan, Vidya; Muthusamy, Babylakshmi; Gandhi, T.K.B.; Gronborg, Mads; Ibarrola, Nieves; Deshpande, Nandan; Shanker, K.; Shivashankar, H.N.; Rashmi, B.P.; Ramya, M.A.; Zhao, Zhixing; Chandrika, K.N.; Padma, N.; Harsha, H.C.; Yatish, A.J.; Kavitha, M.P.; Menezes, Minal; Choudhury, Dipanwita Roy; Suresh, Shubha; Ghosh, Neelanjana; Saravana, R.; Chandran, Sreenath; Krishna, Subhalakshmi; Joy, Mary; Anand, Sanjeev K.; Madavan, V.; Joseph, Ansamma; Wong, Guang W.; Schiemann, William P.; Constantinescu, Stefan N.; Huang, Lily; Khosravi-Far, Roya; Steen, Hanno; Tewari, Muneesh; Ghaffari, Saghi; Blobe, Gerard C.; Dang, Chi V.; Garcia, Joe G.N.; Pevsner, Jonathan; Jensen, Ole N.; Roepstorff, Peter; Deshpande, Krishna S.; Chinnaiyan, Arul M.; Hamosh, Ada; Chakravarti, Aravinda; Pandey, Akhilesh

    2003-01-01

    Human Protein Reference Database (HPRD) is an object database that integrates a wealth of information relevant to the function of human proteins in health and disease. Data pertaining to thousands of protein-protein interactions, posttranslational modifications, enzyme/substrate relationships, disease associations, tissue expression, and subcellular localization were extracted from the literature for a nonredundant set of 2750 human proteins. Almost all the information was obtained manually by biologists who read and interpreted >300,000 published articles during the annotation process. This database, which has an intuitive query interface allowing easy access to all the features of proteins, was built by using open source technologies and will be freely available at http://www.hprd.org to the academic community. This unified bioinformatics platform will be useful in cataloging and mining the large number of proteomic interactions and alterations that will be discovered in the postgenomic era. PMID:14525934

  13. Phenylglyoxal-Based Visualization of Citrullinated Proteins on Western Blots

    Directory of Open Access Journals (Sweden)

    Sanne M. M. Hensen

    2015-04-01

    Full Text Available Citrullination is the conversion of peptidylarginine to peptidylcitrulline, which is catalyzed by peptidylarginine deiminases. This conversion is involved in different physiological processes and is associated with several diseases, including cancer and rheumatoid arthritis. A common method to detect citrullinated proteins relies on anti-modified citrulline antibodies directed to a specific chemical modification of the citrulline side chain. Here, we describe a versatile, antibody-independent method for the detection of citrullinated proteins on a membrane, based on the selective reaction of phenylglyoxal with the ureido group of citrulline under highly acidic conditions. The method makes use of 4-azidophenylglyoxal, which, after reaction with citrullinated proteins, can be visualized with alkyne-conjugated probes. The sensitivity of this procedure, using an alkyne-biotin probe, appeared to be comparable to the antibody-based detection method and independent of the sequence surrounding the citrulline.

  14. PTM-SD: a database of structurally resolved and annotated posttranslational modifications in proteins

    OpenAIRE

    Craveur, Pierrick; Rebehmed, Joseph; de Brevern, Alexandre G.

    2014-01-01

    Posttranslational modifications (PTMs) define covalent and chemical modifications of protein residues. They play important roles in modulating various biological functions. Current PTM databases contain important sequence annotations but do not provide informative 3D structural resource about these modifications. Posttranslational modification structural database (PTM-SD) provides access to structurally solved modified residues, which are experimentally annotated as PTMs. It combines differen...

  15. PARPs database: A LIMS systems for protein-protein interaction data mining or laboratory information management system

    Directory of Open Access Journals (Sweden)

    Picard-Cloutier Aude

    2007-12-01

    Full Text Available Abstract Background In the "post-genome" era, mass spectrometry (MS has become an important method for the analysis of proteins and the rapid advancement of this technique, in combination with other proteomics methods, results in an increasing amount of proteome data. This data must be archived and analysed using specialized bioinformatics tools. Description We herein describe "PARPs database," a data analysis and management pipeline for liquid chromatography tandem mass spectrometry (LC-MS/MS proteomics. PARPs database is a web-based tool whose features include experiment annotation, protein database searching, protein sequence management, as well as data-mining of the peptides and proteins identified. Conclusion Using this pipeline, we have successfully identified several interactions of biological significance between PARP-1 and other proteins, namely RFC-1, 2, 3, 4 and 5.

  16. Integrated Controlling System and Unified Database for High Throughput Protein Crystallography Experiments

    Science.gov (United States)

    Gaponov, Yu. A.; Igarashi, N.; Hiraki, M.; Sasajima, K.; Matsugaki, N.; Suzuki, M.; Kosuge, T.; Wakatsuki, S.

    2004-05-01

    An integrated controlling system and a unified database for high throughput protein crystallography experiments have been developed. Main features of protein crystallography experiments (purification, crystallization, crystal harvesting, data collection, data processing) were integrated into the software under development. All information necessary to perform protein crystallography experiments is stored (except raw X-ray data that are stored in a central data server) in a MySQL relational database. The database contains four mutually linked hierarchical trees describing protein crystals, data collection of protein crystal and experimental data processing. A database editor was designed and developed. The editor supports basic database functions to view, create, modify and delete user records in the database. Two search engines were realized: direct search of necessary information in the database and object oriented search. The system is based on TCP/IP secure UNIX sockets with four predefined sending and receiving behaviors, which support communications between all connected servers and clients with remote control functions (creating and modifying data for experimental conditions, data acquisition, viewing experimental data, and performing data processing). Two secure login schemes were designed and developed: a direct method (using the developed Linux clients with secure connection) and an indirect method (using the secure SSL connection using secure X11 support from any operating system with X-terminal and SSH support). A part of the system has been implemented on a new MAD beam line, NW12, at the Photon Factory Advanced Ring for general user experiments.

  17. ProtVista: visualization of protein sequence annotations.

    Science.gov (United States)

    Watkins, Xavier; Garcia, Leyla J; Pundir, Sangya; Martin, Maria J

    2017-07-01

    ProtVista is a comprehensive visualization tool for the graphical representation of protein sequence features in the UniProt Knowledgebase, experimental proteomics and variation public datasets. The complexity and relationships in this wealth of data pose a challenge in interpretation. Integrative visualization approaches such as provided by ProtVista are thus essential for researchers to understand the data and, for instance, discover patterns affecting function and disease associations. ProtVista is a JavaScript component released as an open source project under the Apache 2 License. Documentation and source code are available at http://ebi-uniprot.github.io/ProtVista/ . martin@ebi.ac.uk. Supplementary data are available at Bioinformatics online.

  18. DBBP: database of binding pairs in protein-nucleic acid interactions

    OpenAIRE

    Park, Byungkyu; Kim, Hyungchan; Han, Kyungsook

    2014-01-01

    Background Interaction of proteins with other molecules plays an important role in many biological activities. As many structures of protein-DNA complexes and protein-RNA complexes have been determined in the past years, several databases have been constructed to provide structure data of the complexes. However, the information on the binding sites between proteins and nucleic acids is not readily available from the structure data since the data consists mostly of the three-dimensional coordi...

  19. Live Cell Visualization of Multiple Protein-Protein Interactions with BiFC Rainbow.

    Science.gov (United States)

    Wang, Sheng; Ding, Miao; Xue, Boxin; Hou, Yingping; Sun, Yujie

    2018-01-11

    As one of the most powerful tools to visualize PPIs in living cells, bimolecular fluorescence complementation (BiFC) has gained great advancement during recent years, including deep tissue imaging with far-red or near-infrared fluorescent proteins or super-resolution imaging with photochromic fluorescent proteins. However, little progress has been made toward simultaneous detection and visualization of multiple PPIs in the same cell, mainly due to the spectral crosstalk. In this report, we developed novel BiFC assays based on large-Stokes-shift fluorescent proteins (LSS-FPs) to detect and visualize multiple PPIs in living cells. With the large excitation/emission spectral separation, LSS-FPs can be imaged together with normal Stokes shift fluorescent proteins to realize multicolor BiFC imaging using a simple illumination scheme. We also further demonstrated BiFC rainbow combining newly developed BiFC assays with previously established mCerulean/mVenus-based BiFC assays to achieve detection and visualization of four PPI pairs in the same cell. Additionally, we prove that with the complete spectral separation of mT-Sapphire and CyOFP1, LSS-FP-based BiFC assays can be readily combined with intensity-based FRET measurement to detect ternary protein complex formation with minimal spectral crosstalk. Thus, our newly developed LSS-FP-based BiFC assays not only expand the fluorescent protein toolbox available for BiFC but also facilitate the detection and visualization of multiple protein complex interactions in living cells.

  20. PFP/ESG: automated protein function prediction servers enhanced with Gene Ontology visualization tool.

    Science.gov (United States)

    Khan, Ishita K; Wei, Qing; Chitale, Meghana; Kihara, Daisuke

    2015-01-15

    Protein function prediction (PFP) is an automated function prediction method that predicts Gene Ontology (GO) annotations for a protein sequence using distantly related sequences and contextual associations of GO terms. Extended similarity group (ESG) is another GO prediction algorithm that makes predictions based on iterative sequence database searches. Here, we provide interactive web servers for the PFP and ESG algorithms that are equipped with an effective visualization of the GO predictions in a hierarchical topology. PFP/ESG servers are freely available at http://kiharalab.org/web/pfp.php and http://kiharalab.org/web/esg.php, or access both at http://kiharalab.org/pfp_esg.php. Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  1. Earth History databases and visualization - the TimeScale Creator system

    Science.gov (United States)

    Ogg, James; Lugowski, Adam; Gradstein, Felix

    2010-05-01

    The "TimeScale Creator" team (www.tscreator.org) and the Subcommission on Stratigraphic Information (stratigraphy.science.purdue.edu) of the International Commission on Stratigraphy (www.stratigraphy.org) has worked with numerous geoscientists and geological surveys to prepare reference datasets for global and regional stratigraphy. All events are currently calibrated to Geologic Time Scale 2004 (Gradstein et al., 2004, Cambridge Univ. Press) and Concise Geologic Time Scale (Ogg et al., 2008, Cambridge Univ. Press); but the array of intercalibrations enable dynamic adjustment to future numerical age scales and interpolation methods. The main "global" database contains over 25,000 events/zones from paleontology, geomagnetics, sea-level and sequence stratigraphy, igneous provinces, bolide impacts, plus several stable isotope curves and image sets. Several regional datasets are provided in conjunction with geological surveys, with numerical ages interpolated using a similar flexible inter-calibration procedure. For example, a joint program with Geoscience Australia has compiled an extensive Australian regional biostratigraphy and a full array of basin lithologic columns with each formation linked to public lexicons of all Proterozoic through Phanerozoic basins - nearly 500 columns of over 9,000 data lines plus hot-curser links to oil-gas reference wells. Other datapacks include New Zealand biostratigraphy and basin transects (ca. 200 columns), Russian biostratigraphy, British Isles regional stratigraphy, Gulf of Mexico biostratigraphy and lithostratigraphy, high-resolution Neogene stable isotope curves and ice-core data, human cultural episodes, and Circum-Arctic stratigraphy sets. The growing library of datasets is designed for viewing and chart-making in the free "TimeScale Creator" JAVA package. This visualization system produces a screen display of the user-selected time-span and the selected columns of geologic time scale information. The user can change the

  2. SynProt: A Comprehensive Database for Proteins of the Detergent-Resistant Synaptic Junctions Fraction

    Directory of Open Access Journals (Sweden)

    Rainer ePielot

    2012-06-01

    Full Text Available Chemical synapses are highly specialized cell-cell contacts for communication between neurons in the CNS characterized by complex and dynamic protein networks at both synaptic membranes. The cytomatrix at the active zone (CAZ organizes the apparatus for the regulated release of transmitters from the presynapse. At the postsynaptic side, the postsynaptic density constitutes the machinery for detection, integration and transduction of the transmitter signal. Both pre- and postsynaptic protein networks represent the molecular substrates for synaptic plasticity. Their function can be altered both by regulating their composition and by post-translational modification of their components. For a comprehensive understanding of synaptic networks the entire ensemble of synaptic proteins has to be considered. To support this, we established a comprehensive database for synaptic junction proteins (SynProt database primarily based on proteomics data obtained from biochemical preparations of detergent-resistant synaptic junctions. The database currently contains 2,788 non-redundant entries of rat, mouse and some human proteins, which mainly have been manually extracted from twelve proteomic studies and annotated for synaptic subcellular localization. Each dataset is completed with manually added information including protein classifiers as well as automatically retrieved and updated information from public databases (UniProt and PubMed. We intend that the database will be used to support modeling of synaptic protein networks and rational experimental design.

  3. Improving classification in protein structure databases using text mining

    Directory of Open Access Journals (Sweden)

    Jones David T

    2009-05-01

    Full Text Available Abstract Background The classification of protein domains in the CATH resource is primarily based on structural comparisons, sequence similarity and manual analysis. One of the main bottlenecks in the processing of new entries is the evaluation of 'borderline' cases by human curators with reference to the literature, and better tools for helping both expert and non-expert users quickly identify relevant functional information from text are urgently needed. A text based method for protein classification is presented, which complements the existing sequence and structure-based approaches, especially in cases exhibiting low similarity to existing members and requiring manual intervention. The method is based on the assumption that textual similarity between sets of documents relating to proteins reflects biological function similarities and can be exploited to make classification decisions. Results An optimal strategy for the text comparisons was identified by using an established gold standard enzyme dataset. Filtering of the abstracts using a machine learning approach to discriminate sentences containing functional, structural and classification information that are relevant to the protein classification task improved performance. Testing this classification scheme on a dataset of 'borderline' protein domains that lack significant sequence or structure similarity to classified proteins showed that although, as expected, the structural similarity classifiers perform better on average, there is a significant benefit in incorporating text similarity in logistic regression models, indicating significant orthogonality in this additional information. Coverage was significantly increased especially at low error rates, which is important for routine classification tasks: 15.3% for the combined structure and text classifier compared to 10% for the structural classifier alone, at 10-3 error rate. Finally when only the highest scoring predictions were used

  4. Integrated remote sensing and visualization (IRSV) system for transportation infrastructure operations and management, phase two, volume 4 : web-based bridge information database--visualization analytics and distributed sensing.

    Science.gov (United States)

    2012-03-01

    This report introduces the design and implementation of a Web-based bridge information visual analytics system. This : project integrates Internet, multiple databases, remote sensing, and other visualization technologies. The result : combines a GIS ...

  5. MoonProt: a database for proteins that are known to moonlight

    Science.gov (United States)

    Mani, Mathew; Chen, Chang; Amblee, Vaishak; Liu, Haipeng; Mathur, Tanu; Zwicke, Grant; Zabad, Shadi; Patel, Bansi; Thakkar, Jagravi; Jeffery, Constance J.

    2015-01-01

    Moonlighting proteins comprise a class of multifunctional proteins in which a single polypeptide chain performs multiple biochemical functions that are not due to gene fusions, multiple RNA splice variants or pleiotropic effects. The known moonlighting proteins perform a variety of diverse functions in many different cell types and species, and information about their structures and functions is scattered in many publications. We have constructed the manually curated, searchable, internet-based MoonProt Database (http://www.moonlightingproteins.org) with information about the over 200 proteins that have been experimentally verified to be moonlighting proteins. The availability of this organized information provides a more complete picture of what is currently known about moonlighting proteins. The database will also aid researchers in other fields, including determining the functions of genes identified in genome sequencing projects, interpreting data from proteomics projects and annotating protein sequence and structural databases. In addition, information about the structures and functions of moonlighting proteins can be helpful in understanding how novel protein functional sites evolved on an ancient protein scaffold, which can also help in the design of proteins with novel functions. PMID:25324305

  6. The reactive metabolite target protein database (TPDB – a web-accessible resource

    Directory of Open Access Journals (Sweden)

    Dong Yinghua

    2007-03-01

    Full Text Available Abstract Background The toxic effects of many simple organic compounds stem from their biotransformation to chemically reactive metabolites which bind covalently to cellular proteins. To understand the mechanisms of cytotoxic responses it may be important to know which proteins become adducted and whether some may be common targets of multiple toxins. The literature of this field is widely scattered but expanding rapidly, suggesting the need for a comprehensive, searchable database of reactive metabolite target proteins. Description The Reactive Metabolite Target Protein Database (TPDB is a comprehensive, curated, searchable, documented compilation of publicly available information on the protein targets of reactive metabolites of 18 well-studied chemicals and drugs of known toxicity. TPDB software enables i string searches for author names and proteins names/synonyms, ii more complex searches by selecting chemical compound, animal species, target tissue and protein names/synonyms from pull-down menus, and iii commonality searches over multiple chemicals. Tabulated search results provide information, references and links to other databases. Conclusion The TPDB is a unique on-line compilation of information on the covalent modification of cellular proteins by reactive metabolites of chemicals and drugs. Its comprehensiveness and searchability should facilitate the elucidation of mechanisms of reactive metabolite toxicity. The database is freely available at http://tpdb.medchem.ku.edu/tpdb.html

  7. Tools and procedures for visualization of proteins and other biomolecules.

    Science.gov (United States)

    Pan, Lurong; Aller, Stephen G

    2015-04-01

    Protein, peptides, and nucleic acids are biomolecules that drive biological processes in living organisms. An enormous amount of structural data for a large number of these biomolecules has been described with atomic precision in the form of structural "snapshots" that are freely available in public repositories. These snapshots can help explain how the biomolecules function, the nature of interactions between multi-molecular complexes, and even how small-molecule drugs can modulate the biomolecules for clinical benefits. Furthermore, these structural snapshots serve as inputs for sophisticated computer simulations to turn the biomolecules into moving, "breathing" molecular machines for understanding their dynamic properties in real-time computer simulations. In order for the researcher to take advantage of such a wealth of structural data, it is necessary to gain competency in the use of computer molecular visualization tools for exploring the structures and visualizing three-dimensional spatial representations. Here, we present protocols for using two common visualization tools--the Web-based Jmol and the stand-alone PyMOL package--as well as a few examples of other popular tools. Copyright © 2015 John Wiley & Sons, Inc.

  8. The REFOLD database: a tool for the optimization of protein expression and refolding

    Science.gov (United States)

    Chow, Michelle K. M.; Amin, Abdullah A.; Fulton, Kate F.; Fernando, Thushan; Kamau, Lawrence; Batty, Chris; Louca, Michael; Ho, Storm; Whisstock, James C.; Bottomley, Stephen P.; Buckle, Ashley M.

    2006-01-01

    A large proportion of proteins expressed in Escherichia coli form inclusion bodies and thus require renaturation to attain a functional conformation for analysis. In this process, identifying and optimizing the refolding conditions and methodology is often rate limiting. In order to address this problem, we have developed REFOLD, a web-accessible relational database containing the published methods employed in the refolding of recombinant proteins. Currently, REFOLD contains >300 entries, which are heavily annotated such that the database can be searched via multiple parameters. We anticipate that REFOLD will continue to grow and eventually become a powerful tool for the optimization of protein renaturation. REFOLD is freely available at . PMID:16381847

  9. An overview of human protein databases and their application to functional proteomics in health and disease.

    Science.gov (United States)

    Zhang, YanQiong; Zhu, YunPing; He, FuChu

    2011-11-01

    Functional proteomics can be defined as a strategy to couple proteomic information with biochemical and physiological analyses with the aim of understanding better the functions of proteins in normal and diseased organs. In recent years, a variety of publicly available bioinformatics databases have been developed to support protein-related information management and biological knowledge discovery. In addition to being used to annotate the proteome, these resources also offer the opportunity to develop global approaches to the study of the functional role of proteins both in health and disease. Here, we present a comprehensive review of the major human protein bioinformatics databases. We conclude this review by discussing a few examples that illustrate the importance of these databases in functional proteomics research.

  10. Visualization of protein folding funnels in lattice models.

    Science.gov (United States)

    Oliveira, Antonio B; Fatore, Francisco M; Paulovich, Fernando V; Oliveira, Osvaldo N; Leite, Vitor B P

    2014-01-01

    Protein folding occurs in a very high dimensional phase space with an exponentially large number of states, and according to the energy landscape theory it exhibits a topology resembling a funnel. In this statistical approach, the folding mechanism is unveiled by describing the local minima in an effective one-dimensional representation. Other approaches based on potential energy landscapes address the hierarchical structure of local energy minima through disconnectivity graphs. In this paper, we introduce a metric to describe the distance between any two conformations, which also allows us to go beyond the one-dimensional representation and visualize the folding funnel in 2D and 3D. In this way it is possible to assess the folding process in detail, e.g., by identifying the connectivity between conformations and establishing the paths to reach the native state, in addition to regions where trapping may occur. Unlike the disconnectivity maps method, which is based on the kinetic connections between states, our methodology is based on structural similarities inferred from the new metric. The method was developed in a 27-mer protein lattice model, folded into a 3×3×3 cube. Five sequences were studied and distinct funnels were generated in an analysis restricted to conformations from the transition-state to the native configuration. Consistent with the expected results from the energy landscape theory, folding routes can be visualized to probe different regions of the phase space, as well as determine the difficulty in folding of the distinct sequences. Changes in the landscape due to mutations were visualized, with the comparison between wild and mutated local minima in a single map, which serves to identify different trapping regions. The extension of this approach to more realistic models and its use in combination with other approaches are discussed.

  11. Visualizing Protein Interactions and Dynamics: Evolving a Visual Language for Molecular Animation

    Science.gov (United States)

    Jenkinson, Jodie; McGill, Gaël

    2012-01-01

    Undergraduate biology education provides students with a number of learning challenges. Subject areas that are particularly difficult to understand include protein conformational change and stability, diffusion and random molecular motion, and molecular crowding. In this study, we examined the relative effectiveness of three-dimensional visualization techniques for learning about protein conformation and molecular motion in association with a ligand–receptor binding event. Increasingly complex versions of the same binding event were depicted in each of four animated treatments. Students (n = 131) were recruited from the undergraduate biology program at University of Toronto, Mississauga. Visualization media were developed in the Center for Molecular and Cellular Dynamics at Harvard Medical School. Stem cell factor ligand and cKit receptor tyrosine kinase were used as a classical example of a ligand-induced receptor dimerization and activation event. Each group completed a pretest, viewed one of four variants of the animation, and completed a posttest and, at 2 wk following the assessment, a delayed posttest. Overall, the most complex animation was the most effective at fostering students' understanding of the events depicted. These results suggest that, in select learning contexts, increasingly complex representations may be more desirable for conveying the dynamic nature of cell binding events. PMID:22383622

  12. ProDis-ContSHC: learning protein dissimilarity measures and hierarchical context coherently for protein-protein comparison in protein database retrieval

    Directory of Open Access Journals (Sweden)

    Wang Jingyan

    2012-05-01

    Full Text Available Abstract Background The need to retrieve or classify protein molecules using structure or sequence-based similarity measures underlies a wide range of biomedical applications. Traditional protein search methods rely on a pairwise dissimilarity/similarity measure for comparing a pair of proteins. This kind of pairwise measures suffer from the limitation of neglecting the distribution of other proteins and thus cannot satisfy the need for high accuracy of the retrieval systems. Recent work in the machine learning community has shown that exploiting the global structure of the database and learning the contextual dissimilarity/similarity measures can improve the retrieval performance significantly. However, most existing contextual dissimilarity/similarity learning algorithms work in an unsupervised manner, which does not utilize the information of the known class labels of proteins in the database. Results In this paper, we propose a novel protein-protein dissimilarity learning algorithm, ProDis-ContSHC. ProDis-ContSHC regularizes an existing dissimilarity measure dij by considering the contextual information of the proteins. The context of a protein is defined by its neighboring proteins. The basic idea is, for a pair of proteins (i, j, if their context N(i and N(j is similar to each other, the two proteins should also have a high similarity. We implement this idea by regularizing dij by a factor learned from the context N(i and N(j. Moreover, we divide the context to hierarchial sub-context and get the contextual dissimilarity vector for each protein pair. Using the class label information of the proteins, we select the relevant (a pair of proteins that has the same class labels and irrelevant (with different labels protein pairs, and train an SVM model to distinguish between their contextual dissimilarity vectors. The SVM model is further used to learn a supervised regularizing factor. Finally, with the new Supervised learned Dissimilarity

  13. ProDis-ContSHC: Learning protein dissimilarity measures and hierarchical context coherently for protein-protein comparison in protein database retrieval

    KAUST Repository

    Wang, Jim Jing-Yan

    2012-05-08

    Background: The need to retrieve or classify protein molecules using structure or sequence-based similarity measures underlies a wide range of biomedical applications. Traditional protein search methods rely on a pairwise dissimilarity/similarity measure for comparing a pair of proteins. This kind of pairwise measures suffer from the limitation of neglecting the distribution of other proteins and thus cannot satisfy the need for high accuracy of the retrieval systems. Recent work in the machine learning community has shown that exploiting the global structure of the database and learning the contextual dissimilarity/similarity measures can improve the retrieval performance significantly. However, most existing contextual dissimilarity/similarity learning algorithms work in an unsupervised manner, which does not utilize the information of the known class labels of proteins in the database.Results: In this paper, we propose a novel protein-protein dissimilarity learning algorithm, ProDis-ContSHC. ProDis-ContSHC regularizes an existing dissimilarity measure dij by considering the contextual information of the proteins. The context of a protein is defined by its neighboring proteins. The basic idea is, for a pair of proteins (i, j), if their context N (i) and N (j) is similar to each other, the two proteins should also have a high similarity. We implement this idea by regularizing dij by a factor learned from the context N (i) and N (j). Moreover, we divide the context to hierarchial sub-context and get the contextual dissimilarity vector for each protein pair. Using the class label information of the proteins, we select the relevant (a pair of proteins that has the same class labels) and irrelevant (with different labels) protein pairs, and train an SVM model to distinguish between their contextual dissimilarity vectors. The SVM model is further used to learn a supervised regularizing factor. Finally, with the new Supervised learned Dissimilarity measure, we update

  14. SPROUTS: a database for the evaluation of protein stability upon point mutation

    OpenAIRE

    Lonquety, Mathieu; Lacroix, Zo?; Papandreou, Nikolaos; Chomilier, Jacques

    2008-01-01

    SPROUTS (Structural Prediction for pRotein fOlding UTility System) is a new database that provides access to various structural data sets and integrated functionalities not yet available to the community. The originality of the SPROUTS database is the ability to gain access to a variety of structural analyses at one place and with a strong interaction between them. SPROUTS currently combines data pertaining to 429 structures that capture representative folds and results related to the predict...

  15. VaProS: a database-integration approach for protein/genome information retrieval

    KAUST Repository

    Gojobori, Takashi

    2016-12-24

    Life science research now heavily relies on all sorts of databases for genome sequences, transcription, protein three-dimensional (3D) structures, protein–protein interactions, phenotypes and so forth. The knowledge accumulated by all the omics research is so vast that a computer-aided search of data is now a prerequisite for starting a new study. In addition, a combinatory search throughout these databases has a chance to extract new ideas and new hypotheses that can be examined by wet-lab experiments. By virtually integrating the related databases on the Internet, we have built a new web application that facilitates life science researchers for retrieving experts’ knowledge stored in the databases and for building a new hypothesis of the research target. This web application, named VaProS, puts stress on the interconnection between the functional information of genome sequences and protein 3D structures, such as structural effect of the gene mutation. In this manuscript, we present the notion of VaProS, the databases and tools that can be accessed without any knowledge of database locations and data formats, and the power of search exemplified in quest of the molecular mechanisms of lysosomal storage disease. VaProS can be freely accessed at http://p4d-info.nig.ac.jp/vapros/.

  16. dbPSP: a curated database for protein phosphorylation sites in prokaryotes.

    Science.gov (United States)

    Pan, Zhicheng; Wang, Bangshan; Zhang, Ying; Wang, Yongbo; Ullah, Shahid; Jian, Ren; Liu, Zexian; Xue, Yu

    2015-01-01

    As one of the most important post-translational modifications, phosphorylation is highly involved in almost all of biological processes through temporally and spatially modifying substrate proteins. Recently, phosphorylation in prokaryotes attracted much attention for its critical roles in various cellular processes such as signal transduction. Thus, an integrative data resource of the prokaryotic phosphorylation will be useful for further analysis. In this study, we presented a curated database of phosphorylation sites in prokaryotes (dbPSP, Database URL: http://dbpsp.biocuckoo.org) for 96 prokaryotic organisms, which belong to 11 phyla in two domains including bacteria and archaea. From the scientific literature, we manually collected experimentally identified phosphorylation sites on seven types of residues, including serine, threonine, tyrosine, aspartic acid, histidine, cysteine and arginine. In total, the dbPSP database contains 7391 phosphorylation sites in 3750 prokaryotic proteins. With the dataset, the sequence preferences of the phosphorylation sites and functional annotations of the phosphoproteins were analyzed, while the results shows that there were obvious differences among the phosphorylation in bacteria, archaea and eukaryotes. All the phosphorylation sites were annotated with original references and other descriptions in the database, which could be easily accessed through user-friendly website interface including various search and browse options. Taken together, the dbPSP database provides a comprehensive data resource for further studies of protein phosphorylation in prokaryotes. Database URL: http://dbpsp.biocuckoo.org © The Author(s) 2015. Published by Oxford University Press.

  17. PhasePlot: A Software Program for Visualizing Phase Relations Computed Using Thermochemical Models and Databases

    Science.gov (United States)

    Ghiorso, M. S.

    2011-12-01

    A new software program has been developed for Macintosh computers that permits the visualization of phase relations calculated from thermodynamic data-model collections. The data-model collections of MELTS (Ghiorso and Sack, 1995, CMP 119, 197-212), pMELTS (Ghiorso et al., 2002, G-cubed 3, 10.1029/2001GC000217) and the deep mantle database of Stixrude and Lithgow-Bertelloni (2011, GJI 184, 1180-1213) are currently implemented. The software allows users to enter a system bulk composition and a range of reference conditions and then calculate a grid of phase relations. These relations may be visualized in a variety of ways including phase diagrams, phase proportion plots, and contour diagrams of phase compositions and abundances. Results may be exported into Excel or similar spreadsheet applications. Flexibility in stipulating reference conditions permit the construction of temperature-pressure, temperature-volume, entropy-pressure, or entropy-volume display grids. Calculations on the grid are performed for fixed bulk composition or in open systems governed by user specified constraints on component chemical potentials (e.g., specified oxygen fugacity buffers). The calculation engine for the software is optimized for multi-core compute architectures and is very fast, allowing a typical grid of 64 points to be calculated in under 10 seconds on a dual-core laptop/iMac. The underlying computational thermodynamic algorithms have been optimized for speed and robust behavior. Taken together, both of these advances facilitate in classroom demonstrations and permit novice users to work with the program effectively, focusing on problem specification and interpretation of results rather than on manipulation and mechanics of computation - a key feature of an effective instructional tool. The emphasis in this software package is graphical visualization, which aids in better comprehension of complex phase relations in multicomponent systems. Anecdotal experience in using Phase

  18. ProtNA-ASA: Protein-nucleic acid structural database with information on accessible surface area

    Science.gov (United States)

    Tkachenko, M. Y.; Boryskina, O. P.; Shestopalova, A. V.; Tolstorukov, M. Y.

    The article describes a new database (ProtNA-ASA), which combines the data on conformational parameters of nucleic acids and calculations of the accessible surface area (ASA) of nucleic acid atoms in protein-DNA/RNA complexes. As for October 2008, the database contains 214 DNA-protein and 28 RNA-protein non-homologous complexes. The database provides structural parameters that describe local geometry of base pairs and base-pair steps as well as backbone torsion angles. Additionally, total ASA of DNA/RNA atoms and the accessible area of atoms in the minor and major grooves are calculated. ProtNA-ASA database facilitates studying the relationship between the DNA/RNA conformation and availability of atoms for contact with proteins either in major or in minor groove for different nucleotides. Such an analysis is important for understanding the principles of molecular recognition including indirect sequence readout. The database is publicly available for use at http://www.protna.bio-page.org.

  19. An update of the DEF database of protein fold class predictions

    DEFF Research Database (Denmark)

    Reczko, Martin; Karras, Dimitris; Bohr, Henrik

    1997-01-01

    An update is given on the Database of Expected Fold classes (DEF) that contains a collection of fold-class predictions made from protein sequences and a mail server that provides new predictions for new sequences. To any given sequence one of 49 fold-classes is chosen to classify the structure...... related to the sequence with high accuracy. The updated predictions system is developed using data from the new version of the 3D-ALI database of aligned protein structures and thus is giving more reliable and more detailed predictions than the previous DEF system....

  20. dbDiarrhea: the database of pathogen proteins and vaccine antigens from diarrheal pathogens.

    Science.gov (United States)

    Ramana, Jayashree; Tamanna

    2012-12-01

    Diarrhea occurs world-wide and is most commonly caused by gastrointestinal infections which kill around 2.2 million people globally each year, mostly children in developing countries. We describe here dbDiarrhea, which is currently the most comprehensive catalog of proteins implicated in the pathogenesis of diarrhea caused by major bacterial, viral and parasitic species. The current release of the database houses 820 proteins gleaned through an extensive and critical survey of research articles from PubMed. The major contributors to this compendium of proteins are Escherichia coli and Salmonella enterica. These proteins are classified into different categories such as Type III secretion system effectors, Type III secretion system components, and Pathogen proteins. There is another complementary module called 'Host proteins'. dbDiarrhea also serves as a repository of the research articles describing (1) trials of subunit and whole organism vaccines (2) high-throughput screening of Type III secretion system inhibitors and (3) diagnostic assays, for various diarrheal pathogens. The database is web accessible through an intuitive user interface that allows querying proteins and research articles for different organism, keywords and accession number. Besides providing the search facility through browsing, the database supports sequence similarity search with the BLAST tool. With the rapidly burgeoning global burden of the diarrhea, we anticipate that this database would serve as a source of useful information for furthering research on diarrhea. The database can be freely accessed at http://www.juit.ac.in/attachments/dbdiarrhea/diarrhea_home.html. Copyright © 2012 Elsevier B.V. All rights reserved.

  1. HIP2: An online database of human plasma proteins from healthy individuals

    Directory of Open Access Journals (Sweden)

    Shen Changyu

    2008-04-01

    Full Text Available Abstract Background With the introduction of increasingly powerful mass spectrometry (MS techniques for clinical research, several recent large-scale MS proteomics studies have sought to characterize the entire human plasma proteome with a general objective for identifying thousands of proteins leaked from tissues in the circulating blood. Understanding the basic constituents, diversity, and variability of the human plasma proteome is essential to the development of sensitive molecular diagnosis and treatment monitoring solutions for future biomedical applications. Biomedical researchers today, however, do not have an integrated online resource in which they can search for plasma proteins collected from different mass spectrometry platforms, experimental protocols, and search software for healthy individuals. The lack of such a resource for comparisons has made it difficult to interpret proteomics profile changes in patients' plasma and to design protein biomarker discovery experiments. Description To aid future protein biomarker studies of disease and health from human plasma, we developed an online database, HIP2 (Healthy Human Individual's Integrated Plasma Proteome. The current version contains 12,787 protein entries linked to 86,831 peptide entries identified using different MS platforms. Conclusion This web-based database will be useful to biomedical researchers involved in biomarker discovery research. This database has been developed to be the comprehensive collection of healthy human plasma proteins, and has protein data captured in a relational database schema built to contain mappings of supporting peptide evidence from several high-quality and high-throughput mass-spectrometry (MS experimental data sets. Users can search for plasma protein/peptide annotations, peptide/protein alignments, and experimental/sample conditions with options for filter-based retrieval to achieve greater analytical power for discovery and validation.

  2. Visualization portal for genetic variation (VizGVar): a tool for interactive visualization of SNPs and somatic mutations in exons, genes and protein domains.

    Science.gov (United States)

    Román, Antonio Solano; Alfaro, Verónica; Cruz, Carlos; Solano, Allan Orozco

    2017-10-30

    VizGVar was designed to meet the growing need of the research community for improved genomic and proteomic data viewers that benefit from better information visualization. We implemented a new information architecture and applied user centered design principles to provide a new improved way of visualizing genetic information and protein data related to human disease. VizGVar connects the entire database of Ensembl protein motifs, domains, genes and exons with annotated SNPs and somatic variations from PharmGKB and COSMIC. VizGVar precisely represents genetic variations and their respective location by colored curves to designate different types of variations. The structured hierarchy of biological data is reflected in aggregated patterns through different levels, integrating several layers of information at once. VizGVar provides a new interactive, web-based JavaScript visualization of somatic mutations and protein variation, enabling fast and easy discovery of clinically relevant variation patterns. VizGVar is accessible at http://vizport.io/vizgvar. http://vizport.io/vizgvar/doc/. asolano@broadinstitute.org, allan.orozcosolano@ucr.ac.cr.

  3. Identification and Validation of Human Missing Proteins and Peptides in Public Proteome Databases: Data Mining Strategy.

    Science.gov (United States)

    Elguoshy, Amr; Hirao, Yoshitoshi; Xu, Bo; Saito, Suguru; Quadery, Ali F; Yamamoto, Keiko; Mitsui, Toshiaki; Yamamoto, Tadashi

    2017-12-01

    In an attempt to complete human proteome project (HPP), Chromosome-Centric Human Proteome Project (C-HPP) launched the journey of missing protein (MP) investigation in 2012. However, 2579 and 572 protein entries in the neXtProt (2017-1) are still considered as missing and uncertain proteins, respectively. Thus, in this study, we proposed a pipeline to analyze, identify, and validate human missing and uncertain proteins in open-access transcriptomics and proteomics databases. Analysis of RNA expression pattern for missing proteins in Human protein Atlas showed that 28% of them, such as Olfactory receptor 1I1 ( O60431 ), had no RNA expression, suggesting the necessity to consider uncommon tissues for transcriptomic and proteomic studies. Interestingly, 21% had elevated expression level in a particular tissue (tissue-enriched proteins), indicating the importance of targeting such proteins in their elevated tissues. Additionally, the analysis of RNA expression level for missing proteins showed that 95% had no or low expression level (0-10 transcripts per million), indicating that low abundance is one of the major obstacles facing the detection of missing proteins. Moreover, missing proteins are predicted to generate fewer predicted unique tryptic peptides than the identified proteins. Searching for these predicted unique tryptic peptides that correspond to missing and uncertain proteins in the experimental peptide list of open-access MS-based databases (PA, GPM) resulted in the detection of 402 missing and 19 uncertain proteins with at least two unique peptides (≥9 aa) at <(5 × 10-4)% FDR. Finally, matching the native spectra for the experimentally detected peptides with their SRMAtlas synthetic counterparts at three transition sources (QQQ, QTOF, QTRAP) gave us an opportunity to validate 41 missing proteins by ≥2 proteotypic peptides.

  4. DOPA: GPU-based protein alignment using database and memory access optimizations

    Directory of Open Access Journals (Sweden)

    Hasan Laiq

    2011-07-01

    Full Text Available Abstract Background Smith-Waterman (S-W algorithm is an optimal sequence alignment method for biological databases, but its computational complexity makes it too slow for practical purposes. Heuristics based approximate methods like FASTA and BLAST provide faster solutions but at the cost of reduced accuracy. Also, the expanding volume and varying lengths of sequences necessitate performance efficient restructuring of these databases. Thus to come up with an accurate and fast solution, it is highly desired to speed up the S-W algorithm. Findings This paper presents a high performance protein sequence alignment implementation for Graphics Processing Units (GPUs. The new implementation improves performance by optimizing the database organization and reducing the number of memory accesses to eliminate bandwidth bottlenecks. The implementation is called Database Optimized Protein Alignment (DOPA and it achieves a performance of 21.4 Giga Cell Updates Per Second (GCUPS, which is 1.13 times better than the fastest GPU implementation to date. Conclusions In the new GPU-based implementation for protein sequence alignment (DOPA, the database is organized in equal length sequence sets. This equally distributes the workload among all the threads on the GPU's multiprocessors. The result is an improved performance which is better than the fastest available GPU implementation.

  5. Medicago PhosphoProtein Database: a repository for Medicago truncatula phosphoprotein data

    OpenAIRE

    Rose, Christopher M.; Venkateshwaran, Muthusubramanian; Grimsrud, Paul A.; Westphall, Michael S.; Sussman, Michael R.; Coon, Joshua J.; Ané, Jean-Michel

    2012-01-01

    The ability of legume crops to fix atmospheric nitrogen via a symbiotic association with soil rhizobia makes them an essential component of many agricultural systems. Initiation of this symbiosis requires protein phosphorylation-mediated signaling in response to rhizobial signals named Nod factors. Medicago truncatula (Medicago) is the model system for studying legume biology, making the study of its phosphoproteome essential. Here, we describe the Medicago PhosphoProtein Database (MPPD; http...

  6. Visualization Tools and Techniques for Search and Validation of Large Earth Science Spatial-Temporal Metadata Databases

    Science.gov (United States)

    Baskin, W. E.; Herbert, A.; Kusterer, J.

    2014-12-01

    Spatial-temporal metadata databases are critical components of interactive data discovery services for ordering Earth Science datasets. The development staff at the Atmospheric Science Data Center (ASDC) works closely with satellite Earth Science mission teams such as CERES, CALIPSO, TES, MOPITT, and CATS to create and maintain metadata databases that are tailored to the data discovery needs of the Earth Science community. This presentation focuses on the visualization tools and techniques used by the ASDC software development team for data discovery and validation/optimization of spatial-temporal objects in large multi-mission spatial-temporal metadata databases. The following topics will be addressed: Optimizing the level of detail of spatial temporal metadata to provide interactive spatial query performance over a multi-year Earth Science mission Generating appropriately scaled sensor footprint gridded (raster) metadata from Level1 and Level2 Satellite and Aircraft time-series data granules Performance comparison of raster vs vector spatial granule footprint mask queries in large metadata database and a description of the visualization tools used to assist with this analysis

  7. Human Protein Reference Database and Human Proteinpedia as resources for phosphoproteome analysis.

    Science.gov (United States)

    Goel, Renu; Harsha, H C; Pandey, Akhilesh; Prasad, T S Keshava

    2012-02-01

    Human Protein Reference Database (HPRD) is a rich resource of experimentally proven features of human proteins. Protein information in HPRD includes protein-protein interactions, post-translational modifications, enzyme/substrate relationships, disease associations, tissue expression, and subcellular localization of human proteins. Although, protein-protein interaction data from HPRD has been widely used by the scientific community, its phosphoproteome data has not been exploited to its full potential. HPRD is one of the largest documentations of human phosphoproteins in the public domain. Currently, phosphorylation data in HPRD comprises of 95,016 phosphosites mapped on to 13,041 proteins. Additionally, enzyme-substrate reactions responsible for 5930 phosphorylation events were also documented. Significant improvements in technologies and high-throughput platforms in biomedical investigations led to an exponential increase of biological data and phosphoproteomic data in recent years. Human Proteinpedia, a community annotation portal developed by us, has also contributed to the significant increase in phosphoproteomic data in HPRD. A large number of phosphorylation events have been mapped on to reference sequences available in HPRD and Human Proteinpedia along with associated protein features. This will provide a platform for systems biology approaches to determine the role of protein phosphorylation in protein function, cell signaling, biological processes and their implication in human diseases. This review aims to provide a composite view of phosphoproteomic data pertaining to human proteins in HPRD and Human Proteinpedia.

  8. Usability Testing of a Large, Multidisciplinary Library Database: Basic Search and Visual Search

    Directory of Open Access Journals (Sweden)

    Jody Condit Fagan

    2006-09-01

    Full Text Available Visual search interfaces have been shown by researchers to assist users with information search and retrieval. Recently, several major library vendors have added visual search interfaces or functions to their products. For public service librarians, perhaps the most critical area of interest is the extent to which visual search interfaces and text-based search interfaces support research. This study presents the results of eight full-scale usability tests of both the EBSCOhost Basic Search and Visual Search in the context of a large liberal arts university.

  9. Interleukin-1beta induced changes in the protein expression of rat islets: a computerized database

    DEFF Research Database (Denmark)

    Andersen, H U; Fey, S J; Larsen, Peter Mose

    1997-01-01

    as well as the intracellular mechanisms of action of interleukin 1-mediated beta-cell cytotoxicity are unknown. However, previous studies have found an association of beta-cell destruction with alterations in protein synthesis. Thus, two-dimensional (2-D) gel electrophoresis of pancreatic islet proteins......% of %IOD was 45.7% in the NEPHGE gels. Addition of interleukin-1beta (IL-1beta) to the cultures resulted in statistically significant modulation or de novo synthesis of 105 proteins in the 10% gels. In conclusion, we present the first 10% and 15% acrylamide 2-D gel protein databases of neonatal rat islets...... may be an important tool facilitating studies of the molecular pathogenesis of insulin-dependent diabetes mellitus. 2-D gel electrophoresis of islet proteins may lead to (i) the determination of qualitative and quantitative changes in specific islet proteins induced by cytokines, (ii...

  10. Three-Dimensional Visualizations in Teaching Genomics and Bioinformatics: Mutations in HIV Envelope Proteins and Their Consequences for Vaccine Design

    Directory of Open Access Journals (Sweden)

    Kathy Takayama

    2009-11-01

    Full Text Available This project addresses the need to provide a visual context to teach the practical applications of genome sequencing and bioinformatics. Present-day research relies on indirect visualization techniques (e.g., fluorescence-labeling of DNA in sequencing reactions and sophisticated computer analysis. Such methods are impractical and prohibitively expensive for laboratory classes. More importantly, there is a need for curriculum resources that visually demonstrate the application of genome sequence information rather than the DNA sequencing methodology itself. This project is a computer-based lesson plan that engages students in collaborative, problem-based learning. The specific example focuses on approaches to Human Immunodeficiency Virus-1 (HIV-1 vaccine design based on HIV-1 genome sequences using a case study. Students performed comparative alignments of variant HIV-1 sequences available from a public database. Students then examined the consequences of HIV-1 mutations by applying the alignments to three-dimensional images of the HIV-1 envelope protein structure, thus visualizing the implications for applications such as vaccine design. The lesson enhances problem solving through the application of one type of information (genomic or protein sequence into concrete visual conceptualizations. Assessment of student comprehension and problem-solving ability revealed marked improvement after the computer tutorial. Furthermore, contextual presentation of these concepts within a case study resulted in student responses that demonstrated higher levels of cognitive ability than was expected by the instructor.

  11. PASS2: an automated database of protein alignments organised as structural superfamilies

    Directory of Open Access Journals (Sweden)

    Sowdhamini Ramanathan

    2004-04-01

    Full Text Available Abstract Background The functional selection and three-dimensional structural constraints of proteins in nature often relates to the retention of significant sequence similarity between proteins of similar fold and function despite poor sequence identity. Organization of structure-based sequence alignments for distantly related proteins, provides a map of the conserved and critical regions of the protein universe that is useful for the analysis of folding principles, for the evolutionary unification of protein families and for maximizing the information return from experimental structure determination. The Protein Alignment organised as Structural Superfamily (PASS2 database represents continuously updated, structural alignments for evolutionary related, sequentially distant proteins. Description An automated and updated version of PASS2 is, in direct correspondence with SCOP 1.63, consisting of sequences having identity below 40% among themselves. Protein domains have been grouped into 628 multi-member superfamilies and 566 single member superfamilies. Structure-based sequence alignments for the superfamilies have been obtained using COMPARER, while initial equivalencies have been derived from a preliminary superposition using LSQMAN or STAMP 4.0. The final sequence alignments have been annotated for structural features using JOY4.0. The database is supplemented with sequence relatives belonging to different genomes, conserved spatially interacting and structural motifs, probabilistic hidden markov models of superfamilies based on the alignments and useful links to other databases. Probabilistic models and sensitive position specific profiles obtained from reliable superfamily alignments aid annotation of remote homologues and are useful tools in structural and functional genomics. PASS2 presents the phylogeny of its members both based on sequence and structural dissimilarities. Clustering of members allows us to understand diversification of

  12. Kinomer v. 1.0: a database of systematically classified eukaryotic protein kinases.

    Science.gov (United States)

    Martin, David M A; Miranda-Saavedra, Diego; Barton, Geoffrey J

    2009-01-01

    The regulation of protein function through reversible phosphorylation by protein kinases and phosphatases is a general mechanism controlling virtually every cellular activity. Eukaryotic protein kinases can be classified into distinct, well-characterized groups based on amino acid sequence similarity and function. We recently reported a highly sensitive and accurate hidden Markov model-based method for the automatic detection and classification of protein kinases into these specific groups. The Kinomer v. 1.0 database presented here contains annotated classifications for the protein kinase complements of 43 eukaryotic genomes. These span the taxonomic range and include fungi (16 species), plants (6), diatoms (1), amoebas (2), protists (1) and animals (17). The kinomes are stored in a relational database and are accessible through a web interface on the basis of species, kinase group or a combination of both. In addition, the Kinomer v. 1.0 HMM library is made available for users to perform classification on arbitrary sequences. The Kinomer v. 1.0 database is a continually updated resource where direct comparison of kinase sequences across kinase groups and across species can give insights into kinase function and evolution. Kinomer v. 1.0 is available at http://www.compbio.dundee.ac.uk/kinomer/.

  13. PTM-SD: a database of structurally resolved and annotated posttranslational modifications in proteins.

    Science.gov (United States)

    Craveur, Pierrick; Rebehmed, Joseph; de Brevern, Alexandre G

    2014-01-01

    Posttranslational modifications (PTMs) define covalent and chemical modifications of protein residues. They play important roles in modulating various biological functions. Current PTM databases contain important sequence annotations but do not provide informative 3D structural resource about these modifications. Posttranslational modification structural database (PTM-SD) provides access to structurally solved modified residues, which are experimentally annotated as PTMs. It combines different PTM information and annotation gathered from other databases, e.g. Protein DataBank for the protein structures and dbPTM and PTMCuration for fine sequence annotation. PTM-SD gives an accurate detection of PTMs in structural data. PTM-SD can be browsed by PDB id, UniProt accession number, organism and classic PTM annotation. Advanced queries can also be performed, i.e. detailed PTM annotations, amino acid type, secondary structure, SCOP class classification, PDB chain length and number of PTMs by chain. Statistics and analyses can be computed on a selected dataset of PTMs. Each PTM entry is detailed in a dedicated page with information on the protein sequence, local conformation with secondary structure and Protein Blocks. PTM-SD gives valuable information on observed PTMs in protein 3D structure, which is of great interest for studying sequence-structure- function relationships at the light of PTMs, and could provide insights for comparative modeling and PTM predictions protocols. Database URL: PTM-SD can be accessed at http://www.dsimb.inserm.fr/dsimb_tools/PTM-SD/. © The Author(s) 2014. Published by Oxford University Press.

  14. Open, Cross Platform Chemistry Application Unifying Structure Manipulation, External Tools, Databases and Visualization

    Science.gov (United States)

    2014-05-30

    b. ABSTRACT c. THIS PAGE 19b. TELEPHONE NUMBER (include area code) Standard Form 298 (Re . 8-98) v Prescribed by ANSI Std. Z39.18 17...project is the creation of the leading computational chemistry workbench , making the premier computational chemistry codes and databases easily...chemistry workbench , making the premier computational chemistry codes and databases easily acces- sible to chemistry practitioners. This has been

  15. Phi-square Lexical Competition Database (Phi-Lex): an online tool for quantifying auditory and visual lexical competition.

    Science.gov (United States)

    Strand, Julia F

    2014-03-01

    A widely agreed-upon feature of spoken word recognition is that multiple lexical candidates in memory are simultaneously activated in parallel when a listener hears a word, and that those candidates compete for recognition (Luce, Goldinger, Auer, & Vitevitch, Perception 62:615-625, 2000; Luce & Pisoni, Ear and Hearing 19:1-36, 1998; McClelland & Elman, Cognitive Psychology 18:1-86, 1986). Because the presence of those competitors influences word recognition, much research has sought to quantify the processes of lexical competition. Metrics that quantify lexical competition continuously are more effective predictors of auditory and visual (lipread) spoken word recognition than are the categorical metrics traditionally used (Feld & Sommers, Speech Communication 53:220-228, 2011; Strand & Sommers, Journal of the Acoustical Society of America 130:1663-1672, 2011). A limitation of the continuous metrics is that they are somewhat computationally cumbersome and require access to existing speech databases. This article describes the Phi-square Lexical Competition Database (Phi-Lex): an online, searchable database that provides access to multiple metrics of auditory and visual (lipread) lexical competition for English words, available at www.juliastrand.com/phi-lex .

  16. PATtyFams: Protein families for the microbial genomes in the PATRIC database

    Directory of Open Access Journals (Sweden)

    James J Davis

    2016-02-01

    Full Text Available The ability to build accurate protein families is a fundamental operation in bioinformatics that influences comparative analyses, genome annotation and metabolic modeling. For several years we have been maintaining protein families for all microbial genomes in the PATRIC database (Pathosystems Resource Integration Center, patricbrc.org in order to drive many of the comparative analysis tools that are available through the PATRIC website. However, due to the burgeoning number of genomes, traditional approaches for generating protein families are becoming prohibitive. In this report, we describe a new approach for generating protein families, which we call PATtyFams. This method uses the k-mer-based function assignments available through RAST (Rapid Annotation using Subsystem Technology to rapidly guide family formation, and then differentiates the function-based groups into families using a Markov Cluster algorithm (MCL. This new approach for generating protein families is rapid, scalable and has properties that are consistent with alignment-based methods.

  17. ProTherm, version 4.0: thermodynamic database for proteins and mutants.

    Science.gov (United States)

    Bava, K Abdulla; Gromiha, M Michael; Uedaira, Hatsuho; Kitajima, Koji; Sarai, Akinori

    2004-01-01

    Release 4.0 of ProTherm, thermodynamic database for proteins and mutants, contains approximately 14,500 numerical data (approximately 450% of the first version) of several thermodynamic parameters along with experimental methods and conditions, and structural, functional and literature information. The sequence and structural information of proteins is connected with thermodynamic data through links between entries in Protein Data Bank, Protein Information Resource and SWISS-PROT and the data in ProTherm. We have separated the Gibbs free energy change obtained at extrapolated temperature from the data on denaturation temperature measured by the thermal denaturation method. We have added the statistics of amino acid replacements and links to homologous structures to each protein. Further, we have improved the search and display options to enhance search capability through the web interface. ProTherm is freely available at http://gibk26. bse.kyutech.ac.jp/jouhou/Protherm/protherm.html.

  18. RiceRBP: a database of experimentally identified RNA-binding proteins in Oryza sativa L.

    Science.gov (United States)

    Morris, Robert T; Doroshenk, Kelly A; Crofts, Andrew J; Lewis, Nicholas; Okita, Thomas W; Wyrick, John J

    2011-02-01

    RNA-binding proteins play critical roles at multiple steps during gene expression, including mRNA transport and translation. mRNA transport is particularly important in rice (Oryza sativa L.) in order to ensure the proper localization of the prolamine and glutelin seed storage proteins. However, relatively little information is available about RNA-binding proteins that have been isolated or characterized in plants. The RiceRBP database is a novel resource for the analysis of RNA-binding proteins in rice. RiceRBP contains 257 experimentally identified RNA-binding proteins, which are derived from at least 221 distinct rice genes. Many of the identified proteins catalogued in RiceRBP had not previously been annotated or predicted to bind RNA. RiceRBP provides tools to facilitate the analysis of the identified RNA-binding proteins, including information about predicted protein domains, phylogenetic relationships, and expression patterns of the identified genes. Importantly, RiceRBP also contains tools to search and analyze predicted RNA-binding protein orthologs in other plant species. We anticipate that the data and analysis tools provided by RiceRBP should facilitate the study of plant RNA-binding proteins. RiceRBP is available at http://www.bioinformatics2.wsu.edu/RiceRBP. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.

  19. Fly-DPI: database of protein interactomes for D. melanogaster in the approach of systems biology

    Directory of Open Access Journals (Sweden)

    Lin Chieh-Hua

    2006-12-01

    Full Text Available Abstract Background Proteins control and mediate many biological activities of cells by interacting with other protein partners. This work presents a statistical model to predict protein interaction networks of Drosophila melanogaster based on insight into domain interactions. Results Three high-throughput yeast two-hybrid experiments and the collection in FlyBase were used as our starting datasets. The co-occurrences of domains in these interactive events are converted into a probability score of domain-domain interaction. These scores are used to infer putative interaction among all available open reading frames (ORFs of fruit fly. Additionally, the likelihood function is used to estimate all potential protein-protein interactions. All parameters are successfully iterated and MLE is obtained for each pair of domains. Additionally, the maximized likelihood reaches its converged criteria and maintains the probability stable. The hybrid model achieves a high specificity with a loss of sensitivity, suggesting that the model may possess major features of protein-protein interactions. Several putative interactions predicted by the proposed hybrid model are supported by literatures, while experimental data with a low probability score indicate an uncertain reliability and require further proof of interaction. Fly-DPI is the online database used to present this work. It is an integrated proteomics tool with comprehensive protein annotation information from major databases as well as an effective means of predicting protein-protein interactions. As a novel search strategy, the ping-pong search is a naïve path map between two chosen proteins based on pre-computed shortest paths. Adopting effective filtering strategies will facilitate researchers in depicting the bird's eye view of the network of interest. Fly-DPI can be accessed at http://flydpi.nhri.org.tw. Conclusion This work provides two reference systems, statistical and biological, to evaluate

  20. SIMAP--a comprehensive database of pre-calculated protein sequence similarities, domains, annotations and clusters.

    Science.gov (United States)

    Rattei, Thomas; Tischler, Patrick; Götz, Stefan; Jehl, Marc-André; Hoser, Jonathan; Arnold, Roland; Conesa, Ana; Mewes, Hans-Werner

    2010-01-01

    The prediction of protein function as well as the reconstruction of evolutionary genesis employing sequence comparison at large is still the most powerful tool in sequence analysis. Due to the exponential growth of the number of known protein sequences and the subsequent quadratic growth of the similarity matrix, the computation of the Similarity Matrix of Proteins (SIMAP) becomes a computational intensive task. The SIMAP database provides a comprehensive and up-to-date pre-calculation of the protein sequence similarity matrix, sequence-based features and sequence clusters. As of September 2009, SIMAP covers 48 million proteins and more than 23 million non-redundant sequences. Novel features of SIMAP include the expansion of the sequence space by including databases such as ENSEMBL as well as the integration of metagenomes based on their consistent processing and annotation. Furthermore, protein function predictions by Blast2GO are pre-calculated for all sequences in SIMAP and the data access and query functions have been improved. SIMAP assists biologists to query the up-to-date sequence space systematically and facilitates large-scale downstream projects in computational biology. Access to SIMAP is freely provided through the web portal for individuals (http://mips.gsf.de/simap/) and for programmatic access through DAS (http://webclu.bio.wzw.tum.de/das/) and Web-Service (http://mips.gsf.de/webservices/services/SimapService2.0?wsdl).

  1. ZifBASE: a database of zinc finger proteins and associated resources

    Directory of Open Access Journals (Sweden)

    Punetha Ankita

    2009-09-01

    databases like UniprotKB, PDB, ModBase and Protein Model Portal and PubMed for making it more informative. Conclusion A database is established to maintain the information of the sequence features, including the class, framework, number of fingers, residues, position, recognition site and physio-chemical properties (molecular weight, isoelectric point of both natural and engineered zinc finger proteins and dissociation constant of few. ZifBASE can provide more effective and efficient way of accessing the zinc finger protein sequences and their target binding sites with the links to their three-dimensional structures. All the data and functions are available at the advanced web-based search interface http://web.iitd.ac.in/~sundar/zifbase.

  2. DroID: the Drosophila Interactions Database, a comprehensive resource for annotated gene and protein interactions

    Directory of Open Access Journals (Sweden)

    Liu Guozhen

    2008-10-01

    Full Text Available Abstract Background Charting the interactions among genes and among their protein products is essential for understanding biological systems. A flood of interaction data is emerging from high throughput technologies, computational approaches, and literature mining methods. Quick and efficient access to this data has become a critical issue for biologists. Several excellent multi-organism databases for gene and protein interactions are available, yet most of these have understandable difficulty maintaining comprehensive information for any one organism. No single database, for example, includes all available interactions, integrated gene expression data, and comprehensive and searchable gene information for the important model organism, Drosophila melanogaster. Description DroID, the Drosophila Interactions Database, is a comprehensive interactions database designed specifically for Drosophila. DroID houses published physical protein interactions, genetic interactions, and computationally predicted interactions, including interologs based on data for other model organisms and humans. All interactions are annotated with original experimental data and source information. DroID can be searched and filtered based on interaction information or a comprehensive set of gene attributes from Flybase. DroID also contains gene expression and expression correlation data that can be searched and used to filter datasets, for example, to focus a study on sub-networks of co-expressed genes. To address the inherent noise in interaction data, DroID employs an updatable confidence scoring system that assigns a score to each physical interaction based on the likelihood that it represents a biologically significant link. Conclusion DroID is the most comprehensive interactions database available for Drosophila. To facilitate downstream analyses, interactions are annotated with original experimental information, gene expression data, and confidence scores. All data in

  3. The master two-dimensional gel database of human AMA cell proteins: towards linking protein and genome sequence and mapping information (update 1991)

    DEFF Research Database (Denmark)

    Celis, J E; Leffers, H; Rasmussen, H H

    1991-01-01

    autoantigens" and "cDNAs". For convenience we have included an alphabetical list of all known proteins recorded in this database. In the long run, the main goal of this database is to link protein and DNA sequencing and mapping information (Human Genome Program) and to provide an integrated picture...

  4. Medicago PhosphoProtein Database: a repository for Medicago truncatula phosphoprotein data

    OpenAIRE

    Rose, Christopher M.; Muthusubramanian eVenkateshwaran; Grimsrud, Paul A.; Westphall, Michael S.; Sussman, Michael R.; Coon, Joshua J.; Jean-Michel eAné

    2012-01-01

    The ability of legume crops to fix atmospheric nitrogen via a symbiotic association with soil rhizobia makes them an essential component of many agricultural systems. Initiation of this symbiosis requires protein phosphorylation-mediated signaling in response to rhizobial signals named Nod factors. Medicago truncatula (Medicago) is the model system for studying legume biology, making the study of its phosphoproteome essential. Here, we describe the Medicago Phosphoprotein Database (http://pho...

  5. Negative example selection for protein function prediction: the NoGO database.

    Directory of Open Access Journals (Sweden)

    Noah Youngs

    2014-06-01

    Full Text Available Negative examples - genes that are known not to carry out a given protein function - are rarely recorded in genome and proteome annotation databases, such as the Gene Ontology database. Negative examples are required, however, for several of the most powerful machine learning methods for integrative protein function prediction. Most protein function prediction efforts have relied on a variety of heuristics for the choice of negative examples. Determining the accuracy of methods for negative example prediction is itself a non-trivial task, given that the Open World Assumption as applied to gene annotations rules out many traditional validation metrics. We present a rigorous comparison of these heuristics, utilizing a temporal holdout, and a novel evaluation strategy for negative examples. We add to this comparison several algorithms adapted from Positive-Unlabeled learning scenarios in text-classification, which are the current state of the art methods for generating negative examples in low-density annotation contexts. Lastly, we present two novel algorithms of our own construction, one based on empirical conditional probability, and the other using topic modeling applied to genes and annotations. We demonstrate that our algorithms achieve significantly fewer incorrect negative example predictions than the current state of the art, using multiple benchmarks covering multiple organisms. Our methods may be applied to generate negative examples for any type of method that deals with protein function, and to this end we provide a database of negative examples in several well-studied organisms, for general use (The NoGO database, available at: bonneaulab.bio.nyu.edu/nogo.html.

  6. Negative example selection for protein function prediction: the NoGO database.

    Science.gov (United States)

    Youngs, Noah; Penfold-Brown, Duncan; Bonneau, Richard; Shasha, Dennis

    2014-06-01

    Negative examples - genes that are known not to carry out a given protein function - are rarely recorded in genome and proteome annotation databases, such as the Gene Ontology database. Negative examples are required, however, for several of the most powerful machine learning methods for integrative protein function prediction. Most protein function prediction efforts have relied on a variety of heuristics for the choice of negative examples. Determining the accuracy of methods for negative example prediction is itself a non-trivial task, given that the Open World Assumption as applied to gene annotations rules out many traditional validation metrics. We present a rigorous comparison of these heuristics, utilizing a temporal holdout, and a novel evaluation strategy for negative examples. We add to this comparison several algorithms adapted from Positive-Unlabeled learning scenarios in text-classification, which are the current state of the art methods for generating negative examples in low-density annotation contexts. Lastly, we present two novel algorithms of our own construction, one based on empirical conditional probability, and the other using topic modeling applied to genes and annotations. We demonstrate that our algorithms achieve significantly fewer incorrect negative example predictions than the current state of the art, using multiple benchmarks covering multiple organisms. Our methods may be applied to generate negative examples for any type of method that deals with protein function, and to this end we provide a database of negative examples in several well-studied organisms, for general use (The NoGO database, available at: bonneaulab.bio.nyu.edu/nogo.html).

  7. Deep Multimodal Pain Recognition: A Database and Comparison of Spatio-Temporal Visual Modalities

    DEFF Research Database (Denmark)

    Haque, Mohammad Ahsanul; Nasrollahi, Kamal; Moeslund, Thomas B.

    2018-01-01

    on shallow learning scenarios. However, employing deep learning techniques for spatio-temporal analysis considering Depth (D) and Thermal (T) along with RGB has high potential in this area. In this paper, we present the first state-of-the-art publicly available database, 'Multimodal Intensity Pain (MInt...

  8. CancerProView: a graphical image database of cancer-related genes and proteins.

    Science.gov (United States)

    Mitsuyama, Susumu; Shimizu, Nobuyoshi

    2012-08-01

    We have developed a graphical image database CancerProView (URL: http://cancerproview.dmb.med.keio.ac.jp/php/cpv.html) to assist the search for alterations of the motifs/domains in the cancer-related proteins that are caused by mutations in the corresponding genes. For the CancerProView, we have collected various kinds of data on 180 cancer-related proteins in terms of the motifs/domains, genomic structures of corresponding genes, and 109 charts of the protein interaction pathways. Moreover, we have collected the relevant data on 1041 reference genes including 197 non-cancer disease-associated genes, and the nucleotide sequences for 2011 full-length cDNA's and the alternatively spliced transcript variants. Thus, the CancerProView database system would provide valuable information to facilitate basic cancer research as well as for designing new molecular diagnosis and drug discovery for cancers. The CancerProView database can be operated via Internet with any Web browser, and the system is freely available to interested users without ID and password. Copyright © 2012 Elsevier Inc. All rights reserved.

  9. YPED: a web-accessible database system for protein expression analysis.

    Science.gov (United States)

    Shifman, Mark A; Li, Yuli; Colangelo, Christopher M; Stone, Kathryn L; Wu, Terence L; Cheung, Kei-Hoi; Miller, Perry L; Williams, Kenneth R

    2007-10-01

    We have developed an integrated web-accessible software system called the Yale Protein Expression Database (YPED) to address the need for storage, retrieval, and integrated analysis of large amounts of data from high throughput proteomic technologies. YPED is an open source system which integrates gel analysis results with protein identifications from DIGE experiments. The system associates the DIGE gel spots and image, analyzed with DeCyder, with mass spectrometric protein identifications from selected gel spots. Following in gel trypsin digestion, proteins in spots of interest are analyzed using MALDI-TOF/TOF on an AB 4700 or, more recently, on an AB 4800 with protein identifications performed by Mascot in conjunction with the AB GPS Explorer system. In addition to DIGE, YPED currently handles protein identifications from MudPIT, iTRAQ, and ICAT experiments. Sample descriptions are compatible with the evolving MIAPE standards. Tandem MS/MS results from MudPIT, and ICAT analyses are validated with the Trans-Proteomic Pipeline and then stored in the database for viewing and linking to the identified proteins. Researchers can view, subset, and download their data through a secure Web interface that includes a table containing proteins identified, a sample summary, the sample description, and a clickable gel image for DIGE samples. Tools are available to facilitate sample comparison and the viewing of phosphoproteins. A summary report with PANTHER Classification System annotations is also available to aid in biological interpretation of the results. The source code is open-source and is available from http://yped.med.yale.edu/yped_dist.

  10. DSFL database: A hub of target proteins of Leishmania sp. to combat leishmaniasis

    Directory of Open Access Journals (Sweden)

    Ameer Khusro

    2017-07-01

    Full Text Available Leishmaniasis is a vector-borne chronic infectious tropical dermal disease caused by the protozoa parasite of the genus Leishmania that causes high mortality globally. Among three different clinical forms of leishmaniasis, visceral leishmaniasis (VL or kala-azar is a systemic public health disease with high morbidity and mortality in developing countries, caused by Leishmania donovani, Leishmania infantum or Leishmania chagasi. Unfortunately, there is no vaccine available till date for the treatment of leishmaniasis. On the other hand, the therapeutics approved to treat this fatal disease is expensive, toxic, and associated with serious side effects. Furthermore, the emergence of drug-resistant Leishmania parasites in most endemic countries due to the incessant utilization of existing drugs is a major concern at present. Drug Search for Leishmaniasis (DSFL is a unique database that involves 50 crystallized target proteins of varied Leishmania sp. in order to develop new drugs in future by interacting several antiparasitic compounds or molecules with specific protein through computational tools. The structure of target protein from different Leishmania sp. is available in this database. In this review, we spotlighted not only the current global status of leishmaniasis in brief but also detailed information about target proteins of various Leishmania sp. available in DSFL. DSFL has created a new expectation for mankind in order to combat leishmaniasis by targeting parasitic proteins and commence a new era to get rid of drug resistance parasites. The database will substantiate to be a worthwhile project for further development of new, non-toxic, and cost-effective antileishmanial drugs as targeted therapies using in vitro/in vivo assays.

  11. Merging in-silico and in vitro salivary protein complex partners using the STRING database: A tutorial.

    Science.gov (United States)

    Crosara, Karla Tonelli Bicalho; Moffa, Eduardo Buozi; Xiao, Yizhi; Siqueira, Walter Luiz

    2018-01-16

    Protein-protein interaction is a common physiological mechanism for protection and actions of proteins in an organism. The identification and characterization of protein-protein interactions in different organisms is necessary to better understand their physiology and to determine their efficacy. In a previous in vitro study using mass spectrometry, we identified 43 proteins that interact with histatin 1. Six previously documented interactors were confirmed and 37 novel partners were identified. In this tutorial, we aimed to demonstrate the usefulness of the STRING database for studying protein-protein interactions. We used an in-silico approach along with the STRING database (http://string-db.org/) and successfully performed a fast simulation of a novel constructed histatin 1 protein-protein network, including both the previously known and the predicted interactors, along with our newly identified interactors. Our study highlights the advantages and importance of applying bioinformatics tools to merge in-silico tactics with experimental in vitro findings for rapid advancement of our knowledge about protein-protein interactions. Our findings also indicate that bioinformatics tools such as the STRING protein network database can help predict potential interactions between proteins and thus serve as a guide for future steps in our exploration of the Human Interactome. Our study highlights the usefulness of the STRING protein database for studying protein-protein interactions. The STRING database can collect and integrate data about known and predicted protein-protein associations from many organisms, including both direct (physical) and indirect (functional) interactions, in an easy-to-use interface. Copyright © 2017 Elsevier B.V. All rights reserved.

  12. Searching the protein structure database for ligand-binding site similarities using CPASS v.2

    Directory of Open Access Journals (Sweden)

    Caprez Adam

    2011-01-01

    Full Text Available Abstract Background A recent analysis of protein sequences deposited in the NCBI RefSeq database indicates that ~8.5 million protein sequences are encoded in prokaryotic and eukaryotic genomes, where ~30% are explicitly annotated as "hypothetical" or "uncharacterized" protein. Our Comparison of Protein Active-Site Structures (CPASS v.2 database and software compares the sequence and structural characteristics of experimentally determined ligand binding sites to infer a functional relationship in the absence of global sequence or structure similarity. CPASS is an important component of our Functional Annotation Screening Technology by NMR (FAST-NMR protocol and has been successfully applied to aid the annotation of a number of proteins of unknown function. Findings We report a major upgrade to our CPASS software and database that significantly improves its broad utility. CPASS v.2 is designed with a layered architecture to increase flexibility and portability that also enables job distribution over the Open Science Grid (OSG to increase speed. Similarly, the CPASS interface was enhanced to provide more user flexibility in submitting a CPASS query. CPASS v.2 now allows for both automatic and manual definition of ligand-binding sites and permits pair-wise, one versus all, one versus list, or list versus list comparisons. Solvent accessible surface area, ligand root-mean square difference, and Cβ distances have been incorporated into the CPASS similarity function to improve the quality of the results. The CPASS database has also been updated. Conclusions CPASS v.2 is more than an order of magnitude faster than the original implementation, and allows for multiple simultaneous job submissions. Similarly, the CPASS database of ligand-defined binding sites has increased in size by ~ 38%, dramatically increasing the likelihood of a positive search result. The modification to the CPASS similarity function is effective in reducing CPASS similarity scores

  13. UniPROBE, update 2011: expanded content and search tools in the online database of protein-binding microarray data on protein-DNA interactions.

    Science.gov (United States)

    Robasky, Kimberly; Bulyk, Martha L

    2011-01-01

    The Universal PBM Resource for Oligonucleotide-Binding Evaluation (UniPROBE) database is a centralized repository of information on the DNA-binding preferences of proteins as determined by universal protein-binding microarray (PBM) technology. Each entry for a protein (or protein complex) in UniPROBE provides the quantitative preferences for all possible nucleotide sequence variants ('words') of length k ('k-mers'), as well as position weight matrix (PWM) and graphical sequence logo representations of the k-mer data. In this update, we describe >130% expansion of the database content, incorporation of a protein BLAST (blastp) tool for finding protein sequence matches in UniPROBE, the introduction of UniPROBE accession numbers and additional database enhancements. The UniPROBE database is available at http://uniprobe.org.

  14. PDB-UF: database of predicted enzymatic functions for unannotated protein structures from structural genomics

    Directory of Open Access Journals (Sweden)

    Rychlewski Leszek

    2006-02-01

    Full Text Available Abstract Background The number of protein structures from structural genomics centers dramatically increases in the Protein Data Bank (PDB. Many of these structures are functionally unannotated because they have no sequence similarity to proteins of known function. However, it is possible to successfully infer function using only structural similarity. Results Here we present the PDB-UF database, a web-accessible collection of predictions of enzymatic properties using structure-function relationship. The assignments were conducted for three-dimensional protein structures of unknown function that come from structural genomics initiatives. We show that 4 hypothetical proteins (with PDB accession codes: 1VH0, 1NS5, 1O6D, and 1TO0, for which standard BLAST tools such as PSI-BLAST or RPS-BLAST failed to assign any function, are probably methyltransferase enzymes. Conclusion We suggest that the structure-based prediction of an EC number should be conducted having the different similarity score cutoff for different protein folds. Moreover, performing the annotation using two different algorithms can reduce the rate of false positive assignments. We believe, that the presented web-based repository will help to decrease the number of protein structures that have functions marked as "unknown" in the PDB file. Availability http://paradox.harvard.edu/PDB-UF and http://bioinfo.pl/PDB-UF

  15. sc-PDB: a database for identifying variations and multiplicity of 'druggable' binding sites in proteins.

    Science.gov (United States)

    Meslamani, Jamel; Rognan, Didier; Kellenberger, Esther

    2011-05-01

    The sc-PDB database is an annotated archive of druggable binding sites extracted from the Protein Data Bank. It contains all-atoms coordinates for 8166 protein-ligand complexes, chosen for their geometrical and physico-chemical properties. The sc-PDB provides a functional annotation for proteins, a chemical description for ligands and the detailed intermolecular interactions for complexes. The sc-PDB now includes a hierarchical classification of all the binding sites within a functional class. The sc-PDB entries were first clustered according to the protein name indifferent of the species. For each cluster, we identified dissimilar sites (e.g. catalytic and allosteric sites of an enzyme). SCOPE AND APPLICATIONS: The classification of sc-PDB targets by binding site diversity was intended to facilitate chemogenomics approaches to drug design. In ligand-based approaches, it avoids comparing ligands that do not share the same binding site. In structure-based approaches, it permits to quantitatively evaluate the diversity of the binding site definition (variations in size, sequence and/or structure). The sc-PDB database is freely available at: http://bioinfo-pharma.u-strasbg.fr/scPDB.

  16. SeqX: a tool to detect, analyze and visualize residue co-locations in protein and nucleic acid structures

    Directory of Open Access Journals (Sweden)

    Fördös Gergely

    2005-07-01

    Full Text Available Abstract Background The interacting residues of protein and nucleic acid sequences are close to each other – they are co-located. Structure databases (like Protein Data Bank, PDB and Nucleic Acid Data Bank, NDB contain all information about these co-locations; however it is not an easy task to penetrate this complex information. We developed a JAVA tool, called SeqX for this purpose. Results SeqX tool is useful to detect, analyze and visualize residue co-locations in protein and nucleic acid structures. The user a. selects a structure from PDB; b. chooses an atom that is commonly present in every residues of the nucleic acid and/or protein structure(s c. defines a distance from these atoms (3–15 Å. The SeqX tool detects every residue that is located within the defined distances from the defined "backbone" atom(s; provides a DotPlot-like visualization (Residues Contact Map, and calculates the frequency of every possible residue pairs (Residue Contact Table in the observed structure. It is possible to exclude +/- 1 to 10 neighbor residues in the same polymeric chain from detection, which greatly improves the specificity of detections (up to 60% when tested on dsDNA. Results obtained on protein structures showed highly significant correlations with results obtained from literature (p Conclusion The tool is simple and easy to use and provides a quick and reliable visualization and analyses of residue co-locations in protein and nucleic acid structures. Availability and requirements http://janbiro.com/Downloads.html SeqX, Java J2SE Runtime Environment 5.0 (available from [see Additional file 1] http://www.sun.com and at least a 1 GHz processor and with a minimum 256 Mb RAM. Source codes are available from the authors. Additional File 1 SeqX_1.041_05601.jar. see this article Click here for file

  17. Identification and correction of abnormal, incomplete and mispredicted proteins in public databases

    Directory of Open Access Journals (Sweden)

    Bányai László

    2008-08-01

    Full Text Available Abstract Background Despite significant improvements in computational annotation of genomes, sequences of abnormal, incomplete or incorrectly predicted genes and proteins remain abundant in public databases. Since the majority of incomplete, abnormal or mispredicted entries are not annotated as such, these errors seriously affect the reliability of these databases. Here we describe the MisPred approach that may provide an efficient means for the quality control of databases. The current version of the MisPred approach uses five distinct routines for identifying abnormal, incomplete or mispredicted entries based on the principle that a sequence is likely to be incorrect if some of its features conflict with our current knowledge about protein-coding genes and proteins: (i conflict between the predicted subcellular localization of proteins and the absence of the corresponding sequence signals; (ii presence of extracellular and cytoplasmic domains and the absence of transmembrane segments; (iii co-occurrence of extracellular and nuclear domains; (iv violation of domain integrity; (v chimeras encoded by two or more genes located on different chromosomes. Results Analyses of predicted EnsEMBL protein sequences of nine deuterostome (Homo sapiens, Mus musculus, Rattus norvegicus, Monodelphis domestica, Gallus gallus, Xenopus tropicalis, Fugu rubripes, Danio rerio and Ciona intestinalis and two protostome species (Caenorhabditis elegans and Drosophila melanogaster have revealed that the absence of expected signal peptides and violation of domain integrity account for the majority of mispredictions. Analyses of sequences predicted by NCBI's GNOMON annotation pipeline show that the rates of mispredictions are comparable to those of EnsEMBL. Interestingly, even the manually curated UniProtKB/Swiss-Prot dataset is contaminated with mispredicted or abnormal proteins, although to a much lesser extent than UniProtKB/TrEMBL or the EnsEMBL or GNOMON

  18. Integration of gel-based and gel-free proteomic data for functional analysis of proteins through Soybean Proteome Database.

    Science.gov (United States)

    Komatsu, Setsuko; Wang, Xin; Yin, Xiaojian; Nanjo, Yohei; Ohyanagi, Hajime; Sakata, Katsumi

    2017-06-23

    The Soybean Proteome Database (SPD) stores data on soybean proteins obtained with gel-based and gel-free proteomic techniques. The database was constructed to provide information on proteins for functional analyses. The majority of the data is focused on soybean (Glycine max 'Enrei'). The growth and yield of soybean are strongly affected by environmental stresses such as flooding. The database was originally constructed using data on soybean proteins separated by two-dimensional polyacrylamide gel electrophoresis, which is a gel-based proteomic technique. Since 2015, the database has been expanded to incorporate data obtained by label-free mass spectrometry-based quantitative proteomics, which is a gel-free proteomic technique. Here, the portions of the database consisting of gel-free proteomic data are described. The gel-free proteomic database contains 39,212 proteins identified in 63 sample sets, such as temporal and organ-specific samples of soybean plants grown under flooding stress or non-stressed conditions. In addition, data on organellar proteins identified in mitochondria, nuclei, and endoplasmic reticulum are stored. Furthermore, the database integrates multiple omics data such as genomics, transcriptomics, metabolomics, and proteomics. The SPD database is accessible at http://proteome.dc.affrc.go.jp/Soybean/. The Soybean Proteome Database stores data obtained from both gel-based and gel-free proteomic techniques. The gel-free proteomic database comprises 39,212 proteins identified in 63 sample sets, such as different organs of soybean plants grown under flooding stress or non-stressed conditions in a time-dependent manner. In addition, organellar proteins identified in mitochondria, nuclei, and endoplasmic reticulum are stored in the gel-free proteomics database. A total of 44,704 proteins, including 5490 proteins identified using a gel-based proteomic technique, are stored in the SPD. It accounts for approximately 80% of all predicted proteins from

  19. Integration of gel-based and gel-free proteomic data for functional analysis of proteins through Soybean Proteome Database

    KAUST Repository

    Komatsu, Setsuko

    2017-05-10

    The Soybean Proteome Database (SPD) stores data on soybean proteins obtained with gel-based and gel-free proteomic techniques. The database was constructed to provide information on proteins for functional analyses. The majority of the data is focused on soybean (Glycine max ‘Enrei’). The growth and yield of soybean are strongly affected by environmental stresses such as flooding. The database was originally constructed using data on soybean proteins separated by two-dimensional polyacrylamide gel electrophoresis, which is a gel-based proteomic technique. Since 2015, the database has been expanded to incorporate data obtained by label-free mass spectrometry-based quantitative proteomics, which is a gel-free proteomic technique. Here, the portions of the database consisting of gel-free proteomic data are described. The gel-free proteomic database contains 39,212 proteins identified in 63 sample sets, such as temporal and organ-specific samples of soybean plants grown under flooding stress or non-stressed conditions. In addition, data on organellar proteins identified in mitochondria, nuclei, and endoplasmic reticulum are stored. Furthermore, the database integrates multiple omics data such as genomics, transcriptomics, metabolomics, and proteomics. The SPD database is accessible at http://proteome.dc.affrc.go.jp/Soybean/. Biological significanceThe Soybean Proteome Database stores data obtained from both gel-based and gel-free proteomic techniques. The gel-free proteomic database comprises 39,212 proteins identified in 63 sample sets, such as different organs of soybean plants grown under flooding stress or non-stressed conditions in a time-dependent manner. In addition, organellar proteins identified in mitochondria, nuclei, and endoplasmic reticulum are stored in the gel-free proteomics database. A total of 44,704 proteins, including 5490 proteins identified using a gel-based proteomic technique, are stored in the SPD. It accounts for approximately 80% of all

  20. Alga-PrAS (Algal Protein Annotation Suite): A Database of Comprehensive Annotation in Algal Proteomes.

    Science.gov (United States)

    Kurotani, Atsushi; Yamada, Yutaka; Sakurai, Tetsuya

    2017-01-01

    Algae are smaller organisms than land plants and offer clear advantages in research over terrestrial species in terms of rapid production, short generation time and varied commercial applications. Thus, studies investigating the practical development of effective algal production are important and will improve our understanding of both aquatic and terrestrial plants. In this study we estimated multiple physicochemical and secondary structural properties of protein sequences, the predicted presence of post-translational modification (PTM) sites, and subcellular localization using a total of 510,123 protein sequences from the proteomes of 31 algal and three plant species. Algal species were broadly selected from green and red algae, glaucophytes, oomycetes, diatoms and other microalgal groups. The results were deposited in the Algal Protein Annotation Suite database (Alga-PrAS; http://alga-pras.riken.jp/), which can be freely accessed online. © The Author 2017. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists.

  1. Analysis of residue conformations in peptides in Cambridge structural database and protein-peptide structural complexes.

    Science.gov (United States)

    Raghavender, Upadhyayula Surya

    2017-03-01

    A comprehensive statistical analysis of the geometric parameters of peptide chains in a reduced dataset of protein-peptide complexes in Protein Data Bank (PDB) is presented. The angular variables describing the backbone conformations of amino acid residues in peptide chains shed insights into the conformational preferences of peptide residues interacting with protein partners. Nonparametric statistical approaches are employed to evaluate the interrelationships and associations in structural variables. Grouping of residues based on their structure into chemical classes reveals characteristic trends in parameter relationships. A comparison of canonical amino acid residues in free peptide structures in Cambridge structural database (CSD) with identical residues in PDB complexes, suggests that the information can be integrated from both the structural repositories enabling efficient and accurate modeling of biologically active peptides. © 2016 John Wiley & Sons A/S.

  2. PDBj Mine: design and implementation of relational database interface for Protein Data Bank Japan

    Science.gov (United States)

    Kinjo, Akira R.; Yamashita, Reiko; Nakamura, Haruki

    2010-01-01

    This article is a tutorial for PDBj Mine, a new database and its interface for Protein Data Bank Japan (PDBj). In PDBj Mine, data are loaded from files in the PDBMLplus format (an extension of PDBML, PDB's canonical XML format, enriched with annotations), which are then served for the user of PDBj via the worldwide web (WWW). We describe the basic design of the relational database (RDB) and web interfaces of PDBj Mine. The contents of PDBMLplus files are first broken into XPath entities, and these paths and data are indexed in the way that reflects the hierarchical structure of the XML files. The data for each XPath type are saved into the corresponding relational table that is named as the XPath itself. The generation of table definitions from the PDBMLplus XML schema is fully automated. For efficient search, frequently queried terms are compiled into a brief summary table. Casual users can perform simple keyword search, and 'Advanced Search' which can specify various conditions on the entries. More experienced users can query the database using SQL statements which can be constructed in a uniform manner. Thus, PDBj Mine achieves a combination of the flexibility of XML documents and the robustness of the RDB. Database URL: http://www.pdbj.org/ PMID:20798081

  3. 'The surface management system' (SuMS) database: a surface-based database to aid cortical surface reconstruction, visualization and analysis

    Science.gov (United States)

    Dickson, J.; Drury, H.; Van Essen, D. C.

    2001-01-01

    Surface reconstructions of the cerebral cortex are increasingly widely used in the analysis and visualization of cortical structure, function and connectivity. From a neuroinformatics perspective, dealing with surface-related data poses a number of challenges. These include the multiplicity of configurations in which surfaces are routinely viewed (e.g. inflated maps, spheres and flat maps), plus the diversity of experimental data that can be represented on any given surface. To address these challenges, we have developed a surface management system (SuMS) that allows automated storage and retrieval of complex surface-related datasets. SuMS provides a systematic framework for the classification, storage and retrieval of many types of surface-related data and associated volume data. Within this classification framework, it serves as a version-control system capable of handling large numbers of surface and volume datasets. With built-in database management system support, SuMS provides rapid search and retrieval capabilities across all the datasets, while also incorporating multiple security levels to regulate access. SuMS is implemented in Java and can be accessed via a Web interface (WebSuMS) or using downloaded client software. Thus, SuMS is well positioned to act as a multiplatform, multi-user 'surface request broker' for the neuroscience community.

  4. Human protein reference database and human proteinpedia as discovery resources for molecular biotechnology.

    Science.gov (United States)

    Goel, Renu; Muthusamy, Babylakshmi; Pandey, Akhilesh; Prasad, T S Keshava

    2011-05-01

    In the recent years, research in molecular biotechnology has transformed from being small scale studies targeted at a single or a small set of molecule(s) into a combination of high throughput discovery platforms and extensive validations. Such a discovery platform provided an unbiased approach which resulted in the identification of several novel genetic and protein biomarkers. High throughput nature of these investigations coupled with higher sensitivity and specificity of Next Generation technologies provided qualitatively and quantitatively richer biological data. These developments have also revolutionized biological research and speed of data generation. However, it is becoming difficult for individual investigators to directly benefit from this data because they are not easily accessible. Data resources became necessary to assimilate, store and disseminate information that could allow future discoveries. We have developed two resources--Human Protein Reference Database (HPRD) and Human Proteinpedia, which integrate knowledge relevant to human proteins. A number of protein features including protein-protein interactions, post-translational modifications, subcellular localization, and tissue expression, which have been studied using different strategies were incorporated in these databases. Human Proteinpedia also provides a portal for community participation to annotate and share proteomic data and uses HPRD as the scaffold for data processing. Proteomic investigators can even share unpublished data in Human Proteinpedia, which provides a meaningful platform for data sharing. As proteomic information reflects a direct view of cellular systems, proteomics is expected to complement other areas of biology such as genomics, transcriptomics, molecular biology, cloning, and classical genetics in understanding the relationships among multiple facets of biological systems.

  5. Effect of cleavage enzyme, search algorithm and decoy database on mass spectrometric identification of wheat gluten proteins.

    Science.gov (United States)

    Vensel, William H; Dupont, Frances M; Sloane, Stacia; Altenbach, Susan B

    2011-07-01

    While tandem mass spectrometry (MS/MS) is routinely used to identify proteins from complex mixtures, certain types of proteins present unique challenges for MS/MS analyses. The major wheat gluten proteins, gliadins and glutenins, are particularly difficult to distinguish by MS/MS. Each of these groups contains many individual proteins with similar sequences that include repetitive motifs rich in proline and glutamine. These proteins have few cleavable tryptic sites, often resulting in only one or two tryptic peptides that may not provide sufficient information for identification. Additionally, there are less than 14,000 complete protein sequences from wheat in the current NCBInr release. In this paper, MS/MS methods were optimized for the identification of the wheat gluten proteins. Chymotrypsin and thermolysin as well as trypsin were used to digest the proteins and the collision energy was adjusted to improve fragmentation of chymotryptic and thermolytic peptides. Specialized databases were constructed that included protein sequences derived from contigs from several assemblies of wheat expressed sequence tags (ESTs), including contigs assembled from ESTs of the cultivar under study. Two different search algorithms were used to interrogate the database and the results were analyzed and displayed using a commercially available software package (Scaffold). We examined the effect of protein database content and size on the false discovery rate. We found that as database size increased above 30,000 sequences there was a decrease in the number of proteins identified. Also, the type of decoy database influenced the number of proteins identified. Using three enzymes, two search algorithms and a specialized database allowed us to greatly increase the number of detected peptides and distinguish proteins within each gluten protein group. Published by Elsevier Ltd.

  6. Visualizing virulence proteins and their translocation into the host during agrobacterium-mediated transformation

    NARCIS (Netherlands)

    Sakalis, Philippe Alexandre

    2013-01-01

    The project focuses on visualizing Agrobacterium Mediated Transformation (AMT) of host cells by real time microscopy. With new visualization techniques the function of several proteins, which have recently been discovered in our lab to play a role during AMT, are studied.

  7. The Methods of Cognitive Visualization for the Astronomical Databases Analyzing Tools Development

    Science.gov (United States)

    Vitkovskiy, V.; Gorohov, V.

    2008-08-01

    There are two kinds of computer graphics: the illustrative one and the cognitive one. Appropriate the cognitive pictures not only make evident and clear the sense of complex and difficult scientific concepts, but promote, --- and not so very rarely, --- a birth of a new knowledge. On the basis of the cognitive graphics concept, we worked out the SW-system for visualization and analysis. It allows to train and to aggravate intuition of researcher, to raise his interest and motivation to the creative, scientific cognition, to realize process of dialogue with the very problems simultaneously.

  8. [Establishment of protein fingerprint database of Salmonella paratyphi A using SELDI-TOF-MS].

    Science.gov (United States)

    Li, Xiao-qing; Huang, Wen-fang; Liu, Hua; Yang, Yong-chang; Xiao, Dai-wen; Yan, Hui; Luo, Chun-li

    2012-06-01

    To establish a protein fingerprint database of Salmonella paratyphi A by surface enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF-MS). Thirty-six clinical bacterial isolates and 96 control bacteria isolates were collected and identified using 16S rDNA sequencing. Bacterial proteins were detected by SELDI-TOF-MS, and all protein fingerprints were analyzed by ProteinChip and Biomarker Wizard software. The analysis results were used to set up a classification tree model by means of BioMarker Patterns software. At the same time, the data were tested by a blinded validation. In the range of M(r); 3 000-20 000, we obtained 104 protein peaks, of which 90 were of statistical significance (P<0.01). A protein peak with mass-to-charge ratio(M/Z) 10 061.7 was chosen to establish the classification tree model of Salmonella paratyphi A, and the sensitivity and specificity of Salmonella paratyphi A diagnosis was 100% as shown by the blinded validation. The classification tree model of Salmonella paratyphi A can be not only established using SELDI-TOF-MS technology, but also used for the rapid identification of Salmonella paratyphi A.

  9. Technical report on implementation of reactor internal 3D modeling and visual database system

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Yeun Seung; Eom, Young Sam; Lee, Suk Hee; Ryu, Seung Hyun [Korea Atomic Energy Research Institute, Taejon (Korea, Republic of)

    1996-06-01

    In this report was described a prototype of reactor internal 3D modeling and VDB system for NSSS design quality improvement. For improving NSSS design quality several cases of the nuclear developed nation`s integrated computer aided engineering system, such as Mitsubishi`s NUWINGS (Japan), AECL`s CANDID (Canada) and Duke Power`s PASCE (USA) were studied. On the basis of these studies the strategy for NSSS design improvement system was extracted and detail work scope was implemented as follows : 3D modelling of the reactor internals were implemented by using the parametric solid modeler, a prototype system of design document computerization and database was suggested, and walk-through simulation integrated with 3D modeling and VDB was accomplished. Major effects of NSSS design quality improvement system by using 3D modeling and VDB are the plant design optimization by simulation, improving the reliability through the single design database system and engineering cost reduction by improving productivity and efficiency. For applying the VDB to full scope of NSSS system design, 3D modelings of reactor coolant system and nuclear fuel assembly and fuel rod were attached as appendix. 2 tabs., 31 figs., 7 refs. (Author) .new.

  10. LoopX: A Graphical User Interface-Based Database for Comprehensive Analysis and Comparative Evaluation of Loops from Protein Structures.

    Science.gov (United States)

    Kadumuri, Rajashekar Varma; Vadrevu, Ramakrishna

    2017-10-01

    Due to their crucial role in function, folding, and stability, protein loops are being targeted for grafting/designing to create novel or alter existing functionality and improve stability and foldability. With a view to facilitate a thorough analysis and effectual search options for extracting and comparing loops for sequence and structural compatibility, we developed, LoopX a comprehensively compiled library of sequence and conformational features of ∼700,000 loops from protein structures. The database equipped with a graphical user interface is empowered with diverse query tools and search algorithms, with various rendering options to visualize the sequence- and structural-level information along with hydrogen bonding patterns, backbone φ, ψ dihedral angles of both the target and candidate loops. Two new features (i) conservation of the polar/nonpolar environment and (ii) conservation of sequence and conformation of specific residues within the loops have also been incorporated in the search and retrieval of compatible loops for a chosen target loop. Thus, the LoopX server not only serves as a database and visualization tool for sequence and structural analysis of protein loops but also aids in extracting and comparing candidate loops for a given target loop based on user-defined search options.

  11. Change Detection and Land Use / Land Cover Database Updating Using Image Segmentation, GIS Analysis and Visual Interpretation

    Science.gov (United States)

    Mas, J.-F.; González, R.

    2015-08-01

    This article presents a hybrid method that combines image segmentation, GIS analysis, and visual interpretation in order to detect discrepancies between an existing land use/cover map and satellite images, and assess land use/cover changes. It was applied to the elaboration of a multidate land use/cover database of the State of Michoacán, Mexico using SPOT and Landsat imagery. The method was first applied to improve the resolution of an existing 1:250,000 land use/cover map produced through the visual interpretation of 2007 SPOT images. A segmentation of the 2007 SPOT images was carried out to create spectrally homogeneous objects with a minimum area of two hectares. Through an overlay operation with the outdated map, each segment receives the "majority" category from the map. Furthermore, spectral indices of the SPOT image were calculated for each band and each segment; therefore, each segment was characterized from the images (spectral indices) and the map (class label). In order to detect uncertain areas which present discrepancy between spectral response and class label, a multivariate trimming, which consists in truncating a distribution from its least likely values, was applied. The segments that behave like outliers were detected and labeled as "uncertain" and a probable alternative category was determined by means of a digital classification using a decision tree classification algorithm. Then, the segments were visually inspected in the SPOT image and high resolution imagery to assign a final category. The same procedure was applied to update the map to 2014 using Landsat imagery. As a final step, an accuracy assessment was carried out using verification sites selected from a stratified random sampling and visually interpreted using high resolution imagery and ground truth.

  12. Neutron cross-sections database for amino acids and proteins analysis

    Energy Technology Data Exchange (ETDEWEB)

    Voi, Dante L.; Ferreira, Francisco de O.; Nunes, Rogerio Chaffin, E-mail: dante@ien.gov.br, E-mail: fferreira@ien.gov.br, E-mail: Chaffin@ien.gov.br [Instituto de Engenharia Nuclear (IEN/CNEN-RJ), Rio de Janeiro, RJ (Brazil); Rocha, Helio F. da, E-mail: hrocha@gbl.com.br [Universidade Federal do Rio de Janeiro (IPPMG/UFRJ), Rio de Janeiro, RJ (Brazil). Instituto de Pediatria

    2015-07-01

    Biological materials may be studied using neutrons as an unconventional tool of analysis. Dynamics and structures data can be obtained for amino acids, protein and others cellular components by neutron cross sections determinations especially for applications in nuclear purity and conformation analysis. The instrument used for this is the crystal spectrometer of the Instituto de Engenharia Nuclear (IEN-CNEN-RJ), the only one in Latin America that uses neutrons for this type of analyzes and it is installed in one of the reactor Argonauta irradiation channels. The experimentally values obtained are compared with calculated values using literature data with a rigorous analysis of the chemical composition, conformation and molecular structure analysis of the materials. A neutron cross-section database was constructed to assist in determining molecular dynamic, structure and formulae of biological materials. The database contains neutron cross-sections values of all amino acids, chemical elements, molecular groups, auxiliary radicals, as well as values of constants and parameters necessary for the analysis. An unprecedented analytical procedure was developed using the neutron cross section parceling and grouping method for data manipulation. This database is a result of measurements obtained from twenty amino acids that were provided by different manufactories and are used in oral administration in hospital individuals for nutritional applications. It was also constructed a small data file of compounds with different molecular groups including carbon, nitrogen, sulfur and oxygen, all linked to hydrogen atoms. A review of global and national scene in the acquisition of neutron cross sections data, the formation of libraries and the application of neutrons for analyzing biological materials is presented. This database has further application in protein analysis and the neutron cross-section from the insulin was estimated. (author)

  13. Exploring the Ligand-Protein Networks in Traditional Chinese Medicine: Current Databases, Methods, and Applications

    Directory of Open Access Journals (Sweden)

    Mingzhu Zhao

    2013-01-01

    Full Text Available The traditional Chinese medicine (TCM, which has thousands of years of clinical application among China and other Asian countries, is the pioneer of the “multicomponent-multitarget” and network pharmacology. Although there is no doubt of the efficacy, it is difficult to elucidate convincing underlying mechanism of TCM due to its complex composition and unclear pharmacology. The use of ligand-protein networks has been gaining significant value in the history of drug discovery while its application in TCM is still in its early stage. This paper firstly surveys TCM databases for virtual screening that have been greatly expanded in size and data diversity in recent years. On that basis, different screening methods and strategies for identifying active ingredients and targets of TCM are outlined based on the amount of network information available, both on sides of ligand bioactivity and the protein structures. Furthermore, applications of successful in silico target identification attempts are discussed in detail along with experiments in exploring the ligand-protein networks of TCM. Finally, it will be concluded that the prospective application of ligand-protein networks can be used not only to predict protein targets of a small molecule, but also to explore the mode of action of TCM.

  14. FRET-FLIM for Visualizing and Quantifying Protein Interactions in Live Plant Cells.

    Science.gov (United States)

    Rios, Alejandra Freire; Radoeva, Tatyana; De Rybel, Bert; Weijers, Dolf; Borst, Jan Willem

    2017-01-01

    Proteins are the workhorses that control most biological processes in living cells. Although proteins can accomplish their functions independently, the vast majority of functions require proteins to interact with other proteins or biomacromolecules. Protein interactions can be investigated through biochemical assays such as co-immunoprecipitation (co-IP) or Western blot analysis, but such assays lack spatial information. Here we describe a well-developed imaging method, Förster resonance energy transfer (FRET) analyzed by fluorescence lifetime imaging microscopy (FLIM), that can be used to visualize protein interactions with both spatial and temporal resolution in live cells. We demonstrate its use in plant developmental research by visualizing in vivo dimerization of AUXIN RESPONSE FACTOR (ARF) proteins, mediating auxin responses.

  15. The evolution of a Web resource: The Galactosemia Proteins Database 2.0.

    Science.gov (United States)

    d'Acierno, Antonio; Scafuri, Bernardina; Facchiano, Angelo; Marabotti, Anna

    2017-09-29

    Galactosemia Proteins Database 2.0 is a Web-accessible resource collecting information about the structural and functional effects of the known variations associated to the three different enzymes of the Leloir pathway encoded by the genes GALT, GALE, and GALK1 and involved in the different forms of the genetic disease globally called "galactosemia." It represents an evolution of two available online resources we previously developed, with new data deriving from new structures, new analysis tools, and new interfaces and filters in order to improve the quality and quantity of information available for different categories of users. We propose this new resource both as a landmark for the entire world community of galactosemia and as a model for the development of similar tools for other proteins object of variations and involved in human diseases. © 2017 Wiley Periodicals, Inc.

  16. Matching spatial with ontological brain regions using Java tools for visualization, database access, and integrated data analysis.

    Science.gov (United States)

    Bezgin, Gleb; Reid, Andrew T; Schubert, Dirk; Kötter, Rolf

    2009-01-01

    Brain atlases are widely used in experimental neuroscience as tools for locating and targeting specific brain structures. Delineated structures in a given atlas, however, are often difficult to interpret and to interface with database systems that supply additional information using hierarchically organized vocabularies (ontologies). Here we discuss the concept of volume-to-ontology mapping in the context of macroscopical brain structures. We present Java tools with which we have implemented this concept for retrieval of mapping and connectivity data on the macaque brain from the CoCoMac database in connection with an electronic version of "The Rhesus Monkey Brain in Stereotaxic Coordinates" authored by George Paxinos and colleagues. The software, including our manually drawn monkey brain template, can be downloaded freely under the GNU General Public License. It adds value to the printed atlas and has a wider (neuro-)informatics application since it can read appropriately annotated data from delineated sections of other species and organs, and turn them into 3D registered stacks. The tools provide additional features, including visualization and analysis of connectivity data, volume and centre-of-mass estimates, and graphical manipulation of entire structures, which are potentially useful for a range of research and teaching applications.

  17. Design and development of a linked open data-based health information representation and visualization system: potentials and preliminary evaluation.

    Science.gov (United States)

    Tilahun, Binyam; Kauppinen, Tomi; Keßler, Carsten; Fritz, Fleur

    2014-10-25

    Healthcare organizations around the world are challenged by pressures to reduce cost, improve coordination and outcome, and provide more with less. This requires effective planning and evidence-based practice by generating important information from available data. Thus, flexible and user-friendly ways to represent, query, and visualize health data becomes increasingly important. International organizations such as the World Health Organization (WHO) regularly publish vital data on priority health topics that can be utilized for public health policy and health service development. However, the data in most portals is displayed in either Excel or PDF formats, which makes information discovery and reuse difficult. Linked Open Data (LOD)-a new Semantic Web set of best practice of standards to publish and link heterogeneous data-can be applied to the representation and management of public level health data to alleviate such challenges. However, the technologies behind building LOD systems and their effectiveness for health data are yet to be assessed. The objective of this study is to evaluate whether Linked Data technologies are potential options for health information representation, visualization, and retrieval systems development and to identify the available tools and methodologies to build Linked Data-based health information systems. We used the Resource Description Framework (RDF) for data representation, Fuseki triple store for data storage, and Sgvizler for information visualization. Additionally, we integrated SPARQL query interface for interacting with the data. We primarily use the WHO health observatory dataset to test the system. All the data were represented using RDF and interlinked with other related datasets on the Web of Data using Silk-a link discovery framework for Web of Data. A preliminary usability assessment was conducted following the System Usability Scale (SUS) method. We developed an LOD-based health information representation, querying

  18. Using Molecular Visualization to Explore Protein Structure and Function and Enhance Student Facility with Computational Tools

    Science.gov (United States)

    Terrell, Cassidy R.; Listenberger, Laura L.

    2017-01-01

    Recognizing that undergraduate students can benefit from analysis of 3D protein structure and function, we have developed a multiweek, inquiry-based molecular visualization project for Biochemistry I students. This project uses a virtual model of cyclooxygenase-1 (COX-1) to guide students through multiple levels of protein structure analysis. The…

  19. SpirPro: A Spirulina proteome database and web-based tools for the analysis of protein-protein interactions at the metabolic level in Spirulina (Arthrospira) platensis C1.

    Science.gov (United States)

    Senachak, Jittisak; Cheevadhanarak, Supapon; Hongsthong, Apiradee

    2015-07-29

    Spirulina (Arthrospira) platensis is the only cyanobacterium that in addition to being studied at the molecular level and subjected to gene manipulation, can also be mass cultivated in outdoor ponds for commercial use as a food supplement. Thus, encountering environmental changes, including temperature stresses, is common during the mass production of Spirulina. The use of cyanobacteria as an experimental platform, especially for photosynthetic gene manipulation in plants and bacteria, is becoming increasingly important. Understanding the mechanisms and protein-protein interaction networks that underlie low- and high-temperature responses is relevant to Spirulina mass production. To accomplish this goal, high-throughput techniques such as OMICs analyses are used. Thus, large datasets must be collected, managed and subjected to information extraction. Therefore, databases including (i) proteomic analysis and protein-protein interaction (PPI) data and (ii) domain/motif visualization tools are required for potential use in temperature response models for plant chloroplasts and photosynthetic bacteria. A web-based repository was developed including an embedded database, SpirPro, and tools for network visualization. Proteome data were analyzed integrated with protein-protein interactions and/or metabolic pathways from KEGG. The repository provides various information, ranging from raw data (2D-gel images) to associated results, such as data from interaction and/or pathway analyses. This integration allows in silico analyses of protein-protein interactions affected at the metabolic level and, particularly, analyses of interactions between and within the affected metabolic pathways under temperature stresses for comparative proteomic analysis. The developed tool, which is coded in HTML with CSS/JavaScript and depicted in Scalable Vector Graphics (SVG), is designed for interactive analysis and exploration of the constructed network. SpirPro is publicly available on the web

  20. An MRspec database query and visualization engine with applications as a clinical diagnostic and research tool.

    Science.gov (United States)

    Miscevic, Filip; Foong, Justin; Schmitt, Benjamin; Blaser, Susan; Brudno, Michael; Schulze, Andreas

    2016-12-01

    Proton magnetic resonance spectroscopy (MRspec), one of the very few techniques for in vivo assessment of neuro-metabolic profiles, is often complicated by lack of standard population norms and paucity of computational tools. 7035 scans and clinical information from 4430 pediatric patients were collected from 2008 to 2014. Scans were conducted using a 1.5T (n=3664) or 3T scanner (n=3371), and with either a long (144ms, n=5559) or short echo time (35ms, n=1476). 3055 of these scans were localized in the basal ganglia (BG), 1211 in parieto-occipital white matter (WM). 34 metabolites were quantified using LCModel. A web application using MySQL, Python and Flask was developed to facilitate the exploration of the data set. Already piloting the application revealed numerous insights. (1), N-acetylaspartate (NAA) increased throughout all ages. During early infancy, total choline was highly varied and myo-inositol demonstrated a downward trend. (2), Total creatine (tCr) and creatine increased throughout childhood and adolescence, though phosphocreatine (PCr) remained constant beyond 200days. (3), tCr was higher in BG than WM. (4), No obvious gender-related differences were observed. (5), Field strength affects quantification using LCModel for some metabolites, most prominently for tCr and total NAA. (6), Outlier analysis identified patients treated with vigabatrin through elevated γ-aminobutyrate, and patients with Klippel-Feil syndrome, Leigh disease and L2-hydroxyglutaric aciduria through low choline in BG. We have established the largest MRSpec database and developed a robust and flexible computational tool for facilitating the exploration of vast metabolite datasets that proved its value for discovering neurochemical trends for clinical diagnosis, treatment monitoring, and research. Open access will lead to its widespread use, improving the diagnostic yield and contributing to better understanding of metabolic processes and conditions in the brain. Copyright © 2016

  1. Using the SUBcellular database for Arabidopsis proteins to localize the Deg protease family

    Science.gov (United States)

    Tanz, Sandra K.; Castleden, Ian; Hooper, Cornelia M.; Small, Ian; Millar, A. Harvey

    2014-01-01

    Sub-functionalization during the expansion of gene families in eukaryotes has occurred in part through specific subcellular localization of different family members. To better understand this process in plants, compiled records of large-scale proteomic and fluorescent protein localization datasets can be explored and bioinformatic predictions for protein localization can be used to predict the gaps in experimental data. This process can be followed by targeted experiments to test predictions. The SUBA3 database is a free web-service at http://suba.plantenergy.uwa.edu.au that helps users to explore reported experimental data and predictions concerning proteins encoded by gene families and to define the experiments required to locate these homologous sets of proteins. Here we show how SUBA3 can be used to explore the subcellular location of the Deg protease family of ATP-independent serine endopeptidases (Deg1–Deg16). Combined data integration and new experiments refined location information for Deg1 and Deg9, confirmed Deg2, Deg5, and Deg8 in plastids and Deg 15 in peroxisomes and provide substantial experimental evidence for mitochondrial localized Deg proteases. Two of these, Deg3 and Deg10, additionally localized to the plastid, revealing novel dual-targeted Deg proteases in the plastid and the mitochondrion. SUBA3 is continually updated to ensure that researchers can use the latest published data when planning the experimental steps remaining to localize gene family functions. PMID:25161662

  2. Cotton QTLdb: a cotton QTL database for QTL analysis, visualization, and comparison between Gossypium hirsutum and G. hirsutum × G. barbadense populations.

    Science.gov (United States)

    Said, Joseph I; Knapka, Joseph A; Song, Mingzhou; Zhang, Jinfa

    2015-08-01

    A specialized database currently containing more than 2200 QTL is established, which allows graphic presentation, visualization and submission of QTL. In cotton quantitative trait loci (QTL), studies are focused on intraspecific Gossypium hirsutum and interspecific G. hirsutum × G. barbadense populations. These two populations are commercially important for the textile industry and are evaluated for fiber quality, yield, seed quality, resistance, physiological, and morphological trait QTL. With meta-analysis data based on the vast amount of QTL studies in cotton it will be beneficial to organize the data into a functional database for the cotton community. Here we provide a tool for cotton researchers to visualize previously identified QTL and submit their own QTL to the Cotton QTLdb database. The database provides the user with the option of selecting various QTL trait types from either the G. hirsutum or G. hirsutum × G. barbadense populations. Based on the user's QTL trait selection, graphical representations of chromosomes of the population selected are displayed in publication ready images. The database also provides users with trait information on QTL, LOD scores, and explained phenotypic variances for all QTL selected. The CottonQTLdb database provides cotton geneticist and breeders with statistical data on cotton QTL previously identified and provides a visualization tool to view QTL positions on chromosomes. Currently the database (Release 1) contains 2274 QTLs, and succeeding QTL studies will be updated regularly by the curators and members of the cotton community that contribute their data to keep the database current. The database is accessible from http://www.cottonqtldb.org.

  3. RECOORD: a recalculated coordinate database of 500+ proteins from the PDB using restraints from the BioMagResBank

    NARCIS (Netherlands)

    Nederveen, Aart J.; Doreleijers, Jurgen F.; Vranken, Wim; Miller, Zachary; Spronk, Chris A. E. M.; Nabuurs, Sander B.; Güntert, Peter; Livny, Miron; Markley, John L.; Nilges, Michael; Ulrich, Eldon L.; Kaptein, Robert; Bonvin, Alexandre M. J. J.

    2005-01-01

    State-of-the-art methods based on CNS and CYANA were used to recalculate the nuclear magnetic resonance (NMR) solution structures of 500+ proteins for which coordinates and NMR restraints are available from the Protein Data Bank. Curated restraints were obtained from the BioMagResBank FRED database.

  4. Creation of a federated database of blood proteins: a powerful new tool for finding and characterizing biomarkers in serum

    Science.gov (United States)

    2014-01-01

    Protein biomarkers offer major benefits for diagnosis and monitoring of disease processes. Recent advances in protein mass spectrometry make it feasible to use this very sensitive technology to detect and quantify proteins in blood. To explore the potential of blood biomarkers, we conducted a thorough review to evaluate the reliability of data in the literature and to determine the spectrum of proteins reported to exist in blood with a goal of creating a Federated Database of Blood Proteins (FDBP). A unique feature of our approach is the use of a SQL database for all of the peptide data; the power of the SQL database combined with standard informatic algorithms such as BLAST and the statistical analysis system (SAS) allowed the rapid annotation and analysis of the database without the need to create special programs to manage the data. Our mathematical analysis and review shows that in addition to the usual secreted proteins found in blood, there are many reports of intracellular proteins and good agreement on transcription factors, DNA remodelling factors in addition to cellular receptors and their signal transduction enzymes. Overall, we have catalogued about 12,130 proteins identified by at least one unique peptide, and of these 3858 have 3 or more peptide correlations. The FDBP with annotations should facilitate testing blood for specific disease biomarkers. PMID:24476026

  5. Dynamic Proteomics: a database for dynamics and localizations of endogenous fluorescently-tagged proteins in living human cells.

    Science.gov (United States)

    Frenkel-Morgenstern, Milana; Cohen, Ariel A; Geva-Zatorsky, Naama; Eden, Eran; Prilusky, Jaime; Issaeva, Irina; Sigal, Alex; Cohen-Saidon, Cellina; Liron, Yuvalal; Cohen, Lydia; Danon, Tamar; Perzov, Natalie; Alon, Uri

    2010-01-01

    Recent advances allow tracking the levels and locations of a thousand proteins in individual living human cells over time using a library of annotated reporter cell clones (LARC). This library was created by Cohen et al. to study the proteome dynamics of a human lung carcinoma cell-line treated with an anti-cancer drug. Here, we report the Dynamic Proteomics database for the proteins studied by Cohen et al. Each cell-line clone in LARC has a protein tagged with yellow fluorescent protein, expressed from its endogenous chromosomal location, under its natural regulation. The Dynamic Proteomics interface facilitates searches for genes of interest, downloads of protein fluorescent movies and alignments of dynamics following drug addition. Each protein in the database is displayed with its annotation, cDNA sequence, fluorescent images and movies obtained by the time-lapse microscopy. The protein dynamics in the database represents a quantitative trace of the protein fluorescence levels in nucleus and cytoplasm produced by image analysis of movies over time. Furthermore, a sequence analysis provides a search and comparison of up to 50 input DNA sequences with all cDNAs in the library. The raw movies may be useful as a benchmark for developing image analysis tools for individual-cell dynamic-proteomics. The database is available at http://www.dynamicproteomics.net/.

  6. Expressions of visual pigments and synaptic proteins in neonatal ...

    Indian Academy of Sciences (India)

    2016-09-28

    Sep 28, 2016 ... decreased expressions of opsins and synaptic proteins, compared to those seen in 12L:12D and 18L:6D conditions. Also, there were ... used in houses and work places where we are continually http://www.ias.ac.in/jbiosci ..... zation and morphological and functional well-being of cells lying in the INL.

  7. Genevar: a database and Java application for the analysis and visualization of SNP-gene associations in eQTL studies

    Science.gov (United States)

    Yang, Tsun-Po; Beazley, Claude; Montgomery, Stephen B.; Dimas, Antigone S.; Gutierrez-Arcelus, Maria; Stranger, Barbara E.; Deloukas, Panos; Dermitzakis, Emmanouil T.

    2010-01-01

    Summary: Genevar (GENe Expression VARiation) is a database and Java tool designed to integrate multiple datasets, and provides analysis and visualization of associations between sequence variation and gene expression. Genevar allows researchers to investigate expression quantitative trait loci (eQTL) associations within a gene locus of interest in real time. The database and application can be installed on a standard computer in database mode and, in addition, on a server to share discoveries among affiliations or the broader community over the Internet via web services protocols. Availability: http://www.sanger.ac.uk/resources/software/genevar Contact: emmanouil.dermitzakis@unige.ch PMID:20702402

  8. Exploring the ligand-protein networks in traditional chinese medicine: current databases, methods and applications.

    Science.gov (United States)

    Zhao, Mingzhu; Wei, Dongqing

    2015-01-01

    While the concept of "single component-single target" in drug discovery seems to have come to an end, "Multi-component-multi-target" is considered to be another promising way out in this field. The Traditional Chinese Medicine (TCM), which has thousands of years' clinical application among China and other Asian countries, is the pioneer of the "Multi-component-multi-target" and network pharmacology. Hundreds of different components in a TCM prescription can cure the diseases or relieve the patients by modulating the network of potential therapeutic targets. Although there is no doubt of the efficacy, it is difficult to elucidate convincing underlying mechanism of TCM due to its complex composition and unclear pharmacology. Without thorough investigation of its potential targets and side effects, TCM is not able to generate large-scale medicinal benefits, especially in the days when scientific reductionism and quantification are dominant. The use of ligand-protein networks has been gaining significant value in the history of drug discovery while its application in TCM is still in its early stage. This article firstly surveys TCM databases for virtual screening that have been greatly expanded in size and data diversity in recent years. On that basis, different screening methods and strategies for identifying active ingredients and targets of TCM are outlined based on the amount of network information available, both on sides of ligand bioactivity and the protein structures. Furthermore, applications of successful in silico target identification attempts are discussed in details along with experiments in exploring the ligand-protein networks of TCM. Finally, it will be concluded that the prospective application of ligand-protein networks can be used not only to predict protein targets of a small molecule, but also to explore the mode of action of TCM.

  9. Visualization of differential gene expression by improved cyan fluorescent protein and yellow fluorescent protein production in Bacillus subtilis

    NARCIS (Netherlands)

    Veening, JW; Smits, WK; Hamoen, LW; Jongbloed, JDH; Kuipers, OP; Smits, Wiep Klaas

    2004-01-01

    The distinguishable cyan and yellow fluorescent proteins (CFP and YFP) enable the simultaneous in vivo visualization of different promoter activities. Here, we report new cloning vectors for the construction of cfp and yfp fusions in Bacillus subtilis. By extending the N-terminal portions of

  10. Global catalogue of microorganisms (gcm): a comprehensive database and information retrieval, analysis, and visualization system for microbial resources.

    Science.gov (United States)

    Wu, Linhuan; Sun, Qinglan; Sugawara, Hideaki; Yang, Song; Zhou, Yuguang; McCluskey, Kevin; Vasilenko, Alexander; Suzuki, Ken-Ichiro; Ohkuma, Moriya; Lee, Yeonhee; Robert, Vincent; Ingsriswang, Supawadee; Guissart, François; Philippe, Desmeth; Ma, Juncai

    2013-12-30

    Throughout the long history of industrial and academic research, many microbes have been isolated, characterized and preserved (whenever possible) in culture collections. With the steady accumulation in observational data of biodiversity as well as microbial sequencing data, bio-resource centers have to function as data and information repositories to serve academia, industry, and regulators on behalf of and for the general public. Hence, the World Data Centre for Microorganisms (WDCM) started to take its responsibility for constructing an effective information environment that would promote and sustain microbial research data activities, and bridge the gaps currently present within and outside the microbiology communities. Strain catalogue information was collected from collections by online submission. We developed tools for automatic extraction of strain numbers and species names from various sources, including Genbank, Pubmed, and SwissProt. These new tools connect strain catalogue information with the corresponding nucleotide and protein sequences, as well as to genome sequence and references citing a particular strain. All information has been processed and compiled in order to create a comprehensive database of microbial resources, and was named Global Catalogue of Microorganisms (GCM). The current version of GCM contains information of over 273,933 strains, which includes 43,436 bacterial, fungal and archaea species from 52 collections in 25 countries and regions.A number of online analysis and statistical tools have been integrated, together with advanced search functions, which should greatly facilitate the exploration of the content of GCM. A comprehensive dynamic database of microbial resources has been created, which unveils the resources preserved in culture collections especially for those whose informatics infrastructures are still under development, which should foster cumulative research, facilitating the activities of microbiologists world

  11. Global catalogue of microorganisms (gcm): a comprehensive database and information retrieval, analysis, and visualization system for microbial resources

    Science.gov (United States)

    2013-01-01

    Background Throughout the long history of industrial and academic research, many microbes have been isolated, characterized and preserved (whenever possible) in culture collections. With the steady accumulation in observational data of biodiversity as well as microbial sequencing data, bio-resource centers have to function as data and information repositories to serve academia, industry, and regulators on behalf of and for the general public. Hence, the World Data Centre for Microorganisms (WDCM) started to take its responsibility for constructing an effective information environment that would promote and sustain microbial research data activities, and bridge the gaps currently present within and outside the microbiology communities. Description Strain catalogue information was collected from collections by online submission. We developed tools for automatic extraction of strain numbers and species names from various sources, including Genbank, Pubmed, and SwissProt. These new tools connect strain catalogue information with the corresponding nucleotide and protein sequences, as well as to genome sequence and references citing a particular strain. All information has been processed and compiled in order to create a comprehensive database of microbial resources, and was named Global Catalogue of Microorganisms (GCM). The current version of GCM contains information of over 273,933 strains, which includes 43,436bacterial, fungal and archaea species from 52 collections in 25 countries and regions. A number of online analysis and statistical tools have been integrated, together with advanced search functions, which should greatly facilitate the exploration of the content of GCM. Conclusion A comprehensive dynamic database of microbial resources has been created, which unveils the resources preserved in culture collections especially for those whose informatics infrastructures are still under development, which should foster cumulative research, facilitating the

  12. Integrated visual analysis of protein structures, sequences, and feature data.

    Science.gov (United States)

    Stolte, Christian; Sabir, Kenneth S; Heinrich, Julian; Hammang, Christopher J; Schafferhans, Andrea; O'Donoghue, Seán I

    2015-01-01

    To understand the molecular mechanisms that give rise to a protein's function, biologists often need to (i) find and access all related atomic-resolution 3D structures, and (ii) map sequence-based features (e.g., domains, single-nucleotide polymorphisms, post-translational modifications) onto these structures. To streamline these processes we recently developed Aquaria, a resource offering unprecedented access to protein structure information based on an all-against-all comparison of SwissProt and PDB sequences. In this work, we provide a requirements analysis for several frequently occuring tasks in molecular biology and describe how design choices in Aquaria meet these requirements. Finally, we show how the interface can be used to explore features of a protein and gain biologically meaningful insights in two case studies conducted by domain experts. The user interface design of Aquaria enables biologists to gain unprecedented access to molecular structures and simplifies the generation of insight. The tasks involved in mapping sequence features onto structures can be conducted easier and faster using Aquaria.

  13. Visualizing Protein Interactions and Dynamics: Evolving a Visual Language for Molecular Animation

    Science.gov (United States)

    Jenkinson, Jodie; McGill, Gael

    2012-01-01

    Undergraduate biology education provides students with a number of learning challenges. Subject areas that are particularly difficult to understand include protein conformational change and stability, diffusion and random molecular motion, and molecular crowding. In this study, we examined the relative effectiveness of three-dimensional…

  14. Knitting Relational Documentary Networks: The Database Meta-Documentary Filming Revolution as a paradigm of bringing interactive audio-visual archives alive

    NARCIS (Netherlands)

    Wiehl, Anna

    2016-01-01

    abstractOne phenomenon in the emerging field of digital documentary are experiments with rhizomatic interfaces and database-logics to bring audio-visual archives 'alive'. A paradigm hereof is Filming Revolution (2015), an interactive platform which gathers and interlinks films of the uprisings in

  15. Visualization of recombinant DNA and protein complexes using atomic force microscopy.

    Science.gov (United States)

    Murphy, Patrick J M; Shannon, Morgan; Goertz, John

    2011-07-18

    Atomic force microscopy (AFM) allows for the visualizing of individual proteins, DNA molecules, protein-protein complexes, and DNA-protein complexes. On the end of the microscope's cantilever is a nano-scale probe, which traverses image areas ranging from nanometers to micrometers, measuring the elevation of macromolecules resting on the substrate surface at any given point. Electrostatic forces cause proteins, lipids, and nucleic acids to loosely attach to the substrate in random orientations and permit imaging. The generated data resemble a topographical map, where the macromolecules resolve as three-dimensional particles of discrete sizes (Figure 1). Tapping mode AFM involves the repeated oscillation of the cantilever, which permits imaging of relatively soft biomaterials such as DNA and proteins. One of the notable benefits of AFM over other nanoscale microscopy techniques is its relative adaptability to visualize individual proteins and macromolecular complexes in aqueous buffers, including near-physiologic buffered conditions, in real-time, and without staining or coating the sample to be imaged. The method presented here describes the imaging of DNA and an immunoadsorbed transcription factor (i.e. the glucocorticoid receptor, GR) in buffered solution (Figure 2). Immunoadsorbed proteins and protein complexes can be separated from the immunoadsorbing antibody-bead pellet by competition with the antibody epitope and then imaged (Figure 2A). This allows for biochemical manipulation of the biomolecules of interest prior to imaging. Once purified, DNA and proteins can be mixed and the resultant interacting complex can be imaged as well. Binding of DNA to mica requires a divalent cation, such as Ni(2+) or Mg(2+), which can be added to sample buffers yet maintain protein activity. Using a similar approach, AFM has been utilized to visualize individual enzymes, including RNA polymerase and a repair enzyme, bound to individual DNA strands. These experiments provide

  16. High-throughput clone screening followed by protein expression cross-check: A visual assay platform.

    Science.gov (United States)

    Bose, Partha Pratim; Kumar, Prakash

    2017-01-01

    In high-throughput biotechnology and structural biology, molecular cloning is an essential prerequisite for attaining high yields of recombinant protein. However, a rapid, cost-effective, easy clone screening protocol is still required to identify colonies with desired insert along with a cross check method to certify the expression of the desired protein as the end product. We report an easy, fast, sensitive and cheap visual clone screening and protein expression cross check protocol employing gold nanoparticle based plasmonic detection phenomenon. This is a non-gel, non-PCR based visual detection technique, which can be used as simultaneous high throughput clone screening followed by the determination of expression of desired protein. Copyright © 2016 Elsevier Inc. All rights reserved.

  17. Distribution of calcium-binding proteins in the chick visual system

    Directory of Open Access Journals (Sweden)

    C.P. Pfeiffer

    1997-11-01

    Full Text Available The calcium-binding proteins calbindin (CB, calretinin (CR, and parvalbumin (PV have been extensively studied over the last decade since they appear to be important as buffers of intracellular calcium. In the present study we investigated the distribution of these proteins in the chick visual system by means of conventional immunocytochemistry. The results indicated that CB, CR, and PV are widely distributed in retinorecipient areas of the chick brain. In some regions, all three calcium-binding proteins were present at different intensities and often in different neurons such as in the dorsolateral thalamic complex. In other areas, such as the nucleus geniculatus lateralis ventralis, only CB and CR were detected, whereas PV was absent. These results show that these three calcium-binding proteins are differentially distributed in the visual system of the chick, with varying degrees of co-localization

  18. A comprehensive assessment of long intrinsic protein disorder from the DisProt database.

    Science.gov (United States)

    Necci, Marco; Piovesan, Damiano; Dosztányi, Zsuzsanna; Tompa, Peter; Tosatto, Silvio C E

    2018-02-01

    Intrinsic disorder (ID), i.e. the lack of a unique folded conformation at physiological conditions, is a common feature for many proteins, which requires specialized biochemical experiments that are not high-throughput. Missing X-ray residues from the PDB have been widely used as a proxy for ID when developing computational methods. This may lead to a systematic bias, where predictors deviate from biologically relevant ID. Large benchmarking sets on experimentally validated ID are scarce. Recently, the DisProt database has been renewed and expanded to include manually curated ID annotations for several hundred new proteins. This provides a large benchmark set which has not yet been used for training ID predictors. Here, we describe the first systematic benchmarking of ID predictors on the new DisProt dataset. In contrast to previous assessments based on missing X-ray data, this dataset contains mostly long ID regions and a significant amount of fully ID proteins. The benchmarking shows that ID predictors work quite well on the new dataset, especially for long ID segments. However, a large fraction of ID still goes virtually undetected and the ranking of methods is different than for PDB data. In particular, many predictors appear to confound ID and regions outside X-ray structures. This suggests that the ID prediction methods capture different flavors of disorder and can benefit from highly accurate curated examples. The raw data used for the evaluation are available from URL: http://www.disprot.org/assessment/. silvio.tosatto@unipd.it. Supplementary data are available at Bioinformatics online.

  19. Effects of Fluoxetine and Visual Experience on Glutamatergic and GABAergic Synaptic Proteins in Adult Rat Visual Cortex123

    Science.gov (United States)

    Beshara, Simon; Beston, Brett R.; Pinto, Joshua G. A.

    2015-01-01

    Abstract Fluoxetine has emerged as a novel treatment for persistent amblyopia because in adult animals it reinstates critical period-like ocular dominance plasticity and promotes recovery of visual acuity. Translation of these results from animal models to the clinic, however, has been challenging because of the lack of understanding of how this selective serotonin reuptake inhibitor affects glutamatergic and GABAergic synaptic mechanisms that are essential for experience-dependent plasticity. An appealing hypothesis is that fluoxetine recreates a critical period (CP)-like state by shifting synaptic mechanisms to be more juvenile. To test this we studied the effect of fluoxetine treatment in adult rats, alone or in combination with visual deprivation [monocular deprivation (MD)], on a set of highly conserved presynaptic and postsynaptic proteins (synapsin, synaptophysin, VGLUT1, VGAT, PSD-95, gephyrin, GluN1, GluA2, GluN2B, GluN2A, GABAAα1, GABAAα3). We did not find evidence that fluoxetine shifted the protein amounts or balances to a CP-like state. Instead, it drove the balances in favor of the more mature subunits (GluN2A, GABAAα1). In addition, when fluoxetine was paired with MD it created a neuroprotective-like environment by normalizing the glutamatergic gain found in adult MDs. Together, our results suggest that fluoxetine treatment creates a novel synaptic environment dominated by GluN2A- and GABAAα1-dependent plasticity. PMID:26730408

  20. Highly efficient visual detection of trace copper(II) and protein by the quantum photoelectric effect.

    Science.gov (United States)

    Wang, Peng; Lei, Jianping; Su, Mengqi; Liu, Yueting; Hao, Qing; Ju, Huangxian

    2013-09-17

    This work presented a photocurrent response mechanism of quantum dots (QDs) under illumination with the concept of a quantum photoelectric effect. Upon irradiation, the photoelectron could directly escape from QDs. By using nitro blue tetrazolium (NBT) to capture the photoelectron, a new visual system was proposed due to the formation of an insoluble reduction product, purple formazan, which could be used to visualize the quantum photoelectric effect. The interaction of copper(II) with QDs could form trapping sites to interfere with the quantum confinement and thus blocked the escape of photoelectron, leading to a "signal off" visual method for sensitive copper(II) detection. Meanwhile, by using QDs as a signal tag to label antibody, a "signal on" visual method was also proposed for immunoassay of corresponding protein. With meso-2,3-dimercaptosuccinic-capped CdTe QDs and carcino-embryonic antigen as models, the proposed visual detection methods showed high sensitivity, low detection limit, and wide detectable concentration ranges. The visualization of quantum photoelectric effect could be simply extended for the detection of other targets. This work opens a new visual detection way and provides a highly efficient tool for bioanalysis.

  1. An Interactive Geospatial Database and Visualization Approach to Early Warning Systems and Monitoring of Active Volcanoes: GEOWARN

    Science.gov (United States)

    Gogu, R. C.; Schwandner, F. M.; Hurni, L.; Dietrich, V. J.

    2002-12-01

    Large parts of southern and central Europe and the Pacific rim are situated in tectonically, seismic and volcanological extremely active zones. With the growth of population and tourism, vulnerability and risk towards natural hazards have expanded over large areas. Socio-economical aspects, land use, tourist and industrial planning as well as environmental protection increasingly require needs of natural hazard assessment. The availability of powerful and reliable satellite, geophysical and geochemical information and warning systems is therefore increasingly vital. Besides, once such systems have proven to be effective, they can be applied for similar purposes in other European areas and worldwide. Technologies today have proven that early warning of volcanic activity can be achieved by monitoring measurable changes in geophysical and geochemical parameters. Correlation between different monitored data sets, which would improve any prediction, is very scarce or missing. Visualisation of all spatial information and integration into an "intelligent cartographic concept" is of paramount interest in order to develop 2-, 3- and 4-dimensional models to approach the risk and emergency assessment as well as environmental and socio-economic planning. In the framework of the GEOWARN project, a database prototype for an Early Warning System (EWS) and monitoring of volcanic activity in case of hydrothermal-explosive and volcanic reactivation has been designed. The platform-independent, web-based, JAVA-programmed, interactive multidisciplinary multiparameter visualization software being developed at ETH allows expansion and utilization to other volcanoes, world-wide databases of volcanic unrest, or other types of natural hazard assessment. Within the project consortium, scientific data have been acquired on two pilot sites: Campi Flegrei (Italy) and Nisyros Greece, including 2&3D Topography and Bathymetry, Elevation (DEM) and Landscape models (DLM) derived from conventional

  2. O-GLYCBASE: a revised database of O-glycosylated proteins

    DEFF Research Database (Denmark)

    Hansen, Jan; Lund, Ole; Nielsen, Jens O.

    1996-01-01

    O-GLYCBASE is a comprehensive database of information on glycoproteins and their O-linked glycosylation sites. Entries are compiled and revised from the SWISS-PROT and PIR databases as well as directly from recently published reports. Nineteen percent of the entries extracted from the databases...

  3. The human keratinocyte two-dimensional gel protein database (update 1995): mapping components of signal transduction pathways

    DEFF Research Database (Denmark)

    Celis, J E; Rasmussen, H H; Gromov, P

    1995-01-01

    chemoluminescence (ECL) detection. Identified proteins are listed both in alphabetical order and with increasing SSP number, together with their M(r), pI, cellular localization and credit to the investigator(s) that aided in the identification. Ultimately, the aim of the comprehensive database is to gather...

  4. Completion of HLA protein sequences by automated homology-based nearest-neighbor extrapolation of HLA database sequences

    NARCIS (Netherlands)

    Geneugelijk, K; Niemann, M; de Hoop, T; Spierings, E

    The IMGT/HLA database contains every publicly available HLA sequence. However, most of these HLA protein sequences are restricted to the alpha-1/alpha-2 domain for HLA class-I and alpha-1/beta-1 domain for HLA class-II. Nevertheless, also polymorphism outside these domains may play a role in

  5. Identifying Gel-Separated Proteins Using In-Gel Digestion, Mass Spectrometry, and Database Searching: Consider the Chemistry

    Science.gov (United States)

    Albright, Jessica C.; Dassenko, David J.; Mohamed, Essa A.; Beussman, Douglas J.

    2009-01-01

    Matrix-assisted laser desorption/ionization (MALDI) mass spectrometry is an important bioanalytical technique in drug discovery, proteomics, and research at the biology-chemistry interface. This is an especially powerful tool when combined with gel separation of proteins and database mining using the mass spectral data. Currently, few hands-on…

  6. Completion of HLA protein sequences by automated homology-based nearest-neighbor extrapolation of HLA database sequences

    NARCIS (Netherlands)

    Geneugelijk, K|info:eu-repo/dai/nl/413648699; Niemann, M; de Hoop, T; Spierings, E|info:eu-repo/dai/nl/195438728

    2016-01-01

    The IMGT/HLA database contains every publicly available HLA sequence. However, most of these HLA protein sequences are restricted to the alpha-1/alpha-2 domain for HLA class-I and alpha-1/beta-1 domain for HLA class-II. Nevertheless, also polymorphism outside these domains may play a role in

  7. The Candida Genome Database (CGD): incorporation of Assembly 22, systematic identifiers and visualization of high throughput sequencing data.

    Science.gov (United States)

    Skrzypek, Marek S; Binkley, Jonathan; Binkley, Gail; Miyasato, Stuart R; Simison, Matt; Sherlock, Gavin

    2017-01-04

    The Candida Genome Database (CGD, http://www.candidagenome.org/) is a freely available online resource that provides gene, protein and sequence information for multiple Candida species, along with web-based tools for accessing, analyzing and exploring these data. The mission of CGD is to facilitate and accelerate research into Candida pathogenesis and biology, by curating the scientific literature in real time, and connecting literature-derived annotations to the latest version of the genomic sequence and its annotations. Here, we report the incorporation into CGD of Assembly 22, the first chromosome-level, phased diploid assembly of the C. albicans genome, coupled with improvements that we have made to the assembly using additional available sequence data. We also report the creation of systematic identifiers for C. albicans genes and sequence features using a system similar to that adopted by the yeast community over two decades ago. Finally, we describe the incorporation of JBrowse into CGD, which allows online browsing of mapped high throughput sequencing data, and its implementation for several RNA-Seq data sets, as well as the whole genome sequencing data that was used in the construction of Assembly 22. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  8. A visual detection of protein content based on titration of moving reaction boundary electrophoresis.

    Science.gov (United States)

    Wang, Hou-Yu; Guo, Cheng-Ye; Guo, Chen-Gang; Fan, Liu-Yin; Zhang, Lei; Cao, Cheng-Xi

    2013-04-24

    A visual electrophoretic titration method was firstly developed from the concept of moving reaction boundary (MRB) for protein content analysis. In the developed method, when the voltage was applied, the hydroxide ions in the cathodic vessel moved towards the anode, and neutralized the carboxyl groups of protein immobilized via highly cross-linked polyacrylamide gel (PAG), generating a MRB between the alkali and the immobilized protein. The boundary moving velocity (V(MRB)) was as a function of protein content, and an acid-base indicator was used to denote the boundary displacement. As a proof of concept, standard model proteins and biological samples were chosen for the experiments to study the feasibility of the developed method. The experiments revealed that good linear calibration functions between V(MRB) and protein content (correlation coefficients R>0.98). The experiments further demonstrated the following merits of developed method: (1) weak influence of non-protein nitrogen additives (e.g., melamine) adulterated in protein samples, (2) good agreement with the classic Kjeldahl method (R=0.9945), (3) fast measuring speed in total protein analysis of large samples from the same source, and (4) low limit of detection (0.02-0.15 mg mL(-1) for protein content), good precision (R.S.D. of intra-day less than 1.7% and inter-day less than 2.7%), and high recoveries (105-107%). Crown Copyright © 2013. Published by Elsevier B.V. All rights reserved.

  9. Administrative Information System Design Services (Costumer Service) on Two Wheels Motor Vehicles Automotive Services Using Visual Basic 6.0 and Database Mysql

    OpenAIRE

    Muhammad Oetomo Rizky Irwanto; Aqwam Rosadi Kardian, SKom, MM

    2010-01-01

    Along with the rapid development era, the more problems faced by humans. Thenecessity to solve a problem quickly, accurately and efficiently is very necessary.Therefore science and information technology is always developed to make it happen.In this study, the authors make a computerized system to process service administrationservices motorcycles in the automotive services using Visual Basic 6.0 programminglanguage and MySQL database.Programs that exist in this application is comprised of th...

  10. Single Molecule Visualization of Protein-DNA Complexes: Watching Machines at Work

    Science.gov (United States)

    Kowalczykowski, Stephen

    2013-03-01

    We can now watch individual proteins acting on single molecules of DNA. Such imaging provides unprecedented interrogation of fundamental biophysical processes. Visualization is achieved through the application of two complementary procedures. In one, single DNA molecules are attached to a polystyrene bead and are then captured by an optical trap. The DNA, a worm-like coil, is extended either by the force of solution flow in a micro-fabricated channel, or by capturing the opposite DNA end in a second optical trap. In the second procedure, DNA is attached by one end to a glass surface. The coiled DNA is elongated either by continuous solution flow or by subsequently tethering the opposite end to the surface. Protein action is visualized by fluorescent reporters: fluorescent dyes that bind double-stranded DNA (dsDNA), fluorescent biosensors for single-stranded DNA (ssDNA), or fluorescently-tagged proteins. Individual molecules are imaged using either epifluorescence microscopy or total internal reflection fluorescence (TIRF) microscopy. Using these approaches, we imaged the search for DNA sequence homology conducted by the RecA-ssDNA filament. The manner by which RecA protein finds a single homologous sequence in the genome had remained undefined for almost 30 years. Single-molecule imaging revealed that the search occurs through a mechanism termed ``intersegmental contact sampling,'' in which the randomly coiled structure of DNA is essential for reiterative sampling of DNA sequence identity: an example of parallel processing. In addition, the assembly of RecA filaments on single molecules of single-stranded DNA was visualized. Filament assembly requires nucleation of a protein dimer on DNA, and subsequent growth occurs via monomer addition. Furthermore, we discovered a class of proteins that catalyzed both nucleation and growth of filaments, revealing how the cell controls assembly of this protein-DNA complex.

  11. Global Identification of Protein Post-translational Modifications in a Single-Pass Database Search.

    Science.gov (United States)

    Shortreed, Michael R; Wenger, Craig D; Frey, Brian L; Sheynkman, Gloria M; Scalf, Mark; Keller, Mark P; Attie, Alan D; Smith, Lloyd M

    2015-11-06

    Bottom-up proteomics database search algorithms used for peptide identification cannot comprehensively identify post-translational modifications (PTMs) in a single-pass because of high false discovery rates (FDRs). A new approach to database searching enables global PTM (G-PTM) identification by exclusively looking for curated PTMs, thereby avoiding the FDR penalty experienced during conventional variable modification searches. We identified over 2200 unique, high-confidence modified peptides comprising 26 different PTM types in a single-pass database search.

  12. O-GLYCBASE version 3.0: a revised database of O-glycosylated proteins

    DEFF Research Database (Denmark)

    Hansen, Jan; Lund, Ole; Nilsson, Jette

    1998-01-01

    O-GLYCBASE is a revised database of information on glycoproteins and their O-linked glycosylation sites. Entries are compiled and revised from the literature, and from the sequence databases. Entries include informations about species, sequence, glycosylation sites and glycan type and is fully...... cross-referenced. Compared to version 2.0 the number of entries has increased by 20%. Sequence logos displaying the acceptor specificity patterns for the GalNAc, mannose and GlcNAc transferases are shown. The O-GLYCBASE database is available through the WWW at http://www.cbs.dtu.dk/databases/OGLYCBASE/...

  13. Visualization of the protein corona: towards a biomolecular understanding of nanoparticle-cell-interactions.

    Science.gov (United States)

    Kokkinopoulou, Maria; Simon, Johanna; Landfester, Katharina; Mailänder, Volker; Lieberwirth, Ingo

    2017-06-29

    The use of nanocarriers in biology and medicine is complicated by the current need to understand how nanoparticles interact in complex biological surroundings. When nanocarriers come into contact with serum, proteins immediately adsorb onto their surface, forming a protein corona which defines their biological identity. Although the composition of the protein corona has been widely determined by proteomics, its morphology still remains unclear. In this study we show for the first time the morphology of the protein corona using transmission electron microscopy. We are able to demonstrate that the protein corona is not, as commonly supposed, a dense, layered shell coating the nanoparticle, but an undefined, loose network of proteins. Additionally, we are now able to visualize and discriminate between the soft and hard corona using centrifugation-based separation techniques together with proteomic characterization. The protein composition of the ∼15 nm hard corona strongly depends on the surface chemistry of the respective nanomaterial, thus further affecting cellular uptake and intracellular trafficking. Large diameter protein corona resulting from pre-incubation with soft corona or Apo-A1 inhibits cellular uptake, confirming the stealth-effect mechanism. In summary, the knowledge on protein corona formation, composition and morphology is essential to design therapeutic effective nanoparticle systems.

  14. SCANPS: a web server for iterative protein sequence database searching by dynamic programing, with display in a hierarchical SCOP browser.

    Science.gov (United States)

    Walsh, Thomas P; Webber, Caleb; Searle, Stephen; Sturrock, Shane S; Barton, Geoffrey J

    2008-07-01

    SCANPS performs iterative profile searching similar to PSI-BLAST but with full dynamic programing on each cycle and on-the-fly estimation of significance. This combination gives good sensitivity and selectivity that outperforms PSI-BLAST in domain-searching benchmarks. Although computationally expensive, SCANPS exploits onchip parallelism (MMX and SSE2 instructions on Intel chips) as well as MPI parallelism to give acceptable turnround times even for large databases. A web server developed to run SCANPS searches is now available at http://www.compbio.dundee.ac.uk/www-scanps. The server interface allows a range of different protein sequence databases to be searched including the SCOP database of protein domains. The server provides the user with regularly updated versions of the main protein sequence databases and is backed up by significant computing resources which ensure that searches are performed rapidly. For SCOP searches, the results may be viewed in a new tree-based representation that reflects the structure of the SCOP hierarchy; this aids the user in placing each hit in the context of its SCOP classification and understanding its relationship to other domains in SCOP.

  15. Visualization of the Expression of HMGN Nucleosomal Binding Proteins in the Developing Mouse Embryo and in Adult Mouse Tissues

    OpenAIRE

    Furusawa, Takashi; Bustin, Michael

    2009-01-01

    Visualization of the expression pattern of specific proteins during development and in adult tissues provides important clues as to their possible role in various cellular processes. Mouse is the organism of choice for obtaining information on gene expression patterns in higher eukaryotes. This chapter describes the protocols we utilized to visualize Hmgn transcripts and HMGN proteins in mouse tissues. HMGN are chromatin-binding proteins that affect chromatin structure and function and play a...

  16. Utilizing Biotinylated Proteins Expressed in Yeast to Visualize DNA–Protein Interactions at the Single-Molecule Level

    Directory of Open Access Journals (Sweden)

    Huijun Xue

    2017-10-01

    Full Text Available Much of our knowledge in conventional biochemistry has derived from bulk assays. However, many stochastic processes and transient intermediates are hidden when averaged over the ensemble. The powerful technique of single-molecule fluorescence microscopy has made great contributions to the understanding of life processes that are inaccessible when using traditional approaches. In single-molecule studies, quantum dots (Qdots have several unique advantages over other fluorescent probes, such as high brightness, extremely high photostability, and large Stokes shift, thus allowing long-time observation and improved signal-to-noise ratios. So far, however, there is no convenient way to label proteins purified from budding yeast with Qdots. Based on BirA–Avi and biotin–streptavidin systems, we have established a simple method to acquire a Qdot-labeled protein and visualize its interaction with DNA using total internal reflection fluorescence microscopy. For proof-of-concept, we chose replication protein A (RPA and origin recognition complex (ORC as the proteins of interest. Proteins were purified from budding yeast with high biotinylation efficiency and rapidly labeled with streptavidin-coated Qdots. Interactions between proteins and DNA were observed successfully at the single-molecule level.

  17. A visual screen for diet-regulated proteins in the Drosophila ovary using GFP protein trap lines.

    Science.gov (United States)

    Hsu, Hwei-Jan; Drummond-Barbosa, Daniela

    2017-01-01

    The effect of diet on reproduction is well documented in a large number of organisms; however, much remains to be learned about the molecular mechanisms underlying this connection. The Drosophila ovary has a well described, fast and largely reversible response to diet. Ovarian stem cells and their progeny proliferate and grow faster on a yeast-rich diet than on a yeast-free (poor) diet, and death of early germline cysts, degeneration of early vitellogenic follicles and partial block in ovulation further contribute to the ∼60-fold decrease in egg laying observed on a poor diet. Multiple diet-dependent factors, including insulin-like peptides, the steroid ecdysone, the nutrient sensor Target of Rapamycin, AMP-dependent kinase, and adipocyte factors mediate this complex response. Here, we describe the results of a visual screen using a collection of green fluorescent protein (GFP) protein trap lines to identify additional factors potentially involved in this response. In each GFP protein trap line, an artificial GFP exon is fused in frame to an endogenous protein, such that the GFP fusion pattern parallels the levels and subcellular localization of the corresponding native protein. We identified 53 GFP-tagged proteins that exhibit changes in levels and/or subcellular localization in the ovary at 12-16 hours after switching females from rich to poor diets, suggesting them as potential candidates for future functional studies. Copyright © 2017 Elsevier B.V. All rights reserved.

  18. Construction and analysis of a plant non-specific lipid transfer protein database (nsLTPDB

    Directory of Open Access Journals (Sweden)

    Wang Nai-Jyuan

    2012-01-01

    Full Text Available Abstract Background Plant non-specific lipid transfer proteins (nsLTPs are small and basic proteins. Recently, nsLTPs have been reported involved in many physiological functions such as mediating phospholipid transfer, participating in plant defence activity against bacterial and fungal pathogens, and enhancing cell wall extension in tobacco. However, the lipid transfer mechanism of nsLTPs is still unclear, and comprehensive information of nsLTPs is difficult to obtain. Methods In this study, we identified 595 nsLTPs from 121 different species and constructed an nsLTPs database -- nsLTPDB -- which comprises the sequence information, structures, relevant literatures, and biological data of all plant nsLTPs http://nsltpdb.life.nthu.edu.tw/. Results Meanwhile, bioinformatics and statistics methods were implemented to develop a classification method for nsLTPs based on the patterns of the eight highly-conserved cysteine residues, and to suggest strict Prosite-styled patterns for Type I and Type II nsLTPs. The pattern of Type I is C X2 V X5-7 C [V, L, I] × Y [L, A, V] X8-13 CC × G X12 D × [Q, K, R] X2 CXC X16-21 P X2 C X13-15C, and that of Type II is C X4 L X2 C X9-11 P [S, T] X2 CC X5 Q X2-4 C[L, F]C X2 [A, L, I] × [D, N] P X10-12 [K, R] X4-5 C X3-4 P X0-2 C. Moreover, we referred the Prosite-styled patterns to the experimental mutagenesis data that previously established by our group, and found that the residues with higher conservation played an important role in the structural stability or lipid binding ability of nsLTPs. Conclusions Taken together, this research has suggested potential residues that might be essential to modulate the structural and functional properties of plant nsLTPs. Finally, we proposed some biologically important sites of the nsLTPs, which are described by using a new Prosite-styled pattern that we defined.

  19. Reef-coral proteins as visual, non-destructive reporters for plant transformation.

    Science.gov (United States)

    Wenck, A; Pugieux, C; Turner, M; Dunn, M; Stacy, C; Tiozzo, A; Dunder, E; van Grinsven, E; Khan, R; Sigareva, M; Wang, W C; Reed, J; Drayton, P; Oliver, D; Trafford, H; Legris, G; Rushton, H; Tayab, S; Launis, K; Chang, Y-F; Chen, D-F; Melchers, L

    2003-11-01

    Recently, five novel fluorescent proteins have been isolated from non-bioluminescent species of reef-coral organisms and have been made available through ClonTech. They are AmCyan, AsRed, DsRed, ZsGreen and ZsYellow. These proteins are valuable as reporters for transformation because they do not require a substrate or external co-factor to emit fluorescence and can be tested in vivo without destruction of the tissue under study. We have evaluated them in a large range of plants, both monocots and dicots, and our results indicate that they are valuable reporting tools for transformation in a wide variety of crops. We report here their successful expression in wheat, maize, barley, rice, banana, onion, soybean, cotton, tobacco, potato and tomato. Transient expression could be observed as early as 24 h after DNA delivery in some cases, allowing for very clear visualization of individually transformed cells. Stable transgenic events were generated, using mannose, kanamycin or hygromycin selection. Transgenic plants were phenotypically normal, showing a wide range of fluorescence levels, and were fertile. Expression of AmCyan, ZsGreen and AsRed was visible in maize T1 seeds, allowing visual segregation to more than 99% accuracy. The excitation and emission wavelengths of some of these proteins are significantly different; the difference is enough for the simultaneous visualization of cells transformed with more than one of the fluorescent proteins. These proteins will become useful tools for transformation optimization and other studies. The wide variety of plants successfully tested demonstrates that these proteins will potentially find broad use in plant biology.

  20. The human interactome knowledge base (hint-kb): An integrative human protein interaction database enriched with predicted protein–protein interaction scores using a novel hybrid technique

    KAUST Repository

    Theofilatos, Konstantinos A.

    2013-07-12

    Proteins are the functional components of many cellular processes and the identification of their physical protein–protein interactions (PPIs) is an area of mature academic research. Various databases have been developed containing information about experimentally and computationally detected human PPIs as well as their corresponding annotation data. However, these databases contain many false positive interactions, are partial and only a few of them incorporate data from various sources. To overcome these limitations, we have developed HINT-KB (http://biotools.ceid.upatras.gr/hint-kb/), a knowledge base that integrates data from various sources, provides a user-friendly interface for their retrieval, cal-culatesasetoffeaturesofinterest and computesaconfidence score for every candidate protein interaction. This confidence score is essential for filtering the false positive interactions which are present in existing databases, predicting new protein interactions and measuring the frequency of each true protein interaction. For this reason, a novel machine learning hybrid methodology, called (Evolutionary Kalman Mathematical Modelling—EvoKalMaModel), was used to achieve an accurate and interpretable scoring methodology. The experimental results indicated that the proposed scoring scheme outperforms existing computational methods for the prediction of PPIs.

  1. The alpha/beta-Hydrolase Fold 3DM Database (ABHDB) as a Tool for Protein Engineering

    NARCIS (Netherlands)

    Kourist, R.; Jochens, H.; Bartsch, S.; Kuipers, R.K.P.; Padhi, S.K.; Gall, M.; Bottcher, D.; Joosten, H.J.; Bornscheuer, U.T.

    2010-01-01

    Aligning the haystack to expose the needle: The 3DM method was used to generate a comprehensive database of the a/ß-hydrolase fold enzyme superfamily. This database facilitates the analysis of structure–function relationships and enables novel insights into this superfamily to be made. In addition

  2. Magnetized carbon nanotubes for visual detection of proteins directly in whole blood.

    Science.gov (United States)

    Huang, Yan; Wen, Yongqiang; Baryeh, Kwaku; Takalkar, Sunitha; Lund, Michelle; Zhang, Xueji; Liu, Guodong

    2017-11-15

    The authors describe a magnetized carbon nanotube (MCNT)-based lateral flow strip biosensor for visual detection of proteins directly in whole blood avoiding complex purification and sample pre-treatments. MCNT were synthesized by coating Fe3O4 nanoparticles on the shortened multiwalled carbon nanotube (CNT) surface via co-precipitation of ferric and ferrous ions within a dispersion of shorten multiwalled CNTs. The antibody-modified MCNTs were used to capture target protein in whole blood; the formed MCNT-antibody-target protein complexes were applied to the lateral flow strip biosensor, in which a capture antibody was immobilized on the test zone of the biosensor. The captured MCNTs on the test zone and control zone were producing characteristic brown/black bands, and this enabled target protein to be visually detected. Quantification was accomplished by reading the intensities of the bands with a portable strip reader. Rabbit IgG was used as a model target to demonstrate the proof-of-concept. After systematic optimizations of assay parameters, the detection limit of the assay in whole blood was determined to be 10 ng mL-1 (S/N = 3) with a linear dynamic range of 10-200 ng mL-1. This study provides a rapid and low-cost approach for detecting proteins in blood, showing great promise for clinical application and biomedical diagnosis, particularly in limited resource settings. Copyright © 2017 Elsevier B.V. All rights reserved.

  3. 3PFDB - A database of Best Representative PSSM Profiles (BRPs of Protein Families generated using a novel data mining approach

    Directory of Open Access Journals (Sweden)

    Shameer Khader

    2009-12-01

    Full Text Available Abstract Background Protein families could be related to each other at broad levels that group them as superfamilies. These relationships are harder to detect at the sequence level due to high evolutionary divergence. Sequence searches are strongly directed and influenced by the best representatives of families that are viewed as starting points. PSSMs are useful approximations and mathematical representations of protein alignments, with wide array of applications in bioinformatics approaches like remote homology detection, protein family analysis, detection of new members and evolutionary modelling. Computational intensive searches have been performed using the neural network based sensitive sequence search method called FASSM to identify the Best Representative PSSMs for families reported in Pfam database version 22. Results We designed a novel data mining approach for the assessment of individual sequences from a protein family to identify a single Best Representative PSSM profile (BRP per protein family. Using the approach, a database of protein family-specific best representative PSSM profiles called 3PFDB has been developed. PSSM profiles in 3PFDB are curated using performance of individual sequence as a reference in a rigorous scoring and coverage analysis approach using FASSM. We have assessed the suitability of 10, 85,588 sequences derived from seed or full alignments reported in Pfam database (Version 22. Coverage analysis using FASSM method is used as the filtering step to identify the best representative sequence, starting from full length or domain sequences to generate the final profile for a given family. 3PFDB is a collection of best representative PSSM profiles of 8,524 protein families from Pfam database. Conclusion Availability of an approach to identify BRPs and a curated database of best representative PSI-BLAST derived PSSMs for 91.4% of current Pfam family will be a useful resource for the community to perform detailed and

  4. Visualization of protein sequence features using JavaScript and SVG with pViz.js.

    Science.gov (United States)

    Mukhyala, Kiran; Masselot, Alexandre

    2014-12-01

    pViz.js is a visualization library for displaying protein sequence features in a Web browser. By simply providing a sequence and the locations of its features, this lightweight, yet versatile, JavaScript library renders an interactive view of the protein features. Interactive exploration of protein sequence features over the Web is a common need in Bioinformatics. Although many Web sites have developed viewers to display these features, their implementations are usually focused on data from a specific source or use case. Some of these viewers can be adapted to fit other use cases but are not designed to be reusable. pViz makes it easy to display features as boxes aligned to a protein sequence with zooming functionality but also includes predefined renderings for secondary structure and post-translational modifications. The library is designed to further customize this view. We demonstrate such applications of pViz using two examples: a proteomic data visualization tool with an embedded viewer for displaying features on protein structure, and a tool to visualize the results of the variant_effect_predictor tool from Ensembl. pViz.js is a JavaScript library, available on github at https://github.com/Genentech/pviz. This site includes examples and functional applications, installation instructions and usage documentation. A Readme file, which explains how to use pViz with examples, is available as Supplementary Material A. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  5. Sting_RDB: a relational database of structural parameters for protein analysis with support for data warehousing and data mining.

    Science.gov (United States)

    Oliveira, S R M; Almeida, G V; Souza, K R R; Rodrigues, D N; Kuser-Falcão, P R; Yamagishi, M E B; Santos, E H; Vieira, F D; Jardine, J G; Neshich, G

    2007-10-05

    An effective strategy for managing protein databases is to provide mechanisms to transform raw data into consistent, accurate and reliable information. Such mechanisms will greatly reduce operational inefficiencies and improve one's ability to better handle scientific objectives and interpret the research results. To achieve this challenging goal for the STING project, we introduce Sting_RDB, a relational database of structural parameters for protein analysis with support for data warehousing and data mining. In this article, we highlight the main features of Sting_RDB and show how a user can explore it for efficient and biologically relevant queries. Considering its importance for molecular biologists, effort has been made to advance Sting_RDB toward data quality assessment. To the best of our knowledge, Sting_RDB is one of the most comprehensive data repositories for protein analysis, now also capable of providing its users with a data quality indicator. This paper differs from our previous study in many aspects. First, we introduce Sting_RDB, a relational database with mechanisms for efficient and relevant queries using SQL. Sting_rdb evolved from the earlier, text (flat file)-based database, in which data consistency and integrity was not guaranteed. Second, we provide support for data warehousing and mining. Third, the data quality indicator was introduced. Finally and probably most importantly, complex queries that could not be posed on a text-based database, are now easily implemented. Further details are accessible at the Sting_RDB demo web page: http://www.cbi.cnptia.embrapa.br/StingRDB.

  6. BFluenza: A Proteomic Database on Bird Flu.

    Science.gov (United States)

    Salahuddin, Parveen; Khan, Asad U

    2011-01-01

    Influenza A virus subtype H5N1, also known as "bird flu" has been documented to cause an outbreak of respiratory diseases in humans. The unprecedented spread of highly pathogenic avian influenza type A is a threat to veterinary and human health. The BFluenza is a relational database which is solely devoted to proteomic information of H5N1 subtype. Bfluenza has novel features including computed physico-chemical properties data of H5N1 viral proteins, modeled structures of viral proteins, data of protein coordinates, experimental details, molecular description and bibliographic reference. The database also contains nucleotide and their decoded protein sequences data. The database can be searched in various modes by setting search options. The structure of viral protein could be visualized by JMol viewer or by Discovery Studio. The database is available for free at http://www.bfluenza.info.

  7. Comparing the Precision of Information Retrieval of MeSH-Controlled Vocabulary Search Method and a Visual Method in the Medline Medical Database

    Science.gov (United States)

    Hariri, Nadjla; Ravandi, Somayyeh Nadi

    2014-01-01

    Background: Medline is one of the most important databases in the biomedical field. One of the most important hosts for Medline is Elton B. Stephens CO. (EBSCO), which has presented different search methods that can be used based on the needs of the users. Visual search and MeSH-controlled search methods are among the most common methods. The goal of this research was to compare the precision of the retrieved sources in the EBSCO Medline base using MeSH-controlled and visual search methods. Methods: This research was a semi-empirical study. By holding training workshops, 70 students of higher education in different educational departments of Kashan University of Medical Sciences were taught MeSH-Controlled and visual search methods in 2012. Then, the precision of 300 searches made by these students was calculated based on Best Precision, Useful Precision, and Objective Precision formulas and analyzed in SPSS software using the independent sample T Test, and three precisions obtained with the three precision formulas were studied for the two search methods. Results: The mean precision of the visual method was greater than that of the MeSH-Controlled search for all three types of precision, i.e. Best Precision, Useful Precision, and Objective Precision, and their mean precisions were significantly different (P EBSCO Medline host than to use the controlled method, which requires users to use special keywords. The potential reason for their preference was that the visual method allowed them more freedom of action. PMID:25763155

  8. Comparing the Precision of Information Retrieval of MeSH-Controlled Vocabulary Search Method and a Visual Method in the Medline Medical Database.

    Science.gov (United States)

    Hariri, Nadjla; Ravandi, Somayyeh Nadi

    2014-01-01

    Medline is one of the most important databases in the biomedical field. One of the most important hosts for Medline is Elton B. Stephens CO. (EBSCO), which has presented different search methods that can be used based on the needs of the users. Visual search and MeSH-controlled search methods are among the most common methods. The goal of this research was to compare the precision of the retrieved sources in the EBSCO Medline base using MeSH-controlled and visual search methods. This research was a semi-empirical study. By holding training workshops, 70 students of higher education in different educational departments of Kashan University of Medical Sciences were taught MeSH-Controlled and visual search methods in 2012. Then, the precision of 300 searches made by these students was calculated based on Best Precision, Useful Precision, and Objective Precision formulas and analyzed in SPSS software using the independent sample T Test, and three precisions obtained with the three precision formulas were studied for the two search methods. The mean precision of the visual method was greater than that of the MeSH-Controlled search for all three types of precision, i.e. Best Precision, Useful Precision, and Objective Precision, and their mean precisions were significantly different (P EBSCO Medline host than to use the controlled method, which requires users to use special keywords. The potential reason for their preference was that the visual method allowed them more freedom of action.

  9. O-GLYCBASE version 2.0: a revised database of O-glycosylated proteins

    DEFF Research Database (Denmark)

    Hansen, Jan; Lund, Ole; Rapacki, Kristoffer

    1997-01-01

    O-GLYCBASE is an updated database of information on glycoproteins and their O-linked glycosylation sites. Entries are compiled and revised from the literature, and from the SWISS-PROT database. Entries include information about species, sequence, glycosylation sites and glycan type. O......-GLYCBASE is now fully cross-referenced to the SWISS-PROT, PIR, PROSITE, PDB, EMBL, HSSP, LISTA and MIM databases. Compared with version 1.0 the number of entries have increased by 34%. Revision of the O-glycan assignment was performed on 20% of the entries. Sequence logos displaying the acceptor specificity...... patterns for the GalNAc, mannose and GlcNAc transferases are shown. The O-GLYCBASE database is available through WWW or by anonymous FTP....

  10. Visualization and Dissemination of Multidimensional Proteomics Data Comparing Protein Abundance During Caenorhabditis elegans Development

    Science.gov (United States)

    Riffle, Michael; Merrihew, Gennifer E.; Jaschob, Daniel; Sharma, Vagisha; Davis, Trisha N.; Noble, William S.; MacCoss, Michael J.

    2015-11-01

    Regulation of protein abundance is a critical aspect of cellular function, organism development, and aging. Alternative splicing may give rise to multiple possible proteoforms of gene products where the abundance of each proteoform is independently regulated. Understanding how the abundances of these distinct gene products change is essential to understanding the underlying mechanisms of many biological processes. Bottom-up proteomics mass spectrometry techniques may be used to estimate protein abundance indirectly by sequencing and quantifying peptides that are later mapped to proteins based on sequence. However, quantifying the abundance of distinct gene products is routinely confounded by peptides that map to multiple possible proteoforms. In this work, we describe a technique that may be used to help mitigate the effects of confounding ambiguous peptides and multiple proteoforms when quantifying proteins. We have applied this technique to visualize the distribution of distinct gene products for the whole proteome across 11 developmental stages of the model organism Caenorhabditis elegans. The result is a large multidimensional dataset for which web-based tools were developed for visualizing how translated gene products change during development and identifying possible proteoforms. The underlying instrument raw files and tandem mass spectra may also be downloaded. The data resource is freely available on the web at http://www.yeastrc.org/wormpes/.

  11. Kappa-alpha plot derived structural alphabet and BLOSUM-like substitution matrix for rapid search of protein structure database

    Science.gov (United States)

    2007-01-01

    We present a novel protein structure database search tool, 3D-BLAST, that is useful for analyzing novel structures and can return a ranked list of alignments. This tool has the features of BLAST (for example, robust statistical basis, and effective and reliable search capabilities) and employs a kappa-alpha (κ, α) plot derived structural alphabet and a new substitution matrix. 3D-BLAST searches more than 12,000 protein structures in 1.2 s and yields good results in zones with low sequence similarity. PMID:17335583

  12. CHANGE DETECTION AND LAND USE / LAND COVER DATABASE UPDATING USING IMAGE SEGMENTATION, GIS ANALYSIS AND VISUAL INTERPRETATION

    National Research Council Canada - National Science Library

    J-F Mas; R González

    2015-01-01

      This article presents a hybrid method that combines image segmentation, GIS analysis, and visual interpretation in order to detect discrepancies between an existing land use/cover map and satellite...

  13. Chemical reporter for visualizing metabolic cross-talk between carbohydrate metabolism and protein modification.

    Science.gov (United States)

    Zaro, Balyn W; Chuh, Kelly N; Pratt, Matthew R

    2014-09-19

    Metabolic chemical reporters have been largely used to study posttranslational modifications. Generally, it was assumed that these reporters entered one biosynthetic pathway, resulting in labeling of one type of modification. However, because they are metabolized by cells before their addition onto proteins, metabolic chemical reporters potentially provide a unique opportunity to read-out on both modifications of interest and cellular metabolism. We report here the development of a metabolic chemical reporter 1-deoxy-N-pentynyl glucosamine (1-deoxy-GlcNAlk). This small-molecule cannot be incorporated into glycans; however, treatment of mammalian cells results in labeling of a variety proteins and enables their visualization and identification. Competition of this labeling with sodium acetate and an acetyltransferase inhibitor suggests that 1-deoxy-GlcNAlk can enter the protein acetylation pathway. These results demonstrate that metabolic chemical reporters have the potential to isolate and potentially discover cross-talk between metabolic pathways in living cells.

  14. Classic and Golli Myelin Basic Protein have distinct developmental trajectories in human visual cortex

    Directory of Open Access Journals (Sweden)

    Caitlin R Siu

    2015-04-01

    Full Text Available Traditionally myelin is viewed as insulation around axons however more recent studies have shown it plays an important role in plasticity, axonal metabolism and neuroimmune signalling. Myelin is a complex multi-protein structure composed of hundreds of proteins, with Myelin Basic Protein (MBP being the most studied. MBP has two families: Classic-MBP that is necessary for activity driven compaction of myelin around axons, and Golli-MBP that is found in neurons, oligodendrocytes, and T cells, and has been called a 'molecular link' between the nervous and immune systems. In visual cortex myelin proteins interact with immune processes to affect experience-dependent plasticity. We studied myelin in human visual cortex using Western blotting to quantify Classic- and Golli-MBP expression in post-mortem tissue samples ranging in age from 20 days to 80 years. We found that Classic- and Golli-MBP have different patterns of change across the lifespan: Classic-MBP gradually increases to 42 years and then declines into aging; Golli-MBP has changes that are coincident with milestones in visual system sensitive period, before gradually increasing into aging. There are 3 stages in the balance between Classic- and Golli-MBP expression, with Golli-MBP dominating early, then shifting to Classic-MBP, and back to Golli-MBP in aging. Also Golli-MBP has a wave of high inter-individual variability during childhood. These results about cortical MBP expression are timely because they compliment recent advances in MRI techniques that produce high resolution maps of cortical myelin in normal and diseased brain. In addition the unique pattern of Golli-MBP expression across the lifespan suggests that it supports high levels of neuroimmune interaction in cortical development and in aging.

  15. Protein-Coupled Fluorescent Probe To Visualize Potassium Ion Transition on Cellular Membranes.

    Science.gov (United States)

    Hirata, Tomoya; Terai, Takuya; Yamamura, Hisao; Shimonishi, Manabu; Komatsu, Toru; Hanaoka, Kenjiro; Ueno, Tasuku; Imaizumi, Yuji; Nagano, Tetsuo; Urano, Yasuteru

    2016-03-01

    K(+) is the most abundant metal ion in cells, and changes of [K(+)] around cell membranes play important roles in physiological events. However, there is no practical method to selectively visualize [K(+)] at the surface of cells. To address this issue, we have developed a protein-coupled fluorescent probe for K(+), TLSHalo. TLSHalo is responsive to [K(+)] in the physiological range, with good selectivity over Na(+) and retains its K(+)-sensing properties after covalent conjugation with HaloTag protein. By using cells expressing HaloTag on the plasma membrane, we successfully directed TLSHalo specifically to the outer surface of target cells. This enabled us to visualize localized extracellular [K(+)] change with TLSHalo under a fluorescence microscope in real time. To confirm the experimental value of this system, we used TLSHalo to monitor extracellular [K(+)] change induced by K(+) ionophores or by activation of a native Ca(2+)-dependent K(+) channel (BK channel). Further, we show that K(+) efflux via BK channel induced by electrical stimulation at the bottom surface of the cells can be visualized with TLSHalo by means of total internal reflection fluorescence microscope (TIRFM) imaging. Our methodology should be useful to analyze physiological K(+) dynamics with high spatiotemporal resolution.

  16. CLMSVault: A Software Suite for Protein Cross-Linking Mass-Spectrometry Data Analysis and Visualization.

    Science.gov (United States)

    Courcelles, Mathieu; Coulombe-Huntington, Jasmin; Cossette, Émilie; Gingras, Anne-Claude; Thibault, Pierre; Tyers, Mike

    2017-07-07

    Protein cross-linking mass spectrometry (CL-MS) enables the sensitive detection of protein interactions and the inference of protein complex topology. The detection of chemical cross-links between protein residues can identify intra- and interprotein contact sites or provide physical constraints for molecular modeling of protein structure. Recent innovations in cross-linker design, sample preparation, mass spectrometry, and software tools have significantly improved CL-MS approaches. Although a number of algorithms now exist for the identification of cross-linked peptides from mass spectral data, a dearth of user-friendly analysis tools represent a practical bottleneck to the broad adoption of the approach. To facilitate the analysis of CL-MS data, we developed CLMSVault, a software suite designed to leverage existing CL-MS algorithms and provide intuitive and flexible tools for cross-platform data interpretation. CLMSVault stores and combines complementary information obtained from different cross-linkers and search algorithms. CLMSVault provides filtering, comparison, and visualization tools to support CL-MS analyses and includes a workflow for label-free quantification of cross-linked peptides. An embedded 3D viewer enables the visualization of quantitative data and the mapping of cross-linked sites onto PDB structural models. We demonstrate the application of CLMSVault for the analysis of a noncovalent Cdc34-ubiquitin protein complex cross-linked under different conditions. CLMSVault is open-source software (available at https://gitlab.com/courcelm/clmsvault.git ), and a live demo is available at http://democlmsvault.tyerslab.com/ .

  17. Phospho.ELM: A database of experimentally verified phosphorylation sites in eukaryotic proteins

    DEFF Research Database (Denmark)

    Diella, F.; Cameron, S.; Gemund, C.

    2004-01-01

    Background: Post-translational phosphorylation is one of the most common protein modifications. Phosphoserine, threonine and tyrosine residues play critical roles in the regulation of many cellular processes. The fast growing number of research reports on protein phosphorylation points to a general...... instances for 556 phosphorylated proteins. Conclusion: Phospho. ELM will be a valuable tool both for molecular biologists working on protein phosphorylation sites and for bioinformaticians developing computational predictions on the specificity of phosphorylation reactions....

  18. Proteins in similarity relationship with the cluster - Gclust Server | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Gclust Server Proteins in similarity relationship with the cluster Data detail Data name Proteins in similarity relat...s Proteins in similarity relationship with the cluster - Gclust Server | LSDB Archive ... ...ionship with the cluster DOI 10.18908/lsdba.nbdc00464-003 Description of data conte

  19. DMPD: Post-transcriptional regulation of proinflammatory proteins. [Dynamic Macrophage Pathway CSML Database

    Lifescience Database Archive (English)

    Full Text Available 15075353 Post-transcriptional regulation of proinflammatory proteins. Anderson P, P...l) (.csml) Show Post-transcriptional regulation of proinflammatory proteins. PubmedID 15075353 Title Post-tr...anscriptional regulation of proinflammatory proteins. Authors Anderson P, Phillip

  20. DMPD: LPS-binding proteins and receptors. [Dynamic Macrophage Pathway CSML Database

    Lifescience Database Archive (English)

    Full Text Available 9665271 LPS-binding proteins and receptors. Fenton MJ, Golenbock DT. J Leukoc Biol.... 1998 Jul;64(1):25-32. (.png) (.svg) (.html) (.csml) Show LPS-binding proteins and receptors. PubmedID 9665271 Title LPS-binding prot...eins and receptors. Authors Fenton MJ, Golenbock DT. Publication J Leukoc Biol. 199