WorldWideScience

Sample records for protein-protein interaction prediction

  1. Prediction of Protein-Protein Interactions Related to Protein Complexes Based on Protein Interaction Networks

    Directory of Open Access Journals (Sweden)

    Peng Liu

    2015-01-01

    Full Text Available A method for predicting protein-protein interactions based on detected protein complexes is proposed to repair deficient interactions derived from high-throughput biological experiments. Protein complexes are pruned and decomposed into small parts based on the adaptive k-cores method to predict protein-protein interactions associated with the complexes. The proposed method is adaptive to protein complexes with different structure, number, and size of nodes in a protein-protein interaction network. Based on different complex sets detected by various algorithms, we can obtain different prediction sets of protein-protein interactions. The reliability of the predicted interaction sets is proved by using estimations with statistical tests and direct confirmation of the biological data. In comparison with the approaches which predict the interactions based on the cliques, the overlap of the predictions is small. Similarly, the overlaps among the predicted sets of interactions derived from various complex sets are also small. Thus, every predicted set of interactions may complement and improve the quality of the original network data. Meanwhile, the predictions from the proposed method replenish protein-protein interactions associated with protein complexes using only the network topology.

  2. Information assessment on predicting protein-protein interactions

    Directory of Open Access Journals (Sweden)

    Gerstein Mark

    2004-10-01

    Full Text Available Abstract Background Identifying protein-protein interactions is fundamental for understanding the molecular machinery of the cell. Proteome-wide studies of protein-protein interactions are of significant value, but the high-throughput experimental technologies suffer from high rates of both false positive and false negative predictions. In addition to high-throughput experimental data, many diverse types of genomic data can help predict protein-protein interactions, such as mRNA expression, localization, essentiality, and functional annotation. Evaluations of the information contributions from different evidences help to establish more parsimonious models with comparable or better prediction accuracy, and to obtain biological insights of the relationships between protein-protein interactions and other genomic information. Results Our assessment is based on the genomic features used in a Bayesian network approach to predict protein-protein interactions genome-wide in yeast. In the special case, when one does not have any missing information about any of the features, our analysis shows that there is a larger information contribution from the functional-classification than from expression correlations or essentiality. We also show that in this case alternative models, such as logistic regression and random forest, may be more effective than Bayesian networks for predicting interactions. Conclusions In the restricted problem posed by the complete-information subset, we identified that the MIPS and Gene Ontology (GO functional similarity datasets as the dominating information contributors for predicting the protein-protein interactions under the framework proposed by Jansen et al. Random forests based on the MIPS and GO information alone can give highly accurate classifications. In this particular subset of complete information, adding other genomic data does little for improving predictions. We also found that the data discretizations used in the

  3. Computational prediction of protein-protein interactions in Leishmania predicted proteomes.

    Directory of Open Access Journals (Sweden)

    Antonio M Rezende

    Full Text Available The Trypanosomatids parasites Leishmania braziliensis, Leishmania major and Leishmania infantum are important human pathogens. Despite of years of study and genome availability, effective vaccine has not been developed yet, and the chemotherapy is highly toxic. Therefore, it is clear just interdisciplinary integrated studies will have success in trying to search new targets for developing of vaccines and drugs. An essential part of this rationale is related to protein-protein interaction network (PPI study which can provide a better understanding of complex protein interactions in biological system. Thus, we modeled PPIs for Trypanosomatids through computational methods using sequence comparison against public database of protein or domain interaction for interaction prediction (Interolog Mapping and developed a dedicated combined system score to address the predictions robustness. The confidence evaluation of network prediction approach was addressed using gold standard positive and negative datasets and the AUC value obtained was 0.94. As result, 39,420, 43,531 and 45,235 interactions were predicted for L. braziliensis, L. major and L. infantum respectively. For each predicted network the top 20 proteins were ranked by MCC topological index. In addition, information related with immunological potential, degree of protein sequence conservation among orthologs and degree of identity compared to proteins of potential parasite hosts was integrated. This information integration provides a better understanding and usefulness of the predicted networks that can be valuable to select new potential biological targets for drug and vaccine development. Network modularity which is a key when one is interested in destabilizing the PPIs for drug or vaccine purposes along with multiple alignments of the predicted PPIs were performed revealing patterns associated with protein turnover. In addition, around 50% of hypothetical protein present in the networks

  4. Protein function prediction using neighbor relativity in protein-protein interaction network.

    Science.gov (United States)

    Moosavi, Sobhan; Rahgozar, Masoud; Rahimi, Amir

    2013-04-01

    There is a large gap between the number of discovered proteins and the number of functionally annotated ones. Due to the high cost of determining protein function by wet-lab research, function prediction has become a major task for computational biology and bioinformatics. Some researches utilize the proteins interaction information to predict function for un-annotated proteins. In this paper, we propose a novel approach called "Neighbor Relativity Coefficient" (NRC) based on interaction network topology which estimates the functional similarity between two proteins. NRC is calculated for each pair of proteins based on their graph-based features including distance, common neighbors and the number of paths between them. In order to ascribe function to an un-annotated protein, NRC estimates a weight for each neighbor to transfer its annotation to the unknown protein. Finally, the unknown protein will be annotated by the top score transferred functions. We also investigate the effect of using different coefficients for various types of functions. The proposed method has been evaluated on Saccharomyces cerevisiae and Homo sapiens interaction networks. The performance analysis demonstrates that NRC yields better results in comparison with previous protein function prediction approaches that utilize interaction network. Copyright © 2012 Elsevier Ltd. All rights reserved.

  5. Protein-protein interaction site predictions with three-dimensional probability distributions of interacting atoms on protein surfaces.

    Directory of Open Access Journals (Sweden)

    Ching-Tai Chen

    Full Text Available Protein-protein interactions are key to many biological processes. Computational methodologies devised to predict protein-protein interaction (PPI sites on protein surfaces are important tools in providing insights into the biological functions of proteins and in developing therapeutics targeting the protein-protein interaction sites. One of the general features of PPI sites is that the core regions from the two interacting protein surfaces are complementary to each other, similar to the interior of proteins in packing density and in the physicochemical nature of the amino acid composition. In this work, we simulated the physicochemical complementarities by constructing three-dimensional probability density maps of non-covalent interacting atoms on the protein surfaces. The interacting probabilities were derived from the interior of known structures. Machine learning algorithms were applied to learn the characteristic patterns of the probability density maps specific to the PPI sites. The trained predictors for PPI sites were cross-validated with the training cases (consisting of 432 proteins and were tested on an independent dataset (consisting of 142 proteins. The residue-based Matthews correlation coefficient for the independent test set was 0.423; the accuracy, precision, sensitivity, specificity were 0.753, 0.519, 0.677, and 0.779 respectively. The benchmark results indicate that the optimized machine learning models are among the best predictors in identifying PPI sites on protein surfaces. In particular, the PPI site prediction accuracy increases with increasing size of the PPI site and with increasing hydrophobicity in amino acid composition of the PPI interface; the core interface regions are more likely to be recognized with high prediction confidence. The results indicate that the physicochemical complementarity patterns on protein surfaces are important determinants in PPIs, and a substantial portion of the PPI sites can be predicted

  6. Protein-Protein Interaction Site Predictions with Three-Dimensional Probability Distributions of Interacting Atoms on Protein Surfaces

    Science.gov (United States)

    Chen, Ching-Tai; Peng, Hung-Pin; Jian, Jhih-Wei; Tsai, Keng-Chang; Chang, Jeng-Yih; Yang, Ei-Wen; Chen, Jun-Bo; Ho, Shinn-Ying; Hsu, Wen-Lian; Yang, An-Suei

    2012-01-01

    Protein-protein interactions are key to many biological processes. Computational methodologies devised to predict protein-protein interaction (PPI) sites on protein surfaces are important tools in providing insights into the biological functions of proteins and in developing therapeutics targeting the protein-protein interaction sites. One of the general features of PPI sites is that the core regions from the two interacting protein surfaces are complementary to each other, similar to the interior of proteins in packing density and in the physicochemical nature of the amino acid composition. In this work, we simulated the physicochemical complementarities by constructing three-dimensional probability density maps of non-covalent interacting atoms on the protein surfaces. The interacting probabilities were derived from the interior of known structures. Machine learning algorithms were applied to learn the characteristic patterns of the probability density maps specific to the PPI sites. The trained predictors for PPI sites were cross-validated with the training cases (consisting of 432 proteins) and were tested on an independent dataset (consisting of 142 proteins). The residue-based Matthews correlation coefficient for the independent test set was 0.423; the accuracy, precision, sensitivity, specificity were 0.753, 0.519, 0.677, and 0.779 respectively. The benchmark results indicate that the optimized machine learning models are among the best predictors in identifying PPI sites on protein surfaces. In particular, the PPI site prediction accuracy increases with increasing size of the PPI site and with increasing hydrophobicity in amino acid composition of the PPI interface; the core interface regions are more likely to be recognized with high prediction confidence. The results indicate that the physicochemical complementarity patterns on protein surfaces are important determinants in PPIs, and a substantial portion of the PPI sites can be predicted correctly with

  7. Protein complex prediction based on k-connected subgraphs in protein interaction network

    OpenAIRE

    Habibi, Mahnaz; Eslahchi, Changiz; Wong, Limsoon

    2010-01-01

    Abstract Background Protein complexes play an important role in cellular mechanisms. Recently, several methods have been presented to predict protein complexes in a protein interaction network. In these methods, a protein complex is predicted as a dense subgraph of protein interactions. However, interactions data are incomplete and a protein complex does not have to be a complete or dense subgraph. Results We propose a more appropriate protein complex prediction method, CFA, that is based on ...

  8. Protein complex prediction based on k-connected subgraphs in protein interaction network

    Directory of Open Access Journals (Sweden)

    Habibi Mahnaz

    2010-09-01

    Full Text Available Abstract Background Protein complexes play an important role in cellular mechanisms. Recently, several methods have been presented to predict protein complexes in a protein interaction network. In these methods, a protein complex is predicted as a dense subgraph of protein interactions. However, interactions data are incomplete and a protein complex does not have to be a complete or dense subgraph. Results We propose a more appropriate protein complex prediction method, CFA, that is based on connectivity number on subgraphs. We evaluate CFA using several protein interaction networks on reference protein complexes in two benchmark data sets (MIPS and Aloy, containing 1142 and 61 known complexes respectively. We compare CFA to some existing protein complex prediction methods (CMC, MCL, PCP and RNSC in terms of recall and precision. We show that CFA predicts more complexes correctly at a competitive level of precision. Conclusions Many real complexes with different connectivity level in protein interaction network can be predicted based on connectivity number. Our CFA program and results are freely available from http://www.bioinf.cs.ipm.ir/softwares/cfa/CFA.rar.

  9. Bioinformatic Prediction of WSSV-Host Protein-Protein Interaction

    Directory of Open Access Journals (Sweden)

    Zheng Sun

    2014-01-01

    Full Text Available WSSV is one of the most dangerous pathogens in shrimp aquaculture. However, the molecular mechanism of how WSSV interacts with shrimp is still not very clear. In the present study, bioinformatic approaches were used to predict interactions between proteins from WSSV and shrimp. The genome data of WSSV (NC_003225.1 and the constructed transcriptome data of F. chinensis were used to screen potentially interacting proteins by searching in protein interaction databases, including STRING, Reactome, and DIP. Forty-four pairs of proteins were suggested to have interactions between WSSV and the shrimp. Gene ontology analysis revealed that 6 pairs of these interacting proteins were classified into “extracellular region” or “receptor complex” GO-terms. KEGG pathway analysis showed that they were involved in the “ECM-receptor interaction pathway.” In the 6 pairs of interacting proteins, an envelope protein called “collagen-like protein” (WSSV-CLP encoded by an early virus gene “wsv001” in WSSV interacted with 6 deduced proteins from the shrimp, including three integrin alpha (ITGA, two integrin beta (ITGB, and one syndecan (SDC. Sequence analysis on WSSV-CLP, ITGA, ITGB, and SDC revealed that they possessed the sequence features for protein-protein interactions. This study might provide new insights into the interaction mechanisms between WSSV and shrimp.

  10. Protein-Protein Interactions Prediction Based on Iterative Clique Extension with Gene Ontology Filtering

    Directory of Open Access Journals (Sweden)

    Lei Yang

    2014-01-01

    Full Text Available Cliques (maximal complete subnets in protein-protein interaction (PPI network are an important resource used to analyze protein complexes and functional modules. Clique-based methods of predicting PPI complement the data defection from biological experiments. However, clique-based predicting methods only depend on the topology of network. The false-positive and false-negative interactions in a network usually interfere with prediction. Therefore, we propose a method combining clique-based method of prediction and gene ontology (GO annotations to overcome the shortcoming and improve the accuracy of predictions. According to different GO correcting rules, we generate two predicted interaction sets which guarantee the quality and quantity of predicted protein interactions. The proposed method is applied to the PPI network from the Database of Interacting Proteins (DIP and most of the predicted interactions are verified by another biological database, BioGRID. The predicted protein interactions are appended to the original protein network, which leads to clique extension and shows the significance of biological meaning.

  11. PIPE: a protein-protein interaction prediction engine based on the re-occurring short polypeptide sequences between known interacting protein pairs

    Directory of Open Access Journals (Sweden)

    Greenblatt Jack

    2006-07-01

    Full Text Available Abstract Background Identification of protein interaction networks has received considerable attention in the post-genomic era. The currently available biochemical approaches used to detect protein-protein interactions are all time and labour intensive. Consequently there is a growing need for the development of computational tools that are capable of effectively identifying such interactions. Results Here we explain the development and implementation of a novel Protein-Protein Interaction Prediction Engine termed PIPE. This tool is capable of predicting protein-protein interactions for any target pair of the yeast Saccharomyces cerevisiae proteins from their primary structure and without the need for any additional information or predictions about the proteins. PIPE showed a sensitivity of 61% for detecting any yeast protein interaction with 89% specificity and an overall accuracy of 75%. This rate of success is comparable to those associated with the most commonly used biochemical techniques. Using PIPE, we identified a novel interaction between YGL227W (vid30 and YMR135C (gid8 yeast proteins. This lead us to the identification of a novel yeast complex that here we term vid30 complex (vid30c. The observed interaction was confirmed by tandem affinity purification (TAP tag, verifying the ability of PIPE to predict novel protein-protein interactions. We then used PIPE analysis to investigate the internal architecture of vid30c. It appeared from PIPE analysis that vid30c may consist of a core and a secondary component. Generation of yeast gene deletion strains combined with TAP tagging analysis indicated that the deletion of a member of the core component interfered with the formation of vid30c, however, deletion of a member of the secondary component had little effect (if any on the formation of vid30c. Also, PIPE can be used to analyse yeast proteins for which TAP tagging fails, thereby allowing us to predict protein interactions that are not

  12. Prediction of protein–protein interactions: unifying evolution and structure at protein interfaces

    International Nuclear Information System (INIS)

    Tuncbag, Nurcan; Gursoy, Attila; Keskin, Ozlem

    2011-01-01

    The vast majority of the chores in the living cell involve protein–protein interactions. Providing details of protein interactions at the residue level and incorporating them into protein interaction networks are crucial toward the elucidation of a dynamic picture of cells. Despite the rapid increase in the number of structurally known protein complexes, we are still far away from a complete network. Given experimental limitations, computational modeling of protein interactions is a prerequisite to proceed on the way to complete structural networks. In this work, we focus on the question 'how do proteins interact?' rather than 'which proteins interact?' and we review structure-based protein–protein interaction prediction approaches. As a sample approach for modeling protein interactions, PRISM is detailed which combines structural similarity and evolutionary conservation in protein interfaces to infer structures of complexes in the protein interaction network. This will ultimately help us to understand the role of protein interfaces in predicting bound conformations

  13. Protein complex prediction in large ontology attributed protein-protein interaction networks.

    Science.gov (United States)

    Zhang, Yijia; Lin, Hongfei; Yang, Zhihao; Wang, Jian; Li, Yanpeng; Xu, Bo

    2013-01-01

    Protein complexes are important for unraveling the secrets of cellular organization and function. Many computational approaches have been developed to predict protein complexes in protein-protein interaction (PPI) networks. However, most existing approaches focus mainly on the topological structure of PPI networks, and largely ignore the gene ontology (GO) annotation information. In this paper, we constructed ontology attributed PPI networks with PPI data and GO resource. After constructing ontology attributed networks, we proposed a novel approach called CSO (clustering based on network structure and ontology attribute similarity). Structural information and GO attribute information are complementary in ontology attributed networks. CSO can effectively take advantage of the correlation between frequent GO annotation sets and the dense subgraph for protein complex prediction. Our proposed CSO approach was applied to four different yeast PPI data sets and predicted many well-known protein complexes. The experimental results showed that CSO was valuable in predicting protein complexes and achieved state-of-the-art performance.

  14. False positive reduction in protein-protein interaction predictions using gene ontology annotations

    Directory of Open Access Journals (Sweden)

    Lin Yen-Han

    2007-07-01

    Full Text Available Abstract Background Many crucial cellular operations such as metabolism, signalling, and regulations are based on protein-protein interactions. However, the lack of robust protein-protein interaction information is a challenge. One reason for the lack of solid protein-protein interaction information is poor agreement between experimental findings and computational sets that, in turn, comes from huge false positive predictions in computational approaches. Reduction of false positive predictions and enhancing true positive fraction of computationally predicted protein-protein interaction datasets based on highly confident experimental results has not been adequately investigated. Results Gene Ontology (GO annotations were used to reduce false positive protein-protein interactions (PPI pairs resulting from computational predictions. Using experimentally obtained PPI pairs as a training dataset, eight top-ranking keywords were extracted from GO molecular function annotations. The sensitivity of these keywords is 64.21% in the yeast experimental dataset and 80.83% in the worm experimental dataset. The specificities, a measure of recovery power, of these keywords applied to four predicted PPI datasets for each studied organisms, are 48.32% and 46.49% (by average of four datasets in yeast and worm, respectively. Based on eight top-ranking keywords and co-localization of interacting proteins a set of two knowledge rules were deduced and applied to remove false positive protein pairs. The 'strength', a measure of improvement provided by the rules was defined based on the signal-to-noise ratio and implemented to measure the applicability of knowledge rules applying to the predicted PPI datasets. Depending on the employed PPI-predicting methods, the strength varies between two and ten-fold of randomly removing protein pairs from the datasets. Conclusion Gene Ontology annotations along with the deduced knowledge rules could be implemented to partially

  15. Prediction of protein-protein interactions between viruses and human by an SVM model

    Directory of Open Access Journals (Sweden)

    Cui Guangyu

    2012-05-01

    Full Text Available Abstract Background Several computational methods have been developed to predict protein-protein interactions from amino acid sequences, but most of those methods are intended for the interactions within a species rather than for interactions across different species. Methods for predicting interactions between homogeneous proteins are not appropriate for finding those between heterogeneous proteins since they do not distinguish the interactions between proteins of the same species from those of different species. Results We developed a new method for representing a protein sequence of variable length in a frequency vector of fixed length, which encodes the relative frequency of three consecutive amino acids of a sequence. We built a support vector machine (SVM model to predict human proteins that interact with virus proteins. In two types of viruses, human papillomaviruses (HPV and hepatitis C virus (HCV, our SVM model achieved an average accuracy above 80%, which is higher than that of another SVM model with a different representation scheme. Using the SVM model and Gene Ontology (GO annotations of proteins, we predicted new interactions between virus proteins and human proteins. Conclusions Encoding the relative frequency of amino acid triplets of a protein sequence is a simple yet powerful representation method for predicting protein-protein interactions across different species. The representation method has several advantages: (1 it enables a prediction model to achieve a better performance than other representations, (2 it generates feature vectors of fixed length regardless of the sequence length, and (3 the same representation is applicable to different types of proteins.

  16. A domain-based approach to predict protein-protein interactions

    Directory of Open Access Journals (Sweden)

    Resat Haluk

    2007-06-01

    Full Text Available Abstract Background Knowing which proteins exist in a certain organism or cell type and how these proteins interact with each other are necessary for the understanding of biological processes at the whole cell level. The determination of the protein-protein interaction (PPI networks has been the subject of extensive research. Despite the development of reasonably successful methods, serious technical difficulties still exist. In this paper we present DomainGA, a quantitative computational approach that uses the information about the domain-domain interactions to predict the interactions between proteins. Results DomainGA is a multi-parameter optimization method in which the available PPI information is used to derive a quantitative scoring scheme for the domain-domain pairs. Obtained domain interaction scores are then used to predict whether a pair of proteins interacts. Using the yeast PPI data and a series of tests, we show the robustness and insensitivity of the DomainGA method to the selection of the parameter sets, score ranges, and detection rules. Our DomainGA method achieves very high explanation ratios for the positive and negative PPIs in yeast. Based on our cross-verification tests on human PPIs, comparison of the optimized scores with the structurally observed domain interactions obtained from the iPFAM database, and sensitivity and specificity analysis; we conclude that our DomainGA method shows great promise to be applicable across multiple organisms. Conclusion We envision the DomainGA as a first step of a multiple tier approach to constructing organism specific PPIs. As it is based on fundamental structural information, the DomainGA approach can be used to create potential PPIs and the accuracy of the constructed interaction template can be further improved using complementary methods. Explanation ratios obtained in the reported test case studies clearly show that the false prediction rates of the template networks constructed

  17. Prediction and characterization of protein-protein interaction networks in swine

    Directory of Open Access Journals (Sweden)

    Wang Fen

    2012-01-01

    Full Text Available Abstract Background Studying the large-scale protein-protein interaction (PPI network is important in understanding biological processes. The current research presents the first PPI map of swine, which aims to give new insights into understanding their biological processes. Results We used three methods, Interolog-based prediction of porcine PPI network, domain-motif interactions from structural topology-based prediction of porcine PPI network and motif-motif interactions from structural topology-based prediction of porcine PPI network, to predict porcine protein interactions among 25,767 porcine proteins. We predicted 20,213, 331,484, and 218,705 porcine PPIs respectively, merged the three results into 567,441 PPIs, constructed four PPI networks, and analyzed the topological properties of the porcine PPI networks. Our predictions were validated with Pfam domain annotations and GO annotations. Averages of 70, 10,495, and 863 interactions were related to the Pfam domain-interacting pairs in iPfam database. For comparison, randomized networks were generated, and averages of only 4.24, 66.79, and 44.26 interactions were associated with Pfam domain-interacting pairs in iPfam database. In GO annotations, we found 52.68%, 75.54%, 27.20% of the predicted PPIs sharing GO terms respectively. However, the number of PPI pairs sharing GO terms in the 10,000 randomized networks reached 52.68%, 75.54%, 27.20% is 0. Finally, we determined the accuracy and precision of the methods. The methods yielded accuracies of 0.92, 0.53, and 0.50 at precisions of about 0.93, 0.74, and 0.75, respectively. Conclusion The results reveal that the predicted PPI networks are considerably reliable. The present research is an important pioneering work on protein function research. The porcine PPI data set, the confidence score of each interaction and a list of related data are available at (http://pppid.biositemap.com/.

  18. Predicting and validating protein interactions using network structure.

    Directory of Open Access Journals (Sweden)

    Pao-Yang Chen

    2008-07-01

    Full Text Available Protein interactions play a vital part in the function of a cell. As experimental techniques for detection and validation of protein interactions are time consuming, there is a need for computational methods for this task. Protein interactions appear to form a network with a relatively high degree of local clustering. In this paper we exploit this clustering by suggesting a score based on triplets of observed protein interactions. The score utilises both protein characteristics and network properties. Our score based on triplets is shown to complement existing techniques for predicting protein interactions, outperforming them on data sets which display a high degree of clustering. The predicted interactions score highly against test measures for accuracy. Compared to a similar score derived from pairwise interactions only, the triplet score displays higher sensitivity and specificity. By looking at specific examples, we show how an experimental set of interactions can be enriched and validated. As part of this work we also examine the effect of different prior databases upon the accuracy of prediction and find that the interactions from the same kingdom give better results than from across kingdoms, suggesting that there may be fundamental differences between the networks. These results all emphasize that network structure is important and helps in the accurate prediction of protein interactions. The protein interaction data set and the program used in our analysis, and a list of predictions and validations, are available at http://www.stats.ox.ac.uk/bioinfo/resources/PredictingInteractions.

  19. Application of Machine Learning Approaches for Protein-protein Interactions Prediction.

    Science.gov (United States)

    Zhang, Mengying; Su, Qiang; Lu, Yi; Zhao, Manman; Niu, Bing

    2017-01-01

    Proteomics endeavors to study the structures, functions and interactions of proteins. Information of the protein-protein interactions (PPIs) helps to improve our knowledge of the functions and the 3D structures of proteins. Thus determining the PPIs is essential for the study of the proteomics. In this review, in order to study the application of machine learning in predicting PPI, some machine learning approaches such as support vector machine (SVM), artificial neural networks (ANNs) and random forest (RF) were selected, and the examples of its applications in PPIs were listed. SVM and RF are two commonly used methods. Nowadays, more researchers predict PPIs by combining more than two methods. This review presents the application of machine learning approaches in predicting PPI. Many examples of success in identification and prediction in the area of PPI prediction have been discussed, and the PPIs research is still in progress. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  20. Predicting protein-protein interactions from multimodal biological data sources via nonnegative matrix tri-factorization.

    Science.gov (United States)

    Wang, Hua; Huang, Heng; Ding, Chris; Nie, Feiping

    2013-04-01

    Protein interactions are central to all the biological processes and structural scaffolds in living organisms, because they orchestrate a number of cellular processes such as metabolic pathways and immunological recognition. Several high-throughput methods, for example, yeast two-hybrid system and mass spectrometry method, can help determine protein interactions, which, however, suffer from high false-positive rates. Moreover, many protein interactions predicted by one method are not supported by another. Therefore, computational methods are necessary and crucial to complete the interactome expeditiously. In this work, we formulate the problem of predicting protein interactions from a new mathematical perspective--sparse matrix completion, and propose a novel nonnegative matrix factorization (NMF)-based matrix completion approach to predict new protein interactions from existing protein interaction networks. Through using manifold regularization, we further develop our method to integrate different biological data sources, such as protein sequences, gene expressions, protein structure information, etc. Extensive experimental results on four species, Saccharomyces cerevisiae, Drosophila melanogaster, Homo sapiens, and Caenorhabditis elegans, have shown that our new methods outperform related state-of-the-art protein interaction prediction methods.

  1. Protein-protein interaction site predictions with minimum covariance determinant and Mahalanobis distance.

    Science.gov (United States)

    Qiu, Zhijun; Zhou, Bo; Yuan, Jiangfeng

    2017-11-21

    Protein-protein interaction site (PPIS) prediction must deal with the diversity of interaction sites that limits their prediction accuracy. Use of proteins with unknown or unidentified interactions can also lead to missing interfaces. Such data errors are often brought into the training dataset. In response to these two problems, we used the minimum covariance determinant (MCD) method to refine the training data to build a predictor with better performance, utilizing its ability of removing outliers. In order to predict test data in practice, a method based on Mahalanobis distance was devised to select proper test data as input for the predictor. With leave-one-validation and independent test, after the Mahalanobis distance screening, our method achieved higher performance according to Matthews correlation coefficient (MCC), although only a part of test data could be predicted. These results indicate that data refinement is an efficient approach to improve protein-protein interaction site prediction. By further optimizing our method, it is hopeful to develop predictors of better performance and wide range of application. Copyright © 2017 Elsevier Ltd. All rights reserved.

  2. From nonspecific DNA-protein encounter complexes to the prediction of DNA-protein interactions.

    Directory of Open Access Journals (Sweden)

    Mu Gao

    2009-03-01

    Full Text Available DNA-protein interactions are involved in many essential biological activities. Because there is no simple mapping code between DNA base pairs and protein amino acids, the prediction of DNA-protein interactions is a challenging problem. Here, we present a novel computational approach for predicting DNA-binding protein residues and DNA-protein interaction modes without knowing its specific DNA target sequence. Given the structure of a DNA-binding protein, the method first generates an ensemble of complex structures obtained by rigid-body docking with a nonspecific canonical B-DNA. Representative models are subsequently selected through clustering and ranking by their DNA-protein interfacial energy. Analysis of these encounter complex models suggests that the recognition sites for specific DNA binding are usually favorable interaction sites for the nonspecific DNA probe and that nonspecific DNA-protein interaction modes exhibit some similarity to specific DNA-protein binding modes. Although the method requires as input the knowledge that the protein binds DNA, in benchmark tests, it achieves better performance in identifying DNA-binding sites than three previously established methods, which are based on sophisticated machine-learning techniques. We further apply our method to protein structures predicted through modeling and demonstrate that our method performs satisfactorily on protein models whose root-mean-square Calpha deviation from native is up to 5 A from their native structures. This study provides valuable structural insights into how a specific DNA-binding protein interacts with a nonspecific DNA sequence. The similarity between the specific DNA-protein interaction mode and nonspecific interaction modes may reflect an important sampling step in search of its specific DNA targets by a DNA-binding protein.

  3. HKC: An Algorithm to Predict Protein Complexes in Protein-Protein Interaction Networks

    Directory of Open Access Journals (Sweden)

    Xiaomin Wang

    2011-01-01

    Full Text Available With the availability of more and more genome-scale protein-protein interaction (PPI networks, research interests gradually shift to Systematic Analysis on these large data sets. A key topic is to predict protein complexes in PPI networks by identifying clusters that are densely connected within themselves but sparsely connected with the rest of the network. In this paper, we present a new topology-based algorithm, HKC, to detect protein complexes in genome-scale PPI networks. HKC mainly uses the concepts of highest k-core and cohesion to predict protein complexes by identifying overlapping clusters. The experiments on two data sets and two benchmarks show that our algorithm has relatively high F-measure and exhibits better performance compared with some other methods.

  4. On the analysis of protein-protein interactions via knowledge-based potentials for the prediction of protein-protein docking

    DEFF Research Database (Denmark)

    Feliu, Elisenda; Aloy, Patrick; Oliva, Baldo

    2011-01-01

    Development of effective methods to screen binary interactions obtained by rigid-body protein-protein docking is key for structure prediction of complexes and for elucidating physicochemical principles of protein-protein binding. We have derived empirical knowledge-based potential functions for s...... and with independence of the partner. This information is encoded at the residue level and could be easily incorporated in the initial grid scoring for Fast Fourier Transform rigid-body docking methods.......Development of effective methods to screen binary interactions obtained by rigid-body protein-protein docking is key for structure prediction of complexes and for elucidating physicochemical principles of protein-protein binding. We have derived empirical knowledge-based potential functions...... for selecting rigid-body docking poses. These potentials include the energetic component that provides the residues with a particular secondary structure and surface accessibility. These scoring functions have been tested on a state-of-art benchmark dataset and on a decoy dataset of permanent interactions. Our...

  5. Predicting protein-protein interactions in Arabidopsis thaliana through integration of orthology, gene ontology and co-expression

    Directory of Open Access Journals (Sweden)

    Vandepoele Klaas

    2009-06-01

    Full Text Available Abstract Background Large-scale identification of the interrelationships between different components of the cell, such as the interactions between proteins, has recently gained great interest. However, unraveling large-scale protein-protein interaction maps is laborious and expensive. Moreover, assessing the reliability of the interactions can be cumbersome. Results In this study, we have developed a computational method that exploits the existing knowledge on protein-protein interactions in diverse species through orthologous relations on the one hand, and functional association data on the other hand to predict and filter protein-protein interactions in Arabidopsis thaliana. A highly reliable set of protein-protein interactions is predicted through this integrative approach making use of existing protein-protein interaction data from yeast, human, C. elegans and D. melanogaster. Localization, biological process, and co-expression data are used as powerful indicators for protein-protein interactions. The functional repertoire of the identified interactome reveals interactions between proteins functioning in well-conserved as well as plant-specific biological processes. We observe that although common mechanisms (e.g. actin polymerization and components (e.g. ARPs, actin-related proteins exist between different lineages, they are active in specific processes such as growth, cancer metastasis and trichome development in yeast, human and Arabidopsis, respectively. Conclusion We conclude that the integration of orthology with functional association data is adequate to predict protein-protein interactions. Through this approach, a high number of novel protein-protein interactions with diverse biological roles is discovered. Overall, we have predicted a reliable set of protein-protein interactions suitable for further computational as well as experimental analyses.

  6. Topology and weights in a protein domain interaction network--a novel way to predict protein interactions.

    Science.gov (United States)

    Wuchty, Stefan

    2006-05-23

    While the analysis of unweighted biological webs as diverse as genetic, protein and metabolic networks allowed spectacular insights in the inner workings of a cell, biological networks are not only determined by their static grid of links. In fact, we expect that the heterogeneity in the utilization of connections has a major impact on the organization of cellular activities as well. We consider a web of interactions between protein domains of the Protein Family database (PFAM), which are weighted by a probability score. We apply metrics that combine the static layout and the weights of the underlying interactions. We observe that unweighted measures as well as their weighted counterparts largely share the same trends in the underlying domain interaction network. However, we only find weak signals that weights and the static grid of interactions are connected entities. Therefore assuming that a protein interaction is governed by a single domain interaction, we observe strong and significant correlations of the highest scoring domain interaction and the confidence of protein interactions in the underlying interactions of yeast and fly. Modeling an interaction between proteins if we find a high scoring protein domain interaction we obtain 1, 428 protein interactions among 361 proteins in the human malaria parasite Plasmodium falciparum. Assessing their quality by a logistic regression method we observe that increasing confidence of predicted interactions is accompanied by high scoring domain interactions and elevated levels of functional similarity and evolutionary conservation. Our results indicate that probability scores are randomly distributed, allowing to treat static grid and weights of domain interactions as separate entities. In particular, these finding confirms earlier observations that a protein interaction is a matter of a single interaction event on domain level. As an immediate application, we show a simple way to predict potential protein interactions

  7. Prediction of heterodimeric protein complexes from weighted protein-protein interaction networks using novel features and kernel functions.

    Directory of Open Access Journals (Sweden)

    Peiying Ruan

    Full Text Available Since many proteins express their functional activity by interacting with other proteins and forming protein complexes, it is very useful to identify sets of proteins that form complexes. For that purpose, many prediction methods for protein complexes from protein-protein interactions have been developed such as MCL, MCODE, RNSC, PCP, RRW, and NWE. These methods have dealt with only complexes with size of more than three because the methods often are based on some density of subgraphs. However, heterodimeric protein complexes that consist of two distinct proteins occupy a large part according to several comprehensive databases of known complexes. In this paper, we propose several feature space mappings from protein-protein interaction data, in which each interaction is weighted based on reliability. Furthermore, we make use of prior knowledge on protein domains to develop feature space mappings, domain composition kernel and its combination kernel with our proposed features. We perform ten-fold cross-validation computational experiments. These results suggest that our proposed kernel considerably outperforms the naive Bayes-based method, which is the best existing method for predicting heterodimeric protein complexes.

  8. Exploration of the dynamic properties of protein complexes predicted from spatially constrained protein-protein interaction networks.

    Directory of Open Access Journals (Sweden)

    Eric A Yen

    2014-05-01

    Full Text Available Protein complexes are not static, but rather highly dynamic with subunits that undergo 1-dimensional diffusion with respect to each other. Interactions within protein complexes are modulated through regulatory inputs that alter interactions and introduce new components and deplete existing components through exchange. While it is clear that the structure and function of any given protein complex is coupled to its dynamical properties, it remains a challenge to predict the possible conformations that complexes can adopt. Protein-fragment Complementation Assays detect physical interactions between protein pairs constrained to ≤8 nm from each other in living cells. This method has been used to build networks composed of 1000s of pair-wise interactions. Significantly, these networks contain a wealth of dynamic information, as the assay is fully reversible and the proteins are expressed in their natural context. In this study, we describe a method that extracts this valuable information in the form of predicted conformations, allowing the user to explore the conformational landscape, to search for structures that correlate with an activity state, and estimate the abundance of conformations in the living cell. The generator is based on a Markov Chain Monte Carlo simulation that uses the interaction dataset as input and is constrained by the physical resolution of the assay. We applied this method to an 18-member protein complex composed of the seven core proteins of the budding yeast Arp2/3 complex and 11 associated regulators and effector proteins. We generated 20,480 output structures and identified conformational states using principle component analysis. We interrogated the conformation landscape and found evidence of symmetry breaking, a mixture of likely active and inactive conformational states and dynamic exchange of the core protein Arc15 between core and regulatory components. Our method provides a novel tool for prediction and

  9. Protein docking prediction using predicted protein-protein interface

    Directory of Open Access Journals (Sweden)

    Li Bin

    2012-01-01

    Full Text Available Abstract Background Many important cellular processes are carried out by protein complexes. To provide physical pictures of interacting proteins, many computational protein-protein prediction methods have been developed in the past. However, it is still difficult to identify the correct docking complex structure within top ranks among alternative conformations. Results We present a novel protein docking algorithm that utilizes imperfect protein-protein binding interface prediction for guiding protein docking. Since the accuracy of protein binding site prediction varies depending on cases, the challenge is to develop a method which does not deteriorate but improves docking results by using a binding site prediction which may not be 100% accurate. The algorithm, named PI-LZerD (using Predicted Interface with Local 3D Zernike descriptor-based Docking algorithm, is based on a pair wise protein docking prediction algorithm, LZerD, which we have developed earlier. PI-LZerD starts from performing docking prediction using the provided protein-protein binding interface prediction as constraints, which is followed by the second round of docking with updated docking interface information to further improve docking conformation. Benchmark results on bound and unbound cases show that PI-LZerD consistently improves the docking prediction accuracy as compared with docking without using binding site prediction or using the binding site prediction as post-filtering. Conclusion We have developed PI-LZerD, a pairwise docking algorithm, which uses imperfect protein-protein binding interface prediction to improve docking accuracy. PI-LZerD consistently showed better prediction accuracy over alternative methods in the series of benchmark experiments including docking using actual docking interface site predictions as well as unbound docking cases.

  10. Protein docking prediction using predicted protein-protein interface.

    Science.gov (United States)

    Li, Bin; Kihara, Daisuke

    2012-01-10

    Many important cellular processes are carried out by protein complexes. To provide physical pictures of interacting proteins, many computational protein-protein prediction methods have been developed in the past. However, it is still difficult to identify the correct docking complex structure within top ranks among alternative conformations. We present a novel protein docking algorithm that utilizes imperfect protein-protein binding interface prediction for guiding protein docking. Since the accuracy of protein binding site prediction varies depending on cases, the challenge is to develop a method which does not deteriorate but improves docking results by using a binding site prediction which may not be 100% accurate. The algorithm, named PI-LZerD (using Predicted Interface with Local 3D Zernike descriptor-based Docking algorithm), is based on a pair wise protein docking prediction algorithm, LZerD, which we have developed earlier. PI-LZerD starts from performing docking prediction using the provided protein-protein binding interface prediction as constraints, which is followed by the second round of docking with updated docking interface information to further improve docking conformation. Benchmark results on bound and unbound cases show that PI-LZerD consistently improves the docking prediction accuracy as compared with docking without using binding site prediction or using the binding site prediction as post-filtering. We have developed PI-LZerD, a pairwise docking algorithm, which uses imperfect protein-protein binding interface prediction to improve docking accuracy. PI-LZerD consistently showed better prediction accuracy over alternative methods in the series of benchmark experiments including docking using actual docking interface site predictions as well as unbound docking cases.

  11. Prediction and Dissection of Protein-RNA Interactions by Molecular Descriptors.

    Science.gov (United States)

    Liu, Zhi-Ping; Chen, Luonan

    2016-01-01

    Protein-RNA interactions play crucial roles in numerous biological processes. However, detecting the interactions and binding sites between protein and RNA by traditional experiments is still time consuming and labor costing. Thus, it is of importance to develop bioinformatics methods for predicting protein-RNA interactions and binding sites. Accurate prediction of protein-RNA interactions and recognitions will highly benefit to decipher the interaction mechanisms between protein and RNA, as well as to improve the RNA-related protein engineering and drug design. In this work, we summarize the current bioinformatics strategies of predicting protein-RNA interactions and dissecting protein-RNA interaction mechanisms from local structure binding motifs. In particular, we focus on the feature-based machine learning methods, in which the molecular descriptors of protein and RNA are extracted and integrated as feature vectors of representing the interaction events and recognition residues. In addition, the available methods are classified and compared comprehensively. The molecular descriptors are expected to elucidate the binding mechanisms of protein-RNA interaction and reveal the functional implications from structural complementary perspective.

  12. Topology and weights in a protein domain interaction network – a novel way to predict protein interactions

    Directory of Open Access Journals (Sweden)

    Wuchty Stefan

    2006-05-01

    Full Text Available Abstract Background While the analysis of unweighted biological webs as diverse as genetic, protein and metabolic networks allowed spectacular insights in the inner workings of a cell, biological networks are not only determined by their static grid of links. In fact, we expect that the heterogeneity in the utilization of connections has a major impact on the organization of cellular activities as well. Results We consider a web of interactions between protein domains of the Protein Family database (PFAM, which are weighted by a probability score. We apply metrics that combine the static layout and the weights of the underlying interactions. We observe that unweighted measures as well as their weighted counterparts largely share the same trends in the underlying domain interaction network. However, we only find weak signals that weights and the static grid of interactions are connected entities. Therefore assuming that a protein interaction is governed by a single domain interaction, we observe strong and significant correlations of the highest scoring domain interaction and the confidence of protein interactions in the underlying interactions of yeast and fly. Modeling an interaction between proteins if we find a high scoring protein domain interaction we obtain 1, 428 protein interactions among 361 proteins in the human malaria parasite Plasmodium falciparum. Assessing their quality by a logistic regression method we observe that increasing confidence of predicted interactions is accompanied by high scoring domain interactions and elevated levels of functional similarity and evolutionary conservation. Conclusion Our results indicate that probability scores are randomly distributed, allowing to treat static grid and weights of domain interactions as separate entities. In particular, these finding confirms earlier observations that a protein interaction is a matter of a single interaction event on domain level. As an immediate application, we

  13. Prediction of protein-protein interaction sites in sequences and 3D structures by random forests.

    Directory of Open Access Journals (Sweden)

    Mile Sikić

    2009-01-01

    Full Text Available Identifying interaction sites in proteins provides important clues to the function of a protein and is becoming increasingly relevant in topics such as systems biology and drug discovery. Although there are numerous papers on the prediction of interaction sites using information derived from structure, there are only a few case reports on the prediction of interaction residues based solely on protein sequence. Here, a sliding window approach is combined with the Random Forests method to predict protein interaction sites using (i a combination of sequence- and structure-derived parameters and (ii sequence information alone. For sequence-based prediction we achieved a precision of 84% with a 26% recall and an F-measure of 40%. When combined with structural information, the prediction performance increases to a precision of 76% and a recall of 38% with an F-measure of 51%. We also present an attempt to rationalize the sliding window size and demonstrate that a nine-residue window is the most suitable for predictor construction. Finally, we demonstrate the applicability of our prediction methods by modeling the Ras-Raf complex using predicted interaction sites as target binding interfaces. Our results suggest that it is possible to predict protein interaction sites with quite a high accuracy using only sequence information.

  14. HitPredict version 4: comprehensive reliability scoring of physical protein-protein interactions from more than 100 species.

    Science.gov (United States)

    López, Yosvany; Nakai, Kenta; Patil, Ashwini

    2015-01-01

    HitPredict is a consolidated resource of experimentally identified, physical protein-protein interactions with confidence scores to indicate their reliability. The study of genes and their inter-relationships using methods such as network and pathway analysis requires high quality protein-protein interaction information. Extracting reliable interactions from most of the existing databases is challenging because they either contain only a subset of the available interactions, or a mixture of physical, genetic and predicted interactions. Automated integration of interactions is further complicated by varying levels of accuracy of database content and lack of adherence to standard formats. To address these issues, the latest version of HitPredict provides a manually curated dataset of 398 696 physical associations between 70 808 proteins from 105 species. Manual confirmation was used to resolve all issues encountered during data integration. For improved reliability assessment, this version combines a new score derived from the experimental information of the interactions with the original score based on the features of the interacting proteins. The combined interaction score performs better than either of the individual scores in HitPredict as well as the reliability score of another similar database. HitPredict provides a web interface to search proteins and visualize their interactions, and the data can be downloaded for offline analysis. Data usability has been enhanced by mapping protein identifiers across multiple reference databases. Thus, the latest version of HitPredict provides a significantly larger, more reliable and usable dataset of protein-protein interactions from several species for the study of gene groups. Database URL: http://hintdb.hgc.jp/htp. © The Author(s) 2015. Published by Oxford University Press.

  15. Boosting compound-protein interaction prediction by deep learning.

    Science.gov (United States)

    Tian, Kai; Shao, Mingyu; Wang, Yang; Guan, Jihong; Zhou, Shuigeng

    2016-11-01

    The identification of interactions between compounds and proteins plays an important role in network pharmacology and drug discovery. However, experimentally identifying compound-protein interactions (CPIs) is generally expensive and time-consuming, computational approaches are thus introduced. Among these, machine-learning based methods have achieved a considerable success. However, due to the nonlinear and imbalanced nature of biological data, many machine learning approaches have their own limitations. Recently, deep learning techniques show advantages over many state-of-the-art machine learning methods in some applications. In this study, we aim at improving the performance of CPI prediction based on deep learning, and propose a method called DL-CPI (the abbreviation of Deep Learning for Compound-Protein Interactions prediction), which employs deep neural network (DNN) to effectively learn the representations of compound-protein pairs. Extensive experiments show that DL-CPI can learn useful features of compound-protein pairs by a layerwise abstraction, and thus achieves better prediction performance than existing methods on both balanced and imbalanced datasets. Copyright © 2016 Elsevier Inc. All rights reserved.

  16. Semi-supervised drug-protein interaction prediction from heterogeneous biological spaces.

    Science.gov (United States)

    Xia, Zheng; Wu, Ling-Yun; Zhou, Xiaobo; Wong, Stephen T C

    2010-09-13

    Predicting drug-protein interactions from heterogeneous biological data sources is a key step for in silico drug discovery. The difficulty of this prediction task lies in the rarity of known drug-protein interactions and myriad unknown interactions to be predicted. To meet this challenge, a manifold regularization semi-supervised learning method is presented to tackle this issue by using labeled and unlabeled information which often generates better results than using the labeled data alone. Furthermore, our semi-supervised learning method integrates known drug-protein interaction network information as well as chemical structure and genomic sequence data. Using the proposed method, we predicted certain drug-protein interactions on the enzyme, ion channel, GPCRs, and nuclear receptor data sets. Some of them are confirmed by the latest publicly available drug targets databases such as KEGG. We report encouraging results of using our method for drug-protein interaction network reconstruction which may shed light on the molecular interaction inference and new uses of marketed drugs.

  17. GRIP: A web-based system for constructing Gold Standard datasets for protein-protein interaction prediction

    Directory of Open Access Journals (Sweden)

    Zheng Huiru

    2009-01-01

    Full Text Available Abstract Background Information about protein interaction networks is fundamental to understanding protein function and cellular processes. Interaction patterns among proteins can suggest new drug targets and aid in the design of new therapeutic interventions. Efforts have been made to map interactions on a proteomic-wide scale using both experimental and computational techniques. Reference datasets that contain known interacting proteins (positive cases and non-interacting proteins (negative cases are essential to support computational prediction and validation of protein-protein interactions. Information on known interacting and non interacting proteins are usually stored within databases. Extraction of these data can be both complex and time consuming. Although, the automatic construction of reference datasets for classification is a useful resource for researchers no public resource currently exists to perform this task. Results GRIP (Gold Reference dataset constructor from Information on Protein complexes is a web-based system that provides researchers with the functionality to create reference datasets for protein-protein interaction prediction in Saccharomyces cerevisiae. Both positive and negative cases for a reference dataset can be extracted, organised and downloaded by the user. GRIP also provides an upload facility whereby users can submit proteins to determine protein complex membership. A search facility is provided where a user can search for protein complex information in Saccharomyces cerevisiae. Conclusion GRIP is developed to retrieve information on protein complex, cellular localisation, and physical and genetic interactions in Saccharomyces cerevisiae. Manual construction of reference datasets can be a time consuming process requiring programming knowledge. GRIP simplifies and speeds up this process by allowing users to automatically construct reference datasets. GRIP is free to access at http://rosalind.infj.ulst.ac.uk/GRIP/.

  18. MEGADOCK-Web: an integrated database of high-throughput structure-based protein-protein interaction predictions.

    Science.gov (United States)

    Hayashi, Takanori; Matsuzaki, Yuri; Yanagisawa, Keisuke; Ohue, Masahito; Akiyama, Yutaka

    2018-05-08

    Protein-protein interactions (PPIs) play several roles in living cells, and computational PPI prediction is a major focus of many researchers. The three-dimensional (3D) structure and binding surface are important for the design of PPI inhibitors. Therefore, rigid body protein-protein docking calculations for two protein structures are expected to allow elucidation of PPIs different from known complexes in terms of 3D structures because known PPI information is not explicitly required. We have developed rapid PPI prediction software based on protein-protein docking, called MEGADOCK. In order to fully utilize the benefits of computational PPI predictions, it is necessary to construct a comprehensive database to gather prediction results and their predicted 3D complex structures and to make them easily accessible. Although several databases exist that provide predicted PPIs, the previous databases do not contain a sufficient number of entries for the purpose of discovering novel PPIs. In this study, we constructed an integrated database of MEGADOCK PPI predictions, named MEGADOCK-Web. MEGADOCK-Web provides more than 10 times the number of PPI predictions than previous databases and enables users to conduct PPI predictions that cannot be found in conventional PPI prediction databases. In MEGADOCK-Web, there are 7528 protein chains and 28,331,628 predicted PPIs from all possible combinations of those proteins. Each protein structure is annotated with PDB ID, chain ID, UniProt AC, related KEGG pathway IDs, and known PPI pairs. Additionally, MEGADOCK-Web provides four powerful functions: 1) searching precalculated PPI predictions, 2) providing annotations for each predicted protein pair with an experimentally known PPI, 3) visualizing candidates that may interact with the query protein on biochemical pathways, and 4) visualizing predicted complex structures through a 3D molecular viewer. MEGADOCK-Web provides a huge amount of comprehensive PPI predictions based on

  19. Improving accuracy of protein-protein interaction prediction by considering the converse problem for sequence representation

    Directory of Open Access Journals (Sweden)

    Wang Yong

    2011-10-01

    Full Text Available Abstract Background With the development of genome-sequencing technologies, protein sequences are readily obtained by translating the measured mRNAs. Therefore predicting protein-protein interactions from the sequences is of great demand. The reason lies in the fact that identifying protein-protein interactions is becoming a bottleneck for eventually understanding the functions of proteins, especially for those organisms barely characterized. Although a few methods have been proposed, the converse problem, if the features used extract sufficient and unbiased information from protein sequences, is almost untouched. Results In this study, we interrogate this problem theoretically by an optimization scheme. Motivated by the theoretical investigation, we find novel encoding methods for both protein sequences and protein pairs. Our new methods exploit sufficiently the information of protein sequences and reduce artificial bias and computational cost. Thus, it significantly outperforms the available methods regarding sensitivity, specificity, precision, and recall with cross-validation evaluation and reaches ~80% and ~90% accuracy in Escherichia coli and Saccharomyces cerevisiae respectively. Our findings here hold important implication for other sequence-based prediction tasks because representation of biological sequence is always the first step in computational biology. Conclusions By considering the converse problem, we propose new representation methods for both protein sequences and protein pairs. The results show that our method significantly improves the accuracy of protein-protein interaction predictions.

  20. A computational tool to predict the evolutionarily conserved protein-protein interaction hot-spot residues from the structure of the unbound protein.

    Science.gov (United States)

    Agrawal, Neeraj J; Helk, Bernhard; Trout, Bernhardt L

    2014-01-21

    Identifying hot-spot residues - residues that are critical to protein-protein binding - can help to elucidate a protein's function and assist in designing therapeutic molecules to target those residues. We present a novel computational tool, termed spatial-interaction-map (SIM), to predict the hot-spot residues of an evolutionarily conserved protein-protein interaction from the structure of an unbound protein alone. SIM can predict the protein hot-spot residues with an accuracy of 36-57%. Thus, the SIM tool can be used to predict the yet unknown hot-spot residues for many proteins for which the structure of the protein-protein complexes are not available, thereby providing a clue to their functions and an opportunity to design therapeutic molecules to target these proteins. Copyright © 2013 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.

  1. Prediction of localization and interactions of apoptotic proteins

    Directory of Open Access Journals (Sweden)

    Matula Pavel

    2009-07-01

    Full Text Available Abstract During apoptosis several mitochondrial proteins are released. Some of them participate in caspase-independent nuclear DNA degradation, especially apoptosis-inducing factor (AIF and endonuclease G (endoG. Another interesting protein, which was expected to act similarly as AIF due to the high sequence homology with AIF is AIF-homologous mitochondrion-associated inducer of death (AMID. We studied the structure, cellular localization, and interactions of several proteins in silico and also in cells using fluorescent microscopy. We found the AMID protein to be cytoplasmic, most probably incorporated into the cytoplasmic side of the lipid membranes. Bioinformatic predictions were conducted to analyze the interactions of the studied proteins with each other and with other possible partners. We conducted molecular modeling of proteins with unknown 3D structures. These models were then refined by MolProbity server and employed in molecular docking simulations of interactions. Our results show data acquired using a combination of modern in silico methods and image analysis to understand the localization, interactions and functions of proteins AMID, AIF, endonuclease G, and other apoptosis-related proteins.

  2. Prediction of thermodynamic instabilities of protein solutions from simple protein–protein interactions

    International Nuclear Information System (INIS)

    D’Agostino, Tommaso; Solana, José Ramón; Emanuele, Antonio

    2013-01-01

    Highlights: ► We propose a model of effective protein–protein interaction embedding solvent effects. ► A previous square-well model is enhanced by giving to the interaction a free energy character. ► The temperature dependence of the interaction is due to entropic effects of the solvent. ► The validity of the original SW model is extended to entropy driven phase transitions. ► We get good fits for lysozyme and haemoglobin spinodal data taken from literature. - Abstract: Statistical thermodynamics of protein solutions is often studied in terms of simple, microscopic models of particles interacting via pairwise potentials. Such modelling can reproduce the short range structure of protein solutions at equilibrium and predict thermodynamics instabilities of these systems. We introduce a square well model of effective protein–protein interaction that embeds the solvent’s action. We modify an existing model [45] by considering a well depth having an explicit dependence on temperature, i.e. an explicit free energy character, thus encompassing the statistically relevant configurations of solvent molecules around proteins. We choose protein solutions exhibiting demixing upon temperature decrease (lysozyme, enthalpy driven) and upon temperature increase (haemoglobin, entropy driven). We obtain satisfactory fits of spinodal curves for both the two proteins without adding any mean field term, thus extending the validity of the original model. Our results underline the solvent role in modulating or stretching the interaction potential

  3. Improving protein-protein interaction prediction using evolutionary information from low-quality MSAs.

    Science.gov (United States)

    Várnai, Csilla; Burkoff, Nikolas S; Wild, David L

    2017-01-01

    Evolutionary information stored in multiple sequence alignments (MSAs) has been used to identify the interaction interface of protein complexes, by measuring either co-conservation or co-mutation of amino acid residues across the interface. Recently, maximum entropy related correlated mutation measures (CMMs) such as direct information, decoupling direct from indirect interactions, have been developed to identify residue pairs interacting across the protein complex interface. These studies have focussed on carefully selected protein complexes with large, good-quality MSAs. In this work, we study protein complexes with a more typical MSA consisting of fewer than 400 sequences, using a set of 79 intramolecular protein complexes. Using a maximum entropy based CMM at the residue level, we develop an interface level CMM score to be used in re-ranking docking decoys. We demonstrate that our interface level CMM score compares favourably to the complementarity trace score, an evolutionary information-based score measuring co-conservation, when combined with the number of interface residues, a knowledge-based potential and the variability score of individual amino acid sites. We also demonstrate, that, since co-mutation and co-complementarity in the MSA contain orthogonal information, the best prediction performance using evolutionary information can be achieved by combining the co-mutation information of the CMM with co-conservation information of a complementarity trace score, predicting a near-native structure as the top prediction for 41% of the dataset. The method presented is not restricted to small MSAs, and will likely improve interface prediction also for complexes with large and good-quality MSAs.

  4. Automatic selection of reference taxa for protein-protein interaction prediction with phylogenetic profiling

    DEFF Research Database (Denmark)

    Simonsen, Martin; Maetschke, S.R.; Ragan, M.A.

    2012-01-01

    Motivation: Phylogenetic profiling methods can achieve good accuracy in predicting protein–protein interactions, especially in prokaryotes. Recent studies have shown that the choice of reference taxa (RT) is critical for accurate prediction, but with more than 2500 fully sequenced taxa publicly......: We present three novel methods for automating the selection of RT, using machine learning based on known protein–protein interaction networks. One of these methods in particular, Tree-Based Search, yields greatly improved prediction accuracies. We further show that different methods for constituting...... phylogenetic profiles often require very different RT sets to support high prediction accuracy....

  5. Utilizing knowledge base of amino acids structural neighborhoods to predict protein-protein interaction sites.

    Science.gov (United States)

    Jelínek, Jan; Škoda, Petr; Hoksza, David

    2017-12-06

    Protein-protein interactions (PPI) play a key role in an investigation of various biochemical processes, and their identification is thus of great importance. Although computational prediction of which amino acids take part in a PPI has been an active field of research for some time, the quality of in-silico methods is still far from perfect. We have developed a novel prediction method called INSPiRE which benefits from a knowledge base built from data available in Protein Data Bank. All proteins involved in PPIs were converted into labeled graphs with nodes corresponding to amino acids and edges to pairs of neighboring amino acids. A structural neighborhood of each node was then encoded into a bit string and stored in the knowledge base. When predicting PPIs, INSPiRE labels amino acids of unknown proteins as interface or non-interface based on how often their structural neighborhood appears as interface or non-interface in the knowledge base. We evaluated INSPiRE's behavior with respect to different types and sizes of the structural neighborhood. Furthermore, we examined the suitability of several different features for labeling the nodes. Our evaluations showed that INSPiRE clearly outperforms existing methods with respect to Matthews correlation coefficient. In this paper we introduce a new knowledge-based method for identification of protein-protein interaction sites called INSPiRE. Its knowledge base utilizes structural patterns of known interaction sites in the Protein Data Bank which are then used for PPI prediction. Extensive experiments on several well-established datasets show that INSPiRE significantly surpasses existing PPI approaches.

  6. DomPep--a general method for predicting modular domain-mediated protein-protein interactions.

    Directory of Open Access Journals (Sweden)

    Lei Li

    Full Text Available Protein-protein interactions (PPIs are frequently mediated by the binding of a modular domain in one protein to a short, linear peptide motif in its partner. The advent of proteomic methods such as peptide and protein arrays has led to the accumulation of a wealth of interaction data for modular interaction domains. Although several computational programs have been developed to predict modular domain-mediated PPI events, they are often restricted to a given domain type. We describe DomPep, a method that can potentially be used to predict PPIs mediated by any modular domains. DomPep combines proteomic data with sequence information to achieve high accuracy and high coverage in PPI prediction. Proteomic binding data were employed to determine a simple yet novel parameter Ligand-Binding Similarity which, in turn, is used to calibrate Domain Sequence Identity and Position-Weighted-Matrix distance, two parameters that are used in constructing prediction models. Moreover, DomPep can be used to predict PPIs for both domains with experimental binding data and those without. Using the PDZ and SH2 domain families as test cases, we show that DomPep can predict PPIs with accuracies superior to existing methods. To evaluate DomPep as a discovery tool, we deployed DomPep to identify interactions mediated by three human PDZ domains. Subsequent in-solution binding assays validated the high accuracy of DomPep in predicting authentic PPIs at the proteome scale. Because DomPep makes use of only interaction data and the primary sequence of a domain, it can be readily expanded to include other types of modular domains.

  7. Specificity and affinity quantification of protein-protein interactions.

    Science.gov (United States)

    Yan, Zhiqiang; Guo, Liyong; Hu, Liang; Wang, Jin

    2013-05-01

    Most biological processes are mediated by the protein-protein interactions. Determination of the protein-protein structures and insight into their interactions are vital to understand the mechanisms of protein functions. Currently, compared with the isolated protein structures, only a small fraction of protein-protein structures are experimentally solved. Therefore, the computational docking methods play an increasing role in predicting the structures and interactions of protein-protein complexes. The scoring function of protein-protein interactions is the key responsible for the accuracy of the computational docking. Previous scoring functions were mostly developed by optimizing the binding affinity which determines the stability of the protein-protein complex, but they are often lack of the consideration of specificity which determines the discrimination of native protein-protein complex against competitive ones. We developed a scoring function (named as SPA-PP, specificity and affinity of the protein-protein interactions) by incorporating both the specificity and affinity into the optimization strategy. The testing results and comparisons with other scoring functions show that SPA-PP performs remarkably on both predictions of binding pose and binding affinity. Thus, SPA-PP is a promising quantification of protein-protein interactions, which can be implemented into the protein docking tools and applied for the predictions of protein-protein structure and affinity. The algorithm is implemented in C language, and the code can be downloaded from http://dl.dropbox.com/u/1865642/Optimization.cpp.

  8. Comparing human-Salmonella with plant-Salmonella protein-protein interaction predictions

    Directory of Open Access Journals (Sweden)

    Sylvia eSchleker

    2015-01-01

    Full Text Available Salmonellosis is the most frequent food-borne disease world-wide and can be transmitted to humans by a variety of routes, especially via animal and plant products. Salmonella bacteria are believed to use not only animal and human but also plant hosts despite their evolutionary distance. This raises the question if Salmonella employs similar mechanisms in infection of these diverse hosts. Given that most of our understanding comes from its interaction with human hosts, we investigate here to what degree knowledge of Salmonella-human interactions can be transferred to the Salmonella-plant system. Reviewed are recent publications on analysis and prediction of Salmonella-host interactomes. Putative protein-protein interactions (PPIs between Salmonella and its human and Arabidopsis hosts were retrieved utilizing purely interolog-based approaches in which predictions were inferred based on available sequence and domain information of known PPIs, and machine learning approaches that integrate a larger set of useful information from different sources. Transfer learning is an especially suitable machine learning technique to predict plant host targets from the knowledge of human host targets. A comparison of the prediction results with transcriptomic data shows a clear overlap between the host proteins predicted to be targeted by PPIs and their gene ontology enrichment in both host species and regulation of gene expression. In particular, the cellular processes Salmonella interferes with in plants and humans are catabolic processes. The details of how these processes are targeted, however, are quite different between the two organisms, as expected based on their evolutionary and habitat differences. Possible implications of this observation on evolution of host-pathogen communication are discussed.

  9. BIPS: BIANA Interolog Prediction Server. A tool for protein-protein interaction inference.

    Science.gov (United States)

    Garcia-Garcia, Javier; Schleker, Sylvia; Klein-Seetharaman, Judith; Oliva, Baldo

    2012-07-01

    Protein-protein interactions (PPIs) play a crucial role in biology, and high-throughput experiments have greatly increased the coverage of known interactions. Still, identification of complete inter- and intraspecies interactomes is far from being complete. Experimental data can be complemented by the prediction of PPIs within an organism or between two organisms based on the known interactions of the orthologous genes of other organisms (interologs). Here, we present the BIANA (Biologic Interactions and Network Analysis) Interolog Prediction Server (BIPS), which offers a web-based interface to facilitate PPI predictions based on interolog information. BIPS benefits from the capabilities of the framework BIANA to integrate the several PPI-related databases. Additional metadata can be used to improve the reliability of the predicted interactions. Sensitivity and specificity of the server have been calculated using known PPIs from different interactomes using a leave-one-out approach. The specificity is between 72 and 98%, whereas sensitivity varies between 1 and 59%, depending on the sequence identity cut-off used to calculate similarities between sequences. BIPS is freely accessible at http://sbi.imim.es/BIPS.php.

  10. Efficient prediction of human protein-protein interactions at a global scale.

    Science.gov (United States)

    Schoenrock, Andrew; Samanfar, Bahram; Pitre, Sylvain; Hooshyar, Mohsen; Jin, Ke; Phillips, Charles A; Wang, Hui; Phanse, Sadhna; Omidi, Katayoun; Gui, Yuan; Alamgir, Md; Wong, Alex; Barrenäs, Fredrik; Babu, Mohan; Benson, Mikael; Langston, Michael A; Green, James R; Dehne, Frank; Golshani, Ashkan

    2014-12-10

    Our knowledge of global protein-protein interaction (PPI) networks in complex organisms such as humans is hindered by technical limitations of current methods. On the basis of short co-occurring polypeptide regions, we developed a tool called MP-PIPE capable of predicting a global human PPI network within 3 months. With a recall of 23% at a precision of 82.1%, we predicted 172,132 putative PPIs. We demonstrate the usefulness of these predictions through a range of experiments. The speed and accuracy associated with MP-PIPE can make this a potential tool to study individual human PPI networks (from genomic sequences alone) for personalized medicine.

  11. Prediction of protein-protein interactions in dengue virus coat proteins guided by low resolution cryoEM structures

    Directory of Open Access Journals (Sweden)

    Srinivasan Narayanaswamy

    2010-06-01

    Full Text Available Abstract Background Dengue virus along with the other members of the flaviviridae family has reemerged as deadly human pathogens. Understanding the mechanistic details of these infections can be highly rewarding in developing effective antivirals. During maturation of the virus inside the host cell, the coat proteins E and M undergo conformational changes, altering the morphology of the viral coat. However, due to low resolution nature of the available 3-D structures of viral assemblies, the atomic details of these changes are still elusive. Results In the present analysis, starting from Cα positions of low resolution cryo electron microscopic structures the residue level details of protein-protein interaction interfaces of dengue virus coat proteins have been predicted. By comparing the preexisting structures of virus in different phases of life cycle, the changes taking place in these predicted protein-protein interaction interfaces were followed as a function of maturation process of the virus. Besides changing the current notion about the presence of only homodimers in the mature viral coat, the present analysis indicated presence of a proline-rich motif at the protein-protein interaction interface of the coat protein. Investigating the conservation status of these seemingly functionally crucial residues across other members of flaviviridae family enabled dissecting common mechanisms used for infections by these viruses. Conclusions Thus, using computational approach the present analysis has provided better insights into the preexisting low resolution structures of virus assemblies, the findings of which can be made use of in designing effective antivirals against these deadly human pathogens.

  12. Extraction of Protein-Protein Interaction from Scientific Articles by Predicting Dominant Keywords.

    Science.gov (United States)

    Koyabu, Shun; Phan, Thi Thanh Thuy; Ohkawa, Takenao

    2015-01-01

    For the automatic extraction of protein-protein interaction information from scientific articles, a machine learning approach is useful. The classifier is generated from training data represented using several features to decide whether a protein pair in each sentence has an interaction. Such a specific keyword that is directly related to interaction as "bind" or "interact" plays an important role for training classifiers. We call it a dominant keyword that affects the capability of the classifier. Although it is important to identify the dominant keywords, whether a keyword is dominant depends on the context in which it occurs. Therefore, we propose a method for predicting whether a keyword is dominant for each instance. In this method, a keyword that derives imbalanced classification results is tentatively assumed to be a dominant keyword initially. Then the classifiers are separately trained from the instance with and without the assumed dominant keywords. The validity of the assumed dominant keyword is evaluated based on the classification results of the generated classifiers. The assumption is updated by the evaluation result. Repeating this process increases the prediction accuracy of the dominant keyword. Our experimental results using five corpora show the effectiveness of our proposed method with dominant keyword prediction.

  13. Predicting the binding patterns of hub proteins: a study using yeast protein interaction networks.

    Directory of Open Access Journals (Sweden)

    Carson M Andorf

    Full Text Available Protein-protein interactions are critical to elucidating the role played by individual proteins in important biological pathways. Of particular interest are hub proteins that can interact with large numbers of partners and often play essential roles in cellular control. Depending on the number of binding sites, protein hubs can be classified at a structural level as singlish-interface hubs (SIH with one or two binding sites, or multiple-interface hubs (MIH with three or more binding sites. In terms of kinetics, hub proteins can be classified as date hubs (i.e., interact with different partners at different times or locations or party hubs (i.e., simultaneously interact with multiple partners.Our approach works in 3 phases: Phase I classifies if a protein is likely to bind with another protein. Phase II determines if a protein-binding (PB protein is a hub. Phase III classifies PB proteins as singlish-interface versus multiple-interface hubs and date versus party hubs. At each stage, we use sequence-based predictors trained using several standard machine learning techniques.Our method is able to predict whether a protein is a protein-binding protein with an accuracy of 94% and a correlation coefficient of 0.87; identify hubs from non-hubs with 100% accuracy for 30% of the data; distinguish date hubs/party hubs with 69% accuracy and area under ROC curve of 0.68; and SIH/MIH with 89% accuracy and area under ROC curve of 0.84. Because our method is based on sequence information alone, it can be used even in settings where reliable protein-protein interaction data or structures of protein-protein complexes are unavailable to obtain useful insights into the functional and evolutionary characteristics of proteins and their interactions.We provide a web server for our three-phase approach: http://hybsvm.gdcb.iastate.edu.

  14. Sequence-based prediction of protein protein interaction using a deep-learning algorithm.

    Science.gov (United States)

    Sun, Tanlin; Zhou, Bo; Lai, Luhua; Pei, Jianfeng

    2017-05-25

    Protein-protein interactions (PPIs) are critical for many biological processes. It is therefore important to develop accurate high-throughput methods for identifying PPI to better understand protein function, disease occurrence, and therapy design. Though various computational methods for predicting PPI have been developed, their robustness for prediction with external datasets is unknown. Deep-learning algorithms have achieved successful results in diverse areas, but their effectiveness for PPI prediction has not been tested. We used a stacked autoencoder, a type of deep-learning algorithm, to study the sequence-based PPI prediction. The best model achieved an average accuracy of 97.19% with 10-fold cross-validation. The prediction accuracies for various external datasets ranged from 87.99% to 99.21%, which are superior to those achieved with previous methods. To our knowledge, this research is the first to apply a deep-learning algorithm to sequence-based PPI prediction, and the results demonstrate its potential in this field.

  15. HitPredict version 4: comprehensive reliability scoring of physical protein?protein interactions from more than 100 species

    OpenAIRE

    L?pez, Yosvany; Nakai, Kenta; Patil, Ashwini

    2015-01-01

    HitPredict is a consolidated resource of experimentally identified, physical protein?protein interactions with confidence scores to indicate their reliability. The study of genes and their inter-relationships using methods such as network and pathway analysis requires high quality protein?protein interaction information. Extracting reliable interactions from most of the existing databases is challenging because they either contain only a subset of the available interactions, or a mixture of p...

  16. Stringent DDI-based prediction of H. sapiens-M. tuberculosis H37Rv protein-protein interactions.

    Science.gov (United States)

    Zhou, Hufeng; Rezaei, Javad; Hugo, Willy; Gao, Shangzhi; Jin, Jingjing; Fan, Mengyuan; Yong, Chern-Han; Wozniak, Michal; Wong, Limsoon

    2013-01-01

    H. sapiens-M. tuberculosis H37Rv protein-protein interaction (PPI) data are very important information to illuminate the infection mechanism of M. tuberculosis H37Rv. But current H. sapiens-M. tuberculosis H37Rv PPI data are very scarce. This seriously limits the study of the interaction between this important pathogen and its host H. sapiens. Computational prediction of H. sapiens-M. tuberculosis H37Rv PPIs is an important strategy to fill in the gap. Domain-domain interaction (DDI) based prediction is one of the frequently used computational approaches in predicting both intra-species and inter-species PPIs. However, the performance of DDI-based host-pathogen PPI prediction has been rather limited. We develop a stringent DDI-based prediction approach with emphasis on (i) differences between the specific domain sequences on annotated regions of proteins under the same domain ID and (ii) calculation of the interaction strength of predicted PPIs based on the interacting residues in their interaction interfaces. We compare our stringent DDI-based approach to a conventional DDI-based approach for predicting PPIs based on gold standard intra-species PPIs and coherent informative Gene Ontology terms assessment. The assessment results show that our stringent DDI-based approach achieves much better performance in predicting PPIs than the conventional approach. Using our stringent DDI-based approach, we have predicted a small set of reliable H. sapiens-M. tuberculosis H37Rv PPIs which could be very useful for a variety of related studies. We also analyze the H. sapiens-M. tuberculosis H37Rv PPIs predicted by our stringent DDI-based approach using cellular compartment distribution analysis, functional category enrichment analysis and pathway enrichment analysis. The analyses support the validity of our prediction result. Also, based on an analysis of the H. sapiens-M. tuberculosis H37Rv PPI network predicted by our stringent DDI-based approach, we have discovered some

  17. Prediction of Protein–Protein Interactions by Evidence Combining Methods

    Directory of Open Access Journals (Sweden)

    Ji-Wei Chang

    2016-11-01

    Full Text Available Most cellular functions involve proteins’ features based on their physical interactions with other partner proteins. Sketching a map of protein–protein interactions (PPIs is therefore an important inception step towards understanding the basics of cell functions. Several experimental techniques operating in vivo or in vitro have made significant contributions to screening a large number of protein interaction partners, especially high-throughput experimental methods. However, computational approaches for PPI predication supported by rapid accumulation of data generated from experimental techniques, 3D structure definitions, and genome sequencing have boosted the map sketching of PPIs. In this review, we shed light on in silico PPI prediction methods that integrate evidence from multiple sources, including evolutionary relationship, function annotation, sequence/structure features, network topology and text mining. These methods are developed for integration of multi-dimensional evidence, for designing the strategies to predict novel interactions, and for making the results consistent with the increase of prediction coverage and accuracy.

  18. Enhancing the prediction of protein pairings between interacting families using orthology information

    Directory of Open Access Journals (Sweden)

    Pazos Florencio

    2008-01-01

    Full Text Available Abstract Background It has repeatedly been shown that interacting protein families tend to have similar phylogenetic trees. These similarities can be used to predicting the mapping between two families of interacting proteins (i.e. which proteins from one family interact with which members of the other. The correct mapping will be that which maximizes the similarity between the trees. The two families may eventually comprise orthologs and paralogs, if members of the two families are present in more than one organism. This fact can be exploited to restrict the possible mappings, simply by impeding links between proteins of different organisms. We present here an algorithm to predict the mapping between families of interacting proteins which is able to incorporate information regarding orthologues, or any other assignment of proteins to "classes" that may restrict possible mappings. Results For the first time in methods for predicting mappings, we have tested this new approach on a large number of interacting protein domains in order to statistically assess its performance. The method accurately predicts around 80% in the most favourable cases. We also analysed in detail the results of the method for a well defined case of interacting families, the sensor and kinase components of the Ntr-type two-component system, for which up to 98% of the pairings predicted by the method were correct. Conclusion Based on the well established relationship between tree similarity and interactions we developed a method for predicting the mapping between two interacting families using genomic information alone. The program is available through a web interface.

  19. Prediction of Protein-Protein Interaction By Metasample-Based Sparse Representation

    Directory of Open Access Journals (Sweden)

    Xiuquan Du

    2015-01-01

    Full Text Available Protein-protein interactions (PPIs play key roles in many cellular processes such as transcription regulation, cell metabolism, and endocrine function. Understanding these interactions takes a great promotion to the pathogenesis and treatment of various diseases. A large amount of data has been generated by experimental techniques; however, most of these data are usually incomplete or noisy, and the current biological experimental techniques are always very time-consuming and expensive. In this paper, we proposed a novel method (metasample-based sparse representation classification, MSRC for PPIs prediction. A group of metasamples are extracted from the original training samples and then use the l1-regularized least square method to express a new testing sample as the linear combination of these metasamples. PPIs prediction is achieved by using a discrimination function defined in the representation coefficients. The MSRC is applied to PPIs dataset; it achieves 84.9% sensitivity, and 94.55% specificity, which is slightly lower than support vector machine (SVM and much higher than naive Bayes (NB, neural networks (NN, and k-nearest neighbor (KNN. The result shows that the MSRC is efficient for PPIs prediction.

  20. Interactive protein manipulation

    International Nuclear Information System (INIS)

    2003-01-01

    We describe an interactive visualization and modeling program for the creation of protein structures ''from scratch''. The input to our program is an amino acid sequence -decoded from a gene- and a sequence of predicted secondary structure types for each amino acid-provided by external structure prediction programs. Our program can be used in the set-up phase of a protein structure prediction process; the structures created with it serve as input for a subsequent global internal energy minimization, or another method of protein structure prediction. Our program supports basic visualization methods for protein structures, interactive manipulation based on inverse kinematics, and visualization guides to aid a user in creating ''good'' initial structures

  1. Interactive protein manipulation

    Energy Technology Data Exchange (ETDEWEB)

    SNCrivelli@lbl.gov

    2003-07-01

    We describe an interactive visualization and modeling program for the creation of protein structures ''from scratch''. The input to our program is an amino acid sequence -decoded from a gene- and a sequence of predicted secondary structure types for each amino acid-provided by external structure prediction programs. Our program can be used in the set-up phase of a protein structure prediction process; the structures created with it serve as input for a subsequent global internal energy minimization, or another method of protein structure prediction. Our program supports basic visualization methods for protein structures, interactive manipulation based on inverse kinematics, and visualization guides to aid a user in creating ''good'' initial structures.

  2. Multi-level machine learning prediction of protein–protein interactions in Saccharomyces cerevisiae

    Directory of Open Access Journals (Sweden)

    Julian Zubek

    2015-07-01

    Full Text Available Accurate identification of protein–protein interactions (PPI is the key step in understanding proteins’ biological functions, which are typically context-dependent. Many existing PPI predictors rely on aggregated features from protein sequences, however only a few methods exploit local information about specific residue contacts. In this work we present a two-stage machine learning approach for prediction of protein–protein interactions. We start with the carefully filtered data on protein complexes available for Saccharomyces cerevisiae in the Protein Data Bank (PDB database. First, we build linear descriptions of interacting and non-interacting sequence segment pairs based on their inter-residue distances. Secondly, we train machine learning classifiers to predict binary segment interactions for any two short sequence fragments. The final prediction of the protein–protein interaction is done using the 2D matrix representation of all-against-all possible interacting sequence segments of both analysed proteins. The level-I predictor achieves 0.88 AUC for micro-scale, i.e., residue-level prediction. The level-II predictor improves the results further by a more complex learning paradigm. We perform 30-fold macro-scale, i.e., protein-level cross-validation experiment. The level-II predictor using PSIPRED-predicted secondary structure reaches 0.70 precision, 0.68 recall, and 0.70 AUC, whereas other popular methods provide results below 0.6 threshold (recall, precision, AUC. Our results demonstrate that multi-scale sequence features aggregation procedure is able to improve the machine learning results by more than 10% as compared to other sequence representations. Prepared datasets and source code for our experimental pipeline are freely available for download from: http://zubekj.github.io/mlppi/ (open source Python implementation, OS independent.

  3. Feature-Based and String-Based Models for Predicting RNA-Protein Interaction

    Directory of Open Access Journals (Sweden)

    Donald Adjeroh

    2018-03-01

    Full Text Available In this work, we study two approaches for the problem of RNA-Protein Interaction (RPI. In the first approach, we use a feature-based technique by combining extracted features from both sequences and secondary structures. The feature-based approach enhanced the prediction accuracy as it included much more available information about the RNA-protein pairs. In the second approach, we apply search algorithms and data structures to extract effective string patterns for prediction of RPI, using both sequence information (protein and RNA sequences, and structure information (protein and RNA secondary structures. This led to different string-based models for predicting interacting RNA-protein pairs. We show results that demonstrate the effectiveness of the proposed approaches, including comparative results against leading state-of-the-art methods.

  4. Different protein-protein interface patterns predicted by different machine learning methods.

    Science.gov (United States)

    Wang, Wei; Yang, Yongxiao; Yin, Jianxin; Gong, Xinqi

    2017-11-22

    Different types of protein-protein interactions make different protein-protein interface patterns. Different machine learning methods are suitable to deal with different types of data. Then, is it the same situation that different interface patterns are preferred for prediction by different machine learning methods? Here, four different machine learning methods were employed to predict protein-protein interface residue pairs on different interface patterns. The performances of the methods for different types of proteins are different, which suggest that different machine learning methods tend to predict different protein-protein interface patterns. We made use of ANOVA and variable selection to prove our result. Our proposed methods taking advantages of different single methods also got a good prediction result compared to single methods. In addition to the prediction of protein-protein interactions, this idea can be extended to other research areas such as protein structure prediction and design.

  5. Exploration of the omics evidence landscape: adding qualitative labels to predicted protein-protein interactions.

    NARCIS (Netherlands)

    Noort, V. van; Snel, B.; Huynen, M.A.

    2007-01-01

    BACKGROUND: In the post-genomic era various functional genomics, proteomics and computational techniques have been developed to elucidate the protein interaction network. While some of these techniques are specific for a certain type of interaction, most predict a mixture of interactions.

  6. Stringent homology-based prediction of H. sapiens-M. tuberculosis H37Rv protein-protein interactions.

    Science.gov (United States)

    Zhou, Hufeng; Gao, Shangzhi; Nguyen, Nam Ninh; Fan, Mengyuan; Jin, Jingjing; Liu, Bing; Zhao, Liang; Xiong, Geng; Tan, Min; Li, Shijun; Wong, Limsoon

    2014-04-08

    H. sapiens-M. tuberculosis H37Rv protein-protein interaction (PPI) data are essential for understanding the infection mechanism of the formidable pathogen M. tuberculosis H37Rv. Computational prediction is an important strategy to fill the gap in experimental H. sapiens-M. tuberculosis H37Rv PPI data. Homology-based prediction is frequently used in predicting both intra-species and inter-species PPIs. However, some limitations are not properly resolved in several published works that predict eukaryote-prokaryote inter-species PPIs using intra-species template PPIs. We develop a stringent homology-based prediction approach by taking into account (i) differences between eukaryotic and prokaryotic proteins and (ii) differences between inter-species and intra-species PPI interfaces. We compare our stringent homology-based approach to a conventional homology-based approach for predicting host-pathogen PPIs, based on cellular compartment distribution analysis, disease gene list enrichment analysis, pathway enrichment analysis and functional category enrichment analysis. These analyses support the validity of our prediction result, and clearly show that our approach has better performance in predicting H. sapiens-M. tuberculosis H37Rv PPIs. Using our stringent homology-based approach, we have predicted a set of highly plausible H. sapiens-M. tuberculosis H37Rv PPIs which might be useful for many of related studies. Based on our analysis of the H. sapiens-M. tuberculosis H37Rv PPI network predicted by our stringent homology-based approach, we have discovered several interesting properties which are reported here for the first time. We find that both host proteins and pathogen proteins involved in the host-pathogen PPIs tend to be hubs in their own intra-species PPI network. Also, both host and pathogen proteins involved in host-pathogen PPIs tend to have longer primary sequence, tend to have more domains, tend to be more hydrophilic, etc. And the protein domains from both

  7. Incorporating information on predicted solvent accessibility to the co-evolution-based study of protein interactions.

    Science.gov (United States)

    Ochoa, David; García-Gutiérrez, Ponciano; Juan, David; Valencia, Alfonso; Pazos, Florencio

    2013-01-27

    A widespread family of methods for studying and predicting protein interactions using sequence information is based on co-evolution, quantified as similarity of phylogenetic trees. Part of the co-evolution observed between interacting proteins could be due to co-adaptation caused by inter-protein contacts. In this case, the co-evolution is expected to be more evident when evaluated on the surface of the proteins or the internal layers close to it. In this work we study the effect of incorporating information on predicted solvent accessibility to three methods for predicting protein interactions based on similarity of phylogenetic trees. We evaluate the performance of these methods in predicting different types of protein associations when trees based on positions with different characteristics of predicted accessibility are used as input. We found that predicted accessibility improves the results of two recent versions of the mirrortree methodology in predicting direct binary physical interactions, while it neither improves these methods, nor the original mirrortree method, in predicting other types of interactions. That improvement comes at no cost in terms of applicability since accessibility can be predicted for any sequence. We also found that predictions of protein-protein interactions are improved when multiple sequence alignments with a richer representation of sequences (including paralogs) are incorporated in the accessibility prediction.

  8. Exploration of the omics evidence landscape: adding qualitative labels to predicted protein-protein interactions

    NARCIS (Netherlands)

    Noort, V. van; Snel, B.; Huynen, M.A.

    2007-01-01

    ABSTRACT: BACKGROUND: In the post-genomic era various functional genomics, proteomics and computational techniques have been developed to elucidate the protein interaction network. While some of these techniques are specific for a certain type of interaction, most predict a mixture of interactions.

  9. Minimum curvilinearity to enhance topological prediction of protein interactions by network embedding

    KAUST Repository

    Cannistraci, Carlo

    2013-06-21

    Motivation: Most functions within the cell emerge thanks to protein-protein interactions (PPIs), yet experimental determination of PPIs is both expensive and time-consuming. PPI networks present significant levels of noise and incompleteness. Predicting interactions using only PPI-network topology (topological prediction) is difficult but essential when prior biological knowledge is absent or unreliable.Methods: Network embedding emphasizes the relations between network proteins embedded in a low-dimensional space, in which protein pairs that are closer to each other represent good candidate interactions. To achieve network denoising, which boosts prediction performance, we first applied minimum curvilinear embedding (MCE), and then adopted shortest path (SP) in the reduced space to assign likelihood scores to candidate interactions. Furthermore, we introduce (i) a new valid variation of MCE, named non-centred MCE (ncMCE); (ii) two automatic strategies for selecting the appropriate embedding dimension; and (iii) two new randomized procedures for evaluating predictions.Results: We compared our method against several unsupervised and supervisedly tuned embedding approaches and node neighbourhood techniques. Despite its computational simplicity, ncMCE-SP was the overall leader, outperforming the current methods in topological link prediction.Conclusion: Minimum curvilinearity is a valuable non-linear framework that we successfully applied to the embedding of protein networks for the unsupervised prediction of novel PPIs. The rationale for our approach is that biological and evolutionary information is imprinted in the non-linear patterns hidden behind the protein network topology, and can be exploited for predicting new protein links. The predicted PPIs represent good candidates for testing in high-throughput experiments or for exploitation in systems biology tools such as those used for network-based inference and prediction of disease-related functional modules. The

  10. Predicting Protein-Protein Interaction Sites with a Novel Membership Based Fuzzy SVM Classifier.

    Science.gov (United States)

    Sriwastava, Brijesh K; Basu, Subhadip; Maulik, Ujjwal

    2015-01-01

    Predicting residues that participate in protein-protein interactions (PPI) helps to identify, which amino acids are located at the interface. In this paper, we show that the performance of the classical support vector machine (SVM) algorithm can further be improved with the use of a custom-designed fuzzy membership function, for the partner-specific PPI interface prediction problem. We evaluated the performances of both classical SVM and fuzzy SVM (F-SVM) on the PPI databases of three different model proteomes of Homo sapiens, Escherichia coli and Saccharomyces Cerevisiae and calculated the statistical significance of the developed F-SVM over classical SVM algorithm. We also compared our performance with the available state-of-the-art fuzzy methods in this domain and observed significant performance improvements. To predict interaction sites in protein complexes, local composition of amino acids together with their physico-chemical characteristics are used, where the F-SVM based prediction method exploits the membership function for each pair of sequence fragments. The average F-SVM performance (area under ROC curve) on the test samples in 10-fold cross validation experiment are measured as 77.07, 78.39, and 74.91 percent for the aforementioned organisms respectively. Performances on independent test sets are obtained as 72.09, 73.24 and 82.74 percent respectively. The software is available for free download from http://code.google.com/p/cmater-bioinfo.

  11. Adaptive compressive learning for prediction of protein-protein interactions from primary sequence.

    Science.gov (United States)

    Zhang, Ya-Nan; Pan, Xiao-Yong; Huang, Yan; Shen, Hong-Bin

    2011-08-21

    Protein-protein interactions (PPIs) play an important role in biological processes. Although much effort has been devoted to the identification of novel PPIs by integrating experimental biological knowledge, there are still many difficulties because of lacking enough protein structural and functional information. It is highly desired to develop methods based only on amino acid sequences for predicting PPIs. However, sequence-based predictors are often struggling with the high-dimensionality causing over-fitting and high computational complexity problems, as well as the redundancy of sequential feature vectors. In this paper, a novel computational approach based on compressed sensing theory is proposed to predict yeast Saccharomyces cerevisiae PPIs from primary sequence and has achieved promising results. The key advantage of the proposed compressed sensing algorithm is that it can compress the original high-dimensional protein sequential feature vector into a much lower but more condensed space taking the sparsity property of the original signal into account. What makes compressed sensing much more attractive in protein sequence analysis is its compressed signal can be reconstructed from far fewer measurements than what is usually considered necessary in traditional Nyquist sampling theory. Experimental results demonstrate that proposed compressed sensing method is powerful for analyzing noisy biological data and reducing redundancy in feature vectors. The proposed method represents a new strategy of dealing with high-dimensional protein discrete model and has great potentiality to be extended to deal with many other complicated biological systems. Copyright © 2011 Elsevier Ltd. All rights reserved.

  12. Prediction of residue-residue contact matrix for protein-protein interaction with Fisher score features and deep learning.

    Science.gov (United States)

    Du, Tianchuan; Liao, Li; Wu, Cathy H; Sun, Bilin

    2016-11-01

    Protein-protein interactions play essential roles in many biological processes. Acquiring knowledge of the residue-residue contact information of two interacting proteins is not only helpful in annotating functions for proteins, but also critical for structure-based drug design. The prediction of the protein residue-residue contact matrix of the interfacial regions is challenging. In this work, we introduced deep learning techniques (specifically, stacked autoencoders) to build deep neural network models to tackled the residue-residue contact prediction problem. In tandem with interaction profile Hidden Markov Models, which was used first to extract Fisher score features from protein sequences, stacked autoencoders were deployed to extract and learn hidden abstract features. The deep learning model showed significant improvement over the traditional machine learning model, Support Vector Machines (SVM), with the overall accuracy increased by 15% from 65.40% to 80.82%. We showed that the stacked autoencoders could extract novel features, which can be utilized by deep neural networks and other classifiers to enhance learning, out of the Fisher score features. It is further shown that deep neural networks have significant advantages over SVM in making use of the newly extracted features. Copyright © 2016. Published by Elsevier Inc.

  13. Understanding Protein-Protein Interactions Using Local Structural Features

    DEFF Research Database (Denmark)

    Planas-Iglesias, Joan; Bonet, Jaume; García-García, Javier

    2013-01-01

    Protein-protein interactions (PPIs) play a relevant role among the different functions of a cell. Identifying the PPI network of a given organism (interactome) is useful to shed light on the key molecular mechanisms within a biological system. In this work, we show the role of structural features...... interacting and non-interacting protein pairs to classify the structural features that sustain the binding (or non-binding) behavior. Our study indicates that not only the interacting region but also the rest of the protein surface are important for the interaction fate. The interpretation...... to score the likelihood of the interaction between two proteins and to develop a method for the prediction of PPIs. We have tested our method on several sets with unbalanced ratios of interactions and non-interactions to simulate real conditions, obtaining accuracies higher than 25% in the most unfavorable...

  14. Exploiting conformational ensembles in modeling protein-protein interactions on the proteome scale

    Science.gov (United States)

    Kuzu, Guray; Gursoy, Attila; Nussinov, Ruth; Keskin, Ozlem

    2013-01-01

    Cellular functions are performed through protein-protein interactions; therefore, identification of these interactions is crucial for understanding biological processes. Recent studies suggest that knowledge-based approaches are more useful than ‘blind’ docking for modeling at large scales. However, a caveat of knowledge-based approaches is that they treat molecules as rigid structures. The Protein Data Bank (PDB) offers a wealth of conformations. Here, we exploited ensemble of the conformations in predictions by a knowledge-based method, PRISM. We tested ‘difficult’ cases in a docking-benchmark dataset, where the unbound and bound protein forms are structurally different. Considering alternative conformations for each protein, the percentage of successfully predicted interactions increased from ~26% to 66%, and 57% of the interactions were successfully predicted in an ‘unbiased’ scenario, in which data related to the bound forms were not utilized. If the appropriate conformation, or relevant template interface, is unavailable in the PDB, PRISM could not predict the interaction successfully. The pace of the growth of the PDB promises a rapid increase of ensemble conformations emphasizing the merit of such knowledge-based ensemble strategies for higher success rates in protein-protein interaction predictions on an interactome-scale. We constructed the structural network of ERK interacting proteins as a case study. PMID:23590674

  15. Scoring protein relationships in functional interaction networks predicted from sequence data.

    Directory of Open Access Journals (Sweden)

    Gaston K Mazandu

    Full Text Available UNLABELLED: The abundance of diverse biological data from various sources constitutes a rich source of knowledge, which has the power to advance our understanding of organisms. This requires computational methods in order to integrate and exploit these data effectively and elucidate local and genome wide functional connections between protein pairs, thus enabling functional inferences for uncharacterized proteins. These biological data are primarily in the form of sequences, which determine functions, although functional properties of a protein can often be predicted from just the domains it contains. Thus, protein sequences and domains can be used to predict protein pair-wise functional relationships, and thus contribute to the function prediction process of uncharacterized proteins in order to ensure that knowledge is gained from sequencing efforts. In this work, we introduce information-theoretic based approaches to score protein-protein functional interaction pairs predicted from protein sequence similarity and conserved protein signature matches. The proposed schemes are effective for data-driven scoring of connections between protein pairs. We applied these schemes to the Mycobacterium tuberculosis proteome to produce a homology-based functional network of the organism with a high confidence and coverage. We use the network for predicting functions of uncharacterised proteins. AVAILABILITY: Protein pair-wise functional relationship scores for Mycobacterium tuberculosis strain CDC1551 sequence data and python scripts to compute these scores are available at http://web.cbio.uct.ac.za/~gmazandu/scoringschemes.

  16. Predicting protein complexes from weighted protein-protein interaction graphs with a novel unsupervised methodology: Evolutionary enhanced Markov clustering.

    Science.gov (United States)

    Theofilatos, Konstantinos; Pavlopoulou, Niki; Papasavvas, Christoforos; Likothanassis, Spiros; Dimitrakopoulos, Christos; Georgopoulos, Efstratios; Moschopoulos, Charalampos; Mavroudi, Seferina

    2015-03-01

    Proteins are considered to be the most important individual components of biological systems and they combine to form physical protein complexes which are responsible for certain molecular functions. Despite the large availability of protein-protein interaction (PPI) information, not much information is available about protein complexes. Experimental methods are limited in terms of time, efficiency, cost and performance constraints. Existing computational methods have provided encouraging preliminary results, but they phase certain disadvantages as they require parameter tuning, some of them cannot handle weighted PPI data and others do not allow a protein to participate in more than one protein complex. In the present paper, we propose a new fully unsupervised methodology for predicting protein complexes from weighted PPI graphs. The proposed methodology is called evolutionary enhanced Markov clustering (EE-MC) and it is a hybrid combination of an adaptive evolutionary algorithm and a state-of-the-art clustering algorithm named enhanced Markov clustering. EE-MC was compared with state-of-the-art methodologies when applied to datasets from the human and the yeast Saccharomyces cerevisiae organisms. Using public available datasets, EE-MC outperformed existing methodologies (in some datasets the separation metric was increased by 10-20%). Moreover, when applied to new human datasets its performance was encouraging in the prediction of protein complexes which consist of proteins with high functional similarity. In specific, 5737 protein complexes were predicted and 72.58% of them are enriched for at least one gene ontology (GO) function term. EE-MC is by design able to overcome intrinsic limitations of existing methodologies such as their inability to handle weighted PPI networks, their constraint to assign every protein in exactly one cluster and the difficulties they face concerning the parameter tuning. This fact was experimentally validated and moreover, new

  17. Structural similarity-based predictions of protein interactions between HIV-1 and Homo sapiens

    Directory of Open Access Journals (Sweden)

    Gomez Shawn M

    2010-04-01

    Full Text Available Abstract Background In the course of infection, viruses such as HIV-1 must enter a cell, travel to sites where they can hijack host machinery to transcribe their genes and translate their proteins, assemble, and then leave the cell again, all while evading the host immune system. Thus, successful infection depends on the pathogen's ability to manipulate the biological pathways and processes of the organism it infects. Interactions between HIV-encoded and human proteins provide one means by which HIV-1 can connect into cellular pathways to carry out these survival processes. Results We developed and applied a computational approach to predict interactions between HIV and human proteins based on structural similarity of 9 HIV-1 proteins to human proteins having known interactions. Using functional data from RNAi studies as a filter, we generated over 2000 interaction predictions between HIV proteins and 406 unique human proteins. Additional filtering based on Gene Ontology cellular component annotation reduced the number of predictions to 502 interactions involving 137 human proteins. We find numerous known interactions as well as novel interactions showing significant functional relevance based on supporting Gene Ontology and literature evidence. Conclusions Understanding the interplay between HIV-1 and its human host will help in understanding the viral lifecycle and the ways in which this virus is able to manipulate its host. The results shown here provide a potential set of interactions that are amenable to further experimental manipulation as well as potential targets for therapeutic intervention.

  18. InterProSurf: a web server for predicting interacting sites on protein surfaces

    Science.gov (United States)

    Negi, Surendra S.; Schein, Catherine H.; Oezguen, Numan; Power, Trevor D.; Braun, Werner

    2009-01-01

    Summary A new web server, InterProSurf, predicts interacting amino acid residues in proteins that are most likely to interact with other proteins, given the 3D structures of subunits of a protein complex. The prediction method is based on solvent accessible surface area of residues in the isolated subunits, a propensity scale for interface residues and a clustering algorithm to identify surface regions with residues of high interface propensities. Here we illustrate the application of InterProSurf to determine which areas of Bacillus anthracis toxins and measles virus hemagglutinin protein interact with their respective cell surface receptors. The computationally predicted regions overlap with those regions previously identified as interface regions by sequence analysis and mutagenesis experiments. PMID:17933856

  19. Computational methods using weighed-extreme learning machine to predict protein self-interactions with protein evolutionary information.

    Science.gov (United States)

    An, Ji-Yong; Zhang, Lei; Zhou, Yong; Zhao, Yu-Jun; Wang, Da-Fu

    2017-08-18

    Self-interactions Proteins (SIPs) is important for their biological activity owing to the inherent interaction amongst their secondary structures or domains. However, due to the limitations of experimental Self-interactions detection, one major challenge in the study of prediction SIPs is how to exploit computational approaches for SIPs detection based on evolutionary information contained protein sequence. In the work, we presented a novel computational approach named WELM-LAG, which combined the Weighed-Extreme Learning Machine (WELM) classifier with Local Average Group (LAG) to predict SIPs based on protein sequence. The major improvement of our method lies in presenting an effective feature extraction method used to represent candidate Self-interactions proteins by exploring the evolutionary information embedded in PSI-BLAST-constructed position specific scoring matrix (PSSM); and then employing a reliable and robust WELM classifier to carry out classification. In addition, the Principal Component Analysis (PCA) approach is used to reduce the impact of noise. The WELM-LAG method gave very high average accuracies of 92.94 and 96.74% on yeast and human datasets, respectively. Meanwhile, we compared it with the state-of-the-art support vector machine (SVM) classifier and other existing methods on human and yeast datasets, respectively. Comparative results indicated that our approach is very promising and may provide a cost-effective alternative for predicting SIPs. In addition, we developed a freely available web server called WELM-LAG-SIPs to predict SIPs. The web server is available at http://219.219.62.123:8888/WELMLAG/ .

  20. Prediction of host - pathogen protein interactions between Mycobacterium tuberculosis and Homo sapiens using sequence motifs.

    Science.gov (United States)

    Huo, Tong; Liu, Wei; Guo, Yu; Yang, Cheng; Lin, Jianping; Rao, Zihe

    2015-03-26

    Emergence of multiple drug resistant strains of M. tuberculosis (MDR-TB) threatens to derail global efforts aimed at reigning in the pathogen. Co-infections of M. tuberculosis with HIV are difficult to treat. To counter these new challenges, it is essential to study the interactions between M. tuberculosis and the host to learn how these bacteria cause disease. We report a systematic flow to predict the host pathogen interactions (HPIs) between M. tuberculosis and Homo sapiens based on sequence motifs. First, protein sequences were used as initial input for identifying the HPIs by 'interolog' method. HPIs were further filtered by prediction of domain-domain interactions (DDIs). Functional annotations of protein and publicly available experimental results were applied to filter the remaining HPIs. Using such a strategy, 118 pairs of HPIs were identified, which involve 43 proteins from M. tuberculosis and 48 proteins from Homo sapiens. A biological interaction network between M. tuberculosis and Homo sapiens was then constructed using the predicted inter- and intra-species interactions based on the 118 pairs of HPIs. Finally, a web accessible database named PATH (Protein interactions of M. tuberculosis and Human) was constructed to store these predicted interactions and proteins. This interaction network will facilitate the research on host-pathogen protein-protein interactions, and may throw light on how M. tuberculosis interacts with its host.

  1. Inferring domain-domain interactions from protein-protein interactions with formal concept analysis.

    Directory of Open Access Journals (Sweden)

    Susan Khor

    Full Text Available Identifying reliable domain-domain interactions will increase our ability to predict novel protein-protein interactions, to unravel interactions in protein complexes, and thus gain more information about the function and behavior of genes. One of the challenges of identifying reliable domain-domain interactions is domain promiscuity. Promiscuous domains are domains that can occur in many domain architectures and are therefore found in many proteins. This becomes a problem for a method where the score of a domain-pair is the ratio between observed and expected frequencies because the protein-protein interaction network is sparse. As such, many protein-pairs will be non-interacting and domain-pairs with promiscuous domains will be penalized. This domain promiscuity challenge to the problem of inferring reliable domain-domain interactions from protein-protein interactions has been recognized, and a number of work-arounds have been proposed. This paper reports on an application of Formal Concept Analysis to this problem. It is found that the relationship between formal concepts provides a natural way for rare domains to elevate the rank of promiscuous domain-pairs and enrich highly ranked domain-pairs with reliable domain-domain interactions. This piggybacking of promiscuous domain-pairs onto less promiscuous domain-pairs is possible only with concept lattices whose attribute-labels are not reduced and is enhanced by the presence of proteins that comprise both promiscuous and rare domains.

  2. Inferring Domain-Domain Interactions from Protein-Protein Interactions with Formal Concept Analysis

    Science.gov (United States)

    Khor, Susan

    2014-01-01

    Identifying reliable domain-domain interactions will increase our ability to predict novel protein-protein interactions, to unravel interactions in protein complexes, and thus gain more information about the function and behavior of genes. One of the challenges of identifying reliable domain-domain interactions is domain promiscuity. Promiscuous domains are domains that can occur in many domain architectures and are therefore found in many proteins. This becomes a problem for a method where the score of a domain-pair is the ratio between observed and expected frequencies because the protein-protein interaction network is sparse. As such, many protein-pairs will be non-interacting and domain-pairs with promiscuous domains will be penalized. This domain promiscuity challenge to the problem of inferring reliable domain-domain interactions from protein-protein interactions has been recognized, and a number of work-arounds have been proposed. This paper reports on an application of Formal Concept Analysis to this problem. It is found that the relationship between formal concepts provides a natural way for rare domains to elevate the rank of promiscuous domain-pairs and enrich highly ranked domain-pairs with reliable domain-domain interactions. This piggybacking of promiscuous domain-pairs onto less promiscuous domain-pairs is possible only with concept lattices whose attribute-labels are not reduced and is enhanced by the presence of proteins that comprise both promiscuous and rare domains. PMID:24586450

  3. Computational Approaches for Prediction of Pathogen-Host Protein-Protein Interactions

    Directory of Open Access Journals (Sweden)

    Esmaeil eNourani

    2015-02-01

    Full Text Available Infectious diseases are still among the major and prevalent health problems, mostly because of the drug resistance of novel variants of pathogens. Molecular interactions between pathogens and their hosts are the key part of the infection mechanisms. Novel antimicrobial therapeutics to fight drug resistance is only possible in case of a thorough understanding of pathogen-host interaction (PHI systems. Existing databases, which contain experimentally verified PHI data, suffer from scarcity of reported interactions due to the technically challenging and time consuming process of experiments. This has motivated many researchers to address the problem by proposing computational approaches for analysis and prediction of PHIs. The computational methods primarily utilize sequence information, protein structure and known interactions. Classic machine learning techniques are used when there are sufficient known interactions to be used as training data. On the opposite case, transfer and multi task learning methods are preferred. Here, we present an overview of these computational approaches for PHI prediction, discussing their weakness and abilities, with future directions.

  4. Protein-Protein Interaction Databases

    DEFF Research Database (Denmark)

    Szklarczyk, Damian; Jensen, Lars Juhl

    2015-01-01

    Years of meticulous curation of scientific literature and increasingly reliable computational predictions have resulted in creation of vast databases of protein interaction data. Over the years, these repositories have become a basic framework in which experiments are analyzed and new directions...

  5. DiffSLC: A graph centrality method to detect essential proteins of a protein-protein interaction network.

    Science.gov (United States)

    Mistry, Divya; Wise, Roger P; Dickerson, Julie A

    2017-01-01

    Identification of central genes and proteins in biomolecular networks provides credible candidates for pathway analysis, functional analysis, and essentiality prediction. The DiffSLC centrality measure predicts central and essential genes and proteins using a protein-protein interaction network. Network centrality measures prioritize nodes and edges based on their importance to the network topology. These measures helped identify critical genes and proteins in biomolecular networks. The proposed centrality measure, DiffSLC, combines the number of interactions of a protein and the gene coexpression values of genes from which those proteins were translated, as a weighting factor to bias the identification of essential proteins in a protein interaction network. Potentially essential proteins with low node degree are promoted through eigenvector centrality. Thus, the gene coexpression values are used in conjunction with the eigenvector of the network's adjacency matrix and edge clustering coefficient to improve essentiality prediction. The outcome of this prediction is shown using three variations: (1) inclusion or exclusion of gene co-expression data, (2) impact of different coexpression measures, and (3) impact of different gene expression data sets. For a total of seven networks, DiffSLC is compared to other centrality measures using Saccharomyces cerevisiae protein interaction networks and gene expression data. Comparisons are also performed for the top ranked proteins against the known essential genes from the Saccharomyces Gene Deletion Project, which show that DiffSLC detects more essential proteins and has a higher area under the ROC curve than other compared methods. This makes DiffSLC a stronger alternative to other centrality methods for detecting essential genes using a protein-protein interaction network that obeys centrality-lethality principle. DiffSLC is implemented using the igraph package in R, and networkx package in Python. The python package can be

  6. Combining modularity, conservation, and interactions of proteins significantly increases precision and coverage of protein function prediction

    Directory of Open Access Journals (Sweden)

    Sers Christine T

    2010-12-01

    Full Text Available Abstract Background While the number of newly sequenced genomes and genes is constantly increasing, elucidation of their function still is a laborious and time-consuming task. This has led to the development of a wide range of methods for predicting protein functions in silico. We report on a new method that predicts function based on a combination of information about protein interactions, orthology, and the conservation of protein networks in different species. Results We show that aggregation of these independent sources of evidence leads to a drastic increase in number and quality of predictions when compared to baselines and other methods reported in the literature. For instance, our method generates more than 12,000 novel protein functions for human with an estimated precision of ~76%, among which are 7,500 new functional annotations for 1,973 human proteins that previously had zero or only one function annotated. We also verified our predictions on a set of genes that play an important role in colorectal cancer (MLH1, PMS2, EPHB4 and could confirm more than 73% of them based on evidence in the literature. Conclusions The combination of different methods into a single, comprehensive prediction method infers thousands of protein functions for every species included in the analysis at varying, yet always high levels of precision and very good coverage.

  7. Oligomeric protein structure networks: insights into protein-protein interactions

    Directory of Open Access Journals (Sweden)

    Brinda KV

    2005-12-01

    Full Text Available Abstract Background Protein-protein association is essential for a variety of cellular processes and hence a large number of investigations are being carried out to understand the principles of protein-protein interactions. In this study, oligomeric protein structures are viewed from a network perspective to obtain new insights into protein association. Structure graphs of proteins have been constructed from a non-redundant set of protein oligomer crystal structures by considering amino acid residues as nodes and the edges are based on the strength of the non-covalent interactions between the residues. The analysis of such networks has been carried out in terms of amino acid clusters and hubs (highly connected residues with special emphasis to protein interfaces. Results A variety of interactions such as hydrogen bond, salt bridges, aromatic and hydrophobic interactions, which occur at the interfaces are identified in a consolidated manner as amino acid clusters at the interface, from this study. Moreover, the characterization of the highly connected hub-forming residues at the interfaces and their comparison with the hubs from the non-interface regions and the non-hubs in the interface regions show that there is a predominance of charged interactions at the interfaces. Further, strong and weak interfaces are identified on the basis of the interaction strength between amino acid residues and the sizes of the interface clusters, which also show that many protein interfaces are stronger than their monomeric protein cores. The interface strengths evaluated based on the interface clusters and hubs also correlate well with experimentally determined dissociation constants for known complexes. Finally, the interface hubs identified using the present method correlate very well with experimentally determined hotspots in the interfaces of protein complexes obtained from the Alanine Scanning Energetics database (ASEdb. A few predictions of interface hot

  8. Computational prediction of protein hot spot residues.

    Science.gov (United States)

    Morrow, John Kenneth; Zhang, Shuxing

    2012-01-01

    Most biological processes involve multiple proteins interacting with each other. It has been recently discovered that certain residues in these protein-protein interactions, which are called hot spots, contribute more significantly to binding affinity than others. Hot spot residues have unique and diverse energetic properties that make them challenging yet important targets in the modulation of protein-protein complexes. Design of therapeutic agents that interact with hot spot residues has proven to be a valid methodology in disrupting unwanted protein-protein interactions. Using biological methods to determine which residues are hot spots can be costly and time consuming. Recent advances in computational approaches to predict hot spots have incorporated a myriad of features, and have shown increasing predictive successes. Here we review the state of knowledge around protein-protein interactions, hot spots, and give an overview of multiple in silico prediction techniques of hot spot residues.

  9. Prediction of interactions between viral and host proteins using supervised machine learning methods.

    Directory of Open Access Journals (Sweden)

    Ranjan Kumar Barman

    Full Text Available BACKGROUND: Viral-host protein-protein interaction plays a vital role in pathogenesis, since it defines viral infection of the host and regulation of the host proteins. Identification of key viral-host protein-protein interactions (PPIs has great implication for therapeutics. METHODS: In this study, a systematic attempt has been made to predict viral-host PPIs by integrating different features, including domain-domain association, network topology and sequence information using viral-host PPIs from VirusMINT. The three well-known supervised machine learning methods, such as SVM, Naïve Bayes and Random Forest, which are commonly used in the prediction of PPIs, were employed to evaluate the performance measure based on five-fold cross validation techniques. RESULTS: Out of 44 descriptors, best features were found to be domain-domain association and methionine, serine and valine amino acid composition of viral proteins. In this study, SVM-based method achieved better sensitivity of 67% over Naïve Bayes (37.49% and Random Forest (55.66%. However the specificity of Naïve Bayes was the highest (99.52% as compared with SVM (74% and Random Forest (89.08%. Overall, the SVM and Random Forest achieved accuracy of 71% and 72.41%, respectively. The proposed SVM-based method was evaluated on blind dataset and attained a sensitivity of 64%, specificity of 83%, and accuracy of 74%. In addition, unknown potential targets of hepatitis B virus-human and hepatitis E virus-human PPIs have been predicted through proposed SVM model and validated by gene ontology enrichment analysis. Our proposed model shows that, hepatitis B virus "C protein" binds to membrane docking protein, while "X protein" and "P protein" interacts with cell-killing and metabolic process proteins, respectively. CONCLUSION: The proposed method can predict large scale interspecies viral-human PPIs. The nature and function of unknown viral proteins (HBV and HEV, interacting partners of host

  10. Predicting Protein-Protein Interactions Using BiGGER: Case Studies

    Directory of Open Access Journals (Sweden)

    Rui M. Almeida

    2016-08-01

    Full Text Available The importance of understanding interactomes makes preeminent the study of protein interactions and protein complexes. Traditionally, protein interactions have been elucidated by experimental methods or, with lower impact, by simulation with protein docking algorithms. This article describes features and applications of the BiGGER docking algorithm, which stands at the interface of these two approaches. BiGGER is a user-friendly docking algorithm that was specifically designed to incorporate experimental data at different stages of the simulation, to either guide the search for correct structures or help evaluate the results, in order to combine the reliability of hard data with the convenience of simulations. Herein, the applications of BiGGER are described by illustrative applications divided in three Case Studies: (Case Study A in which no specific contact data is available; (Case Study B when different experimental data (e.g., site-directed mutagenesis, properties of the complex, NMR chemical shift perturbation mapping, electron tunneling on one of the partners is available; and (Case Study C when experimental data are available for both interacting surfaces, which are used during the search and/or evaluation stage of the docking. This algorithm has been extensively used, evidencing its usefulness in a wide range of different biological research fields.

  11. Predicting highly-connected hubs in protein interaction networks by QSAR and biological data descriptors

    Science.gov (United States)

    Hsing, Michael; Byler, Kendall; Cherkasov, Artem

    2009-01-01

    Hub proteins (those engaged in most physical interactions in a protein interaction network (PIN) have recently gained much research interest due to their essential role in mediating cellular processes and their potential therapeutic value. It is straightforward to identify hubs if the underlying PIN is experimentally determined; however, theoretical hub prediction remains a very challenging task, as physicochemical properties that differentiate hubs from less connected proteins remain mostly uncharacterized. To adequately distinguish hubs from non-hub proteins we have utilized over 1300 protein descriptors, some of which represent QSAR (quantitative structure-activity relationship) parameters, and some reflect sequence-derived characteristics of proteins including domain composition and functional annotations. Those protein descriptors, together with available protein interaction data have been processed by a machine learning method (boosting trees) and resulted in the development of hub classifiers that are capable of predicting highly interacting proteins for four model organisms: Escherichia coli, Saccharomyces cerevisiae, Drosophila melanogaster and Homo sapiens. More importantly, through the analyses of the most relevant protein descriptors, we are able to demonstrate that hub proteins not only share certain common physicochemical and structural characteristics that make them different from non-hub counterparts, but they also exhibit species-specific characteristics that should be taken into account when analyzing different PINs. The developed prediction models can be used for determining highly interacting proteins in the four studied species to assist future proteomics experiments and PIN analyses. Availability The source code and executable program of the hub classifier are available for download at: http://www.cnbi2.ca/hub-analysis/ PMID:20198194

  12. Molecular tweezers modulate 14-3-3 protein-protein interactions

    Science.gov (United States)

    Bier, David; Rose, Rolf; Bravo-Rodriguez, Kenny; Bartel, Maria; Ramirez-Anguita, Juan Manuel; Dutt, Som; Wilch, Constanze; Klärner, Frank-Gerrit; Sanchez-Garcia, Elsa; Schrader, Thomas; Ottmann, Christian

    2013-03-01

    Supramolecular chemistry has recently emerged as a promising way to modulate protein functions, but devising molecules that will interact with a protein in the desired manner is difficult as many competing interactions exist in a biological environment (with solvents, salts or different sites for the target biomolecule). We now show that lysine-specific molecular tweezers bind to a 14-3-3 adapter protein and modulate its interaction with partner proteins. The tweezers inhibit binding between the 14-3-3 protein and two partner proteins—a phosphorylated (C-Raf) protein and an unphosphorylated one (ExoS)—in a concentration-dependent manner. Protein crystallography shows that this effect arises from the binding of the tweezers to a single surface-exposed lysine (Lys214) of the 14-3-3 protein in the proximity of its central channel, which normally binds the partner proteins. A combination of structural analysis and computer simulations provides rules for the tweezers' binding preferences, thus allowing us to predict their influence on this type of protein-protein interactions.

  13. Prediction of Cancer Proteins by Integrating Protein Interaction, Domain Frequency, and Domain Interaction Data Using Machine Learning Algorithms

    Directory of Open Access Journals (Sweden)

    Chien-Hung Huang

    2015-01-01

    Full Text Available Many proteins are known to be associated with cancer diseases. It is quite often that their precise functional role in disease pathogenesis remains unclear. A strategy to gain a better understanding of the function of these proteins is to make use of a combination of different aspects of proteomics data types. In this study, we extended Aragues’s method by employing the protein-protein interaction (PPI data, domain-domain interaction (DDI data, weighted domain frequency score (DFS, and cancer linker degree (CLD data to predict cancer proteins. Performances were benchmarked based on three kinds of experiments as follows: (I using individual algorithm, (II combining algorithms, and (III combining the same classification types of algorithms. When compared with Aragues’s method, our proposed methods, that is, machine learning algorithm and voting with the majority, are significantly superior in all seven performance measures. We demonstrated the accuracy of the proposed method on two independent datasets. The best algorithm can achieve a hit ratio of 89.4% and 72.8% for lung cancer dataset and lung cancer microarray study, respectively. It is anticipated that the current research could help understand disease mechanisms and diagnosis.

  14. Recovering protein-protein and domain-domain interactions from aggregation of IP-MS proteomics of coregulator complexes.

    Directory of Open Access Journals (Sweden)

    Amin R Mazloom

    2011-12-01

    Full Text Available Coregulator proteins (CoRegs are part of multi-protein complexes that transiently assemble with transcription factors and chromatin modifiers to regulate gene expression. In this study we analyzed data from 3,290 immuno-precipitations (IP followed by mass spectrometry (MS applied to human cell lines aimed at identifying CoRegs complexes. Using the semi-quantitative spectral counts, we scored binary protein-protein and domain-domain associations with several equations. Unlike previous applications, our methods scored prey-prey protein-protein interactions regardless of the baits used. We also predicted domain-domain interactions underlying predicted protein-protein interactions. The quality of predicted protein-protein and domain-domain interactions was evaluated using known binary interactions from the literature, whereas one protein-protein interaction, between STRN and CTTNBP2NL, was validated experimentally; and one domain-domain interaction, between the HEAT domain of PPP2R1A and the Pkinase domain of STK25, was validated using molecular docking simulations. The scoring schemes presented here recovered known, and predicted many new, complexes, protein-protein, and domain-domain interactions. The networks that resulted from the predictions are provided as a web-based interactive application at http://maayanlab.net/HT-IP-MS-2-PPI-DDI/.

  15. A Bipartite Network-based Method for Prediction of Long Non-coding RNA–protein Interactions

    Directory of Open Access Journals (Sweden)

    Mengqu Ge

    2016-02-01

    Full Text Available As one large class of non-coding RNAs (ncRNAs, long ncRNAs (lncRNAs have gained considerable attention in recent years. Mutations and dysfunction of lncRNAs have been implicated in human disorders. Many lncRNAs exert their effects through interactions with the corresponding RNA-binding proteins. Several computational approaches have been developed, but only few are able to perform the prediction of these interactions from a network-based point of view. Here, we introduce a computational method named lncRNA–protein bipartite network inference (LPBNI. LPBNI aims to identify potential lncRNA–interacting proteins, by making full use of the known lncRNA–protein interactions. Leave-one-out cross validation (LOOCV test shows that LPBNI significantly outperforms other network-based methods, including random walk (RWR and protein-based collaborative filtering (ProCF. Furthermore, a case study was performed to demonstrate the performance of LPBNI using real data in predicting potential lncRNA–interacting proteins.

  16. HomPPI: a class of sequence homology based protein-protein interface prediction methods

    Directory of Open Access Journals (Sweden)

    Dobbs Drena

    2011-06-01

    Full Text Available Abstract Background Although homology-based methods are among the most widely used methods for predicting the structure and function of proteins, the question as to whether interface sequence conservation can be effectively exploited in predicting protein-protein interfaces has been a subject of debate. Results We studied more than 300,000 pair-wise alignments of protein sequences from structurally characterized protein complexes, including both obligate and transient complexes. We identified sequence similarity criteria required for accurate homology-based inference of interface residues in a query protein sequence. Based on these analyses, we developed HomPPI, a class of sequence homology-based methods for predicting protein-protein interface residues. We present two variants of HomPPI: (i NPS-HomPPI (Non partner-specific HomPPI, which can be used to predict interface residues of a query protein in the absence of knowledge of the interaction partner; and (ii PS-HomPPI (Partner-specific HomPPI, which can be used to predict the interface residues of a query protein with a specific target protein. Our experiments on a benchmark dataset of obligate homodimeric complexes show that NPS-HomPPI can reliably predict protein-protein interface residues in a given protein, with an average correlation coefficient (CC of 0.76, sensitivity of 0.83, and specificity of 0.78, when sequence homologs of the query protein can be reliably identified. NPS-HomPPI also reliably predicts the interface residues of intrinsically disordered proteins. Our experiments suggest that NPS-HomPPI is competitive with several state-of-the-art interface prediction servers including those that exploit the structure of the query proteins. The partner-specific classifier, PS-HomPPI can, on a large dataset of transient complexes, predict the interface residues of a query protein with a specific target, with a CC of 0.65, sensitivity of 0.69, and specificity of 0.70, when homologs of

  17. Hot-spot analysis for drug discovery targeting protein-protein interactions.

    Science.gov (United States)

    Rosell, Mireia; Fernández-Recio, Juan

    2018-04-01

    Protein-protein interactions are important for biological processes and pathological situations, and are attractive targets for drug discovery. However, rational drug design targeting protein-protein interactions is still highly challenging. Hot-spot residues are seen as the best option to target such interactions, but their identification requires detailed structural and energetic characterization, which is only available for a tiny fraction of protein interactions. Areas covered: In this review, the authors cover a variety of computational methods that have been reported for the energetic analysis of protein-protein interfaces in search of hot-spots, and the structural modeling of protein-protein complexes by docking. This can help to rationalize the discovery of small-molecule inhibitors of protein-protein interfaces of therapeutic interest. Computational analysis and docking can help to locate the interface, molecular dynamics can be used to find suitable cavities, and hot-spot predictions can focus the search for inhibitors of protein-protein interactions. Expert opinion: A major difficulty for applying rational drug design methods to protein-protein interactions is that in the majority of cases the complex structure is not available. Fortunately, computational docking can complement experimental data. An interesting aspect to explore in the future is the integration of these strategies for targeting PPIs with large-scale mutational analysis.

  18. Protein-protein interactions and cancer: targeting the central dogma.

    Science.gov (United States)

    Garner, Amanda L; Janda, Kim D

    2011-01-01

    Between 40,000 and 200,000 protein-protein interactions have been predicted to exist within the human interactome. As these interactions are of a critical nature in many important cellular functions and their dysregulation is causal of disease, the modulation of these binding events has emerged as a leading, yet difficult therapeutic arena. In particular, the targeting of protein-protein interactions relevant to cancer is of fundamental importance as the tumor-promoting function of several aberrantly expressed proteins in the cancerous state is directly resultant of its ability to interact with a protein-binding partner. Of significance, these protein complexes play a crucial role in each of the steps of the central dogma of molecular biology, the fundamental processes of genetic transmission. With the many important discoveries being made regarding the mechanisms of these genetic process, the identification of new chemical probes are needed to better understand and validate the druggability of protein-protein interactions related to the central dogma. In this review, we provide an overview of current small molecule-based protein-protein interaction inhibitors for each stage of the central dogma: transcription, mRNA splicing and translation. Importantly, through our analysis we have uncovered a lack of necessary probes targeting mRNA splicing and translation, thus, opening up the possibility for expansion of these fields.

  19. ProteinShop: A tool for interactive protein manipulation and steering

    Energy Technology Data Exchange (ETDEWEB)

    Crivelli, Silvia; Kreylos, Oliver; Max, Nelson; Hamann, Bernd; Bethel, Wes

    2004-05-25

    We describe ProteinShop, a new visualization tool that streamlines and simplifies the process of determining optimal protein folds. ProteinShop may be used at different stages of a protein structure prediction process. First, it can create protein configurations containing secondary structures specified by the user. Second, it can interactively manipulate protein fragments to achieve desired folds by adjusting the dihedral angles of selected coil regions using an Inverse Kinematics method. Last, it serves as a visual framework to monitor and steer a protein structure prediction process that may be running on a remote machine. ProteinShop was used to create initial configurations for a protein structure prediction method developed by a team that competed in CASP5. ProteinShop's use accelerated the process of generating initial configurations, reducing the time required from days to hours. This paper describes the structure of ProteinShop and discusses its main features.

  20. The human interactome knowledge base (hint-kb): An integrative human protein interaction database enriched with predicted protein–protein interaction scores using a novel hybrid technique

    KAUST Repository

    Theofilatos, Konstantinos A.

    2013-07-12

    Proteins are the functional components of many cellular processes and the identification of their physical protein–protein interactions (PPIs) is an area of mature academic research. Various databases have been developed containing information about experimentally and computationally detected human PPIs as well as their corresponding annotation data. However, these databases contain many false positive interactions, are partial and only a few of them incorporate data from various sources. To overcome these limitations, we have developed HINT-KB (http://biotools.ceid.upatras.gr/hint-kb/), a knowledge base that integrates data from various sources, provides a user-friendly interface for their retrieval, cal-culatesasetoffeaturesofinterest and computesaconfidence score for every candidate protein interaction. This confidence score is essential for filtering the false positive interactions which are present in existing databases, predicting new protein interactions and measuring the frequency of each true protein interaction. For this reason, a novel machine learning hybrid methodology, called (Evolutionary Kalman Mathematical Modelling—EvoKalMaModel), was used to achieve an accurate and interpretable scoring methodology. The experimental results indicated that the proposed scoring scheme outperforms existing computational methods for the prediction of PPIs.

  1. Quantifying the molecular origins of opposite solvent effects on protein-protein interactions.

    Directory of Open Access Journals (Sweden)

    Vincent Vagenende

    Full Text Available Although the nature of solvent-protein interactions is generally weak and non-specific, addition of cosolvents such as denaturants and osmolytes strengthens protein-protein interactions for some proteins, whereas it weakens protein-protein interactions for others. This is exemplified by the puzzling observation that addition of glycerol oppositely affects the association constants of two antibodies, D1.3 and D44.1, with lysozyme. To resolve this conundrum, we develop a methodology based on the thermodynamic principles of preferential interaction theory and the quantitative characterization of local protein solvation from molecular dynamics simulations. We find that changes of preferential solvent interactions at the protein-protein interface quantitatively account for the opposite effects of glycerol on the antibody-antigen association constants. Detailed characterization of local protein solvation in the free and associated protein states reveals how opposite solvent effects on protein-protein interactions depend on the extent of dewetting of the protein-protein contact region and on structural changes that alter cooperative solvent-protein interactions at the periphery of the protein-protein interface. These results demonstrate the direct relationship between macroscopic solvent effects on protein-protein interactions and atom-scale solvent-protein interactions, and establish a general methodology for predicting and understanding solvent effects on protein-protein interactions in diverse biological environments.

  2. Finding low-conductance sets with dense interactions (FLCD) for better protein complex prediction.

    Science.gov (United States)

    Wang, Yijie; Qian, Xiaoning

    2017-03-14

    Intuitively, proteins in the same protein complexes should highly interact with each other but rarely interact with the other proteins in protein-protein interaction (PPI) networks. Surprisingly, many existing computational algorithms do not directly detect protein complexes based on both of these topological properties. Most of them, depending on mathematical definitions of either "modularity" or "conductance", have their own limitations: Modularity has the inherent resolution problem ignoring small protein complexes; and conductance characterizes the separability of complexes but fails to capture the interaction density within complexes. In this paper, we propose a two-step algorithm FLCD (Finding Low-Conductance sets with Dense interactions) to predict overlapping protein complexes with the desired topological structure, which is densely connected inside and well separated from the rest of the networks. First, FLCD detects well-separated subnetworks based on approximating a potential low-conductance set through a personalized PageRank vector from a protein and then solving a mixed integer programming (MIP) problem to find the minimum-conductance set within the identified low-conductance set. At the second step, the densely connected parts in those subnetworks are discovered as the protein complexes by solving another MIP problem that aims to find the dense subnetwork in the minimum-conductance set. Experiments on four large-scale yeast PPI networks from different public databases demonstrate that the complexes predicted by FLCD have better correspondence with the yeast protein complex gold standards than other three state-of-the-art algorithms (ClusterONE, LinkComm, and SR-MCL). Additionally, results of FLCD show higher biological relevance with respect to Gene Ontology (GO) terms by GO enrichment analysis.

  3. Comparative Genomics and Disorder Prediction Identify Biologically Relevant SH3 Protein Interactions.

    Directory of Open Access Journals (Sweden)

    2005-08-01

    Full Text Available Protein interaction networks are an important part of the post-genomic effort to integrate a part-list view of the cell into system-level understanding. Using a set of 11 yeast genomes we show that combining comparative genomics and secondary structure information greatly increases consensus-based prediction of SH3 targets. Benchmarking of our method against positive and negative standards gave 83% accuracy with 26% coverage. The concept of an optimal divergence time for effective comparative genomics studies was analyzed, demonstrating that genomes of species that diverged very recently from Saccharomyces cerevisiae(S. mikatae, S. bayanus, and S. paradoxus, or a long time ago (Neurospora crassa and Schizosaccharomyces pombe, contain less information for accurate prediction of SH3 targets than species within the optimal divergence time proposed. We also show here that intrinsically disordered SH3 domain targets are more probable sites of interaction than equivalent sites within ordered regions. Our findings highlight several novel S. cerevisiae SH3 protein interactions, the value of selection of optimal divergence times in comparative genomics studies, and the importance of intrinsic disorder for protein interactions. Based on our results we propose novel roles for the S. cerevisiae proteins Abp1p in endocytosis and Hse1p in endosome protein sorting.

  4. Comparative genomics and disorder prediction identify biologically relevant SH3 protein interactions.

    Directory of Open Access Journals (Sweden)

    Pedro Beltrao

    2005-08-01

    Full Text Available Protein interaction networks are an important part of the post-genomic effort to integrate a part-list view of the cell into system-level understanding. Using a set of 11 yeast genomes we show that combining comparative genomics and secondary structure information greatly increases consensus-based prediction of SH3 targets. Benchmarking of our method against positive and negative standards gave 83% accuracy with 26% coverage. The concept of an optimal divergence time for effective comparative genomics studies was analyzed, demonstrating that genomes of species that diverged very recently from Saccharomyces cerevisiae(S. mikatae, S. bayanus, and S. paradoxus, or a long time ago (Neurospora crassa and Schizosaccharomyces pombe, contain less information for accurate prediction of SH3 targets than species within the optimal divergence time proposed. We also show here that intrinsically disordered SH3 domain targets are more probable sites of interaction than equivalent sites within ordered regions. Our findings highlight several novel S. cerevisiae SH3 protein interactions, the value of selection of optimal divergence times in comparative genomics studies, and the importance of intrinsic disorder for protein interactions. Based on our results we propose novel roles for the S. cerevisiae proteins Abp1p in endocytosis and Hse1p in endosome protein sorting.

  5. Large-scale prediction of drug–target interactions using protein sequences and drug topological structures

    International Nuclear Information System (INIS)

    Cao Dongsheng; Liu Shao; Xu Qingsong; Lu Hongmei; Huang Jianhua; Hu Qiannan; Liang Yizeng

    2012-01-01

    Highlights: ► Drug–target interactions are predicted using an extended SAR methodology. ► A drug–target interaction is regarded as an event triggered by many factors. ► Molecular fingerprint and CTD descriptors are used to represent drugs and proteins. ► Our approach shows compatibility between the new scheme and current SAR methodology. - Abstract: The identification of interactions between drugs and target proteins plays a key role in the process of genomic drug discovery. It is both consuming and costly to determine drug–target interactions by experiments alone. Therefore, there is an urgent need to develop new in silico prediction approaches capable of identifying these potential drug–target interactions in a timely manner. In this article, we aim at extending current structure–activity relationship (SAR) methodology to fulfill such requirements. In some sense, a drug–target interaction can be regarded as an event or property triggered by many influence factors from drugs and target proteins. Thus, each interaction pair can be represented theoretically by using these factors which are based on the structural and physicochemical properties simultaneously from drugs and proteins. To realize this, drug molecules are encoded with MACCS substructure fingerings representing existence of certain functional groups or fragments; and proteins are encoded with some biochemical and physicochemical properties. Four classes of drug–target interaction networks in humans involving enzymes, ion channels, G-protein-coupled receptors (GPCRs) and nuclear receptors, are independently used for establishing predictive models with support vector machines (SVMs). The SVM models gave prediction accuracy of 90.31%, 88.91%, 84.68% and 83.74% for four datasets, respectively. In conclusion, the results demonstrate the ability of our proposed method to predict the drug–target interactions, and show a general compatibility between the new scheme and current SAR

  6. Large-scale prediction of drug-target interactions using protein sequences and drug topological structures

    Energy Technology Data Exchange (ETDEWEB)

    Cao Dongsheng [Research Center of Modernization of Traditional Chinese Medicines, Central South University, Changsha 410083 (China); Liu Shao [Xiangya Hospital, Central South University, Changsha 410008 (China); Xu Qingsong [School of Mathematical Sciences and Computing Technology, Central South University, Changsha 410083 (China); Lu Hongmei; Huang Jianhua [Research Center of Modernization of Traditional Chinese Medicines, Central South University, Changsha 410083 (China); Hu Qiannan [Key Laboratory of Combinatorial Biosynthesis and Drug Discovery (Wuhan University), Ministry of Education, and Wuhan University School of Pharmaceutical Sciences, Wuhan 430071 (China); Liang Yizeng, E-mail: yizeng_liang@263.net [Research Center of Modernization of Traditional Chinese Medicines, Central South University, Changsha 410083 (China)

    2012-11-08

    Highlights: Black-Right-Pointing-Pointer Drug-target interactions are predicted using an extended SAR methodology. Black-Right-Pointing-Pointer A drug-target interaction is regarded as an event triggered by many factors. Black-Right-Pointing-Pointer Molecular fingerprint and CTD descriptors are used to represent drugs and proteins. Black-Right-Pointing-Pointer Our approach shows compatibility between the new scheme and current SAR methodology. - Abstract: The identification of interactions between drugs and target proteins plays a key role in the process of genomic drug discovery. It is both consuming and costly to determine drug-target interactions by experiments alone. Therefore, there is an urgent need to develop new in silico prediction approaches capable of identifying these potential drug-target interactions in a timely manner. In this article, we aim at extending current structure-activity relationship (SAR) methodology to fulfill such requirements. In some sense, a drug-target interaction can be regarded as an event or property triggered by many influence factors from drugs and target proteins. Thus, each interaction pair can be represented theoretically by using these factors which are based on the structural and physicochemical properties simultaneously from drugs and proteins. To realize this, drug molecules are encoded with MACCS substructure fingerings representing existence of certain functional groups or fragments; and proteins are encoded with some biochemical and physicochemical properties. Four classes of drug-target interaction networks in humans involving enzymes, ion channels, G-protein-coupled receptors (GPCRs) and nuclear receptors, are independently used for establishing predictive models with support vector machines (SVMs). The SVM models gave prediction accuracy of 90.31%, 88.91%, 84.68% and 83.74% for four datasets, respectively. In conclusion, the results demonstrate the ability of our proposed method to predict the drug

  7. Human cancer protein-protein interaction network: a structural perspective.

    Directory of Open Access Journals (Sweden)

    Gozde Kar

    2009-12-01

    Full Text Available Protein-protein interaction networks provide a global picture of cellular function and biological processes. Some proteins act as hub proteins, highly connected to others, whereas some others have few interactions. The dysfunction of some interactions causes many diseases, including cancer. Proteins interact through their interfaces. Therefore, studying the interface properties of cancer-related proteins will help explain their role in the interaction networks. Similar or overlapping binding sites should be used repeatedly in single interface hub proteins, making them promiscuous. Alternatively, multi-interface hub proteins make use of several distinct binding sites to bind to different partners. We propose a methodology to integrate protein interfaces into cancer interaction networks (ciSPIN, cancer structural protein interface network. The interactions in the human protein interaction network are replaced by interfaces, coming from either known or predicted complexes. We provide a detailed analysis of cancer related human protein-protein interfaces and the topological properties of the cancer network. The results reveal that cancer-related proteins have smaller, more planar, more charged and less hydrophobic binding sites than non-cancer proteins, which may indicate low affinity and high specificity of the cancer-related interactions. We also classified the genes in ciSPIN according to phenotypes. Within phenotypes, for breast cancer, colorectal cancer and leukemia, interface properties were found to be discriminating from non-cancer interfaces with an accuracy of 71%, 67%, 61%, respectively. In addition, cancer-related proteins tend to interact with their partners through distinct interfaces, corresponding mostly to multi-interface hubs, which comprise 56% of cancer-related proteins, and constituting the nodes with higher essentiality in the network (76%. We illustrate the interface related affinity properties of two cancer-related hub

  8. Protein Annotation from Protein Interaction Networks and Gene Ontology

    OpenAIRE

    Nguyen, Cao D.; Gardiner, Katheleen J.; Cios, Krzysztof J.

    2011-01-01

    We introduce a novel method for annotating protein function that combines Naïve Bayes and association rules, and takes advantage of the underlying topology in protein interaction networks and the structure of graphs in the Gene Ontology. We apply our method to proteins from the Human Protein Reference Database (HPRD) and show that, in comparison with other approaches, it predicts protein functions with significantly higher recall with no loss of precision. Specifically, it achieves 51% precis...

  9. Scoring functions for protein-protein interactions.

    Science.gov (United States)

    Moal, Iain H; Moretti, Rocco; Baker, David; Fernández-Recio, Juan

    2013-12-01

    The computational evaluation of protein-protein interactions will play an important role in organising the wealth of data being generated by high-throughput initiatives. Here we discuss future applications, report recent developments and identify areas requiring further investigation. Many functions have been developed to quantify the structural and energetic properties of interacting proteins, finding use in interrelated challenges revolving around the relationship between sequence, structure and binding free energy. These include loop modelling, side-chain refinement, docking, multimer assembly, affinity prediction, affinity change upon mutation, hotspots location and interface design. Information derived from models optimised for one of these challenges can be used to benefit the others, and can be unified within the theoretical frameworks of multi-task learning and Pareto-optimal multi-objective learning. Copyright © 2013 Elsevier Ltd. All rights reserved.

  10. Construction of ontology augmented networks for protein complex prediction.

    Science.gov (United States)

    Zhang, Yijia; Lin, Hongfei; Yang, Zhihao; Wang, Jian

    2013-01-01

    Protein complexes are of great importance in understanding the principles of cellular organization and function. The increase in available protein-protein interaction data, gene ontology and other resources make it possible to develop computational methods for protein complex prediction. Most existing methods focus mainly on the topological structure of protein-protein interaction networks, and largely ignore the gene ontology annotation information. In this article, we constructed ontology augmented networks with protein-protein interaction data and gene ontology, which effectively unified the topological structure of protein-protein interaction networks and the similarity of gene ontology annotations into unified distance measures. After constructing ontology augmented networks, a novel method (clustering based on ontology augmented networks) was proposed to predict protein complexes, which was capable of taking into account the topological structure of the protein-protein interaction network, as well as the similarity of gene ontology annotations. Our method was applied to two different yeast protein-protein interaction datasets and predicted many well-known complexes. The experimental results showed that (i) ontology augmented networks and the unified distance measure can effectively combine the structure closeness and gene ontology annotation similarity; (ii) our method is valuable in predicting protein complexes and has higher F1 and accuracy compared to other competing methods.

  11. Annotating the protein-RNA interaction sites in proteins using evolutionary information and protein backbone structure.

    Science.gov (United States)

    Li, Tao; Li, Qian-Zhong

    2012-11-07

    RNA-protein interactions play important roles in various biological processes. The precise detection of RNA-protein interaction sites is very important for understanding essential biological processes and annotating the function of the proteins. In this study, based on various features from amino acid sequence and structure, including evolutionary information, solvent accessible surface area and torsion angles (φ, ψ) in the backbone structure of the polypeptide chain, a computational method for predicting RNA-binding sites in proteins is proposed. When the method is applied to predict RNA-binding sites in three datasets: RBP86 containing 86 protein chains, RBP107 containing 107 proteins chains and RBP109 containing 109 proteins chains, better sensitivities and specificities are obtained compared to previously published methods in five-fold cross-validation tests. In order to make further examination for the efficiency of our method, the RBP107 dataset is used as training set, RBP86 and RBP109 datasets are used as the independent test sets. In addition, as examples of our prediction, RNA-binding sites in a few proteins are presented. The annotated results are consistent with the PDB annotation. These results show that our method is useful for annotating RNA binding sites of novel proteins.

  12. A new essential protein discovery method based on the integration of protein-protein interaction and gene expression data

    Directory of Open Access Journals (Sweden)

    Li Min

    2012-03-01

    Full Text Available Abstract Background Identification of essential proteins is always a challenging task since it requires experimental approaches that are time-consuming and laborious. With the advances in high throughput technologies, a large number of protein-protein interactions are available, which have produced unprecedented opportunities for detecting proteins' essentialities from the network level. There have been a series of computational approaches proposed for predicting essential proteins based on network topologies. However, the network topology-based centrality measures are very sensitive to the robustness of network. Therefore, a new robust essential protein discovery method would be of great value. Results In this paper, we propose a new centrality measure, named PeC, based on the integration of protein-protein interaction and gene expression data. The performance of PeC is validated based on the protein-protein interaction network of Saccharomyces cerevisiae. The experimental results show that the predicted precision of PeC clearly exceeds that of the other fifteen previously proposed centrality measures: Degree Centrality (DC, Betweenness Centrality (BC, Closeness Centrality (CC, Subgraph Centrality (SC, Eigenvector Centrality (EC, Information Centrality (IC, Bottle Neck (BN, Density of Maximum Neighborhood Component (DMNC, Local Average Connectivity-based method (LAC, Sum of ECC (SoECC, Range-Limited Centrality (RL, L-index (LI, Leader Rank (LR, Normalized α-Centrality (NC, and Moduland-Centrality (MC. Especially, the improvement of PeC over the classic centrality measures (BC, CC, SC, EC, and BN is more than 50% when predicting no more than 500 proteins. Conclusions We demonstrate that the integration of protein-protein interaction network and gene expression data can help improve the precision of predicting essential proteins. The new centrality measure, PeC, is an effective essential protein discovery method.

  13. Prediction of vitamin interacting residues in a vitamin binding protein using evolutionary information

    Directory of Open Access Journals (Sweden)

    Panwar Bharat

    2013-02-01

    Full Text Available Abstract Background The vitamins are important cofactors in various enzymatic-reactions. In past, many inhibitors have been designed against vitamin binding pockets in order to inhibit vitamin-protein interactions. Thus, it is important to identify vitamin interacting residues in a protein. It is possible to detect vitamin-binding pockets on a protein, if its tertiary structure is known. Unfortunately tertiary structures of limited proteins are available. Therefore, it is important to develop in-silico models for predicting vitamin interacting residues in protein from its primary structure. Results In this study, first we compared protein-interacting residues of vitamins with other ligands using Two Sample Logo (TSL. It was observed that ATP, GTP, NAD, FAD and mannose preferred {G,R,K,S,H}, {G,K,T,S,D,N}, {T,G,Y}, {G,Y,W} and {Y,D,W,N,E} residues respectively, whereas vitamins preferred {Y,F,S,W,T,G,H} residues for the interaction with proteins. Furthermore, compositional information of preferred and non-preferred residues along with patterns-specificity was also observed within different vitamin-classes. Vitamins A, B and B6 preferred {F,I,W,Y,L,V}, {S,Y,G,T,H,W,N,E} and {S,T,G,H,Y,N} interacting residues respectively. It suggested that protein-binding patterns of vitamins are different from other ligands, and motivated us to develop separate predictor for vitamins and their sub-classes. The four different prediction modules, (i vitamin interacting residues (VIRs, (ii vitamin-A interacting residues (VAIRs, (iii vitamin-B interacting residues (VBIRs and (iv pyridoxal-5-phosphate (vitamin B6 interacting residues (PLPIRs have been developed. We applied various classifiers of SVM, BayesNet, NaiveBayes, ComplementNaiveBayes, NaiveBayesMultinomial, RandomForest and IBk etc., as machine learning techniques, using binary and Position-Specific Scoring Matrix (PSSM features of protein sequences. Finally, we selected best performing SVM modules and

  14. Prediction of vitamin interacting residues in a vitamin binding protein using evolutionary information.

    Science.gov (United States)

    Panwar, Bharat; Gupta, Sudheer; Raghava, Gajendra P S

    2013-02-07

    The vitamins are important cofactors in various enzymatic-reactions. In past, many inhibitors have been designed against vitamin binding pockets in order to inhibit vitamin-protein interactions. Thus, it is important to identify vitamin interacting residues in a protein. It is possible to detect vitamin-binding pockets on a protein, if its tertiary structure is known. Unfortunately tertiary structures of limited proteins are available. Therefore, it is important to develop in-silico models for predicting vitamin interacting residues in protein from its primary structure. In this study, first we compared protein-interacting residues of vitamins with other ligands using Two Sample Logo (TSL). It was observed that ATP, GTP, NAD, FAD and mannose preferred {G,R,K,S,H}, {G,K,T,S,D,N}, {T,G,Y}, {G,Y,W} and {Y,D,W,N,E} residues respectively, whereas vitamins preferred {Y,F,S,W,T,G,H} residues for the interaction with proteins. Furthermore, compositional information of preferred and non-preferred residues along with patterns-specificity was also observed within different vitamin-classes. Vitamins A, B and B6 preferred {F,I,W,Y,L,V}, {S,Y,G,T,H,W,N,E} and {S,T,G,H,Y,N} interacting residues respectively. It suggested that protein-binding patterns of vitamins are different from other ligands, and motivated us to develop separate predictor for vitamins and their sub-classes. The four different prediction modules, (i) vitamin interacting residues (VIRs), (ii) vitamin-A interacting residues (VAIRs), (iii) vitamin-B interacting residues (VBIRs) and (iv) pyridoxal-5-phosphate (vitamin B6) interacting residues (PLPIRs) have been developed. We applied various classifiers of SVM, BayesNet, NaiveBayes, ComplementNaiveBayes, NaiveBayesMultinomial, RandomForest and IBk etc., as machine learning techniques, using binary and Position-Specific Scoring Matrix (PSSM) features of protein sequences. Finally, we selected best performing SVM modules and obtained highest MCC of 0.53, 0.48, 0.61, 0

  15. PCVMZM: Using the Probabilistic Classification Vector Machines Model Combined with a Zernike Moments Descriptor to Predict Protein-Protein Interactions from Protein Sequences.

    Science.gov (United States)

    Wang, Yanbin; You, Zhuhong; Li, Xiao; Chen, Xing; Jiang, Tonghai; Zhang, Jingting

    2017-05-11

    Protein-protein interactions (PPIs) are essential for most living organisms' process. Thus, detecting PPIs is extremely important to understand the molecular mechanisms of biological systems. Although many PPIs data have been generated by high-throughput technologies for a variety of organisms, the whole interatom is still far from complete. In addition, the high-throughput technologies for detecting PPIs has some unavoidable defects, including time consumption, high cost, and high error rate. In recent years, with the development of machine learning, computational methods have been broadly used to predict PPIs, and can achieve good prediction rate. In this paper, we present here PCVMZM, a computational method based on a Probabilistic Classification Vector Machines (PCVM) model and Zernike moments (ZM) descriptor for predicting the PPIs from protein amino acids sequences. Specifically, a Zernike moments (ZM) descriptor is used to extract protein evolutionary information from Position-Specific Scoring Matrix (PSSM) generated by Position-Specific Iterated Basic Local Alignment Search Tool (PSI-BLAST). Then, PCVM classifier is used to infer the interactions among protein. When performed on PPIs datasets of Yeast and H. Pylori , the proposed method can achieve the average prediction accuracy of 94.48% and 91.25%, respectively. In order to further evaluate the performance of the proposed method, the state-of-the-art support vector machines (SVM) classifier is used and compares with the PCVM model. Experimental results on the Yeast dataset show that the performance of PCVM classifier is better than that of SVM classifier. The experimental results indicate that our proposed method is robust, powerful and feasible, which can be used as a helpful tool for proteomics research.

  16. Prediction of Protein-Protein Interactions by NanoLuc-Based Protein-Fragment Complementation Assay | Office of Cancer Genomics

    Science.gov (United States)

    The CTD2 Center at Emory has developed a new NanoLuc®-based protein-fragment complementation assay (NanoPCA) which allows the detection of novel protein-protein interactions (PPI). NanoPCA allows the study of PPI dynamics with reversible interactions.  Read the abstract. Experimental Approaches Read the detailed Experimetnal Approaches. 

  17. A Library of Plasmodium vivax Recombinant Merozoite Proteins Reveals New Vaccine Candidates and Protein-Protein Interactions

    Science.gov (United States)

    Hostetler, Jessica B.; Sharma, Sumana; Bartholdson, S. Josefin; Wright, Gavin J.; Fairhurst, Rick M.; Rayner, Julian C.

    2015-01-01

    Background A vaccine targeting Plasmodium vivax will be an essential component of any comprehensive malaria elimination program, but major gaps in our understanding of P. vivax biology, including the protein-protein interactions that mediate merozoite invasion of reticulocytes, hinder the search for candidate antigens. Only one ligand-receptor interaction has been identified, that between P. vivax Duffy Binding Protein (PvDBP) and the erythrocyte Duffy Antigen Receptor for Chemokines (DARC), and strain-specific immune responses to PvDBP make it a complex vaccine target. To broaden the repertoire of potential P. vivax merozoite-stage vaccine targets, we exploited a recent breakthrough in expressing full-length ectodomains of Plasmodium proteins in a functionally-active form in mammalian cells and initiated a large-scale study of P. vivax merozoite proteins that are potentially involved in reticulocyte binding and invasion. Methodology/Principal Findings We selected 39 P. vivax proteins that are predicted to localize to the merozoite surface or invasive secretory organelles, some of which show homology to P. falciparum vaccine candidates. Of these, we were able to express 37 full-length protein ectodomains in a mammalian expression system, which has been previously used to express P. falciparum invasion ligands such as PfRH5. To establish whether the expressed proteins were correctly folded, we assessed whether they were recognized by antibodies from Cambodian patients with acute vivax malaria. IgG from these samples showed at least a two-fold change in reactivity over naïve controls in 27 of 34 antigens tested, and the majority showed heat-labile IgG immunoreactivity, suggesting the presence of conformation-sensitive epitopes and native tertiary protein structures. Using a method specifically designed to detect low-affinity, extracellular protein-protein interactions, we confirmed a predicted interaction between P. vivax 6-cysteine proteins P12 and P41, further

  18. PSAIA – Protein Structure and Interaction Analyzer

    Directory of Open Access Journals (Sweden)

    Vlahoviček Kristian

    2008-04-01

    Full Text Available Abstract Background PSAIA (Protein Structure and Interaction Analyzer was developed to compute geometric parameters for large sets of protein structures in order to predict and investigate protein-protein interaction sites. Results In addition to most relevant established algorithms, PSAIA offers a new method PIADA (Protein Interaction Atom Distance Algorithm for the determination of residue interaction pairs. We found that PIADA produced more satisfactory results than comparable algorithms implemented in PSAIA. Particular advantages of PSAIA include its capacity to combine different methods to detect the locations and types of interactions between residues and its ability, without any further automation steps, to handle large numbers of protein structures and complexes. Generally, the integration of a variety of methods enables PSAIA to offer easier automation of analysis and greater reliability of results. PSAIA can be used either via a graphical user interface or from the command-line. Results are generated in either tabular or XML format. Conclusion In a straightforward fashion and for large sets of protein structures, PSAIA enables the calculation of protein geometric parameters and the determination of location and type for protein-protein interaction sites. XML formatted output enables easy conversion of results to various formats suitable for statistic analysis. Results from smaller data sets demonstrated the influence of geometry on protein interaction sites. Comprehensive analysis of properties of large data sets lead to new information useful in the prediction of protein-protein interaction sites.

  19. Mapping monomeric threading to protein-protein structure prediction.

    Science.gov (United States)

    Guerler, Aysam; Govindarajoo, Brandon; Zhang, Yang

    2013-03-25

    The key step of template-based protein-protein structure prediction is the recognition of complexes from experimental structure libraries that have similar quaternary fold. Maintaining two monomer and dimer structure libraries is however laborious, and inappropriate library construction can degrade template recognition coverage. We propose a novel strategy SPRING to identify complexes by mapping monomeric threading alignments to protein-protein interactions based on the original oligomer entries in the PDB, which does not rely on library construction and increases the efficiency and quality of complex template recognitions. SPRING is tested on 1838 nonhomologous protein complexes which can recognize correct quaternary template structures with a TM score >0.5 in 1115 cases after excluding homologous proteins. The average TM score of the first model is 60% and 17% higher than that by HHsearch and COTH, respectively, while the number of targets with an interface RMSD benchmark proteins. Although the relative performance of SPRING and ZDOCK depends on the level of homology filters, a combination of the two methods can result in a significantly higher model quality than ZDOCK at all homology thresholds. These data demonstrate a new efficient approach to quaternary structure recognition that is ready to use for genome-scale modeling of protein-protein interactions due to the high speed and accuracy.

  20. Protein-protein interaction inference based on semantic similarity of Gene Ontology terms.

    Science.gov (United States)

    Zhang, Shu-Bo; Tang, Qiang-Rong

    2016-07-21

    Identifying protein-protein interactions is important in molecular biology. Experimental methods to this issue have their limitations, and computational approaches have attracted more and more attentions from the biological community. The semantic similarity derived from the Gene Ontology (GO) annotation has been regarded as one of the most powerful indicators for protein interaction. However, conventional methods based on GO similarity fail to take advantage of the specificity of GO terms in the ontology graph. We proposed a GO-based method to predict protein-protein interaction by integrating different kinds of similarity measures derived from the intrinsic structure of GO graph. We extended five existing methods to derive the semantic similarity measures from the descending part of two GO terms in the GO graph, then adopted a feature integration strategy to combines both the ascending and the descending similarity scores derived from the three sub-ontologies to construct various kinds of features to characterize each protein pair. Support vector machines (SVM) were employed as discriminate classifiers, and five-fold cross validation experiments were conducted on both human and yeast protein-protein interaction datasets to evaluate the performance of different kinds of integrated features, the experimental results suggest the best performance of the feature that combines information from both the ascending and the descending parts of the three ontologies. Our method is appealing for effective prediction of protein-protein interaction. Copyright © 2016 Elsevier Ltd. All rights reserved.

  1. An efficient heuristic method for active feature acquisition and its application to protein-protein interaction prediction

    Directory of Open Access Journals (Sweden)

    Thahir Mohamed

    2012-11-01

    Full Text Available Abstract Background Machine learning approaches for classification learn the pattern of the feature space of different classes, or learn a boundary that separates the feature space into different classes. The features of the data instances are usually available, and it is only the class-labels of the instances that are unavailable. For example, to classify text documents into different topic categories, the words in the documents are features and they are readily available, whereas the topic is what is predicted. However, in some domains obtaining features may be resource-intensive because of which not all features may be available. An example is that of protein-protein interaction prediction, where not only are the labels ('interacting' or 'non-interacting' unavailable, but so are some of the features. It may be possible to obtain at least some of the missing features by carrying out a few experiments as permitted by the available resources. If only a few experiments can be carried out to acquire missing features, which proteins should be studied and which features of those proteins should be determined? From the perspective of machine learning for PPI prediction, it would be desirable that those features be acquired which when used in training the classifier, the accuracy of the classifier is improved the most. That is, the utility of the feature-acquisition is measured in terms of how much acquired features contribute to improving the accuracy of the classifier. Active feature acquisition (AFA is a strategy to preselect such instance-feature combinations (i.e. protein and experiment combinations for maximum utility. The goal of AFA is the creation of optimal training set that would result in the best classifier, and not in determining the best classification model itself. Results We present a heuristic method for active feature acquisition to calculate the utility of acquiring a missing feature. This heuristic takes into account the change in

  2. A discriminatory function for prediction of protein-DNA interactions based on alpha shape modeling.

    Science.gov (United States)

    Zhou, Weiqiang; Yan, Hong

    2010-10-15

    Protein-DNA interaction has significant importance in many biological processes. However, the underlying principle of the molecular recognition process is still largely unknown. As more high-resolution 3D structures of protein-DNA complex are becoming available, the surface characteristics of the complex become an important research topic. In our work, we apply an alpha shape model to represent the surface structure of the protein-DNA complex and developed an interface-atom curvature-dependent conditional probability discriminatory function for the prediction of protein-DNA interaction. The interface-atom curvature-dependent formalism captures atomic interaction details better than the atomic distance-based method. The proposed method provides good performance in discriminating the native structures from the docking decoy sets, and outperforms the distance-dependent formalism in terms of the z-score. Computer experiment results show that the curvature-dependent formalism with the optimal parameters can achieve a native z-score of -8.17 in discriminating the native structure from the highest surface-complementarity scored decoy set and a native z-score of -7.38 in discriminating the native structure from the lowest RMSD decoy set. The interface-atom curvature-dependent formalism can also be used to predict apo version of DNA-binding proteins. These results suggest that the interface-atom curvature-dependent formalism has a good prediction capability for protein-DNA interactions. The code and data sets are available for download on http://www.hy8.com/bioinformatics.htm kenandzhou@hotmail.com.

  3. Specificity of molecular interactions in transient protein-protein interaction interfaces.

    Science.gov (United States)

    Cho, Kyu-il; Lee, KiYoung; Lee, Kwang H; Kim, Dongsup; Lee, Doheon

    2006-11-15

    In this study, we investigate what types of interactions are specific to their biological function, and what types of interactions are persistent regardless of their functional category in transient protein-protein heterocomplexes. This is the first approach to analyze protein-protein interfaces systematically at the molecular interaction level in the context of protein functions. We perform systematic analysis at the molecular interaction level using classification and feature subset selection technique prevalent in the field of pattern recognition. To represent the physicochemical properties of protein-protein interfaces, we design 18 molecular interaction types using canonical and noncanonical interactions. Then, we construct input vector using the frequency of each interaction type in protein-protein interface. We analyze the 131 interfaces of transient protein-protein heterocomplexes in PDB: 33 protease-inhibitors, 52 antibody-antigens, 46 signaling proteins including 4 cyclin dependent kinase and 26 G-protein. Using kNN classification and feature subset selection technique, we show that there are specific interaction types based on their functional category, and such interaction types are conserved through the common binding mechanism, rather than through the sequence or structure conservation. The extracted interaction types are C(alpha)-- H...O==C interaction, cation...anion interaction, amine...amine interaction, and amine...cation interaction. With these four interaction types, we achieve the classification success rate up to 83.2% with leave-one-out cross-validation at k = 15. Of these four interaction types, C(alpha)--H...O==C shows binding specificity for protease-inhibitor complexes, while cation-anion interaction is predominant in signaling complexes. The amine ... amine and amine...cation interaction give a minor contribution to the classification accuracy. When combined with these two interactions, they increase the accuracy by 3.8%. In the case of

  4. Protein annotation from protein interaction networks and Gene Ontology.

    Science.gov (United States)

    Nguyen, Cao D; Gardiner, Katheleen J; Cios, Krzysztof J

    2011-10-01

    We introduce a novel method for annotating protein function that combines Naïve Bayes and association rules, and takes advantage of the underlying topology in protein interaction networks and the structure of graphs in the Gene Ontology. We apply our method to proteins from the Human Protein Reference Database (HPRD) and show that, in comparison with other approaches, it predicts protein functions with significantly higher recall with no loss of precision. Specifically, it achieves 51% precision and 60% recall versus 45% and 26% for Majority and 24% and 61% for χ²-statistics, respectively. Copyright © 2011 Elsevier Inc. All rights reserved.

  5. Evolution of protein-protein interactions

    Indian Academy of Sciences (India)

    Evolution of protein-protein interactions · Our interests in protein-protein interactions · Slide 3 · Slide 4 · Slide 5 · Slide 6 · Slide 7 · Slide 8 · Slide 9 · Slide 10 · Slide 11 · Slide 12 · Slide 13 · Slide 14 · Slide 15 · Slide 16 · Slide 17 · Slide 18 · Slide 19 · Slide 20.

  6. Filtering high-throughput protein-protein interaction data using a combination of genomic features

    Directory of Open Access Journals (Sweden)

    Patil Ashwini

    2005-04-01

    Full Text Available Abstract Background Protein-protein interaction data used in the creation or prediction of molecular networks is usually obtained from large scale or high-throughput experiments. This experimental data is liable to contain a large number of spurious interactions. Hence, there is a need to validate the interactions and filter out the incorrect data before using them in prediction studies. Results In this study, we use a combination of 3 genomic features – structurally known interacting Pfam domains, Gene Ontology annotations and sequence homology – as a means to assign reliability to the protein-protein interactions in Saccharomyces cerevisiae determined by high-throughput experiments. Using Bayesian network approaches, we show that protein-protein interactions from high-throughput data supported by one or more genomic features have a higher likelihood ratio and hence are more likely to be real interactions. Our method has a high sensitivity (90% and good specificity (63%. We show that 56% of the interactions from high-throughput experiments in Saccharomyces cerevisiae have high reliability. We use the method to estimate the number of true interactions in the high-throughput protein-protein interaction data sets in Caenorhabditis elegans, Drosophila melanogaster and Homo sapiens to be 27%, 18% and 68% respectively. Our results are available for searching and downloading at http://helix.protein.osaka-u.ac.jp/htp/. Conclusion A combination of genomic features that include sequence, structure and annotation information is a good predictor of true interactions in large and noisy high-throughput data sets. The method has a very high sensitivity and good specificity and can be used to assign a likelihood ratio, corresponding to the reliability, to each interaction.

  7. Comprehensive predictions of target proteins based on protein-chemical interaction using virtual screening and experimental verifications.

    Science.gov (United States)

    Kobayashi, Hiroki; Harada, Hiroko; Nakamura, Masaomi; Futamura, Yushi; Ito, Akihiro; Yoshida, Minoru; Iemura, Shun-Ichiro; Shin-Ya, Kazuo; Doi, Takayuki; Takahashi, Takashi; Natsume, Tohru; Imoto, Masaya; Sakakibara, Yasubumi

    2012-04-05

    Identification of the target proteins of bioactive compounds is critical for elucidating the mode of action; however, target identification has been difficult in general, mostly due to the low sensitivity of detection using affinity chromatography followed by CBB staining and MS/MS analysis. We applied our protocol of predicting target proteins combining in silico screening and experimental verification for incednine, which inhibits the anti-apoptotic function of Bcl-xL by an unknown mechanism. One hundred eighty-two target protein candidates were computationally predicted to bind to incednine by the statistical prediction method, and the predictions were verified by in vitro binding of incednine to seven proteins, whose expression can be confirmed in our cell system.As a result, 40% accuracy of the computational predictions was achieved successfully, and we newly found 3 incednine-binding proteins. This study revealed that our proposed protocol of predicting target protein combining in silico screening and experimental verification is useful, and provides new insight into a strategy for identifying target proteins of small molecules.

  8. Comprehensive predictions of target proteins based on protein-chemical interaction using virtual screening and experimental verifications

    Directory of Open Access Journals (Sweden)

    Kobayashi Hiroki

    2012-04-01

    Full Text Available Abstract Background Identification of the target proteins of bioactive compounds is critical for elucidating the mode of action; however, target identification has been difficult in general, mostly due to the low sensitivity of detection using affinity chromatography followed by CBB staining and MS/MS analysis. Results We applied our protocol of predicting target proteins combining in silico screening and experimental verification for incednine, which inhibits the anti-apoptotic function of Bcl-xL by an unknown mechanism. One hundred eighty-two target protein candidates were computationally predicted to bind to incednine by the statistical prediction method, and the predictions were verified by in vitro binding of incednine to seven proteins, whose expression can be confirmed in our cell system. As a result, 40% accuracy of the computational predictions was achieved successfully, and we newly found 3 incednine-binding proteins. Conclusions This study revealed that our proposed protocol of predicting target protein combining in silico screening and experimental verification is useful, and provides new insight into a strategy for identifying target proteins of small molecules.

  9. Coevolution of interacting fertilization proteins.

    Directory of Open Access Journals (Sweden)

    Nathaniel L Clark

    2009-07-01

    Full Text Available Reproductive proteins are among the fastest evolving in the proteome, often due to the consequences of positive selection, and their rapid evolution is frequently attributed to a coevolutionary process between interacting female and male proteins. Such a process could leave characteristic signatures at coevolving genes. One signature of coevolution, predicted by sexual selection theory, is an association of alleles between the two genes. Another predicted signature is a correlation of evolutionary rates during divergence due to compensatory evolution. We studied female-male coevolution in the abalone by resequencing sperm lysin and its interacting egg coat protein, VERL, in populations of two species. As predicted, we found intergenic linkage disequilibrium between lysin and VERL, despite our demonstration that they are not physically linked. This finding supports a central prediction of sexual selection using actual genotypes, that of an association between a male trait and its female preference locus. We also created a novel likelihood method to show that lysin and VERL have experienced correlated rates of evolution. These two signatures of coevolution can provide statistical rigor to hypotheses of coevolution and could be exploited for identifying coevolving proteins a priori. We also present polymorphism-based evidence for positive selection and implicate recent selective events at the specific structural regions of lysin and VERL responsible for their species-specific interaction. Finally, we observed deep subdivision between VERL alleles in one species, which matches a theoretical prediction of sexual conflict. Thus, abalone fertilization proteins illustrate how coevolution can lead to reproductive barriers and potentially drive speciation.

  10. Protein function prediction involved on radio-resistant bacteria

    International Nuclear Information System (INIS)

    Mezhoud, Karim; Mankai, Houda; Sghaier, Haitham; Barkallah, Insaf

    2009-01-01

    Previously, we identified 58 proteins under positive selection in ionizing-radiation-resistant bacteria (IRRB) but absent in all ionizing-radiation-sensitive bacteria (IRSB). These are good reasons to believe these 58 proteins with their interactions with other proteins (interactomes) are a part of the answer to the question as to how IRRB resist to radiation, because our knowledge of interactomes of positively selected orphan proteins in IRRB might allow us to define cellular pathways important to ionizing-radiation resistance. Using the Database of Interacting Proteins and the PSIbase, we have predicted interactions of orthologs of the 58 proteins under positive selection in IRRB but absent in all IRSB. We used integrate experimental data sets with molecular interaction networks and protein structure prediction from databases. Among these, 18 proteins with their interactomes were identified in Deinococcus radiodurans R1. DNA checkpoint and repair, kinases pathways, energetic and nucleotide metabolisms were the important biological process that found. We predicted the interactomes of 58 proteins under positive selection in IRRB. It is hoped our data will provide new clues as to the cellular pathways that are important for ionizing-radiation resistance. We have identified news proteins involved on DNA management which were not previously mentioned. It is an important input in addition to protein that studied. It does still work to deepen our study on these new proteins

  11. Proteins interacting with cloning scars: a source of false positive protein-protein interactions.

    Science.gov (United States)

    Banks, Charles A S; Boanca, Gina; Lee, Zachary T; Florens, Laurence; Washburn, Michael P

    2015-02-23

    A common approach for exploring the interactome, the network of protein-protein interactions in cells, uses a commercially available ORF library to express affinity tagged bait proteins; these can be expressed in cells and endogenous cellular proteins that copurify with the bait can be identified as putative interacting proteins using mass spectrometry. Control experiments can be used to limit false-positive results, but in many cases, there are still a surprising number of prey proteins that appear to copurify specifically with the bait. Here, we have identified one source of false-positive interactions in such studies. We have found that a combination of: 1) the variable sequence of the C-terminus of the bait with 2) a C-terminal valine "cloning scar" present in a commercially available ORF library, can in some cases create a peptide motif that results in the aberrant co-purification of endogenous cellular proteins. Control experiments may not identify false positives resulting from such artificial motifs, as aberrant binding depends on sequences that vary from one bait to another. It is possible that such cryptic protein binding might occur in other systems using affinity tagged proteins; this study highlights the importance of conducting careful follow-up studies where novel protein-protein interactions are suspected.

  12. The role of electrostatics in protein-protein interactions of a monoclonal antibody.

    Science.gov (United States)

    Roberts, D; Keeling, R; Tracka, M; van der Walle, C F; Uddin, S; Warwicker, J; Curtis, R

    2014-07-07

    Understanding how protein-protein interactions depend on the choice of buffer, salt, ionic strength, and pH is needed to have better control over protein solution behavior. Here, we have characterized the pH and ionic strength dependence of protein-protein interactions in terms of an interaction parameter kD obtained from dynamic light scattering and the osmotic second virial coefficient B22 measured by static light scattering. A simplified protein-protein interaction model based on a Baxter adhesive potential and an electric double layer force is used to separate out the contributions of longer-ranged electrostatic interactions from short-ranged attractive forces. The ionic strength dependence of protein-protein interactions for solutions at pH 6.5 and below can be accurately captured using a Deryaguin-Landau-Verwey-Overbeek (DLVO) potential to describe the double layer forces. In solutions at pH 9, attractive electrostatics occur over the ionic strength range of 5-275 mM. At intermediate pH values (7.25 to 8.5), there is a crossover effect characterized by a nonmonotonic ionic strength dependence of protein-protein interactions, which can be rationalized by the competing effects of long-ranged repulsive double layer forces at low ionic strength and a shorter ranged electrostatic attraction, which dominates above a critical ionic strength. The change of interactions from repulsive to attractive indicates a concomitant change in the angular dependence of protein-protein interaction from isotropic to anisotropic. In the second part of the paper, we show how the Baxter adhesive potential can be used to predict values of kD from fitting to B22 measurements, thus providing a molecular basis for the linear correlation between the two protein-protein interaction parameters.

  13. Deciphering peculiar protein-protein interacting modules in Deinococcus radiodurans

    Directory of Open Access Journals (Sweden)

    Barkallah Insaf

    2009-04-01

    Full Text Available Abstract Interactomes of proteins under positive selection from ionizing-radiation-resistant bacteria (IRRB might be a part of the answer to the question as to how IRRB, particularly Deinococcus radiodurans R1 (Deira, resist ionizing radiation. Here, using the Database of Interacting Proteins (DIP and the Protein Structural Interactome (PSI-base server for PSI map, we have predicted novel interactions of orthologs of the 58 proteins under positive selection in Deira and other IRRB, but which are absent in IRSB. Among these, 18 domains and their interactomes have been identified in DNA checkpoint and repair; kinases pathways; energy and nucleotide metabolisms were the important biological processes that were found to be involved. This finding provides new clues to the cellular pathways that can to be important for ionizing-radiation resistance in Deira.

  14. Probability weighted ensemble transfer learning for predicting interactions between HIV-1 and human proteins.

    Directory of Open Access Journals (Sweden)

    Suyu Mei

    Full Text Available Reconstruction of host-pathogen protein interaction networks is of great significance to reveal the underlying microbic pathogenesis. However, the current experimentally-derived networks are generally small and should be augmented by computational methods for less-biased biological inference. From the point of view of computational modelling, data scarcity, data unavailability and negative data sampling are the three major problems for host-pathogen protein interaction networks reconstruction. In this work, we are motivated to address the three concerns and propose a probability weighted ensemble transfer learning model for HIV-human protein interaction prediction (PWEN-TLM, where support vector machine (SVM is adopted as the individual classifier of the ensemble model. In the model, data scarcity and data unavailability are tackled by homolog knowledge transfer. The importance of homolog knowledge is measured by the ROC-AUC metric of the individual classifiers, whose outputs are probability weighted to yield the final decision. In addition, we further validate the assumption that only the homolog knowledge is sufficient to train a satisfactory model for host-pathogen protein interaction prediction. Thus the model is more robust against data unavailability with less demanding data constraint. As regards with negative data construction, experiments show that exclusiveness of subcellular co-localized proteins is unbiased and more reliable than random sampling. Last, we conduct analysis of overlapped predictions between our model and the existing models, and apply the model to novel host-pathogen PPIs recognition for further biological research.

  15. An automated decision-tree approach to predicting protein interaction hot spots.

    Science.gov (United States)

    Darnell, Steven J; Page, David; Mitchell, Julie C

    2007-09-01

    Protein-protein interactions can be altered by mutating one or more "hot spots," the subset of residues that account for most of the interface's binding free energy. The identification of hot spots requires a significant experimental effort, highlighting the practical value of hot spot predictions. We present two knowledge-based models that improve the ability to predict hot spots: K-FADE uses shape specificity features calculated by the Fast Atomic Density Evaluation (FADE) program, and K-CON uses biochemical contact features. The combined K-FADE/CON (KFC) model displays better overall predictive accuracy than computational alanine scanning (Robetta-Ala). In addition, because these methods predict different subsets of known hot spots, a large and significant increase in accuracy is achieved by combining KFC and Robetta-Ala. The KFC analysis is applied to the calmodulin (CaM)/smooth muscle myosin light chain kinase (smMLCK) interface, and to the bone morphogenetic protein-2 (BMP-2)/BMP receptor-type I (BMPR-IA) interface. The results indicate a strong correlation between KFC hot spot predictions and mutations that significantly reduce the binding affinity of the interface. 2007 Wiley-Liss, Inc.

  16. A rice kinase-protein interaction map.

    Science.gov (United States)

    Ding, Xiaodong; Richter, Todd; Chen, Mei; Fujii, Hiroaki; Seo, Young Su; Xie, Mingtang; Zheng, Xianwu; Kanrar, Siddhartha; Stevenson, Rebecca A; Dardick, Christopher; Li, Ying; Jiang, Hao; Zhang, Yan; Yu, Fahong; Bartley, Laura E; Chern, Mawsheng; Bart, Rebecca; Chen, Xiuhua; Zhu, Lihuang; Farmerie, William G; Gribskov, Michael; Zhu, Jian-Kang; Fromm, Michael E; Ronald, Pamela C; Song, Wen-Yuan

    2009-03-01

    Plants uniquely contain large numbers of protein kinases, and for the vast majority of the 1,429 kinases predicted in the rice (Oryza sativa) genome, little is known of their functions. Genetic approaches often fail to produce observable phenotypes; thus, new strategies are needed to delineate kinase function. We previously developed a cost-effective high-throughput yeast two-hybrid system. Using this system, we have generated a protein interaction map of 116 representative rice kinases and 254 of their interacting proteins. Overall, the resulting interaction map supports a large number of known or predicted kinase-protein interactions from both plants and animals and reveals many new functional insights. Notably, we found a potential widespread role for E3 ubiquitin ligases in pathogen defense signaling mediated by receptor-like kinases, particularly by the kinases that may have evolved from recently expanded kinase subfamilies in rice. We anticipate that the data provided here will serve as a foundation for targeted functional studies in rice and other plants. The application of yeast two-hybrid and TAPtag analyses for large-scale plant protein interaction studies is also discussed.

  17. Protein Structure Prediction by Protein Threading

    Science.gov (United States)

    Xu, Ying; Liu, Zhijie; Cai, Liming; Xu, Dong

    The seminal work of Bowie, Lüthy, and Eisenberg (Bowie et al., 1991) on "the inverse protein folding problem" laid the foundation of protein structure prediction by protein threading. By using simple measures for fitness of different amino acid types to local structural environments defined in terms of solvent accessibility and protein secondary structure, the authors derived a simple and yet profoundly novel approach to assessing if a protein sequence fits well with a given protein structural fold. Their follow-up work (Elofsson et al., 1996; Fischer and Eisenberg, 1996; Fischer et al., 1996a,b) and the work by Jones, Taylor, and Thornton (Jones et al., 1992) on protein fold recognition led to the development of a new brand of powerful tools for protein structure prediction, which we now term "protein threading." These computational tools have played a key role in extending the utility of all the experimentally solved structures by X-ray crystallography and nuclear magnetic resonance (NMR), providing structural models and functional predictions for many of the proteins encoded in the hundreds of genomes that have been sequenced up to now.

  18. Identification of Essential Proteins Based on a New Combination of Local Interaction Density and Protein Complexes.

    Directory of Open Access Journals (Sweden)

    Jiawei Luo

    Full Text Available Computational approaches aided by computer science have been used to predict essential proteins and are faster than expensive, time-consuming, laborious experimental approaches. However, the performance of such approaches is still poor, making practical applications of computational approaches difficult in some fields. Hence, the development of more suitable and efficient computing methods is necessary for identification of essential proteins.In this paper, we propose a new method for predicting essential proteins in a protein interaction network, local interaction density combined with protein complexes (LIDC, based on statistical analyses of essential proteins and protein complexes. First, we introduce a new local topological centrality, local interaction density (LID, of the yeast PPI network; second, we discuss a new integration strategy for multiple bioinformatics. The LIDC method was then developed through a combination of LID and protein complex information based on our new integration strategy. The purpose of LIDC is discovery of important features of essential proteins with their neighbors in real protein complexes, thereby improving the efficiency of identification.Experimental results based on three different PPI(protein-protein interaction networks of Saccharomyces cerevisiae and Escherichia coli showed that LIDC outperformed classical topological centrality measures and some recent combinational methods. Moreover, when predicting MIPS datasets, the better improvement of performance obtained by LIDC is over all nine reference methods (i.e., DC, BC, NC, LID, PeC, CoEWC, WDC, ION, and UC.LIDC is more effective for the prediction of essential proteins than other recently developed methods.

  19. Improving protein function prediction methods with integrated literature data

    Directory of Open Access Journals (Sweden)

    Gabow Aaron P

    2008-04-01

    Full Text Available Abstract Background Determining the function of uncharacterized proteins is a major challenge in the post-genomic era due to the problem's complexity and scale. Identifying a protein's function contributes to an understanding of its role in the involved pathways, its suitability as a drug target, and its potential for protein modifications. Several graph-theoretic approaches predict unidentified functions of proteins by using the functional annotations of better-characterized proteins in protein-protein interaction networks. We systematically consider the use of literature co-occurrence data, introduce a new method for quantifying the reliability of co-occurrence and test how performance differs across species. We also quantify changes in performance as the prediction algorithms annotate with increased specificity. Results We find that including information on the co-occurrence of proteins within an abstract greatly boosts performance in the Functional Flow graph-theoretic function prediction algorithm in yeast, fly and worm. This increase in performance is not simply due to the presence of additional edges since supplementing protein-protein interactions with co-occurrence data outperforms supplementing with a comparably-sized genetic interaction dataset. Through the combination of protein-protein interactions and co-occurrence data, the neighborhood around unknown proteins is quickly connected to well-characterized nodes which global prediction algorithms can exploit. Our method for quantifying co-occurrence reliability shows superior performance to the other methods, particularly at threshold values around 10% which yield the best trade off between coverage and accuracy. In contrast, the traditional way of asserting co-occurrence when at least one abstract mentions both proteins proves to be the worst method for generating co-occurrence data, introducing too many false positives. Annotating the functions with greater specificity is harder

  20. Our interests in protein-protein interactions

    Indian Academy of Sciences (India)

    protein interactions. Evolution of P-P partnerships. Evolution of P-P structures. Evolutionary dynamics of P-P interactions. Dynamics of P-P interaction network. Host-pathogen interactions. CryoEM mapping of gigantic protein assemblies.

  1. Sequence motifs in MADS transcription factors responsible for specificity and diversification of protein-protein interaction.

    Directory of Open Access Journals (Sweden)

    Aalt D J van Dijk

    Full Text Available Protein sequences encompass tertiary structures and contain information about specific molecular interactions, which in turn determine biological functions of proteins. Knowledge about how protein sequences define interaction specificity is largely missing, in particular for paralogous protein families with high sequence similarity, such as the plant MADS domain transcription factor family. In comparison to the situation in mammalian species, this important family of transcription regulators has expanded enormously in plant species and contains over 100 members in the model plant species Arabidopsis thaliana. Here, we provide insight into the mechanisms that determine protein-protein interaction specificity for the Arabidopsis MADS domain transcription factor family, using an integrated computational and experimental approach. Plant MADS proteins have highly similar amino acid sequences, but their dimerization patterns vary substantially. Our computational analysis uncovered small sequence regions that explain observed differences in dimerization patterns with reasonable accuracy. Furthermore, we show the usefulness of the method for prediction of MADS domain transcription factor interaction networks in other plant species. Introduction of mutations in the predicted interaction motifs demonstrated that single amino acid mutations can have a large effect and lead to loss or gain of specific interactions. In addition, various performed bioinformatics analyses shed light on the way evolution has shaped MADS domain transcription factor interaction specificity. Identified protein-protein interaction motifs appeared to be strongly conserved among orthologs, indicating their evolutionary importance. We also provide evidence that mutations in these motifs can be a source for sub- or neo-functionalization. The analyses presented here take us a step forward in understanding protein-protein interactions and the interplay between protein sequences and

  2. DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier

    KAUST Repository

    Kulmanov, Maxat

    2017-09-27

    Motivation A large number of protein sequences are becoming available through the application of novel high-throughput sequencing technologies. Experimental functional characterization of these proteins is time-consuming and expensive, and is often only done rigorously for few selected model organisms. Computational function prediction approaches have been suggested to fill this gap. The functions of proteins are classified using the Gene Ontology (GO), which contains over 40 000 classes. Additionally, proteins have multiple functions, making function prediction a large-scale, multi-class, multi-label problem. Results We have developed a novel method to predict protein function from sequence. We use deep learning to learn features from protein sequences as well as a cross-species protein–protein interaction network. Our approach specifically outputs information in the structure of the GO and utilizes the dependencies between GO classes as background information to construct a deep learning model. We evaluate our method using the standards established by the Computational Assessment of Function Annotation (CAFA) and demonstrate a significant improvement over baseline methods such as BLAST, in particular for predicting cellular locations.

  3. Blood profile of proteins and steroid hormones predicts weight change after weight loss with interactions of dietary protein level and glycemic index.

    Directory of Open Access Journals (Sweden)

    Ping Wang

    2011-02-01

    Full Text Available Weight regain after weight loss is common. In the Diogenes dietary intervention study, high protein and low glycemic index (GI diet improved weight maintenance.To identify blood predictors for weight change after weight loss following the dietary intervention within the Diogenes study.Blood samples were collected at baseline and after 8-week low caloric diet-induced weight loss from 48 women who continued to lose weight and 48 women who regained weight during subsequent 6-month dietary intervention period with 4 diets varying in protein and GI levels. Thirty-one proteins and 3 steroid hormones were measured.Angiotensin I converting enzyme (ACE was the most important predictor. Its greater reduction during the 8-week weight loss was related to continued weight loss during the subsequent 6 months, identified by both Logistic Regression and Random Forests analyses. The prediction power of ACE was influenced by immunoproteins, particularly fibrinogen. Leptin, luteinizing hormone and some immunoproteins showed interactions with dietary protein level, while interleukin 8 showed interaction with GI level on the prediction of weight maintenance. A predictor panel of 15 variables enabled an optimal classification by Random Forests with an error rate of 24±1%. A logistic regression model with independent variables from 9 blood analytes had a prediction accuracy of 92%.A selected panel of blood proteins/steroids can predict the weight change after weight loss. ACE may play an important role in weight maintenance. The interactions of blood factors with dietary components are important for personalized dietary advice after weight loss.ClinicalTrials.gov NCT00390637.

  4. Protein- protein interaction detection system using fluorescent protein microdomains

    Science.gov (United States)

    Waldo, Geoffrey S.; Cabantous, Stephanie

    2010-02-23

    The invention provides a protein labeling and interaction detection system based on engineered fragments of fluorescent and chromophoric proteins that require fused interacting polypeptides to drive the association of the fragments, and further are soluble and stable, and do not change the solubility of polypeptides to which they are fused. In one embodiment, a test protein X is fused to a sixteen amino acid fragment of GFP (.beta.-strand 10, amino acids 198-214), engineered to not perturb fusion protein solubility. A second test protein Y is fused to a sixteen amino acid fragment of GFP (.beta.-strand 11, amino acids 215-230), engineered to not perturb fusion protein solubility. When X and Y interact, they bring the GFP strands into proximity, and are detected by complementation with a third GFP fragment consisting of GFP amino acids 1-198 (strands 1-9). When GFP strands 10 and 11 are held together by interaction of protein X and Y, they spontaneous association with GFP strands 1-9, resulting in structural complementation, folding, and concomitant GFP fluorescence.

  5. Protein-protein interactions: an application of Tus-Ter mediated protein microarray system.

    Science.gov (United States)

    Sitaraman, Kalavathy; Chatterjee, Deb K

    2011-01-01

    In this chapter, we present a novel, cost-effective microarray strategy that utilizes expression-ready plasmid DNAs to generate protein arrays on-demand and its use to validate protein-protein interactions. These expression plasmids were constructed in such a way so as to serve a dual purpose of synthesizing the protein of interest as well as capturing the synthesized protein. The microarray system is based on the high affinity binding of Escherichia coli "Tus" protein to "Ter," a 20 bp DNA sequence involved in the regulation of DNA replication. The protein expression is carried out in a cell-free protein synthesis system, with rabbit reticulocyte lysates, and the target proteins are detected either by labeled incorporated tag specific or by gene-specific antibodies. This microarray system has been successfully used for the detection of protein-protein interaction because both the target protein and the query protein can be transcribed and translated simultaneously in the microarray slides. The utility of this system for detecting protein-protein interaction is demonstrated by a few well-known examples: Jun/Fos, FRB/FKBP12, p53/MDM2, and CDK4/p16. In all these cases, the presence of protein complexes resulted in the localization of fluorophores at the specific sites of the immobilized target plasmids. Interestingly, during our interactions studies we also detected a previously unknown interaction between CDK2 and p16. Thus, this Tus-Ter based system of protein microarray can be used for the validation of known protein interactions as well as for identifying new protein-protein interactions. In addition, it can be used to examine and identify targets of nucleic acid-protein, ligand-receptor, enzyme-substrate, and drug-protein interactions.

  6. A coevolution analysis for identifying protein-protein interactions by Fourier transform

    Science.gov (United States)

    Yin, Changchuan; Yau, Stephen S. -T.

    2017-01-01

    Protein-protein interactions (PPIs) play key roles in life processes, such as signal transduction, transcription regulations, and immune response, etc. Identification of PPIs enables better understanding of the functional networks within a cell. Common experimental methods for identifying PPIs are time consuming and expensive. However, recent developments in computational approaches for inferring PPIs from protein sequences based on coevolution theory avoid these problems. In the coevolution theory model, interacted proteins may show coevolutionary mutations and have similar phylogenetic trees. The existing coevolution methods depend on multiple sequence alignments (MSA); however, the MSA-based coevolution methods often produce high false positive interactions. In this paper, we present a computational method using an alignment-free approach to accurately detect PPIs and reduce false positives. In the method, protein sequences are numerically represented by biochemical properties of amino acids, which reflect the structural and functional differences of proteins. Fourier transform is applied to the numerical representation of protein sequences to capture the dissimilarities of protein sequences in biophysical context. The method is assessed for predicting PPIs in Ebola virus. The results indicate strong coevolution between the protein pairs (NP-VP24, NP-VP30, NP-VP40, VP24-VP30, VP24-VP40, and VP30-VP40). The method is also validated for PPIs in influenza and E.coli genomes. Since our method can reduce false positive and increase the specificity of PPI prediction, it offers an effective tool to understand mechanisms of disease pathogens and find potential targets for drug design. The Python programs in this study are available to public at URL (https://github.com/cyinbox/PPI). PMID:28430779

  7. A coevolution analysis for identifying protein-protein interactions by Fourier transform.

    Directory of Open Access Journals (Sweden)

    Changchuan Yin

    Full Text Available Protein-protein interactions (PPIs play key roles in life processes, such as signal transduction, transcription regulations, and immune response, etc. Identification of PPIs enables better understanding of the functional networks within a cell. Common experimental methods for identifying PPIs are time consuming and expensive. However, recent developments in computational approaches for inferring PPIs from protein sequences based on coevolution theory avoid these problems. In the coevolution theory model, interacted proteins may show coevolutionary mutations and have similar phylogenetic trees. The existing coevolution methods depend on multiple sequence alignments (MSA; however, the MSA-based coevolution methods often produce high false positive interactions. In this paper, we present a computational method using an alignment-free approach to accurately detect PPIs and reduce false positives. In the method, protein sequences are numerically represented by biochemical properties of amino acids, which reflect the structural and functional differences of proteins. Fourier transform is applied to the numerical representation of protein sequences to capture the dissimilarities of protein sequences in biophysical context. The method is assessed for predicting PPIs in Ebola virus. The results indicate strong coevolution between the protein pairs (NP-VP24, NP-VP30, NP-VP40, VP24-VP30, VP24-VP40, and VP30-VP40. The method is also validated for PPIs in influenza and E.coli genomes. Since our method can reduce false positive and increase the specificity of PPI prediction, it offers an effective tool to understand mechanisms of disease pathogens and find potential targets for drug design. The Python programs in this study are available to public at URL (https://github.com/cyinbox/PPI.

  8. Predicting Ligand Binding Sites on Protein Surfaces by 3-Dimensional Probability Density Distributions of Interacting Atoms

    Science.gov (United States)

    Jian, Jhih-Wei; Elumalai, Pavadai; Pitti, Thejkiran; Wu, Chih Yuan; Tsai, Keng-Chang; Chang, Jeng-Yih; Peng, Hung-Pin; Yang, An-Suei

    2016-01-01

    Predicting ligand binding sites (LBSs) on protein structures, which are obtained either from experimental or computational methods, is a useful first step in functional annotation or structure-based drug design for the protein structures. In this work, the structure-based machine learning algorithm ISMBLab-LIG was developed to predict LBSs on protein surfaces with input attributes derived from the three-dimensional probability density maps of interacting atoms, which were reconstructed on the query protein surfaces and were relatively insensitive to local conformational variations of the tentative ligand binding sites. The prediction accuracy of the ISMBLab-LIG predictors is comparable to that of the best LBS predictors benchmarked on several well-established testing datasets. More importantly, the ISMBLab-LIG algorithm has substantial tolerance to the prediction uncertainties of computationally derived protein structure models. As such, the method is particularly useful for predicting LBSs not only on experimental protein structures without known LBS templates in the database but also on computationally predicted model protein structures with structural uncertainties in the tentative ligand binding sites. PMID:27513851

  9. Aquaporin Protein-Protein Interactions

    Directory of Open Access Journals (Sweden)

    Jennifer Virginia Roche

    2017-10-01

    Full Text Available Aquaporins are tetrameric membrane-bound channels that facilitate transport of water and other small solutes across cell membranes. In eukaryotes, they are frequently regulated by gating or trafficking, allowing for the cell to control membrane permeability in a specific manner. Protein–protein interactions play crucial roles in both regulatory processes and also mediate alternative functions such as cell adhesion. In this review, we summarize recent knowledge about aquaporin protein–protein interactions; dividing the interactions into three types: (1 interactions between aquaporin tetramers; (2 interactions between aquaporin monomers within a tetramer (hetero-tetramerization; and (3 transient interactions with regulatory proteins. We particularly focus on the structural aspects of the interactions, discussing the small differences within a conserved overall fold that allow for aquaporins to be differentially regulated in an organism-, tissue- and trigger-specific manner. A deep knowledge about these differences is needed to fully understand aquaporin function and regulation in many physiological processes, and may enable design of compounds targeting specific aquaporins for treatment of human disease.

  10. Toxicological relationships between proteins obtained from protein target predictions of large toxicity databases

    International Nuclear Information System (INIS)

    Nigsch, Florian; Mitchell, John B.O.

    2008-01-01

    The combination of models for protein target prediction with large databases containing toxicological information for individual molecules allows the derivation of 'toxiclogical' profiles, i.e., to what extent are molecules of known toxicity predicted to interact with a set of protein targets. To predict protein targets of drug-like and toxic molecules, we built a computational multiclass model using the Winnow algorithm based on a dataset of protein targets derived from the MDL Drug Data Report. A 15-fold Monte Carlo cross-validation using 50% of each class for training, and the remaining 50% for testing, provided an assessment of the accuracy of that model. We retained the 3 top-ranking predictions and found that in 82% of all cases the correct target was predicted within these three predictions. The first prediction was the correct one in almost 70% of cases. A model built on the whole protein target dataset was then used to predict the protein targets for 150 000 molecules from the MDL Toxicity Database. We analysed the frequency of the predictions across the panel of protein targets for experimentally determined toxicity classes of all molecules. This allowed us to identify clusters of proteins related by their toxicological profiles, as well as toxicities that are related. Literature-based evidence is provided for some specific clusters to show the relevance of the relationships identified

  11. Analysis of intraviral protein-protein interactions of the SARS coronavirus ORFeome.

    Directory of Open Access Journals (Sweden)

    Albrecht von Brunn

    2007-05-01

    Full Text Available The severe acute respiratory syndrome coronavirus (SARS-CoV genome is predicted to encode 14 functional open reading frames, leading to the expression of up to 30 structural and non-structural protein products. The functions of a large number of viral ORFs are poorly understood or unknown. In order to gain more insight into functions and modes of action and interaction of the different proteins, we cloned the viral ORFeome and performed a genome-wide analysis for intraviral protein interactions and for intracellular localization. 900 pairwise interactions were tested by yeast-two-hybrid matrix analysis, and more than 65 positive non-redundant interactions, including six self interactions, were identified. About 38% of interactions were subsequently confirmed by CoIP in mammalian cells. Nsp2, nsp8 and ORF9b showed a wide range of interactions with other viral proteins. Nsp8 interacts with replicase proteins nsp2, nsp5, nsp6, nsp7, nsp8, nsp9, nsp12, nsp13 and nsp14, indicating a crucial role as a major player within the replication complex machinery. It was shown by others that nsp8 is essential for viral replication in vitro, whereas nsp2 is not. We show that also accessory protein ORF9b does not play a pivotal role for viral replication, as it can be deleted from the virus displaying normal plaque sizes and growth characteristics in Vero cells. However, it can be expected to be important for the virus-host interplay and for pathogenicity, due to its large number of interactions, by enhancing the global stability of the SARS proteome network, or play some unrealized role in regulating protein-protein interactions. The interactions identified provide valuable material for future studies.

  12. Predicting co-complexed protein pairs using genomic and proteomic data integration

    Directory of Open Access Journals (Sweden)

    King Oliver D

    2004-04-01

    Full Text Available Abstract Background Identifying all protein-protein interactions in an organism is a major objective of proteomics. A related goal is to know which protein pairs are present in the same protein complex. High-throughput methods such as yeast two-hybrid (Y2H and affinity purification coupled with mass spectrometry (APMS have been used to detect interacting proteins on a genomic scale. However, both Y2H and APMS methods have substantial false-positive rates. Aside from high-throughput interaction screens, other gene- or protein-pair characteristics may also be informative of physical interaction. Therefore it is desirable to integrate multiple datasets and utilize their different predictive value for more accurate prediction of co-complexed relationship. Results Using a supervised machine learning approach – probabilistic decision tree, we integrated high-throughput protein interaction datasets and other gene- and protein-pair characteristics to predict co-complexed pairs (CCP of proteins. Our predictions proved more sensitive and specific than predictions based on Y2H or APMS methods alone or in combination. Among the top predictions not annotated as CCPs in our reference set (obtained from the MIPS complex catalogue, a significant fraction was found to physically interact according to a separate database (YPD, Yeast Proteome Database, and the remaining predictions may potentially represent unknown CCPs. Conclusions We demonstrated that the probabilistic decision tree approach can be successfully used to predict co-complexed protein (CCP pairs from other characteristics. Our top-scoring CCP predictions provide testable hypotheses for experimental validation.

  13. Detecting mutually exclusive interactions in protein-protein interaction maps.

    KAUST Repository

    Sánchez Claros, Carmen

    2012-06-08

    Comprehensive protein interaction maps can complement genetic and biochemical experiments and allow the formulation of new hypotheses to be tested in the system of interest. The computational analysis of the maps may help to focus on interesting cases and thereby to appropriately prioritize the validation experiments. We show here that, by automatically comparing and analyzing structurally similar regions of proteins of known structure interacting with a common partner, it is possible to identify mutually exclusive interactions present in the maps with a sensitivity of 70% and a specificity higher than 85% and that, in about three fourth of the correctly identified complexes, we also correctly recognize at least one residue (five on average) belonging to the interaction interface. Given the present and continuously increasing number of proteins of known structure, the requirement of the knowledge of the structure of the interacting proteins does not substantially impact on the coverage of our strategy that can be estimated to be around 25%. We also introduce here the Estrella server that embodies this strategy, is designed for users interested in validating specific hypotheses about the functional role of a protein-protein interaction and it also allows access to pre-computed data for seven organisms.

  14. Detecting mutually exclusive interactions in protein-protein interaction maps.

    KAUST Repository

    Sá nchez Claros, Carmen; Tramontano, Anna

    2012-01-01

    Comprehensive protein interaction maps can complement genetic and biochemical experiments and allow the formulation of new hypotheses to be tested in the system of interest. The computational analysis of the maps may help to focus on interesting cases and thereby to appropriately prioritize the validation experiments. We show here that, by automatically comparing and analyzing structurally similar regions of proteins of known structure interacting with a common partner, it is possible to identify mutually exclusive interactions present in the maps with a sensitivity of 70% and a specificity higher than 85% and that, in about three fourth of the correctly identified complexes, we also correctly recognize at least one residue (five on average) belonging to the interaction interface. Given the present and continuously increasing number of proteins of known structure, the requirement of the knowledge of the structure of the interacting proteins does not substantially impact on the coverage of our strategy that can be estimated to be around 25%. We also introduce here the Estrella server that embodies this strategy, is designed for users interested in validating specific hypotheses about the functional role of a protein-protein interaction and it also allows access to pre-computed data for seven organisms.

  15. Topology-function conservation in protein-protein interaction networks.

    Science.gov (United States)

    Davis, Darren; Yaveroğlu, Ömer Nebil; Malod-Dognin, Noël; Stojmirovic, Aleksandar; Pržulj, Nataša

    2015-05-15

    Proteins underlay the functioning of a cell and the wiring of proteins in protein-protein interaction network (PIN) relates to their biological functions. Proteins with similar wiring in the PIN (topology around them) have been shown to have similar functions. This property has been successfully exploited for predicting protein functions. Topological similarity is also used to guide network alignment algorithms that find similarly wired proteins between PINs of different species; these similarities are used to transfer annotation across PINs, e.g. from model organisms to human. To refine these functional predictions and annotation transfers, we need to gain insight into the variability of the topology-function relationships. For example, a function may be significantly associated with specific topologies, while another function may be weakly associated with several different topologies. Also, the topology-function relationships may differ between different species. To improve our understanding of topology-function relationships and of their conservation among species, we develop a statistical framework that is built upon canonical correlation analysis. Using the graphlet degrees to represent the wiring around proteins in PINs and gene ontology (GO) annotations to describe their functions, our framework: (i) characterizes statistically significant topology-function relationships in a given species, and (ii) uncovers the functions that have conserved topology in PINs of different species, which we term topologically orthologous functions. We apply our framework to PINs of yeast and human, identifying seven biological process and two cellular component GO terms to be topologically orthologous for the two organisms. © The Author 2015. Published by Oxford University Press.

  16. SynechoNET: integrated protein-protein interaction database of a model cyanobacterium Synechocystis sp. PCC 6803

    OpenAIRE

    Kim, Woo-Yeon; Kang, Sungsoo; Kim, Byoung-Chul; Oh, Jeehyun; Cho, Seongwoong; Bhak, Jong; Choi, Jong-Soon

    2008-01-01

    Background Cyanobacteria are model organisms for studying photosynthesis, carbon and nitrogen assimilation, evolution of plant plastids, and adaptability to environmental stresses. Despite many studies on cyanobacteria, there is no web-based database of their regulatory and signaling protein-protein interaction networks to date. Description We report a database and website SynechoNET that provides predicted protein-protein interactions. SynechoNET shows cyanobacterial domain-domain interactio...

  17. Non-interacting surface solvation and dynamics in protein-protein interactions

    NARCIS (Netherlands)

    Visscher, Koen M.; Kastritis, Panagiotis L.|info:eu-repo/dai/nl/315886668; Bonvin, Alexandre M J J|info:eu-repo/dai/nl/113691238

    2015-01-01

    Protein-protein interactions control a plethora of cellular processes, including cell proliferation, differentiation, apoptosis, and signal transduction. Understanding how and why proteins interact will inevitably lead to novel structure-based drug design methods, as well as design of de novo

  18. Prediction of protein interaction hot spots using rough set-based multiple criteria linear programming.

    Science.gov (United States)

    Chen, Ruoying; Zhang, Zhiwang; Wu, Di; Zhang, Peng; Zhang, Xinyang; Wang, Yong; Shi, Yong

    2011-01-21

    Protein-protein interactions are fundamentally important in many biological processes and it is in pressing need to understand the principles of protein-protein interactions. Mutagenesis studies have found that only a small fraction of surface residues, known as hot spots, are responsible for the physical binding in protein complexes. However, revealing hot spots by mutagenesis experiments are usually time consuming and expensive. In order to complement the experimental efforts, we propose a new computational approach in this paper to predict hot spots. Our method, Rough Set-based Multiple Criteria Linear Programming (RS-MCLP), integrates rough sets theory and multiple criteria linear programming to choose dominant features and computationally predict hot spots. Our approach is benchmarked by a dataset of 904 alanine-mutated residues and the results show that our RS-MCLP method performs better than other methods, e.g., MCLP, Decision Tree, Bayes Net, and the existing HotSprint database. In addition, we reveal several biological insights based on our analysis. We find that four features (the change of accessible surface area, percentage of the change of accessible surface area, size of a residue, and atomic contacts) are critical in predicting hot spots. Furthermore, we find that three residues (Tyr, Trp, and Phe) are abundant in hot spots through analyzing the distribution of amino acids. Copyright © 2010 Elsevier Ltd. All rights reserved.

  19. KFC Server: interactive forecasting of protein interaction hot spots.

    Science.gov (United States)

    Darnell, Steven J; LeGault, Laura; Mitchell, Julie C

    2008-07-01

    The KFC Server is a web-based implementation of the KFC (Knowledge-based FADE and Contacts) model-a machine learning approach for the prediction of binding hot spots, or the subset of residues that account for most of a protein interface's; binding free energy. The server facilitates the automated analysis of a user submitted protein-protein or protein-DNA interface and the visualization of its hot spot predictions. For each residue in the interface, the KFC Server characterizes its local structural environment, compares that environment to the environments of experimentally determined hot spots and predicts if the interface residue is a hot spot. After the computational analysis, the user can visualize the results using an interactive job viewer able to quickly highlight predicted hot spots and surrounding structural features within the protein structure. The KFC Server is accessible at http://kfc.mitchell-lab.org.

  20. Mapping Protein-Protein Interactions by Quantitative Proteomics

    DEFF Research Database (Denmark)

    Dengjel, Joern; Kratchmarova, Irina; Blagoev, Blagoy

    2010-01-01

    spectrometry (MS)-based proteomics in combination with affinity purification protocols has become the method of choice to map and track the dynamic changes in protein-protein interactions, including the ones occurring during cellular signaling events. Different quantitative MS strategies have been used...... to characterize protein interaction networks. In this chapter we describe in detail the use of stable isotope labeling by amino acids in cell culture (SILAC) for the quantitative analysis of stimulus-dependent dynamic protein interactions.......Proteins exert their function inside a cell generally in multiprotein complexes. These complexes are highly dynamic structures changing their composition over time and cell state. The same protein may thereby fulfill different functions depending on its binding partners. Quantitative mass...

  1. PREFACE: Physics approaches to protein interactions and gene regulation Physics approaches to protein interactions and gene regulation

    Science.gov (United States)

    Nussinov, Ruth; Panchenko, Anna R.; Przytycka, Teresa

    2011-06-01

    Physics approaches focus on uncovering, modeling and quantitating the general principles governing the micro and macro universe. This has always been an important component of biological research, however recent advances in experimental techniques and the accumulation of unprecedented genome-scale experimental data produced by these novel technologies now allow for addressing fundamental questions on a large scale. These relate to molecular interactions, principles of bimolecular recognition, and mechanisms of signal propagation. The functioning of a cell requires a variety of intermolecular interactions including protein-protein, protein-DNA, protein-RNA, hormones, peptides, small molecules, lipids and more. Biomolecules work together to provide specific functions and perturbations in intermolecular communication channels often lead to cellular malfunction and disease. A full understanding of the interactome requires an in-depth grasp of the biophysical principles underlying individual interactions as well as their organization in cellular networks. Phenomena can be described at different levels of abstraction. Computational and systems biology strive to model cellular processes by integrating and analyzing complex data from multiple experimental sources using interdisciplinary tools. As a result, both the causal relationships between the variables and the general features of the system can be discovered, which even without knowing the details of the underlying mechanisms allow for putting forth hypotheses and predicting the behavior of the systems in response to perturbation. And here lies the strength of in silico models which provide control and predictive power. At the same time, the complexity of individual elements and molecules can be addressed by the fields of molecular biophysics, physical biology and structural biology, which focus on the underlying physico-chemical principles and may explain the molecular mechanisms of cellular function. In this issue

  2. A Deep Learning Framework for Robust and Accurate Prediction of ncRNA-Protein Interactions Using Evolutionary Information.

    Science.gov (United States)

    Yi, Hai-Cheng; You, Zhu-Hong; Huang, De-Shuang; Li, Xiao; Jiang, Tong-Hai; Li, Li-Ping

    2018-06-01

    The interactions between non-coding RNAs (ncRNAs) and proteins play an important role in many biological processes, and their biological functions are primarily achieved by binding with a variety of proteins. High-throughput biological techniques are used to identify protein molecules bound with specific ncRNA, but they are usually expensive and time consuming. Deep learning provides a powerful solution to computationally predict RNA-protein interactions. In this work, we propose the RPI-SAN model by using the deep-learning stacked auto-encoder network to mine the hidden high-level features from RNA and protein sequences and feed them into a random forest (RF) model to predict ncRNA binding proteins. Stacked assembling is further used to improve the accuracy of the proposed method. Four benchmark datasets, including RPI2241, RPI488, RPI1807, and NPInter v2.0, were employed for the unbiased evaluation of five established prediction tools: RPI-Pred, IPMiner, RPISeq-RF, lncPro, and RPI-SAN. The experimental results show that our RPI-SAN model achieves much better performance than other methods, with accuracies of 90.77%, 89.7%, 96.1%, and 99.33%, respectively. It is anticipated that RPI-SAN can be used as an effective computational tool for future biomedical researches and can accurately predict the potential ncRNA-protein interacted pairs, which provides reliable guidance for biological research. Copyright © 2018 The Author(s). Published by Elsevier Inc. All rights reserved.

  3. C2 Domains as Protein-Protein Interaction Modules in the Ciliary Transition Zone

    Directory of Open Access Journals (Sweden)

    Kim Remans

    2014-07-01

    Full Text Available RPGR-interacting protein 1 (RPGRIP1 is mutated in the eye disease Leber congenital amaurosis (LCA and its structural homolog, RPGRIP1-like (RPGRIP1L, is mutated in many different ciliopathies. Both are multidomain proteins that are predicted to interact with retinitis pigmentosa G-protein regulator (RPGR. RPGR is mutated in X-linked retinitis pigmentosa and is located in photoreceptors and primary cilia. We solved the crystal structure of the complex between the RPGR-interacting domain (RID of RPGRIP1 and RPGR and demonstrate that RPGRIP1L binds to RPGR similarly. RPGRIP1 binding to RPGR affects the interaction with PDEδ, the cargo shuttling factor for prenylated ciliary proteins. RPGRIP1-RID is a C2 domain with a canonical β sandwich structure that does not bind Ca2+ and/or phospholipids and thus constitutes a unique type of protein-protein interaction module. Judging from the large number of C2 domains in most of the ciliary transition zone proteins identified thus far, the structure presented here seems to constitute a cilia-specific module that is present in multiprotein transition zone complexes.

  4. Coevolution study of mitochondria respiratory chain proteins: toward the understanding of protein--protein interaction.

    Science.gov (United States)

    Yang, Ming; Ge, Yan; Wu, Jiayan; Xiao, Jingfa; Yu, Jun

    2011-05-20

    Coevolution can be seen as the interdependency between evolutionary histories. In the context of protein evolution, functional correlation proteins are ever-present coordinated evolutionary characters without disruption of organismal integrity. As to complex system, there are two forms of protein--protein interactions in vivo, which refer to inter-complex interaction and intra-complex interaction. In this paper, we studied the difference of coevolution characters between inter-complex interaction and intra-complex interaction using "Mirror tree" method on the respiratory chain (RC) proteins. We divided the correlation coefficients of every pairwise RC proteins into two groups corresponding to the binary protein--protein interaction in intra-complex and the binary protein--protein interaction in inter-complex, respectively. A dramatical discrepancy is detected between the coevolution characters of the two sets of protein interactions (Wilcoxon test, p-value = 4.4 × 10(-6)). Our finding reveals some critical information on coevolutionary study and assists the mechanical investigation of protein--protein interaction. Furthermore, the results also provide some unique clue for supramolecular organization of protein complexes in the mitochondrial inner membrane. More detailed binding sites map and genome information of nuclear encoded RC proteins will be extraordinary valuable for the further mitochondria dynamics study. Copyright © 2011. Published by Elsevier Ltd.

  5. Evolutionary reprograming of protein-protein interaction specificity.

    Science.gov (United States)

    Akiva, Eyal; Babbitt, Patricia C

    2015-10-22

    Using mutation libraries and deep sequencing, Aakre et al. study the evolution of protein-protein interactions using a toxin-antitoxin model. The results indicate probable trajectories via "intermediate" proteins that are promiscuous, thus avoiding transitions via non-interactions. These results extend observations about other biological interactions and enzyme evolution, suggesting broadly general principles. Copyright © 2015 Elsevier Inc. All rights reserved.

  6. DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier.

    Science.gov (United States)

    Kulmanov, Maxat; Khan, Mohammed Asif; Hoehndorf, Robert; Wren, Jonathan

    2018-02-15

    A large number of protein sequences are becoming available through the application of novel high-throughput sequencing technologies. Experimental functional characterization of these proteins is time-consuming and expensive, and is often only done rigorously for few selected model organisms. Computational function prediction approaches have been suggested to fill this gap. The functions of proteins are classified using the Gene Ontology (GO), which contains over 40 000 classes. Additionally, proteins have multiple functions, making function prediction a large-scale, multi-class, multi-label problem. We have developed a novel method to predict protein function from sequence. We use deep learning to learn features from protein sequences as well as a cross-species protein-protein interaction network. Our approach specifically outputs information in the structure of the GO and utilizes the dependencies between GO classes as background information to construct a deep learning model. We evaluate our method using the standards established by the Computational Assessment of Function Annotation (CAFA) and demonstrate a significant improvement over baseline methods such as BLAST, in particular for predicting cellular locations. Web server: http://deepgo.bio2vec.net, Source code: https://github.com/bio-ontology-research-group/deepgo. robert.hoehndorf@kaust.edu.sa. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.

  7. Can infrared spectroscopy provide information on protein-protein interactions?

    Science.gov (United States)

    Haris, Parvez I

    2010-08-01

    For most biophysical techniques, characterization of protein-protein interactions is challenging; this is especially true with methods that rely on a physical phenomenon that is common to both of the interacting proteins. Thus, for example, in IR spectroscopy, the carbonyl vibration (1600-1700 cm(-1)) associated with the amide bonds from both of the interacting proteins will overlap extensively, making the interpretation of spectral changes very complicated. Isotope-edited infrared spectroscopy, where one of the interacting proteins is uniformly labelled with (13)C or (13)C,(15)N has been introduced as a solution to this problem, enabling the study of protein-protein interactions using IR spectroscopy. The large shift of the amide I band (approx. 45 cm(-1) towards lower frequency) upon (13)C labelling of one of the proteins reveals the amide I band of the unlabelled protein, enabling it to be used as a probe for monitoring conformational changes. With site-specific isotopic labelling, structural resolution at the level of individual amino acid residues can be achieved. Furthermore, the ability to record IR spectra of proteins in diverse environments means that isotope-edited IR spectroscopy can be used to structurally characterize difficult systems such as protein-protein complexes bound to membranes or large insoluble peptide/protein aggregates. In the present article, examples of application of isotope-edited IR spectroscopy for studying protein-protein interactions are provided.

  8. Cloud prediction of protein structure and function with PredictProtein for Debian.

    Science.gov (United States)

    Kaján, László; Yachdav, Guy; Vicedo, Esmeralda; Steinegger, Martin; Mirdita, Milot; Angermüller, Christof; Böhm, Ariane; Domke, Simon; Ertl, Julia; Mertes, Christian; Reisinger, Eva; Staniewski, Cedric; Rost, Burkhard

    2013-01-01

    We report the release of PredictProtein for the Debian operating system and derivatives, such as Ubuntu, Bio-Linux, and Cloud BioLinux. The PredictProtein suite is available as a standard set of open source Debian packages. The release covers the most popular prediction methods from the Rost Lab, including methods for the prediction of secondary structure and solvent accessibility (profphd), nuclear localization signals (predictnls), and intrinsically disordered regions (norsnet). We also present two case studies that successfully utilize PredictProtein packages for high performance computing in the cloud: the first analyzes protein disorder for whole organisms, and the second analyzes the effect of all possible single sequence variants in protein coding regions of the human genome.

  9. Protein-Protein Interactions Prediction Using a Novel Local Conjoint Triad Descriptor of Amino Acid Sequences

    Directory of Open Access Journals (Sweden)

    Jun Wang

    2017-11-01

    Full Text Available Protein-protein interactions (PPIs play crucial roles in almost all cellular processes. Although a large amount of PPIs have been verified by high-throughput techniques in the past decades, currently known PPIs pairs are still far from complete. Furthermore, the wet-lab experiments based techniques for detecting PPIs are time-consuming and expensive. Hence, it is urgent and essential to develop automatic computational methods to efficiently and accurately predict PPIs. In this paper, a sequence-based approach called DNN-LCTD is developed by combining deep neural networks (DNNs and a novel local conjoint triad description (LCTD feature representation. LCTD incorporates the advantage of local description and conjoint triad, thus, it is capable to account for the interactions between residues in both continuous and discontinuous regions of amino acid sequences. DNNs can not only learn suitable features from the data by themselves, but also learn and discover hierarchical representations of data. When performing on the PPIs data of Saccharomyces cerevisiae, DNN-LCTD achieves superior performance with accuracy as 93.12%, precision as 93.75%, sensitivity as 93.83%, area under the receiver operating characteristic curve (AUC as 97.92%, and it only needs 718 s. These results indicate DNN-LCTD is very promising for predicting PPIs. DNN-LCTD can be a useful supplementary tool for future proteomics study.

  10. An ontology-based search engine for protein-protein interactions.

    Science.gov (United States)

    Park, Byungkyu; Han, Kyungsook

    2010-01-18

    Keyword matching or ID matching is the most common searching method in a large database of protein-protein interactions. They are purely syntactic methods, and retrieve the records in the database that contain a keyword or ID specified in a query. Such syntactic search methods often retrieve too few search results or no results despite many potential matches present in the database. We have developed a new method for representing protein-protein interactions and the Gene Ontology (GO) using modified Gödel numbers. This representation is hidden from users but enables a search engine using the representation to efficiently search protein-protein interactions in a biologically meaningful way. Given a query protein with optional search conditions expressed in one or more GO terms, the search engine finds all the interaction partners of the query protein by unique prime factorization of the modified Gödel numbers representing the query protein and the search conditions. Representing the biological relations of proteins and their GO annotations by modified Gödel numbers makes a search engine efficiently find all protein-protein interactions by prime factorization of the numbers. Keyword matching or ID matching search methods often miss the interactions involving a protein that has no explicit annotations matching the search condition, but our search engine retrieves such interactions as well if they satisfy the search condition with a more specific term in the ontology.

  11. Prediction of the Ebola Virus Infection Related Human Genes Using Protein-Protein Interaction Network.

    Science.gov (United States)

    Cao, HuanHuan; Zhang, YuHang; Zhao, Jia; Zhu, Liucun; Wang, Yi; Li, JiaRui; Feng, Yuan-Ming; Zhang, Ning

    2017-01-01

    Ebola hemorrhagic fever (EHF) is caused by Ebola virus (EBOV). It is reported that human could be infected by EBOV with a high fatality rate. However, association factors between EBOV and host still tend to be ambiguous. According to the "guilt by association" (GBA) principle, proteins interacting with each other are very likely to function similarly or the same. Based on this assumption, we tried to obtain EBOV infection-related human genes in a protein-protein interaction network using Dijkstra algorithm. We hope it could contribute to the discovery of novel effective treatments. Finally, 15 genes were selected as potential EBOV infection-related human genes. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  12. PRODIGY : a web server for predicting the binding affinity of protein-protein complexes

    NARCIS (Netherlands)

    Xue, Li; Garcia Lopes Maia Rodrigues, João; Kastritis, Panagiotis L; Bonvin, Alexandre Mjj; Vangone, Anna

    2016-01-01

    Gaining insights into the structural determinants of protein-protein interactions holds the key for a deeper understanding of biological functions, diseases and development of therapeutics. An important aspect of this is the ability to accurately predict the binding strength for a given

  13. Coarse-grain modelling of protein-protein interactions

    NARCIS (Netherlands)

    Baaden, Marc; Marrink, Siewert J.

    2013-01-01

    Here, we review recent advances towards the modelling of protein-protein interactions (PPI) at the coarse-grained (CG) level, a technique that is now widely used to understand protein affinity, aggregation and self-assembly behaviour. PPI models of soluble proteins and membrane proteins are

  14. Annotating activation/inhibition relationships to protein-protein interactions using gene ontology relations.

    Science.gov (United States)

    Yim, Soorin; Yu, Hasun; Jang, Dongjin; Lee, Doheon

    2018-04-11

    Signaling pathways can be reconstructed by identifying 'effect types' (i.e. activation/inhibition) of protein-protein interactions (PPIs). Effect types are composed of 'directions' (i.e. upstream/downstream) and 'signs' (i.e. positive/negative), thereby requiring directions as well as signs of PPIs to predict signaling events from PPI networks. Here, we propose a computational method for systemically annotating effect types to PPIs using relations between functional information of proteins. We used regulates, positively regulates, and negatively regulates relations in Gene Ontology (GO) to predict directions and signs of PPIs. These relations indicate both directions and signs between GO terms so that we can project directions and signs between relevant GO terms to PPIs. Independent test results showed that our method is effective for predicting both directions and signs of PPIs. Moreover, our method outperformed a previous GO-based method that did not consider the relations between GO terms. We annotated effect types to human PPIs and validated several highly confident effect types against literature. The annotated human PPIs are available in Additional file 2 to aid signaling pathway reconstruction and network biology research. We annotated effect types to PPIs by using regulates, positively regulates, and negatively regulates relations in GO. We demonstrated that those relations are effective for predicting not only signs, but also directions of PPIs. The usefulness of those relations suggests their potential applications to other types of interactions such as protein-DNA interactions.

  15. Alignment of non-covalent interactions at protein-protein interfaces.

    Directory of Open Access Journals (Sweden)

    Hongbo Zhu

    Full Text Available BACKGROUND: The study and comparison of protein-protein interfaces is essential for the understanding of the mechanisms of interaction between proteins. While there are many methods for comparing protein structures and protein binding sites, so far no methods have been reported for comparing the geometry of non-covalent interactions occurring at protein-protein interfaces. METHODOLOGY/PRINCIPAL FINDINGS: Here we present a method for aligning non-covalent interactions between different protein-protein interfaces. The method aligns the vector representations of van der Waals interactions and hydrogen bonds based on their geometry. The method has been applied to a dataset which comprises a variety of protein-protein interfaces. The alignments are consistent to a large extent with the results obtained using two other complementary approaches. In addition, we apply the method to three examples of protein mimicry. The method successfully aligns respective interfaces and allows for recognizing conserved interface regions. CONCLUSIONS/SIGNIFICANCE: The Galinter method has been validated in the comparison of interfaces in which homologous subunits are involved, including cases of mimicry. The method is also applicable to comparing interfaces involving non-peptidic compounds. Galinter assists users in identifying local interface regions with similar patterns of non-covalent interactions. This is particularly relevant to the investigation of the molecular basis of interaction mimicry.

  16. Unveiling protein functions through the dynamics of the interaction network.

    Directory of Open Access Journals (Sweden)

    Irene Sendiña-Nadal

    Full Text Available Protein interaction networks have become a tool to study biological processes, either for predicting molecular functions or for designing proper new drugs to regulate the main biological interactions. Furthermore, such networks are known to be organized in sub-networks of proteins contributing to the same cellular function. However, the protein function prediction is not accurate and each protein has traditionally been assigned to only one function by the network formalism. By considering the network of the physical interactions between proteins of the yeast together with a manual and single functional classification scheme, we introduce a method able to reveal important information on protein function, at both micro- and macro-scale. In particular, the inspection of the properties of oscillatory dynamics on top of the protein interaction network leads to the identification of misclassification problems in protein function assignments, as well as to unveil correct identification of protein functions. We also demonstrate that our approach can give a network representation of the meta-organization of biological processes by unraveling the interactions between different functional classes.

  17. Detection of protein-protein interactions by ribosome display and protein in situ immobilisation.

    Science.gov (United States)

    He, Mingyue; Liu, Hong; Turner, Martin; Taussig, Michael J

    2009-12-31

    We describe a method for identification of protein-protein interactions by combining two cell-free protein technologies, namely ribosome display and protein in situ immobilisation. The method requires only PCR fragments as the starting material, the target proteins being made through cell-free protein synthesis, either associated with their encoding mRNA as ribosome complexes or immobilised on a solid surface. The use of ribosome complexes allows identification of interacting protein partners from their attached coding mRNA. To demonstrate the procedures, we have employed the lymphocyte signalling proteins Vav1 and Grb2 and confirmed the interaction between Grb2 and the N-terminal SH3 domain of Vav1. The method has promise for library screening of pairwise protein interactions, down to the analytical level of individual domain or motif mapping.

  18. Protein-protein interactions in paralogues: Electrostatics modulates specificity on a conserved steric scaffold.

    Directory of Open Access Journals (Sweden)

    Stefan M Ivanov

    Full Text Available An improved knowledge of protein-protein interactions is essential for better understanding of metabolic and signaling networks, and cellular function. Progress tends to be based on structure determination and predictions using known structures, along with computational methods based on evolutionary information or detailed atomistic descriptions. We hypothesized that for the case of interactions across a common interface, between proteins from a pair of paralogue families or within a family of paralogues, a relatively simple interface description could distinguish between binding and non-binding pairs. Using binding data for several systems, and large-scale comparative modeling based on known template complex structures, it is found that charge-charge interactions (for groups bearing net charge are generally a better discriminant than buried non-polar surface. This is particularly the case for paralogue families that are less divergent, with more reliable comparative modeling. We suggest that electrostatic interactions are major determinants of specificity in such systems, an observation that could be used to predict binding partners.

  19. Protein-protein interactions in paralogues: Electrostatics modulates specificity on a conserved steric scaffold.

    Science.gov (United States)

    Ivanov, Stefan M; Cawley, Andrew; Huber, Roland G; Bond, Peter J; Warwicker, Jim

    2017-01-01

    An improved knowledge of protein-protein interactions is essential for better understanding of metabolic and signaling networks, and cellular function. Progress tends to be based on structure determination and predictions using known structures, along with computational methods based on evolutionary information or detailed atomistic descriptions. We hypothesized that for the case of interactions across a common interface, between proteins from a pair of paralogue families or within a family of paralogues, a relatively simple interface description could distinguish between binding and non-binding pairs. Using binding data for several systems, and large-scale comparative modeling based on known template complex structures, it is found that charge-charge interactions (for groups bearing net charge) are generally a better discriminant than buried non-polar surface. This is particularly the case for paralogue families that are less divergent, with more reliable comparative modeling. We suggest that electrostatic interactions are major determinants of specificity in such systems, an observation that could be used to predict binding partners.

  20. Predicting protein folding pathways at the mesoscopic level based on native interactions between secondary structure elements

    Directory of Open Access Journals (Sweden)

    Sze Sing-Hoi

    2008-07-01

    Full Text Available Abstract Background Since experimental determination of protein folding pathways remains difficult, computational techniques are often used to simulate protein folding. Most current techniques to predict protein folding pathways are computationally intensive and are suitable only for small proteins. Results By assuming that the native structure of a protein is known and representing each intermediate conformation as a collection of fully folded structures in which each of them contains a set of interacting secondary structure elements, we show that it is possible to significantly reduce the conformation space while still being able to predict the most energetically favorable folding pathway of large proteins with hundreds of residues at the mesoscopic level, including the pig muscle phosphoglycerate kinase with 416 residues. The model is detailed enough to distinguish between different folding pathways of structurally very similar proteins, including the streptococcal protein G and the peptostreptococcal protein L. The model is also able to recognize the differences between the folding pathways of protein G and its two structurally similar variants NuG1 and NuG2, which are even harder to distinguish. We show that this strategy can produce accurate predictions on many other proteins with experimentally determined intermediate folding states. Conclusion Our technique is efficient enough to predict folding pathways for both large and small proteins at the mesoscopic level. Such a strategy is often the only feasible choice for large proteins. A software program implementing this strategy (SSFold is available at http://faculty.cs.tamu.edu/shsze/ssfold.

  1. Protein-protein interaction site prediction in Homo sapiens and E. coli using an interaction-affinity based membership function in fuzzy SVM.

    Science.gov (United States)

    Sriwastava, Brijesh Kumar; Basu, Subhadip; Maulik, Ujjwal

    2015-10-01

    Protein-protein interaction (PPI) site prediction aids to ascertain the interface residues that participate in interaction processes. Fuzzy support vector machine (F-SVM) is proposed as an effective method to solve this problem, and we have shown that the performance of the classical SVM can be enhanced with the help of an interaction-affinity based fuzzy membership function. The performances of both SVM and F-SVM on the PPI databases of the Homo sapiens and E. coli organisms are evaluated and estimated the statistical significance of the developed method over classical SVM and other fuzzy membership-based SVM methods available in the literature. Our membership function uses the residue-level interaction affinity scores for each pair of positive and negative sequence fragments. The average AUC scores in the 10-fold cross-validation experiments are measured as 79.94% and 80.48% for the Homo sapiens and E. coli organisms respectively. On the independent test datasets, AUC scores are obtained as 76.59% and 80.17% respectively for the two organisms. In almost all cases, the developed F-SVM method improves the performances obtained by the corresponding classical SVM and the other classifiers, available in the literature.

  2. The nonstructural protein 8 (nsp8) of the SARS coronavirus interacts with its ORF6 accessory protein

    International Nuclear Information System (INIS)

    Kumar, Purnima; Gunalan, Vithiagaran; Liu Boping; Chow, Vincent T.K.; Druce, Julian; Birch, Chris; Catton, Mike; Fielding, Burtram C.; Tan, Yee-Joo; Lal, Sunil K.

    2007-01-01

    Severe acute respiratory syndrome (SARS) coronavirus (SARS-CoV) caused a severe outbreak in several regions of the world in 2003. The SARS-CoV genome is predicted to contain 14 functional open reading frames (ORFs). The first ORF (1a and 1b) encodes a large polyprotein that is cleaved into nonstructural proteins (nsp). The other ORFs encode for four structural proteins (spike, membrane, nucleocapsid and envelope) as well as eight SARS-CoV-specific accessory proteins (3a, 3b, 6, 7a, 7b, 8a, 8b and 9b). In this report we have cloned the predicted nsp8 gene and the ORF6 gene of the SARS-CoV and studied their abilities to interact with each other. We expressed the two proteins as fusion proteins in the yeast two-hybrid system to demonstrate protein-protein interactions and tested the same using a yeast genetic cross. Further the strength of the interaction was measured by challenging growth of the positive interaction clones on increasing gradients of 2-amino trizole. The interaction was then verified by expressing both proteins separately in-vitro in a coupled-transcription translation system and by coimmunoprecipitation in mammalian cells. Finally, colocalization experiments were performed in SARS-CoV infected Vero E6 mammalian cells to confirm the nsp8-ORF6 interaction. To the best of our knowledge, this is the first report of the interaction between a SARS-CoV accessory protein and nsp8 and our findings suggest that ORF6 protein may play a role in virus replication

  3. Structural interface parameters are discriminatory in recognising near-native poses of protein-protein interactions.

    Directory of Open Access Journals (Sweden)

    Sony Malhotra

    Full Text Available Interactions at the molecular level in the cellular environment play a very crucial role in maintaining the physiological functioning of the cell. These molecular interactions exist at varied levels viz. protein-protein interactions, protein-nucleic acid interactions or protein-small molecules interactions. Presently in the field, these interactions and their mechanisms mark intensively studied areas. Molecular interactions can also be studied computationally using the approach named as Molecular Docking. Molecular docking employs search algorithms to predict the possible conformations for interacting partners and then calculates interaction energies. However, docking proposes number of solutions as different docked poses and hence offers a serious challenge to identify the native (or near native structures from the pool of these docked poses. Here, we propose a rigorous scoring scheme called DockScore which can be used to rank the docked poses and identify the best docked pose out of many as proposed by docking algorithm employed. The scoring identifies the optimal interactions between the two protein partners utilising various features of the putative interface like area, short contacts, conservation, spatial clustering and the presence of positively charged and hydrophobic residues. DockScore was first trained on a set of 30 protein-protein complexes to determine the weights for different parameters. Subsequently, we tested the scoring scheme on 30 different protein-protein complexes and native or near-native structure were assigned the top rank from a pool of docked poses in 26 of the tested cases. We tested the ability of DockScore to discriminate likely dimer interactions that differ substantially within a homologous family and also demonstrate that DOCKSCORE can distinguish correct pose for all 10 recent CAPRI targets.

  4. Detection of protein complex from protein-protein interaction network using Markov clustering

    International Nuclear Information System (INIS)

    Ochieng, P J; Kusuma, W A; Haryanto, T

    2017-01-01

    Detection of complexes, or groups of functionally related proteins, is an important challenge while analysing biological networks. However, existing algorithms to identify protein complexes are insufficient when applied to dense networks of experimentally derived interaction data. Therefore, we introduced a graph clustering method based on Markov clustering algorithm to identify protein complex within highly interconnected protein-protein interaction networks. Protein-protein interaction network was first constructed to develop geometrical network, the network was then partitioned using Markov clustering to detect protein complexes. The interest of the proposed method was illustrated by its application to Human Proteins associated to type II diabetes mellitus. Flow simulation of MCL algorithm was initially performed and topological properties of the resultant network were analysed for detection of the protein complex. The results indicated the proposed method successfully detect an overall of 34 complexes with 11 complexes consisting of overlapping modules and 20 non-overlapping modules. The major complex consisted of 102 proteins and 521 interactions with cluster modularity and density of 0.745 and 0.101 respectively. The comparison analysis revealed MCL out perform AP, MCODE and SCPS algorithms with high clustering coefficient (0.751) network density and modularity index (0.630). This demonstrated MCL was the most reliable and efficient graph clustering algorithm for detection of protein complexes from PPI networks. (paper)

  5. MetaGO: Predicting Gene Ontology of Non-homologous Proteins Through Low-Resolution Protein Structure Prediction and Protein-Protein Network Mapping.

    Science.gov (United States)

    Zhang, Chengxin; Zheng, Wei; Freddolino, Peter L; Zhang, Yang

    2018-03-10

    Homology-based transferal remains the major approach to computational protein function annotations, but it becomes increasingly unreliable when the sequence identity between query and template decreases below 30%. We propose a novel pipeline, MetaGO, to deduce Gene Ontology attributes of proteins by combining sequence homology-based annotation with low-resolution structure prediction and comparison, and partner's homology-based protein-protein network mapping. The pipeline was tested on a large-scale set of 1000 non-redundant proteins from the CAFA3 experiment. Under the stringent benchmark conditions where templates with >30% sequence identity to the query are excluded, MetaGO achieves average F-measures of 0.487, 0.408, and 0.598, for Molecular Function, Biological Process, and Cellular Component, respectively, which are significantly higher than those achieved by other state-of-the-art function annotations methods. Detailed data analysis shows that the major advantage of the MetaGO lies in the new functional homolog detections from partner's homology-based network mapping and structure-based local and global structure alignments, the confidence scores of which can be optimally combined through logistic regression. These data demonstrate the power of using a hybrid model incorporating protein structure and interaction networks to deduce new functional insights beyond traditional sequence homology-based referrals, especially for proteins that lack homologous function templates. The MetaGO pipeline is available at http://zhanglab.ccmb.med.umich.edu/MetaGO/. Copyright © 2018. Published by Elsevier Ltd.

  6. A credit-card library approach for disrupting protein-protein interactions.

    Science.gov (United States)

    Xu, Yang; Shi, Jin; Yamamoto, Noboru; Moss, Jason A; Vogt, Peter K; Janda, Kim D

    2006-04-15

    Protein-protein interfaces are prominent in many therapeutically important targets. Using small organic molecules to disrupt protein-protein interactions is a current challenge in chemical biology. An important example of protein-protein interactions is provided by the Myc protein, which is frequently deregulated in human cancers. Myc belongs to the family of basic helix-loop-helix leucine zipper (bHLH-ZIP) transcription factors. It is biologically active only as heterodimer with the bHLH-ZIP protein Max. Herein, we report a new strategy for the disruption of protein-protein interactions that has been corroborated through the design and synthesis of a small parallel library composed of 'credit-card' compounds. These compounds are derived from a planar, aromatic scaffold and functionalized with four points of diversity. From a 285 membered library, several hits were obtained that disrupted the c-Myc-Max interaction and cellular functions of c-Myc. The IC50 values determined for this small focused library for the disruption of Myc-Max dimerization are quite potent, especially since small molecule antagonists of protein-protein interactions are notoriously difficult to find. Furthermore, several of the compounds were active at the cellular level as shown by their biological effects on Myc action in chicken embryo fibroblast assays. In light of our findings, this approach is considered a valuable addition to the armamentarium of new molecules being developed to interact with protein-protein interfaces. Finally, this strategy for disrupting protein-protein interactions should prove applicable to other families of proteins.

  7. Quality control methodology for high-throughput protein-protein interaction screening.

    Science.gov (United States)

    Vazquez, Alexei; Rual, Jean-François; Venkatesan, Kavitha

    2011-01-01

    Protein-protein interactions are key to many aspects of the cell, including its cytoskeletal structure, the signaling processes in which it is involved, or its metabolism. Failure to form protein complexes or signaling cascades may sometimes translate into pathologic conditions such as cancer or neurodegenerative diseases. The set of all protein interactions between the proteins encoded by an organism constitutes its protein interaction network, representing a scaffold for biological function. Knowing the protein interaction network of an organism, combined with other sources of biological information, can unravel fundamental biological circuits and may help better understand the molecular basics of human diseases. The protein interaction network of an organism can be mapped by combining data obtained from both low-throughput screens, i.e., "one gene at a time" experiments and high-throughput screens, i.e., screens designed to interrogate large sets of proteins at once. In either case, quality controls are required to deal with the inherent imperfect nature of experimental assays. In this chapter, we discuss experimental and statistical methodologies to quantify error rates in high-throughput protein-protein interactions screens.

  8. Prediction of RNA-Binding Proteins by Voting Systems

    Directory of Open Access Journals (Sweden)

    C. R. Peng

    2011-01-01

    Full Text Available It is important to identify which proteins can interact with RNA for the purpose of protein annotation, since interactions between RNA and proteins influence the structure of the ribosome and play important roles in gene expression. This paper tries to identify proteins that can interact with RNA using voting systems. Firstly through Weka, 34 learning algorithms are chosen for investigation. Then simple majority voting system (SMVS is used for the prediction of RNA-binding proteins, achieving average ACC (overall prediction accuracy value of 79.72% and MCC (Matthew’s correlation coefficient value of 59.77% for the independent testing dataset. Then mRMR (minimum redundancy maximum relevance strategy is used, which is transferred into algorithm selection. In addition, the MCC value of each classifier is assigned to be the weight of the classifier’s vote. As a result, best average MCC values are attained when 22 algorithms are selected and integrated through weighted votes, which are 64.70% for the independent testing dataset, and ACC value is 82.04% at this moment.

  9. Which clustering algorithm is better for predicting protein complexes?

    Directory of Open Access Journals (Sweden)

    Moschopoulos Charalampos N

    2011-12-01

    Full Text Available Abstract Background Protein-Protein interactions (PPI play a key role in determining the outcome of most cellular processes. The correct identification and characterization of protein interactions and the networks, which they comprise, is critical for understanding the molecular mechanisms within the cell. Large-scale techniques such as pull down assays and tandem affinity purification are used in order to detect protein interactions in an organism. Today, relatively new high-throughput methods like yeast two hybrid, mass spectrometry, microarrays, and phage display are also used to reveal protein interaction networks. Results In this paper we evaluated four different clustering algorithms using six different interaction datasets. We parameterized the MCL, Spectral, RNSC and Affinity Propagation algorithms and applied them to six PPI datasets produced experimentally by Yeast 2 Hybrid (Y2H and Tandem Affinity Purification (TAP methods. The predicted clusters, so called protein complexes, were then compared and benchmarked with already known complexes stored in published databases. Conclusions While results may differ upon parameterization, the MCL and RNSC algorithms seem to be more promising and more accurate at predicting PPI complexes. Moreover, they predict more complexes than other reviewed algorithms in absolute numbers. On the other hand the spectral clustering algorithm achieves the highest valid prediction rate in our experiments. However, it is nearly always outperformed by both RNSC and MCL in terms of the geometrical accuracy while it generates the fewest valid clusters than any other reviewed algorithm. This article demonstrates various metrics to evaluate the accuracy of such predictions as they are presented in the text below. Supplementary material can be found at: http://www.bioacademy.gr/bioinformatics/projects/ppireview.htm

  10. Identification of Protein-Protein Interactions with Glutathione-S-Transferase (GST) Fusion Proteins.

    Science.gov (United States)

    Einarson, Margret B; Pugacheva, Elena N; Orlinick, Jason R

    2007-08-01

    INTRODUCTIONGlutathione-S-transferase (GST) fusion proteins have had a wide range of applications since their introduction as tools for synthesis of recombinant proteins in bacteria. GST was originally selected as a fusion moiety because of several desirable properties. First and foremost, when expressed in bacteria alone, or as a fusion, GST is not sequestered in inclusion bodies (in contrast to previous fusion protein systems). Second, GST can be affinity-purified without denaturation because it binds to immobilized glutathione, which provides the basis for simple purification. Consequently, GST fusion proteins are routinely used for antibody generation and purification, protein-protein interaction studies, and biochemical analysis. This article describes the use of GST fusion proteins as probes for the identification of protein-protein interactions.

  11. Exploring the potential of 3D Zernike descriptors and SVM for protein-protein interface prediction.

    Science.gov (United States)

    Daberdaku, Sebastian; Ferrari, Carlo

    2018-02-06

    The correct determination of protein-protein interaction interfaces is important for understanding disease mechanisms and for rational drug design. To date, several computational methods for the prediction of protein interfaces have been developed, but the interface prediction problem is still not fully understood. Experimental evidence suggests that the location of binding sites is imprinted in the protein structure, but there are major differences among the interfaces of the various protein types: the characterising properties can vary a lot depending on the interaction type and function. The selection of an optimal set of features characterising the protein interface and the development of an effective method to represent and capture the complex protein recognition patterns are of paramount importance for this task. In this work we investigate the potential of a novel local surface descriptor based on 3D Zernike moments for the interface prediction task. Descriptors invariant to roto-translations are extracted from circular patches of the protein surface enriched with physico-chemical properties from the HQI8 amino acid index set, and are used as samples for a binary classification problem. Support Vector Machines are used as a classifier to distinguish interface local surface patches from non-interface ones. The proposed method was validated on 16 classes of proteins extracted from the Protein-Protein Docking Benchmark 5.0 and compared to other state-of-the-art protein interface predictors (SPPIDER, PrISE and NPS-HomPPI). The 3D Zernike descriptors are able to capture the similarity among patterns of physico-chemical and biochemical properties mapped on the protein surface arising from the various spatial arrangements of the underlying residues, and their usage can be easily extended to other sets of amino acid properties. The results suggest that the choice of a proper set of features characterising the protein interface is crucial for the interface prediction

  12. Predicting the tolerated sequences for proteins and protein interfaces using RosettaBackrub flexible backbone design.

    Directory of Open Access Journals (Sweden)

    Colin A Smith

    Full Text Available Predicting the set of sequences that are tolerated by a protein or protein interface, while maintaining a desired function, is useful for characterizing protein interaction specificity and for computationally designing sequence libraries to engineer proteins with new functions. Here we provide a general method, a detailed set of protocols, and several benchmarks and analyses for estimating tolerated sequences using flexible backbone protein design implemented in the Rosetta molecular modeling software suite. The input to the method is at least one experimentally determined three-dimensional protein structure or high-quality model. The starting structure(s are expanded or refined into a conformational ensemble using Monte Carlo simulations consisting of backrub backbone and side chain moves in Rosetta. The method then uses a combination of simulated annealing and genetic algorithm optimization methods to enrich for low-energy sequences for the individual members of the ensemble. To emphasize certain functional requirements (e.g. forming a binding interface, interactions between and within parts of the structure (e.g. domains can be reweighted in the scoring function. Results from each backbone structure are merged together to create a single estimate for the tolerated sequence space. We provide an extensive description of the protocol and its parameters, all source code, example analysis scripts and three tests applying this method to finding sequences predicted to stabilize proteins or protein interfaces. The generality of this method makes many other applications possible, for example stabilizing interactions with small molecules, DNA, or RNA. Through the use of within-domain reweighting and/or multistate design, it may also be possible to use this method to find sequences that stabilize particular protein conformations or binding interactions over others.

  13. [Detection of protein-protein interactions by FRET and BRET methods].

    Science.gov (United States)

    Matoulková, E; Vojtěšek, B

    2014-01-01

    Nowadays, in vivo protein-protein interaction studies have become preferable detecting meth-ods that enable to show or specify (already known) protein interactions and discover their inhibitors. They also facilitate detection of protein conformational changes and discovery or specification of signaling pathways in living cells. One group of in vivo methods enabling these findings is based on fluorescent resonance energy transfer (FRET) and its bio-luminescent modification (BRET). They are based on visualization of protein-protein interactions via light or enzymatic excitation of fluorescent or bio-luminescent proteins. These methods allow not only protein localization within the cell or its organelles (or small animals) but they also allow us to quantify fluorescent signals and to discover weak or strong interaction partners. In this review, we explain the principles of FRET and BRET, their applications in the characterization of protein-protein interactions and we describe several findings using these two methods that clarify molecular and cellular mechanisms and signals related to cancer biology.

  14. Selection of peptides interfering with protein-protein interaction.

    Science.gov (United States)

    Gaida, Annette; Hagemann, Urs B; Mattay, Dinah; Räuber, Christina; Müller, Kristian M; Arndt, Katja M

    2009-01-01

    Cell physiology depends on a fine-tuned network of protein-protein interactions, and misguided interactions are often associated with various diseases. Consequently, peptides, which are able to specifically interfere with such adventitious interactions, are of high interest for analytical as well as medical purposes. One of the most abundant protein interaction domains is the coiled-coil motif, and thus provides a premier target. Coiled coils, which consist of two or more alpha-helices wrapped around each other, have one of the simplest interaction interfaces, yet they are able to confer highly specific homo- and heterotypic interactions involved in virtually any cellular process. While there are several ways to generate interfering peptides, the combination of library design with a powerful selection system seems to be one of the most effective and promising approaches. This chapter guides through all steps of such a process, starting with library options and cloning, detailing suitable selection techniques and ending with purification for further down-stream characterization. Such generated peptides will function as versatile tools to interfere with the natural function of their targets thereby illuminating their down-stream signaling and, in general, promoting understanding of factors leading to specificity and stability in protein-protein interactions. Furthermore, peptides interfering with medically relevant proteins might become important diagnostics and therapeutics.

  15. Water-Protein Interactions: The Secret of Protein Dynamics

    Directory of Open Access Journals (Sweden)

    Silvia Martini

    2013-01-01

    Full Text Available Water-protein interactions help to maintain flexible conformation conditions which are required for multifunctional protein recognition processes. The intimate relationship between the protein surface and hydration water can be analyzed by studying experimental water properties measured in protein systems in solution. In particular, proteins in solution modify the structure and the dynamics of the bulk water at the solute-solvent interface. The ordering effects of proteins on hydration water are extended for several angstroms. In this paper we propose a method for analyzing the dynamical properties of the water molecules present in the hydration shells of proteins. The approach is based on the analysis of the effects of protein-solvent interactions on water protons NMR relaxation parameters. NMR relaxation parameters, especially the nonselective (R1NS and selective (R1SE spin-lattice relaxation rates of water protons, are useful for investigating the solvent dynamics at the macromolecule-solvent interfaces as well as the perturbation effects caused by the water-macromolecule interactions on the solvent dynamical properties. In this paper we demonstrate that Nuclear Magnetic Resonance Spectroscopy can be used to determine the dynamical contributions of proteins to the water molecules belonging to their hydration shells.

  16. Predicting adverse drug reaction profiles by integrating protein interaction networks with drug structures.

    Science.gov (United States)

    Huang, Liang-Chin; Wu, Xiaogang; Chen, Jake Y

    2013-01-01

    The prediction of adverse drug reactions (ADRs) has become increasingly important, due to the rising concern on serious ADRs that can cause drugs to fail to reach or stay in the market. We proposed a framework for predicting ADR profiles by integrating protein-protein interaction (PPI) networks with drug structures. We compared ADR prediction performances over 18 ADR categories through four feature groups-only drug targets, drug targets with PPI networks, drug structures, and drug targets with PPI networks plus drug structures. The results showed that the integration of PPI networks and drug structures can significantly improve the ADR prediction performance. The median AUC values for the four groups were 0.59, 0.61, 0.65, and 0.70. We used the protein features in the best two models, "Cardiac disorders" (median-AUC: 0.82) and "Psychiatric disorders" (median-AUC: 0.76), to build ADR-specific PPI networks with literature supports. For validation, we examined 30 drugs withdrawn from the U.S. market to see if our approach can predict their ADR profiles and explain why they were withdrawn. Except for three drugs having ADRs in the categories we did not predict, 25 out of 27 withdrawn drugs (92.6%) having severe ADRs were successfully predicted by our approach. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  17. A scored human protein-protein interaction network to catalyze genomic interpretation

    DEFF Research Database (Denmark)

    Li, Taibo; Wernersson, Rasmus; Hansen, Rasmus B

    2017-01-01

    Genome-scale human protein-protein interaction networks are critical to understanding cell biology and interpreting genomic data, but challenging to produce experimentally. Through data integration and quality control, we provide a scored human protein-protein interaction network (InWeb_InBioMap,......Genome-scale human protein-protein interaction networks are critical to understanding cell biology and interpreting genomic data, but challenging to produce experimentally. Through data integration and quality control, we provide a scored human protein-protein interaction network (In...

  18. Evolutionary diversification of protein-protein interactions by interface add-ons.

    Science.gov (United States)

    Plach, Maximilian G; Semmelmann, Florian; Busch, Florian; Busch, Markus; Heizinger, Leonhard; Wysocki, Vicki H; Merkl, Rainer; Sterner, Reinhard

    2017-10-03

    Cells contain a multitude of protein complexes whose subunits interact with high specificity. However, the number of different protein folds and interface geometries found in nature is limited. This raises the question of how protein-protein interaction specificity is achieved on the structural level and how the formation of nonphysiological complexes is avoided. Here, we describe structural elements called interface add-ons that fulfill this function and elucidate their role for the diversification of protein-protein interactions during evolution. We identified interface add-ons in 10% of a representative set of bacterial, heteromeric protein complexes. The importance of interface add-ons for protein-protein interaction specificity is demonstrated by an exemplary experimental characterization of over 30 cognate and hybrid glutamine amidotransferase complexes in combination with comprehensive genetic profiling and protein design. Moreover, growth experiments showed that the lack of interface add-ons can lead to physiologically harmful cross-talk between essential biosynthetic pathways. In sum, our complementary in silico, in vitro, and in vivo analysis argues that interface add-ons are a practical and widespread evolutionary strategy to prevent the formation of nonphysiological complexes by specializing protein-protein interactions.

  19. Visualization of Host-Polerovirus Interaction Topologies Using Protein Interaction Reporter Technology.

    Science.gov (United States)

    DeBlasio, Stacy L; Chavez, Juan D; Alexander, Mariko M; Ramsey, John; Eng, Jimmy K; Mahoney, Jaclyn; Gray, Stewart M; Bruce, James E; Cilia, Michelle

    2016-02-15

    Demonstrating direct interactions between host and virus proteins during infection is a major goal and challenge for the field of virology. Most protein interactions are not binary or easily amenable to structural determination. Using infectious preparations of a polerovirus (Potato leafroll virus [PLRV]) and protein interaction reporter (PIR), a revolutionary technology that couples a mass spectrometric-cleavable chemical cross-linker with high-resolution mass spectrometry, we provide the first report of a host-pathogen protein interaction network that includes data-derived, topological features for every cross-linked site that was identified. We show that PLRV virions have hot spots of protein interaction and multifunctional surface topologies, revealing how these plant viruses maximize their use of binding interfaces. Modeling data, guided by cross-linking constraints, suggest asymmetric packing of the major capsid protein in the virion, which supports previous epitope mapping studies. Protein interaction topologies are conserved with other species in the Luteoviridae and with unrelated viruses in the Herpesviridae and Adenoviridae. Functional analysis of three PLRV-interacting host proteins in planta using a reverse-genetics approach revealed a complex, molecular tug-of-war between host and virus. Structural mimicry and diversifying selection-hallmarks of host-pathogen interactions-were identified within host and viral binding interfaces predicted by our models. These results illuminate the functional diversity of the PLRV-host protein interaction network and demonstrate the usefulness of PIR technology for precision mapping of functional host-pathogen protein interaction topologies. The exterior shape of a plant virus and its interacting host and insect vector proteins determine whether a virus will be transmitted by an insect or infect a specific host. Gaining this information is difficult and requires years of experimentation. We used protein interaction

  20. Improving prediction of heterodimeric protein complexes using combination with pairwise kernel.

    Science.gov (United States)

    Ruan, Peiying; Hayashida, Morihiro; Akutsu, Tatsuya; Vert, Jean-Philippe

    2018-02-19

    Since many proteins become functional only after they interact with their partner proteins and form protein complexes, it is essential to identify the sets of proteins that form complexes. Therefore, several computational methods have been proposed to predict complexes from the topology and structure of experimental protein-protein interaction (PPI) network. These methods work well to predict complexes involving at least three proteins, but generally fail at identifying complexes involving only two different proteins, called heterodimeric complexes or heterodimers. There is however an urgent need for efficient methods to predict heterodimers, since the majority of known protein complexes are precisely heterodimers. In this paper, we use three promising kernel functions, Min kernel and two pairwise kernels, which are Metric Learning Pairwise Kernel (MLPK) and Tensor Product Pairwise Kernel (TPPK). We also consider the normalization forms of Min kernel. Then, we combine Min kernel or its normalization form and one of the pairwise kernels by plugging. We applied kernels based on PPI, domain, phylogenetic profile, and subcellular localization properties to predicting heterodimers. Then, we evaluate our method by employing C-Support Vector Classification (C-SVC), carrying out 10-fold cross-validation, and calculating the average F-measures. The results suggest that the combination of normalized-Min-kernel and MLPK leads to the best F-measure and improved the performance of our previous work, which had been the best existing method so far. We propose new methods to predict heterodimers, using a machine learning-based approach. We train a support vector machine (SVM) to discriminate interacting vs non-interacting protein pairs, based on informations extracted from PPI, domain, phylogenetic profiles and subcellular localization. We evaluate in detail new kernel functions to encode these data, and report prediction performance that outperforms the state-of-the-art.

  1. Optimizing scoring function of protein-nucleic acid interactions with both affinity and specificity.

    Directory of Open Access Journals (Sweden)

    Zhiqiang Yan

    Full Text Available Protein-nucleic acid (protein-DNA and protein-RNA recognition is fundamental to the regulation of gene expression. Determination of the structures of the protein-nucleic acid recognition and insight into their interactions at molecular level are vital to understanding the regulation function. Recently, quantitative computational approach has been becoming an alternative of experimental technique for predicting the structures and interactions of biomolecular recognition. However, the progress of protein-nucleic acid structure prediction, especially protein-RNA, is far behind that of the protein-ligand and protein-protein structure predictions due to the lack of reliable and accurate scoring function for quantifying the protein-nucleic acid interactions. In this work, we developed an accurate scoring function (named as SPA-PN, SPecificity and Affinity of the Protein-Nucleic acid interactions for protein-nucleic acid interactions by incorporating both the specificity and affinity into the optimization strategy. Specificity and affinity are two requirements of highly efficient and specific biomolecular recognition. Previous quantitative descriptions of the biomolecular interactions considered the affinity, but often ignored the specificity owing to the challenge of specificity quantification. We applied our concept of intrinsic specificity to connect the conventional specificity, which circumvents the challenge of specificity quantification. In addition to the affinity optimization, we incorporated the quantified intrinsic specificity into the optimization strategy of SPA-PN. The testing results and comparisons with other scoring functions validated that SPA-PN performs well on both the prediction of binding affinity and identification of native conformation. In terms of its performance, SPA-PN can be widely used to predict the protein-nucleic acid structures and quantify their interactions.

  2. Detecting protein complexes based on a combination of topological and biological properties in protein-protein interaction network

    Directory of Open Access Journals (Sweden)

    Pooja Sharma

    2018-06-01

    Full Text Available Protein complexes are known to play a major role in controlling cellular activity in a living being. Identifying complexes from raw protein protein interactions (PPIs is an important area of research. Earlier work has been limited mostly to yeast. Such protein complex identification methods, when applied to large human PPIs often give poor performance. We introduce a novel method called CSC to detect protein complexes. The method is evaluated in terms of positive predictive value, sensitivity and accuracy using the datasets of the model organism, yeast and humans. CSC outperforms several other competing algorithms for both organisms. Further, we present a framework to establish the usefulness of CSC in analyzing the influence of a given disease gene in a complex topologically as well as biologically considering eight major association factors. Keywords: Protein complex, Connectivity, Semantic similarity, Contribution

  3. Identification of NAD interacting residues in proteins

    Directory of Open Access Journals (Sweden)

    Raghava Gajendra PS

    2010-03-01

    Full Text Available Abstract Background Small molecular cofactors or ligands play a crucial role in the proper functioning of cells. Accurate annotation of their target proteins and binding sites is required for the complete understanding of reaction mechanisms. Nicotinamide adenine dinucleotide (NAD+ or NAD is one of the most commonly used organic cofactors in living cells, which plays a critical role in cellular metabolism, storage and regulatory processes. In the past, several NAD binding proteins (NADBP have been reported in the literature, which are responsible for a wide-range of activities in the cell. Attempts have been made to derive a rule for the binding of NAD+ to its target proteins. However, so far an efficient model could not be derived due to the time consuming process of structure determination, and limitations of similarity based approaches. Thus a sequence and non-similarity based method is needed to characterize the NAD binding sites to help in the annotation. In this study attempts have been made to predict NAD binding proteins and their interacting residues (NIRs from amino acid sequence using bioinformatics tools. Results We extracted 1556 proteins chains from 555 NAD binding proteins whose structure is available in Protein Data Bank. Then we removed all redundant protein chains and finally obtained 195 non-redundant NAD binding protein chains, where no two chains have more than 40% sequence identity. In this study all models were developed and evaluated using five-fold cross validation technique on the above dataset of 195 NAD binding proteins. While certain type of residues are preferred (e.g. Gly, Tyr, Thr, His in NAD interaction, residues like Ala, Glu, Leu, Lys are not preferred. A support vector machine (SVM based method has been developed using various window lengths of amino acid sequence for predicting NAD interacting residues and obtained maximum Matthew's correlation coefficient (MCC 0.47 with accuracy 74.13% at window length 17

  4. The Prediction of Key Cytoskeleton Components Involved in Glomerular Diseases Based on a Protein-Protein Interaction Network.

    Science.gov (United States)

    Ding, Fangrui; Tan, Aidi; Ju, Wenjun; Li, Xuejuan; Li, Shao; Ding, Jie

    2016-01-01

    Maintenance of the physiological morphologies of different types of cells and tissues is essential for the normal functioning of each system in the human body. Dynamic variations in cell and tissue morphologies depend on accurate adjustments of the cytoskeletal system. The cytoskeletal system in the glomerulus plays a key role in the normal process of kidney filtration. To enhance the understanding of the possible roles of the cytoskeleton in glomerular diseases, we constructed the Glomerular Cytoskeleton Network (GCNet), which shows the protein-protein interaction network in the glomerulus, and identified several possible key cytoskeletal components involved in glomerular diseases. In this study, genes/proteins annotated to the cytoskeleton were detected by Gene Ontology analysis, and glomerulus-enriched genes were selected from nine available glomerular expression datasets. Then, the GCNet was generated by combining these two sets of information. To predict the possible key cytoskeleton components in glomerular diseases, we then examined the common regulation of the genes in GCNet in the context of five glomerular diseases based on their transcriptomic data. As a result, twenty-one cytoskeleton components as potential candidate were highlighted for consistently down- or up-regulating in all five glomerular diseases. And then, these candidates were examined in relation to existing known glomerular diseases and genes to determine their possible functions and interactions. In addition, the mRNA levels of these candidates were also validated in a puromycin aminonucleoside(PAN) induced rat nephropathy model and were also matched with existing Diabetic Nephropathy (DN) transcriptomic data. As a result, there are 15 of 21 candidates in PAN induced nephropathy model were consistent with our predication and also 12 of 21 candidates were matched with differentially expressed genes in the DN transcriptomic data. By providing a novel interaction network and prediction, GCNet

  5. Coarse-grained versus atomistic simulations : realistic interaction free energies for real proteins

    NARCIS (Netherlands)

    May, Ali; Pool, René; van Dijk, Erik; Bijlard, Jochem; Abeln, Sanne; Heringa, Jaap; Feenstra, K Anton

    2014-01-01

    MOTIVATION: To assess whether two proteins will interact under physiological conditions, information on the interaction free energy is needed. Statistical learning techniques and docking methods for predicting protein-protein interactions cannot quantitatively estimate binding free energies. Full

  6. Coarse-grained versus atomistic simulations: realistic interaction free energies for real proteins

    NARCIS (Netherlands)

    May, A.; Pool, R.; van Dijk, E.; Bijlard, J.; Abeln, S.; Heringa, J.; Feenstra, K.A.

    2014-01-01

    MOTIVATION: To assess whether two proteins will interact under physiological conditions, information on the interaction free energy is needed. Statistical learning techniques and docking methods for predicting protein-protein interactions cannot quantitatively estimate binding free energies. Full

  7. Predictive and comparative analysis of Ebolavirus proteins

    Science.gov (United States)

    Cong, Qian; Pei, Jimin; Grishin, Nick V

    2015-01-01

    Ebolavirus is the pathogen for Ebola Hemorrhagic Fever (EHF). This disease exhibits a high fatality rate and has recently reached a historically epidemic proportion in West Africa. Out of the 5 known Ebolavirus species, only Reston ebolavirus has lost human pathogenicity, while retaining the ability to cause EHF in long-tailed macaque. Significant efforts have been spent to determine the three-dimensional (3D) structures of Ebolavirus proteins, to study their interaction with host proteins, and to identify the functional motifs in these viral proteins. Here, in light of these experimental results, we apply computational analysis to predict the 3D structures and functional sites for Ebolavirus protein domains with unknown structure, including a zinc-finger domain of VP30, the RNA-dependent RNA polymerase catalytic domain and a methyltransferase domain of protein L. In addition, we compare sequences of proteins that interact with Ebolavirus proteins from RESTV-resistant primates with those from RESTV-susceptible monkeys. The host proteins that interact with GP and VP35 show an elevated level of sequence divergence between the RESTV-resistant and RESTV-susceptible species, suggesting that they may be responsible for host specificity. Meanwhile, we detect variable positions in protein sequences that are likely associated with the loss of human pathogenicity in RESTV, map them onto the 3D structures and compare their positions to known functional sites. VP35 and VP30 are significantly enriched in these potential pathogenicity determinants and the clustering of such positions on the surfaces of VP35 and GP suggests possible uncharacterized interaction sites with host proteins that contribute to the virulence of Ebolavirus. PMID:26158395

  8. Predictive and comparative analysis of Ebolavirus proteins.

    Science.gov (United States)

    Cong, Qian; Pei, Jimin; Grishin, Nick V

    2015-01-01

    Ebolavirus is the pathogen for Ebola Hemorrhagic Fever (EHF). This disease exhibits a high fatality rate and has recently reached a historically epidemic proportion in West Africa. Out of the 5 known Ebolavirus species, only Reston ebolavirus has lost human pathogenicity, while retaining the ability to cause EHF in long-tailed macaque. Significant efforts have been spent to determine the three-dimensional (3D) structures of Ebolavirus proteins, to study their interaction with host proteins, and to identify the functional motifs in these viral proteins. Here, in light of these experimental results, we apply computational analysis to predict the 3D structures and functional sites for Ebolavirus protein domains with unknown structure, including a zinc-finger domain of VP30, the RNA-dependent RNA polymerase catalytic domain and a methyltransferase domain of protein L. In addition, we compare sequences of proteins that interact with Ebolavirus proteins from RESTV-resistant primates with those from RESTV-susceptible monkeys. The host proteins that interact with GP and VP35 show an elevated level of sequence divergence between the RESTV-resistant and RESTV-susceptible species, suggesting that they may be responsible for host specificity. Meanwhile, we detect variable positions in protein sequences that are likely associated with the loss of human pathogenicity in RESTV, map them onto the 3D structures and compare their positions to known functional sites. VP35 and VP30 are significantly enriched in these potential pathogenicity determinants and the clustering of such positions on the surfaces of VP35 and GP suggests possible uncharacterized interaction sites with host proteins that contribute to the virulence of Ebolavirus.

  9. Categorizing Biases in High-Confidence High-Throughput Protein-Protein Interaction Data Sets*

    Science.gov (United States)

    Yu, Xueping; Ivanic, Joseph; Memišević, Vesna; Wallqvist, Anders; Reifman, Jaques

    2011-01-01

    We characterized and evaluated the functional attributes of three yeast high-confidence protein-protein interaction data sets derived from affinity purification/mass spectrometry, protein-fragment complementation assay, and yeast two-hybrid experiments. The interacting proteins retrieved from these data sets formed distinct, partially overlapping sets with different protein-protein interaction characteristics. These differences were primarily a function of the deployed experimental technologies used to recover these interactions. This affected the total coverage of interactions and was especially evident in the recovery of interactions among different functional classes of proteins. We found that the interaction data obtained by the yeast two-hybrid method was the least biased toward any particular functional characterization. In contrast, interacting proteins in the affinity purification/mass spectrometry and protein-fragment complementation assay data sets were over- and under-represented among distinct and different functional categories. We delineated how these differences affected protein complex organization in the network of interactions, in particular for strongly interacting complexes (e.g. RNA and protein synthesis) versus weak and transient interacting complexes (e.g. protein transport). We quantified methodological differences in detecting protein interactions from larger protein complexes, in the correlation of protein abundance among interacting proteins, and in their connectivity of essential proteins. In the latter case, we showed that minimizing inherent methodology biases removed many of the ambiguous conclusions about protein essentiality and protein connectivity. We used these findings to rationalize how biological insights obtained by analyzing data sets originating from different sources sometimes do not agree or may even contradict each other. An important corollary of this work was that discrepancies in biological insights did not

  10. Protein-protein interaction network-based detection of functionally similar proteins within species.

    Science.gov (United States)

    Song, Baoxing; Wang, Fen; Guo, Yang; Sang, Qing; Liu, Min; Li, Dengyun; Fang, Wei; Zhang, Deli

    2012-07-01

    Although functionally similar proteins across species have been widely studied, functionally similar proteins within species showing low sequence similarity have not been examined in detail. Identification of these proteins is of significant importance for understanding biological functions, evolution of protein families, progression of co-evolution, and convergent evolution and others which cannot be obtained by detection of functionally similar proteins across species. Here, we explored a method of detecting functionally similar proteins within species based on graph theory. After denoting protein-protein interaction networks using graphs, we split the graphs into subgraphs using the 1-hop method. Proteins with functional similarities in a species were detected using a method of modified shortest path to compare these subgraphs and to find the eligible optimal results. Using seven protein-protein interaction networks and this method, some functionally similar proteins with low sequence similarity that cannot detected by sequence alignment were identified. By analyzing the results, we found that, sometimes, it is difficult to separate homologous from convergent evolution. Evaluation of the performance of our method by gene ontology term overlap showed that the precision of our method was excellent. Copyright © 2012 Wiley Periodicals, Inc.

  11. Biospecific protein immobilization for rapid analysis of weak protein interactions using self-interaction nanoparticle spectroscopy.

    Science.gov (United States)

    Bengali, Aditya N; Tessier, Peter M

    2009-10-01

    "Reversible" protein interactions govern diverse biological behavior ranging from intracellular transport and toxic protein aggregation to protein crystallization and inactivation of protein therapeutics. Much less is known about weak protein interactions than their stronger counterparts since they are difficult to characterize, especially in a parallel format (in contrast to a sequential format) necessary for high-throughput screening. We have recently introduced a highly efficient approach of characterizing protein self-association, namely self-interaction nanoparticle spectroscopy (SINS; Tessier et al., 2008; J Am Chem Soc 130:3106-3112). This approach exploits the separation-dependent optical properties of gold nanoparticles to detect weak self-interactions between proteins immobilized on nanoparticles. A limitation of our previous work is that differences in the sequence and structure of proteins can lead to significant differences in their affinity to adsorb to nanoparticle surfaces, which complicates analysis of the corresponding protein self-association behavior. In this work we demonstrate a highly specific approach for coating nanoparticles with proteins using biotin-avidin interactions to generate protein-nanoparticle conjugates that report protein self-interactions through changes in their optical properties. Using lysozyme as a model protein that is refractory to characterization by conventional SINS, we demonstrate that surface Plasmon wavelengths for gold-avidin-lysozyme conjugates over a range of solution conditions (i.e., pH and ionic strength) are well correlated with lysozyme osmotic second virial coefficient measurements. Since SINS requires orders of magnitude less protein and time than conventional methods (e.g., static light scattering), we envision this approach will find application in large screens of protein self-association aimed at either preventing (e.g., protein aggregation) or promoting (e.g., protein crystallization) these

  12. Protein-protein interactions in the regulation of WRKY transcription factors.

    Science.gov (United States)

    Chi, Yingjun; Yang, Yan; Zhou, Yuan; Zhou, Jie; Fan, Baofang; Yu, Jing-Quan; Chen, Zhixiang

    2013-03-01

    It has been almost 20 years since the first report of a WRKY transcription factor, SPF1, from sweet potato. Great progress has been made since then in establishing the diverse biological roles of WRKY transcription factors in plant growth, development, and responses to biotic and abiotic stress. Despite the functional diversity, almost all analyzed WRKY proteins recognize the TTGACC/T W-box sequences and, therefore, mechanisms other than mere recognition of the core W-box promoter elements are necessary to achieve the regulatory specificity of WRKY transcription factors. Research over the past several years has revealed that WRKY transcription factors physically interact with a wide range of proteins with roles in signaling, transcription, and chromatin remodeling. Studies of WRKY-interacting proteins have provided important insights into the regulation and mode of action of members of the important family of transcription factors. It has also emerged that the slightly varied WRKY domains and other protein motifs conserved within each of the seven WRKY subfamilies participate in protein-protein interactions and mediate complex functional interactions between WRKY proteins and between WRKY and other regulatory proteins in the modulation of important biological processes. In this review, we summarize studies of protein-protein interactions for WRKY transcription factors and discuss how the interacting partners contribute, at different levels, to the establishment of the complex regulatory and functional network of WRKY transcription factors.

  13. The Development of Protein Microarrays and Their Applications in DNA-Protein and Protein-Protein Interaction Analyses of Arabidopsis Transcription Factors

    Science.gov (United States)

    Gong, Wei; He, Kun; Covington, Mike; Dinesh-Kumar, S. P.; Snyder, Michael; Harmer, Stacey L.; Zhu, Yu-Xian; Deng, Xing Wang

    2009-01-01

    We used our collection of Arabidopsis transcription factor (TF) ORFeome clones to construct protein microarrays containing as many as 802 TF proteins. These protein microarrays were used for both protein-DNA and protein-protein interaction analyses. For protein-DNA interaction studies, we examined AP2/ERF family TFs and their cognate cis-elements. By careful comparison of the DNA-binding specificity of 13 TFs on the protein microarray with previous non-microarray data, we showed that protein microarrays provide an efficient and high throughput tool for genome-wide analysis of TF-DNA interactions. This microarray protein-DNA interaction analysis allowed us to derive a comprehensive view of DNA-binding profiles of AP2/ERF family proteins in Arabidopsis. It also revealed four TFs that bound the EE (evening element) and had the expected phased gene expression under clock-regulation, thus providing a basis for further functional analysis of their roles in clock regulation of gene expression. We also developed procedures for detecting protein interactions using this TF protein microarray and discovered four novel partners that interact with HY5, which can be validated by yeast two-hybrid assays. Thus, plant TF protein microarrays offer an attractive high-throughput alternative to traditional techniques for TF functional characterization on a global scale. PMID:19802365

  14. Inferring high-confidence human protein-protein interactions

    Directory of Open Access Journals (Sweden)

    Yu Xueping

    2012-05-01

    Full Text Available Abstract Background As numerous experimental factors drive the acquisition, identification, and interpretation of protein-protein interactions (PPIs, aggregated assemblies of human PPI data invariably contain experiment-dependent noise. Ascertaining the reliability of PPIs collected from these diverse studies and scoring them to infer high-confidence networks is a non-trivial task. Moreover, a large number of PPIs share the same number of reported occurrences, making it impossible to distinguish the reliability of these PPIs and rank-order them. For example, for the data analyzed here, we found that the majority (>83% of currently available human PPIs have been reported only once. Results In this work, we proposed an unsupervised statistical approach to score a set of diverse, experimentally identified PPIs from nine primary databases to create subsets of high-confidence human PPI networks. We evaluated this ranking method by comparing it with other methods and assessing their ability to retrieve protein associations from a number of diverse and independent reference sets. These reference sets contain known biological data that are either directly or indirectly linked to interactions between proteins. We quantified the average effect of using ranked protein interaction data to retrieve this information and showed that, when compared to randomly ranked interaction data sets, the proposed method created a larger enrichment (~134% than either ranking based on the hypergeometric test (~109% or occurrence ranking (~46%. Conclusions From our evaluations, it was clear that ranked interactions were always of value because higher-ranked PPIs had a higher likelihood of retrieving high-confidence experimental data. Reducing the noise inherent in aggregated experimental PPIs via our ranking scheme further increased the accuracy and enrichment of PPIs derived from a number of biologically relevant data sets. These results suggest that using our high

  15. Immobilized Cytochrome P450 2C9 (CYP2C9): Applications for Metabolite Generation, Monitoring Protein-Protein Interactions, and Improving In-vivo Predictions Using Enhanced In-vitro Models

    Science.gov (United States)

    Wollenberg, Lance A.

    Cytochrome P450 (P450) enzymes are a family of oxoferroreductase enzymes containing a heme moiety and are well known to be involved in the metabolism of a wide variety of endogenous and xenobiotic materials. It is estimated that roughly 75% of all pharmaceutical compounds are metabolized by these enzymes. Traditional reconstituted in-vitro incubation studies using recombinant P450 enzymes are often used to predict in-vivo kinetic parameters of a drug early in development. However, in many cases, these reconstituted incubations are prone to aggregation which has been shown to affect the catalytic activity of an enzyme. Moreover, the presence of other isoforms of P450 enzymes present in a metabolic incubation, as is the case with microsomal systems, may affect the catalytic activity of an enzyme through isoform-specific protein-protein interactions. Both of these effects may result in inaccurate prediction of in-vivo drug metabolism using in-vitro experiments. Here we described the development of immobilized P450 constructs designed to elucidate the effects of aggregation and protein-protein interactions between P450 isoforms on catalytic activities. The long term objective of this project is to develop a system to control the oligomeric state of Cytochrome P450 enzymes to accurately elucidate discrepancies between in vitro reconstituted systems and actual in vivo drug metabolism for the precise prediction of metabolic activity. This approach will serve as a system to better draw correlations between in-vivo and in-vitro drug metabolism data. The central hypothesis is that Cytochrome P450 enzymes catalytic activity can be altered by protein-protein interactions occurring between Cytochrome P450 enzymes involved in drug metabolism, and is dependent on varying states of protein aggregation. This dissertation explains the details of the construction and characterization of a nanostructure device designed to control the state of aggregation of a P450 enzyme. Moreover

  16. A predicted protein interactome identifies conserved global networks and disease resistance subnetworks in maize.

    Directory of Open Access Journals (Sweden)

    Matt eGeisler

    2015-06-01

    Full Text Available Interactomes are genome-wide roadmaps of protein-protein interactions. They have been produced for humans, yeast, the fruit fly, and Arabidopsis thaliana and have become invaluable tools for generating and testing hypotheses. A predicted interactome for Zea mays (PiZeaM is presented here as an aid to the research community for this valuable crop species. PiZeaM was built using a proven method of interologs (interacting orthologs that were identified using both one-to-one and many-to-many orthology between genomes of maize and reference species. Where both maize orthologs occurred for an experimentally determined interaction in the reference species, we predicted a likely interaction in maize. A total of 49,026 unique interactions for 6,004 maize proteins were predicted. These interactions are enriched for processes that are evolutionarily conserved, but include many otherwise poorly annotated proteins in maize. The predicted maize interactions were further analyzed by comparing annotation of interacting proteins, including different layers of ontology. A map of pairwise gene co-expression was also generated and compared to predicted interactions. Two global subnetworks were constructed for highly conserved interactions. These subnetworks showed clear clustering of proteins by function. Another subnetwork was created for disease response using a bait and prey strategy to capture interacting partners for proteins that respond to other organisms. Closer examination of this subnetwork revealed the connectivity between biotic and abiotic hormone stress pathways. We believe PiZeaM will provide a useful tool for the prediction of protein function and analysis of pathways for Z. mays researchers and is presented in this paper as a reference tool for the exploration of protein interactions in maize.

  17. Energetics of the protein-DNA-water interaction

    Directory of Open Access Journals (Sweden)

    Marabotti Anna

    2007-01-01

    Full Text Available Abstract Background To understand the energetics of the interaction between protein and DNA we analyzed 39 crystallographically characterized complexes with the HINT (Hydropathic INTeractions computational model. HINT is an empirical free energy force field based on solvent partitioning of small molecules between water and 1-octanol. Our previous studies on protein-ligand complexes demonstrated that free energy predictions were significantly improved by taking into account the energetic contribution of water molecules that form at least one hydrogen bond with each interacting species. Results An initial correlation between the calculated HINT scores and the experimentally determined binding free energies in the protein-DNA system exhibited a relatively poor r2 of 0.21 and standard error of ± 1.71 kcal mol-1. However, the inclusion of 261 waters that bridge protein and DNA improved the HINT score-free energy correlation to an r2 of 0.56 and standard error of ± 1.28 kcal mol-1. Analysis of the water role and energy contributions indicate that 46% of the bridging waters act as linkers between amino acids and nucleotide bases at the protein-DNA interface, while the remaining 54% are largely involved in screening unfavorable electrostatic contacts. Conclusion This study quantifies the key energetic role of bridging waters in protein-DNA associations. In addition, the relevant role of hydrophobic interactions and entropy in driving protein-DNA association is indicated by analyses of interaction character showing that, together, the favorable polar and unfavorable polar/hydrophobic-polar interactions (i.e., desolvation mostly cancel.

  18. The missing piece in the puzzle: Prediction of aggregation via the protein-protein interaction parameter A∗2.

    Science.gov (United States)

    Koepf, Ellen; Schroeder, Rudolf; Brezesinski, Gerald; Friess, Wolfgang

    2018-07-01

    The tendency of protein pharmaceuticals to form aggregates is a major challenge during formulation development, as aggregation affects quality and safety of the product. In particular, the formation of large native-like particles in the context of liquid-air interfacial stress is a well-known but not fully understood problem. Focusing on the two most fundamental criteria of protein formulation affecting protein-protein interaction, the impact of pH and ionic strength on the interaction parameter A ∗ 2 and its link to aggregation upon mechanical stress was investigated. A ∗ 2 of two monoclonal antibodies (mABs) and a polyclonal IgG was determined using dynamic light scattering and was correlated to the number of particles formed upon shaking in vials analyzed by visual inspection, turbidity analysis, light obscuration and micro-flow imaging. A good correlation between aggregation induced by interfacial stress and formulation pH was given. It could be shown that A ∗ 2 was highest for mAB 1 and lowest for IgG, what was in good accordance with the number of particles formed. Shaking of IgG resulted in overall higher numbers of particles compared to the two mABs. A ∗ 2 decreased and particle numbers increased with increasing pH. Different to pH, ionic strength only slightly affected A ∗ 2 . Nevertheless, at high ionic (100 mM) strength the samples exhibited more pronounced particle formation, particularly of large particles >25 µm, which was most pronounced at high pH. Protein solutions were identified to form continuous films with an inhomogeneous protein distribution at the liquid-air interface. These areas of agglomerated, native-like protein material can be transferred into the bulk solution by compression-decompression of the interface. Whether or not those clusters lead to the appearance of large protein aggregates or fall apart depends on the attractive or repulsive forces between protein molecules. Thus, protein aggregation due to interfacial

  19. Detecting protein-protein interactions in living cells

    DEFF Research Database (Denmark)

    Gottschalk, Marie; Bach, Anders; Hansen, Jakob Lerche

    2009-01-01

    to the endogenous C-terminal peptide of the NMDA receptor, as evaluated by a cell-free protein-protein interaction assay. However, it is important to address both membrane permeability and effect in living cells. Therefore a bioluminescence resonance energy transfer (BRET) assay was established, where the C......-terminal of the NMDA receptor and PDZ2 of PSD-95 were fused to green fluorescent protein (GFP) and Renilla luciferase (Rluc) and expressed in COS7 cells. A robust and specific BRET signal was obtained by expression of the appropriate partner proteins and subsequently, the assay was used to evaluate a Tat......The PDZ domain mediated interaction between the NMDA receptor and its intracellular scaffolding protein, PSD-95, is a potential target for treatment of ischemic brain diseases. We have recently developed a number of peptide analogues with improved affinity for the PDZ domains of PSD-95 compared...

  20. Targeting protein-protein interactions for parasite control.

    Directory of Open Access Journals (Sweden)

    Christina M Taylor

    2011-04-01

    Full Text Available Finding new drug targets for pathogenic infections would be of great utility for humanity, as there is a large need to develop new drugs to fight infections due to the developing resistance and side effects of current treatments. Current drug targets for pathogen infections involve only a single protein. However, proteins rarely act in isolation, and the majority of biological processes occur via interactions with other proteins, so protein-protein interactions (PPIs offer a realm of unexplored potential drug targets and are thought to be the next-generation of drug targets. Parasitic worms were chosen for this study because they have deleterious effects on human health, livestock, and plants, costing society billions of dollars annually and many sequenced genomes are available. In this study, we present a computational approach that utilizes whole genomes of 6 parasitic and 1 free-living worm species and 2 hosts. The species were placed in orthologous groups, then binned in species-specific orthologous groups. Proteins that are essential and conserved among species that span a phyla are of greatest value, as they provide foundations for developing broad-control strategies. Two PPI databases were used to find PPIs within the species specific bins. PPIs with unique helminth proteins and helminth proteins with unique features relative to the host, such as indels, were prioritized as drug targets. The PPIs were scored based on RNAi phenotype and homology to the PDB (Protein DataBank. EST data for the various life stages, GO annotation, and druggability were also taken into consideration. Several PPIs emerged from this study as potential drug targets. A few interactions were supported by co-localization of expression in M. incognita (plant parasite and B. malayi (H. sapiens parasite, which have extremely different modes of parasitism. As more genomes of pathogens are sequenced and PPI databases expanded, this methodology will become increasingly

  1. Yeast Interacting Proteins Database: YLR447C, YOR047C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available xpression; interacts with protein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding protein Sp...; interacts with protein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding protein Spt15p; act

  2. Simple knowledge-based descriptors to predict protein-ligand interactions. Methodology and validation

    Science.gov (United States)

    Nissink, J. Willem M.; Verdonk, Marcel L.; Klebe, Gerhard

    2000-11-01

    A new type of shape descriptor is proposed to describe the spatial orientation for non-covalent interactions. It is built from simple, anisotropic Gaussian contributions that are parameterised by 10 adjustable values. The descriptors have been used to fit propensity distributions derived from scatter data stored in the IsoStar database. This database holds composite pictures of possible interaction geometries between a common central group and various interacting moieties, as extracted from small-molecule crystal structures. These distributions can be related to probabilities for the occurrence of certain interaction geometries among different functional groups. A fitting procedure is described that generates the descriptors in a fully automated way. For this purpose, we apply a similarity index that is tailored to the problem, the Split Hodgkin Index. It accounts for the similarity in regions of either high or low propensity in a separate way. Although dependent on the division into these two subregions, the index is robust and performs better than the regular Hodgkin index. The reliability and coverage of the fitted descriptors was assessed using SuperStar. SuperStar usually operates on the raw IsoStar data to calculate propensity distributions, e.g., for a binding site in a protein. For our purpose we modified the code to have it operate on our descriptors instead. This resulted in a substantial reduction in calculation time (factor of five to eight) compared to the original implementation. A validation procedure was performed on a set of 130 protein-ligand complexes, using four representative interacting probes to map the properties of the various binding sites: ammonium nitrogen, alcohol oxygen, carbonyl oxygen, and methyl carbon. The predicted `hot spots' for the binding of these probes were compared to the actual arrangement of ligand atoms in experimentally determined protein-ligand complexes. Results indicate that the version of SuperStar that applies to

  3. Systematic Prediction of Scaffold Proteins Reveals New Design Principles in Scaffold-Mediated Signal Transduction

    Science.gov (United States)

    Hu, Jianfei; Neiswinger, Johnathan; Zhang, Jin; Zhu, Heng; Qian, Jiang

    2015-01-01

    Scaffold proteins play a crucial role in facilitating signal transduction in eukaryotes by bringing together multiple signaling components. In this study, we performed a systematic analysis of scaffold proteins in signal transduction by integrating protein-protein interaction and kinase-substrate relationship networks. We predicted 212 scaffold proteins that are involved in 605 distinct signaling pathways. The computational prediction was validated using a protein microarray-based approach. The predicted scaffold proteins showed several interesting characteristics, as we expected from the functionality of scaffold proteins. We found that the scaffold proteins are likely to interact with each other, which is consistent with previous finding that scaffold proteins tend to form homodimers and heterodimers. Interestingly, a single scaffold protein can be involved in multiple signaling pathways by interacting with other scaffold protein partners. Furthermore, we propose two possible regulatory mechanisms by which the activity of scaffold proteins is coordinated with their associated pathways through phosphorylation process. PMID:26393507

  4. Protein-protein interaction networks identify targets which rescue the MPP+ cellular model of Parkinson’s disease

    Science.gov (United States)

    Keane, Harriet; Ryan, Brent J.; Jackson, Brendan; Whitmore, Alan; Wade-Martins, Richard

    2015-11-01

    Neurodegenerative diseases are complex multifactorial disorders characterised by the interplay of many dysregulated physiological processes. As an exemplar, Parkinson’s disease (PD) involves multiple perturbed cellular functions, including mitochondrial dysfunction and autophagic dysregulation in preferentially-sensitive dopamine neurons, a selective pathophysiology recapitulated in vitro using the neurotoxin MPP+. Here we explore a network science approach for the selection of therapeutic protein targets in the cellular MPP+ model. We hypothesised that analysis of protein-protein interaction networks modelling MPP+ toxicity could identify proteins critical for mediating MPP+ toxicity. Analysis of protein-protein interaction networks constructed to model the interplay of mitochondrial dysfunction and autophagic dysregulation (key aspects of MPP+ toxicity) enabled us to identify four proteins predicted to be key for MPP+ toxicity (P62, GABARAP, GBRL1 and GBRL2). Combined, but not individual, knockdown of these proteins increased cellular susceptibility to MPP+ toxicity. Conversely, combined, but not individual, over-expression of the network targets provided rescue of MPP+ toxicity associated with the formation of autophagosome-like structures. We also found that modulation of two distinct proteins in the protein-protein interaction network was necessary and sufficient to mitigate neurotoxicity. Together, these findings validate our network science approach to multi-target identification in complex neurological diseases.

  5. Drosophila protein interaction map (DPiM): a paradigm for metazoan protein complex interactions.

    Science.gov (United States)

    Guruharsha, K G; Obar, Robert A; Mintseris, Julian; Aishwarya, K; Krishnan, R T; Vijayraghavan, K; Artavanis-Tsakonas, Spyros

    2012-01-01

    Proteins perform essential cellular functions as part of protein complexes, often in conjunction with RNA, DNA, metabolites and other small molecules. The genome encodes thousands of proteins but not all of them are expressed in every cell type; and expressed proteins are not active at all times. Such diversity of protein expression and function accounts for the level of biological intricacy seen in nature. Defining protein-protein interactions in protein complexes, and establishing the when, what and where of potential interactions, is therefore crucial to understanding the cellular function of any protein-especially those that have not been well studied by traditional molecular genetic approaches. We generated a large-scale resource of affinity-tagged expression-ready clones and used co-affinity purification combined with tandem mass-spectrometry to identify protein partners of nearly 5,000 Drosophila melanogaster proteins. The resulting protein complex "map" provided a blueprint of metazoan protein complex organization. Here we describe how the map has provided valuable insights into protein function in addition to generating hundreds of testable hypotheses. We also discuss recent technological advancements that will be critical in addressing the next generation of questions arising from the map.

  6. A feature-based approach to modeling protein-protein interaction hot spots.

    Science.gov (United States)

    Cho, Kyu-il; Kim, Dongsup; Lee, Doheon

    2009-05-01

    Identifying features that effectively represent the energetic contribution of an individual interface residue to the interactions between proteins remains problematic. Here, we present several new features and show that they are more effective than conventional features. By combining the proposed features with conventional features, we develop a predictive model for interaction hot spots. Initially, 54 multifaceted features, composed of different levels of information including structure, sequence and molecular interaction information, are quantified. Then, to identify the best subset of features for predicting hot spots, feature selection is performed using a decision tree. Based on the selected features, a predictive model for hot spots is created using support vector machine (SVM) and tested on an independent test set. Our model shows better overall predictive accuracy than previous methods such as the alanine scanning methods Robetta and FOLDEF, and the knowledge-based method KFC. Subsequent analysis yields several findings about hot spots. As expected, hot spots have a larger relative surface area burial and are more hydrophobic than other residues. Unexpectedly, however, residue conservation displays a rather complicated tendency depending on the types of protein complexes, indicating that this feature is not good for identifying hot spots. Of the selected features, the weighted atomic packing density, relative surface area burial and weighted hydrophobicity are the top 3, with the weighted atomic packing density proving to be the most effective feature for predicting hot spots. Notably, we find that hot spots are closely related to pi-related interactions, especially pi . . . pi interactions.

  7. The Ser/Thr Protein Kinase Protein-Protein Interaction Map of M. tuberculosis.

    Science.gov (United States)

    Wu, Fan-Lin; Liu, Yin; Jiang, He-Wei; Luan, Yi-Zhao; Zhang, Hai-Nan; He, Xiang; Xu, Zhao-Wei; Hou, Jing-Li; Ji, Li-Yun; Xie, Zhi; Czajkowsky, Daniel M; Yan, Wei; Deng, Jiao-Yu; Bi, Li-Jun; Zhang, Xian-En; Tao, Sheng-Ce

    2017-08-01

    Mycobacterium tuberculosis (Mtb) is the causative agent of tuberculosis, the leading cause of death among all infectious diseases. There are 11 eukaryotic-like serine/threonine protein kinases (STPKs) in Mtb, which are thought to play pivotal roles in cell growth, signal transduction and pathogenesis. However, their underlying mechanisms of action remain largely uncharacterized. In this study, using a Mtb proteome microarray, we have globally identified the binding proteins in Mtb for all of the STPKs, and constructed the first STPK protein interaction (KPI) map that includes 492 binding proteins and 1,027 interactions. Bioinformatics analysis showed that the interacting proteins reflect diverse functions, including roles in two-component system, transcription, protein degradation, and cell wall integrity. Functional investigations confirmed that PknG regulates cell wall integrity through key components of peptidoglycan (PG) biosynthesis, e.g. MurC. The global STPK-KPIs network constructed here is expected to serve as a rich resource for understanding the key signaling pathways in Mtb, thus facilitating drug development and effective control of Mtb. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.

  8. A Type-2 fuzzy data fusion approach for building reliable weighted protein interaction networks with application in protein complex detection.

    Science.gov (United States)

    Mehranfar, Adele; Ghadiri, Nasser; Kouhsar, Morteza; Golshani, Ashkan

    2017-09-01

    Detecting the protein complexes is an important task in analyzing the protein interaction networks. Although many algorithms predict protein complexes in different ways, surveys on the interaction networks indicate that about 50% of detected interactions are false positives. Consequently, the accuracy of existing methods needs to be improved. In this paper we propose a novel algorithm to detect the protein complexes in 'noisy' protein interaction data. First, we integrate several biological data sources to determine the reliability of each interaction and determine more accurate weights for the interactions. A data fusion component is used for this step, based on the interval type-2 fuzzy voter that provides an efficient combination of the information sources. This fusion component detects the errors and diminishes their effect on the detection protein complexes. So in the first step, the reliability scores have been assigned for every interaction in the network. In the second step, we have proposed a general protein complex detection algorithm by exploiting and adopting the strong points of other algorithms and existing hypotheses regarding real complexes. Finally, the proposed method has been applied for the yeast interaction datasets for predicting the interactions. The results show that our framework has a better performance regarding precision and F-measure than the existing approaches. Copyright © 2017 Elsevier Ltd. All rights reserved.

  9. Electrostatics, structure prediction, and the energy landscapes for protein folding and binding.

    Science.gov (United States)

    Tsai, Min-Yeh; Zheng, Weihua; Balamurugan, D; Schafer, Nicholas P; Kim, Bobby L; Cheung, Margaret S; Wolynes, Peter G

    2016-01-01

    While being long in range and therefore weakly specific, electrostatic interactions are able to modulate the stability and folding landscapes of some proteins. The relevance of electrostatic forces for steering the docking of proteins to each other is widely acknowledged, however, the role of electrostatics in establishing specifically funneled landscapes and their relevance for protein structure prediction are still not clear. By introducing Debye-Hückel potentials that mimic long-range electrostatic forces into the Associative memory, Water mediated, Structure, and Energy Model (AWSEM), a transferable protein model capable of predicting tertiary structures, we assess the effects of electrostatics on the landscapes of thirteen monomeric proteins and four dimers. For the monomers, we find that adding electrostatic interactions does not improve structure prediction. Simulations of ribosomal protein S6 show, however, that folding stability depends monotonically on electrostatic strength. The trend in predicted melting temperatures of the S6 variants agrees with experimental observations. Electrostatic effects can play a range of roles in binding. The binding of the protein complex KIX-pKID is largely assisted by electrostatic interactions, which provide direct charge-charge stabilization of the native state and contribute to the funneling of the binding landscape. In contrast, for several other proteins, including the DNA-binding protein FIS, electrostatics causes frustration in the DNA-binding region, which favors its binding with DNA but not with its protein partner. This study highlights the importance of long-range electrostatics in functional responses to problems where proteins interact with their charged partners, such as DNA, RNA, as well as membranes. © 2015 The Protein Society.

  10. Molecular simulations of lipid-mediated protein-protein interactions

    NARCIS (Netherlands)

    de Meyer, F.J.M.; Venturoli, M.; Smit, B.

    2008-01-01

    Recent experimental results revealed that lipid-mediated interactions due to hydrophobic forces may be important in determining the protein topology after insertion in the membrane, in regulating the protein activity, in protein aggregation and in signal transduction. To gain insight into the

  11. Yeast Interacting Proteins Database: YGR013W, YKL012W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available tion U1 snRNP protein involved in splicing, interacts with the branchpoint-binding protein during the formation of the second commitm... PRP40 U1 snRNP protein involved in splicing, interacts with the branchpoint-binding protein during the form...ation of the second commitment complex Rows with this prey as prey (1) Rows with

  12. Emergence of modularity and disassortativity in protein-protein interaction networks.

    Science.gov (United States)

    Wan, Xi; Cai, Shuiming; Zhou, Jin; Liu, Zengrong

    2010-12-01

    In this paper, we present a simple evolution model of protein-protein interaction networks by introducing a rule of small-preference duplication of a node, meaning that the probability of a node chosen to duplicate is inversely proportional to its degree, and subsequent divergence plus nonuniform heterodimerization based on some plausible mechanisms in biology. We show that our model cannot only reproduce scale-free connectivity and small-world pattern, but also exhibit hierarchical modularity and disassortativity. After comparing the features of our model with those of real protein-protein interaction networks, we believe that our model can provide relevant insights into the mechanism underlying the evolution of protein-protein interaction networks. © 2010 American Institute of Physics.

  13. Interaction between a plasma membrane-localized ankyrin-repeat protein ITN1 and a nuclear protein RTV1

    Energy Technology Data Exchange (ETDEWEB)

    Sakamoto, Hikaru [Department of Bioproduction, Faculty of Bioindustry, Tokyo University of Agriculture, 196 Yasaka, Abashiri-shi, Hokkaido 093-2422 (Japan); Sakata, Keiko; Kusumi, Kensuke [Department of Biology, Faculty of Sciences, Kyushu University, 6-10-1 Hakozaki, Higashi-ku, Fukuoka 812-8581 (Japan); Kojima, Mikiko; Sakakibara, Hitoshi [RIKEN Plant Science Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045 (Japan); Iba, Koh, E-mail: koibascb@kyushu-u.org [Department of Biology, Faculty of Sciences, Kyushu University, 6-10-1 Hakozaki, Higashi-ku, Fukuoka 812-8581 (Japan)

    2012-06-29

    Highlights: Black-Right-Pointing-Pointer ITN1, a plasma membrane ankyrin protein, interacts with a nuclear DNA-binding protein RTV1. Black-Right-Pointing-Pointer The nuclear transport of RTV1 is partially inhibited by interaction with ITN1. Black-Right-Pointing-Pointer RTV1 can promote the nuclear localization of ITN1. Black-Right-Pointing-Pointer Both overexpression of RTV1 and the lack of ITN1 increase salicylic acids sensitivity in plants. -- Abstract: The increased tolerance to NaCl 1 (ITN1) protein is a plasma membrane (PM)-localized protein involved in responses to NaCl stress in Arabidopsis. The predicted structure of ITN1 is composed of multiple transmembrane regions and an ankyrin-repeat domain that is known to mediate protein-protein interactions. To elucidate the molecular functions of ITN1, we searched for interacting partners using a yeast two-hybrid assay, and a nuclear-localized DNA-binding protein, RTV1, was identified as a candidate. Bimolecular fluorescence complementation analysis revealed that RTV1 interacted with ITN1 at the PM and nuclei in vivo. RTV1 tagged with red fluorescent protein localized to nuclei and ITN1 tagged with green fluorescent protein localized to PM; however, both proteins localized to both nuclei and the PM when co-expressed. These findings suggest that RTV1 and ITN1 regulate the subcellular localization of each other.

  14. A web server for analysis, comparison and prediction of protein ligand binding sites.

    Science.gov (United States)

    Singh, Harinder; Srivastava, Hemant Kumar; Raghava, Gajendra P S

    2016-03-25

    One of the major challenges in the field of system biology is to understand the interaction between a wide range of proteins and ligands. In the past, methods have been developed for predicting binding sites in a protein for a limited number of ligands. In order to address this problem, we developed a web server named 'LPIcom' to facilitate users in understanding protein-ligand interaction. Analysis, comparison and prediction modules are available in the "LPIcom' server to predict protein-ligand interacting residues for 824 ligands. Each ligand must have at least 30 protein binding sites in PDB. Analysis module of the server can identify residues preferred in interaction and binding motif for a given ligand; for example residues glycine, lysine and arginine are preferred in ATP binding sites. Comparison module of the server allows comparing protein-binding sites of multiple ligands to understand the similarity between ligands based on their binding site. This module indicates that ATP, ADP and GTP ligands are in the same cluster and thus their binding sites or interacting residues exhibit a high level of similarity. Propensity-based prediction module has been developed for predicting ligand-interacting residues in a protein for more than 800 ligands. In addition, a number of web-based tools have been integrated to facilitate users in creating web logo and two-sample between ligand interacting and non-interacting residues. In summary, this manuscript presents a web-server for analysis of ligand interacting residue. This server is available for public use from URL http://crdd.osdd.net/raghava/lpicom .

  15. Improving N-terminal protein annotation of Plasmodium species based on signal peptide prediction of orthologous proteins

    Directory of Open Access Journals (Sweden)

    Neto Armando

    2012-11-01

    Full Text Available Abstract Background Signal peptide is one of the most important motifs involved in protein trafficking and it ultimately influences protein function. Considering the expected functional conservation among orthologs it was hypothesized that divergence in signal peptides within orthologous groups is mainly due to N-terminal protein sequence misannotation. Thus, discrepancies in signal peptide prediction of orthologous proteins were used to identify misannotated proteins in five Plasmodium species. Methods Signal peptide (SignalP and orthology (OrthoMCL were combined in an innovative strategy to identify orthologous groups showing discrepancies in signal peptide prediction among their protein members (Mixed groups. In a comparative analysis, multiple alignments for each of these groups and gene models were visually inspected in search of misannotated proteins and, whenever possible, alternative gene models were proposed. Thresholds for signal peptide prediction parameters were also modified to reduce their impact as a possible source of discrepancy among orthologs. Validation of new gene models was based on RT-PCR (few examples or on experimental evidence already published (ApiLoc. Results The rate of misannotated proteins was significantly higher in Mixed groups than in Positive or Negative groups, corroborating the proposed hypothesis. A total of 478 proteins were reannotated and change of signal peptide prediction from negative to positive was the most common. Reannotations triggered the conversion of almost 50% of all Mixed groups, which were further reduced by optimization of signal peptide prediction parameters. Conclusions The methodological novelty proposed here combining orthology and signal peptide prediction proved to be an effective strategy for the identification of proteins showing wrongly N-terminal annotated sequences, and it might have an important impact in the available data for genome-wide searching of potential vaccine and drug

  16. Globular and disordered – the non-identical twins in protein-protein interactions

    Directory of Open Access Journals (Sweden)

    Kaare eTeilum

    2015-07-01

    Full Text Available In biology proteins from different structural classes interact across and within classes in ways that are optimized to achieve balanced functional outputs. The interactions between intrinsically disordered proteins (IDPs and other proteins rely on changes in flexibility and this is seen as a strong determinant for their function. This has fostered the notion that IDP’s bind with low affinity but high specificity. Here we have analyzed available detailed thermodynamic data for protein-protein interactions to put to the test if the thermodynamic profiles of IDP interactions differ from those of other protein-protein interactions. We find that ordered proteins and the disordered ones act as non identical twins operating by similar principles but where the disordered proteins complexes are on average less stable by 2.5 kcal mol-1.

  17. Design principles for cancer therapy guided by changes in complexity of protein-protein interaction networks.

    Science.gov (United States)

    Benzekry, Sebastian; Tuszynski, Jack A; Rietman, Edward A; Lakka Klement, Giannoula

    2015-05-28

    The ever-increasing expanse of online bioinformatics data is enabling new ways to, not only explore the visualization of these data, but also to apply novel mathematical methods to extract meaningful information for clinically relevant analysis of pathways and treatment decisions. One of the methods used for computing topological characteristics of a space at different spatial resolutions is persistent homology. This concept can also be applied to network theory, and more specifically to protein-protein interaction networks, where the number of rings in an individual cancer network represents a measure of complexity. We observed a linear correlation of R = -0.55 between persistent homology and 5-year survival of patients with a variety of cancers. This relationship was used to predict the proteins within a protein-protein interaction network with the most impact on cancer progression. By re-computing the persistent homology after computationally removing an individual node (protein) from the protein-protein interaction network, we were able to evaluate whether such an inhibition would lead to improvement in patient survival. The power of this approach lied in its ability to identify the effects of inhibition of multiple proteins and in the ability to expose whether the effect of a single inhibition may be amplified by inhibition of other proteins. More importantly, we illustrate specific examples of persistent homology calculations, which correctly predict the survival benefit observed effects in clinical trials using inhibitors of the identified molecular target. We propose that computational approaches such as persistent homology may be used in the future for selection of molecular therapies in clinic. The technique uses a mathematical algorithm to evaluate the node (protein) whose inhibition has the highest potential to reduce network complexity. The greater the drop in persistent homology, the greater reduction in network complexity, and thus a larger

  18. Novel Technology for Protein-Protein Interaction-based Targeted Drug Discovery

    Directory of Open Access Journals (Sweden)

    Jung Me Hwang

    2011-12-01

    Full Text Available We have developed a simple but highly efficient in-cell protein-protein interaction (PPI discovery system based on the translocation properties of protein kinase C- and its C1a domain in live cells. This system allows the visual detection of trimeric and dimeric protein interactions including cytosolic, nuclear, and/or membrane proteins with their cognate ligands. In addition, this system can be used to identify pharmacological small compounds that inhibit specific PPIs. These properties make this PPI system an attractive tool for screening drug candidates and mapping the protein interactome.

  19. Globular and disordered-the non-identical twins in protein-protein interactions

    DEFF Research Database (Denmark)

    Teilum, Kaare; Olsen, Johan Gotthardt; Kragelund, Birthe Brandt

    2015-01-01

    as a strong determinant for their function. This has fostered the notion that IDP's bind with low affinity but high specificity. Here we have analyzed available detailed thermodynamic data for protein-protein interactions to put to the test if the thermodynamic profiles of IDP interactions differ from those...... of other protein-protein interactions. We find that ordered proteins and the disordered ones act as non-identical twins operating by similar principles but where the disordered proteins complexes are on average less stable by 2.5 kcal mol(-1)....

  20. Towards a map of the Populus biomass protein-protein interaction network

    Energy Technology Data Exchange (ETDEWEB)

    Beers, Eric [Virginia Polytechnic Inst. and State Univ. (Virginia Tech), Blacksburg, VA (United States); Brunner, Amy [Virginia Polytechnic Inst. and State Univ. (Virginia Tech), Blacksburg, VA (United States); Helm, Richard [Virginia Polytechnic Inst. and State Univ. (Virginia Tech), Blacksburg, VA (United States); Dickerman, Allan [Virginia Polytechnic Inst. and State Univ. (Virginia Tech), Blacksburg, VA (United States)

    2015-07-31

    Biofuels can be produced from a variety of plant feedstocks. The value of a particular feedstock for biofuels production depends in part on the degree of difficulty associated with the extraction of fermentable sugars from the plant biomass. The wood of trees is potentially a rich source fermentable sugars. However, the sugars in wood exist in a tightly cross-linked matrix of cellulose, hemicellulose, and lignin, making them largely recalcitrant to release and fermentation for biofuels production. Before breeders and genetic engineers can effectively develop plants with reduced recalcitrance to fermentation, it is necessary to gain a better understanding of the fundamental biology of the mechanisms responsible for wood formation. Regulatory, structural, and enzymatic proteins are required for the complicated process of wood formation. To function properly, proteins must interact with other proteins. Yet, very few of the protein-protein interactions necessary for wood formation are known. The main objectives of this project were to 1) identify new protein-protein interactions relevant to wood formation, and 2) perform in-depth characterizations of selected protein-protein interactions. To identify relevant protein-protein interactions, we cloned a set of approximately 400 genes that were highly expressed in the wood-forming tissue (known as secondary xylem) of poplar (Populus trichocarpa). We tested whether the proteins encoded by these biomass genes interacted with each other in a binary matrix design using the yeast two-hybrid (Y2H) method for protein-protein interaction discovery. We also tested a subset of the 400 biomass proteins for interactions with all proteins present in wood-forming tissue of poplar in a biomass library screen design using Y2H. Together, these two Y2H screens yielded over 270 interactions involving over 75 biomass proteins. For the second main objective we selected several interacting pairs or groups of interacting proteins for in

  1. The effect of protein-protein and protein-membrane interactions on membrane fouling in ultrafiltration

    NARCIS (Netherlands)

    Huisman, I.H.; Prádanos, P.; Hernández, A.

    2000-01-01

    It was studied how protein-protein and protein-membrane interactions influence the filtration performance during the ultrafiltration of protein solutions over polymeric membranes. This was done by measuring flux, streaming potential, and protein transmission during filtration of bovine serum albumin

  2. CaMELS: In silico prediction of calmodulin binding proteins and their binding sites.

    Science.gov (United States)

    Abbasi, Wajid Arshad; Asif, Amina; Andleeb, Saiqa; Minhas, Fayyaz Ul Amir Afsar

    2017-09-01

    Due to Ca 2+ -dependent binding and the sequence diversity of Calmodulin (CaM) binding proteins, identifying CaM interactions and binding sites in the wet-lab is tedious and costly. Therefore, computational methods for this purpose are crucial to the design of such wet-lab experiments. We present an algorithm suite called CaMELS (CalModulin intEraction Learning System) for predicting proteins that interact with CaM as well as their binding sites using sequence information alone. CaMELS offers state of the art accuracy for both CaM interaction and binding site prediction and can aid biologists in studying CaM binding proteins. For CaM interaction prediction, CaMELS uses protein sequence features coupled with a large-margin classifier. CaMELS models the binding site prediction problem using multiple instance machine learning with a custom optimization algorithm which allows more effective learning over imprecisely annotated CaM-binding sites during training. CaMELS has been extensively benchmarked using a variety of data sets, mutagenic studies, proteome-wide Gene Ontology enrichment analyses and protein structures. Our experiments indicate that CaMELS outperforms simple motif-based search and other existing methods for interaction and binding site prediction. We have also found that the whole sequence of a protein, rather than just its binding site, is important for predicting its interaction with CaM. Using the machine learning model in CaMELS, we have identified important features of protein sequences for CaM interaction prediction as well as characteristic amino acid sub-sequences and their relative position for identifying CaM binding sites. Python code for training and evaluating CaMELS together with a webserver implementation is available at the URL: http://faculty.pieas.edu.pk/fayyaz/software.html#camels. © 2017 Wiley Periodicals, Inc.

  3. Yeast Interacting Proteins Database: YGL237C, YOR047C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ene expression; interacts with protein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding prote... expression; interacts with protein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding protein

  4. Yeast Interacting Proteins Database: YKL002W, YOR047C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ene expression; interacts with protein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding prote...xpression; interacts with protein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding protein Sp

  5. Text mining improves prediction of protein functional sites.

    Directory of Open Access Journals (Sweden)

    Karin M Verspoor

    Full Text Available We present an approach that integrates protein structure analysis and text mining for protein functional site prediction, called LEAP-FS (Literature Enhanced Automated Prediction of Functional Sites. The structure analysis was carried out using Dynamics Perturbation Analysis (DPA, which predicts functional sites at control points where interactions greatly perturb protein vibrations. The text mining extracts mentions of residues in the literature, and predicts that residues mentioned are functionally important. We assessed the significance of each of these methods by analyzing their performance in finding known functional sites (specifically, small-molecule binding sites and catalytic sites in about 100,000 publicly available protein structures. The DPA predictions recapitulated many of the functional site annotations and preferentially recovered binding sites annotated as biologically relevant vs. those annotated as potentially spurious. The text-based predictions were also substantially supported by the functional site annotations: compared to other residues, residues mentioned in text were roughly six times more likely to be found in a functional site. The overlap of predictions with annotations improved when the text-based and structure-based methods agreed. Our analysis also yielded new high-quality predictions of many functional site residues that were not catalogued in the curated data sources we inspected. We conclude that both DPA and text mining independently provide valuable high-throughput protein functional site predictions, and that integrating the two methods using LEAP-FS further improves the quality of these predictions.

  6. Text Mining Improves Prediction of Protein Functional Sites

    Science.gov (United States)

    Cohn, Judith D.; Ravikumar, Komandur E.

    2012-01-01

    We present an approach that integrates protein structure analysis and text mining for protein functional site prediction, called LEAP-FS (Literature Enhanced Automated Prediction of Functional Sites). The structure analysis was carried out using Dynamics Perturbation Analysis (DPA), which predicts functional sites at control points where interactions greatly perturb protein vibrations. The text mining extracts mentions of residues in the literature, and predicts that residues mentioned are functionally important. We assessed the significance of each of these methods by analyzing their performance in finding known functional sites (specifically, small-molecule binding sites and catalytic sites) in about 100,000 publicly available protein structures. The DPA predictions recapitulated many of the functional site annotations and preferentially recovered binding sites annotated as biologically relevant vs. those annotated as potentially spurious. The text-based predictions were also substantially supported by the functional site annotations: compared to other residues, residues mentioned in text were roughly six times more likely to be found in a functional site. The overlap of predictions with annotations improved when the text-based and structure-based methods agreed. Our analysis also yielded new high-quality predictions of many functional site residues that were not catalogued in the curated data sources we inspected. We conclude that both DPA and text mining independently provide valuable high-throughput protein functional site predictions, and that integrating the two methods using LEAP-FS further improves the quality of these predictions. PMID:22393388

  7. Mapping Protein Interactions between Dengue Virus and Its Human and Insect Hosts

    Science.gov (United States)

    Doolittle, Janet M.; Gomez, Shawn M.

    2011-01-01

    Background Dengue fever is an increasingly significant arthropod-borne viral disease, with at least 50 million cases per year worldwide. As with other viral pathogens, dengue virus is dependent on its host to perform the bulk of functions necessary for viral survival and replication. To be successful, dengue must manipulate host cell biological processes towards its own ends, while avoiding elimination by the immune system. Protein-protein interactions between the virus and its host are one avenue through which dengue can connect and exploit these host cellular pathways and processes. Methodology/Principal Findings We implemented a computational approach to predict interactions between Dengue virus (DENV) and both of its hosts, Homo sapiens and the insect vector Aedes aegypti. Our approach is based on structural similarity between DENV and host proteins and incorporates knowledge from the literature to further support a subset of the predictions. We predict over 4,000 interactions between DENV and humans, as well as 176 interactions between DENV and A. aegypti. Additional filtering based on shared Gene Ontology cellular component annotation reduced the number of predictions to approximately 2,000 for humans and 18 for A. aegypti. Of 19 experimentally validated interactions between DENV and humans extracted from the literature, this method was able to predict nearly half (9). Additional predictions suggest specific interactions between virus and host proteins relevant to interferon signaling, transcriptional regulation, stress, and the unfolded protein response. Conclusions/Significance Dengue virus manipulates cellular processes to its advantage through specific interactions with the host's protein interaction network. The interaction networks presented here provide a set of hypothesis for further experimental investigation into the DENV life cycle as well as potential therapeutic targets. PMID:21358811

  8. Mapping protein interactions between Dengue virus and its human and insect hosts.

    Directory of Open Access Journals (Sweden)

    Janet M Doolittle

    Full Text Available BACKGROUND: Dengue fever is an increasingly significant arthropod-borne viral disease, with at least 50 million cases per year worldwide. As with other viral pathogens, dengue virus is dependent on its host to perform the bulk of functions necessary for viral survival and replication. To be successful, dengue must manipulate host cell biological processes towards its own ends, while avoiding elimination by the immune system. Protein-protein interactions between the virus and its host are one avenue through which dengue can connect and exploit these host cellular pathways and processes. METHODOLOGY/PRINCIPAL FINDINGS: We implemented a computational approach to predict interactions between Dengue virus (DENV and both of its hosts, Homo sapiens and the insect vector Aedes aegypti. Our approach is based on structural similarity between DENV and host proteins and incorporates knowledge from the literature to further support a subset of the predictions. We predict over 4,000 interactions between DENV and humans, as well as 176 interactions between DENV and A. aegypti. Additional filtering based on shared Gene Ontology cellular component annotation reduced the number of predictions to approximately 2,000 for humans and 18 for A. aegypti. Of 19 experimentally validated interactions between DENV and humans extracted from the literature, this method was able to predict nearly half (9. Additional predictions suggest specific interactions between virus and host proteins relevant to interferon signaling, transcriptional regulation, stress, and the unfolded protein response. CONCLUSIONS/SIGNIFICANCE: Dengue virus manipulates cellular processes to its advantage through specific interactions with the host's protein interaction network. The interaction networks presented here provide a set of hypothesis for further experimental investigation into the DENV life cycle as well as potential therapeutic targets.

  9. An overview of the prediction of protein DNA-binding sites.

    Science.gov (United States)

    Si, Jingna; Zhao, Rui; Wu, Rongling

    2015-03-06

    Interactions between proteins and DNA play an important role in many essential biological processes such as DNA replication, transcription, splicing, and repair. The identification of amino acid residues involved in DNA-binding sites is critical for understanding the mechanism of these biological activities. In the last decade, numerous computational approaches have been developed to predict protein DNA-binding sites based on protein sequence and/or structural information, which play an important role in complementing experimental strategies. At this time, approaches can be divided into three categories: sequence-based DNA-binding site prediction, structure-based DNA-binding site prediction, and homology modeling and threading. In this article, we review existing research on computational methods to predict protein DNA-binding sites, which includes data sets, various residue sequence/structural features, machine learning methods for comparison and selection, evaluation methods, performance comparison of different tools, and future directions in protein DNA-binding site prediction. In particular, we detail the meta-analysis of protein DNA-binding sites. We also propose specific implications that are likely to result in novel prediction methods, increased performance, or practical applications.

  10. A conserved mammalian protein interaction network.

    Directory of Open Access Journals (Sweden)

    Åsa Pérez-Bercoff

    Full Text Available Physical interactions between proteins mediate a variety of biological functions, including signal transduction, physical structuring of the cell and regulation. While extensive catalogs of such interactions are known from model organisms, their evolutionary histories are difficult to study given the lack of interaction data from phylogenetic outgroups. Using phylogenomic approaches, we infer a upper bound on the time of origin for a large set of human protein-protein interactions, showing that most such interactions appear relatively ancient, dating no later than the radiation of placental mammals. By analyzing paired alignments of orthologous and putatively interacting protein-coding genes from eight mammals, we find evidence for weak but significant co-evolution, as measured by relative selective constraint, between pairs of genes with interacting proteins. However, we find no strong evidence for shared instances of directional selection within an interacting pair. Finally, we use a network approach to show that the distribution of selective constraint across the protein interaction network is non-random, with a clear tendency for interacting proteins to share similar selective constraints. Collectively, the results suggest that, on the whole, protein interactions in mammals are under selective constraint, presumably due to their functional roles.

  11. Cytoprophet: a Cytoscape plug-in for protein and domain interaction networks inference.

    Science.gov (United States)

    Morcos, Faruck; Lamanna, Charles; Sikora, Marcin; Izaguirre, Jesús

    2008-10-01

    Cytoprophet is a software tool that allows prediction and visualization of protein and domain interaction networks. It is implemented as a plug-in of Cytoscape, an open source software framework for analysis and visualization of molecular networks. Cytoprophet implements three algorithms that predict new potential physical interactions using the domain composition of proteins and experimental assays. The algorithms for protein and domain interaction inference include maximum likelihood estimation (MLE) using expectation maximization (EM); the set cover approach maximum specificity set cover (MSSC) and the sum-product algorithm (SPA). After accepting an input set of proteins with Uniprot ID/Accession numbers and a selected prediction algorithm, Cytoprophet draws a network of potential interactions with probability scores and GO distances as edge attributes. A network of domain interactions between the domains of the initial protein list can also be generated. Cytoprophet was designed to take advantage of the visual capabilities of Cytoscape and be simple to use. An example of inference in a signaling network of myxobacterium Myxococcus xanthus is presented and available at Cytoprophet's website. http://cytoprophet.cse.nd.edu.

  12. A Novel Approach for Protein-Named Entity Recognition and Protein-Protein Interaction Extraction

    Directory of Open Access Journals (Sweden)

    Meijing Li

    2015-01-01

    Full Text Available Many researchers focus on developing protein-named entity recognition (Protein-NER or PPI extraction systems. However, the studies about these two topics cannot be merged well; then existing PPI extraction systems’ Protein-NER still needs to improve. In this paper, we developed the protein-protein interaction extraction system named PPIMiner based on Support Vector Machine (SVM and parsing tree. PPIMiner consists of three main models: natural language processing (NLP model, Protein-NER model, and PPI discovery model. The Protein-NER model, which is named ProNER, identifies the protein names based on two methods: dictionary-based method and machine learning-based method. ProNER is capable of identifying more proteins than dictionary-based Protein-NER model in other existing systems. The final discovered PPIs extracted via PPI discovery model are represented in detail because we showed the protein interaction types and the occurrence frequency through two different methods. In the experiments, the result shows that the performances achieved by our ProNER and PPI discovery model are better than other existing tools. PPIMiner applied this protein-named entity recognition approach and parsing tree based PPI extraction method to improve the performance of PPI extraction. We also provide an easy-to-use interface to access PPIs database and an online system for PPIs extraction and Protein-NER.

  13. Comparing side chain packing in soluble proteins, protein-protein interfaces, and transmembrane proteins.

    Science.gov (United States)

    Gaines, J C; Acebes, S; Virrueta, A; Butler, M; Regan, L; O'Hern, C S

    2018-05-01

    We compare side chain prediction and packing of core and non-core regions of soluble proteins, protein-protein interfaces, and transmembrane proteins. We first identified or created comparable databases of high-resolution crystal structures of these 3 protein classes. We show that the solvent-inaccessible cores of the 3 classes of proteins are equally densely packed. As a result, the side chains of core residues at protein-protein interfaces and in the membrane-exposed regions of transmembrane proteins can be predicted by the hard-sphere plus stereochemical constraint model with the same high prediction accuracies (>90%) as core residues in soluble proteins. We also find that for all 3 classes of proteins, as one moves away from the solvent-inaccessible core, the packing fraction decreases as the solvent accessibility increases. However, the side chain predictability remains high (80% within 30°) up to a relative solvent accessibility, rSASA≲0.3, for all 3 protein classes. Our results show that ≈40% of the interface regions in protein complexes are "core", that is, densely packed with side chain conformations that can be accurately predicted using the hard-sphere model. We propose packing fraction as a metric that can be used to distinguish real protein-protein interactions from designed, non-binding, decoys. Our results also show that cores of membrane proteins are the same as cores of soluble proteins. Thus, the computational methods we are developing for the analysis of the effect of hydrophobic core mutations in soluble proteins will be equally applicable to analyses of mutations in membrane proteins. © 2018 Wiley Periodicals, Inc.

  14. Phthalic Acid Chemical Probes Synthesized for Protein-Protein Interaction Analysis

    Directory of Open Access Journals (Sweden)

    Chin-Jen Wu

    2013-06-01

    Full Text Available Plasticizers are additives that are used to increase the flexibility of plastic during manufacturing. However, in injection molding processes, plasticizers cannot be generated with monomers because they can peel off from the plastics into the surrounding environment, water, or food, or become attached to skin. Among the various plasticizers that are used, 1,2-benzenedicarboxylic acid (phthalic acid is a typical precursor to generate phthalates. In addition, phthalic acid is a metabolite of diethylhexyl phthalate (DEHP. According to Gene_Ontology gene/protein database, phthalates can cause genital diseases, cardiotoxicity, hepatotoxicity, nephrotoxicity, etc. In this study, a silanized linker (3-aminopropyl triethoxyslane, APTES was deposited on silicon dioxides (SiO2 particles and phthalate chemical probes were manufactured from phthalic acid and APTES–SiO2. These probes could be used for detecting proteins that targeted phthalic acid and for protein-protein interactions. The phthalic acid chemical probes we produced were incubated with epithelioid cell lysates of normal rat kidney (NRK-52E cells to detect the interactions between phthalic acid and NRK-52E extracted proteins. These chemical probes interacted with a number of chaperones such as protein disulfide-isomerase A6, heat shock proteins, and Serpin H1. Ingenuity Pathways Analysis (IPA software showed that these chemical probes were a practical technique for protein-protein interaction analysis.

  15. Integration of relational and hierarchical network information for protein function prediction

    Directory of Open Access Journals (Sweden)

    Jiang Xiaoyu

    2008-08-01

    Full Text Available Abstract Background In the current climate of high-throughput computational biology, the inference of a protein's function from related measurements, such as protein-protein interaction relations, has become a canonical task. Most existing technologies pursue this task as a classification problem, on a term-by-term basis, for each term in a database, such as the Gene Ontology (GO database, a popular rigorous vocabulary for biological functions. However, ontology structures are essentially hierarchies, with certain top to bottom annotation rules which protein function predictions should in principle follow. Currently, the most common approach to imposing these hierarchical constraints on network-based classifiers is through the use of transitive closure to predictions. Results We propose a probabilistic framework to integrate information in relational data, in the form of a protein-protein interaction network, and a hierarchically structured database of terms, in the form of the GO database, for the purpose of protein function prediction. At the heart of our framework is a factorization of local neighborhood information in the protein-protein interaction network across successive ancestral terms in the GO hierarchy. We introduce a classifier within this framework, with computationally efficient implementation, that produces GO-term predictions that naturally obey a hierarchical 'true-path' consistency from root to leaves, without the need for further post-processing. Conclusion A cross-validation study, using data from the yeast Saccharomyces cerevisiae, shows our method offers substantial improvements over both standard 'guilt-by-association' (i.e., Nearest-Neighbor and more refined Markov random field methods, whether in their original form or when post-processed to artificially impose 'true-path' consistency. Further analysis of the results indicates that these improvements are associated with increased predictive capabilities (i.e., increased

  16. 3DProIN: Protein-Protein Interaction Networks and Structure Visualization.

    Science.gov (United States)

    Li, Hui; Liu, Chunmei

    2014-06-14

    3DProIN is a computational tool to visualize protein-protein interaction networks in both two dimensional (2D) and three dimensional (3D) view. It models protein-protein interactions in a graph and explores the biologically relevant features of the tertiary structures of each protein in the network. Properties such as color, shape and name of each node (protein) of the network can be edited in either 2D or 3D views. 3DProIN is implemented using 3D Java and C programming languages. The internet crawl technique is also used to parse dynamically grasped protein interactions from protein data bank (PDB). It is a java applet component that is embedded in the web page and it can be used on different platforms including Linux, Mac and Window using web browsers such as Firefox, Internet Explorer, Chrome and Safari. It also was converted into a mac app and submitted to the App store as a free app. Mac users can also download the app from our website. 3DProIN is available for academic research at http://bicompute.appspot.com.

  17. Prediction of Protein Configurational Entropy (Popcoen).

    Science.gov (United States)

    Goethe, Martin; Gleixner, Jan; Fita, Ignacio; Rubi, J Miguel

    2018-03-13

    A knowledge-based method for configurational entropy prediction of proteins is presented; this methodology is extremely fast, compared to previous approaches, because it does not involve any type of configurational sampling. Instead, the configurational entropy of a query fold is estimated by evaluating an artificial neural network, which was trained on molecular-dynamics simulations of ∼1000 proteins. The predicted entropy can be incorporated into a large class of protein software based on cost-function minimization/evaluation, in which configurational entropy is currently neglected for performance reasons. Software of this type is used for all major protein tasks such as structure predictions, proteins design, NMR and X-ray refinement, docking, and mutation effect predictions. Integrating the predicted entropy can yield a significant accuracy increase as we show exemplarily for native-state identification with the prominent protein software FoldX. The method has been termed Popcoen for Prediction of Protein Configurational Entropy. An implementation is freely available at http://fmc.ub.edu/popcoen/ .

  18. Potential disruption of protein-protein interactions by graphene oxide

    International Nuclear Information System (INIS)

    Feng, Mei; Kang, Hongsuk; Luan, Binquan; Yang, Zaixing; Zhou, Ruhong

    2016-01-01

    Graphene oxide (GO) is a promising novel nanomaterial with a wide range of potential biomedical applications due to its many intriguing properties. However, very little research has been conducted to study its possible adverse effects on protein-protein interactions (and thus subsequent toxicity to human). Here, the potential cytotoxicity of GO is investigated at molecular level using large-scale, all-atom molecular dynamics simulations to explore the interaction mechanism between a protein dimer and a GO nanosheet oxidized at different levels. Our theoretical results reveal that GO nanosheet could intercalate between the two monomers of HIV-1 integrase dimer, disrupting the protein-protein interactions and eventually lead to dimer disassociation as graphene does [B. Luan et al., ACS Nano 9(1), 663 (2015)], albeit its insertion process is slower when compared with graphene due to the additional steric and attractive interactions. This study helps to better understand the toxicity of GO to cell functions which could shed light on how to improve its biocompatibility and biosafety for its wide potential biomedical applications.

  19. Potential disruption of protein-protein interactions by graphene oxide

    Energy Technology Data Exchange (ETDEWEB)

    Feng, Mei [Department of Physics, Institute of Quantitative Biology, Zhejiang University, Hangzhou 310027 (China); Kang, Hongsuk; Luan, Binquan [Computational Biological Center, IBM Thomas J. Watson Research Center, Yorktown Heights, New York 10598 (United States); Yang, Zaixing [Institute of Quantitative Biology and Medicine, SRMP and RAD-X, and Collaborative Innovation Center of Radiation Medicine of Jiangsu Higher Education Institutions, Soochow University, Suzhou 215123 (China); Zhou, Ruhong, E-mail: ruhong@us.ibm.com [Department of Physics, Institute of Quantitative Biology, Zhejiang University, Hangzhou 310027 (China); Computational Biological Center, IBM Thomas J. Watson Research Center, Yorktown Heights, New York 10598 (United States); Department of Chemistry, Columbia University, New York, New York 10027 (United States)

    2016-06-14

    Graphene oxide (GO) is a promising novel nanomaterial with a wide range of potential biomedical applications due to its many intriguing properties. However, very little research has been conducted to study its possible adverse effects on protein-protein interactions (and thus subsequent toxicity to human). Here, the potential cytotoxicity of GO is investigated at molecular level using large-scale, all-atom molecular dynamics simulations to explore the interaction mechanism between a protein dimer and a GO nanosheet oxidized at different levels. Our theoretical results reveal that GO nanosheet could intercalate between the two monomers of HIV-1 integrase dimer, disrupting the protein-protein interactions and eventually lead to dimer disassociation as graphene does [B. Luan et al., ACS Nano 9(1), 663 (2015)], albeit its insertion process is slower when compared with graphene due to the additional steric and attractive interactions. This study helps to better understand the toxicity of GO to cell functions which could shed light on how to improve its biocompatibility and biosafety for its wide potential biomedical applications.

  20. Interaction between plate make and protein in protein crystallisation screening.

    Directory of Open Access Journals (Sweden)

    Gordon J King

    Full Text Available BACKGROUND: Protein crystallisation screening involves the parallel testing of large numbers of candidate conditions with the aim of identifying conditions suitable as a starting point for the production of diffraction quality crystals. Generally, condition screening is performed in 96-well plates. While previous studies have examined the effects of protein construct, protein purity, or crystallisation condition ingredients on protein crystallisation, few have examined the effect of the crystallisation plate. METHODOLOGY/PRINCIPAL FINDINGS: We performed a statistically rigorous examination of protein crystallisation, and evaluated interactions between crystallisation success and plate row/column, different plates of same make, different plate makes and different proteins. From our analysis of protein crystallisation, we found a significant interaction between plate make and the specific protein being crystallised. CONCLUSIONS/SIGNIFICANCE: Protein crystal structure determination is the principal method for determining protein structure but is limited by the need to produce crystals of the protein under study. Many important proteins are difficult to crystallize, so that identification of factors that assist crystallisation could open up the structure determination of these more challenging targets. Our findings suggest that protein crystallisation success may be improved by matching a protein with its optimal plate make.

  1. RVMAB: Using the Relevance Vector Machine Model Combined with Average Blocks to Predict the Interactions of Proteins from Protein Sequences

    Directory of Open Access Journals (Sweden)

    Ji-Yong An

    2016-05-01

    Full Text Available Protein-Protein Interactions (PPIs play essential roles in most cellular processes. Knowledge of PPIs is becoming increasingly more important, which has prompted the development of technologies that are capable of discovering large-scale PPIs. Although many high-throughput biological technologies have been proposed to detect PPIs, there are unavoidable shortcomings, including cost, time intensity, and inherently high false positive and false negative rates. For the sake of these reasons, in silico methods are attracting much attention due to their good performances in predicting PPIs. In this paper, we propose a novel computational method known as RVM-AB that combines the Relevance Vector Machine (RVM model and Average Blocks (AB to predict PPIs from protein sequences. The main improvements are the results of representing protein sequences using the AB feature representation on a Position Specific Scoring Matrix (PSSM, reducing the influence of noise using a Principal Component Analysis (PCA, and using a Relevance Vector Machine (RVM based classifier. We performed five-fold cross-validation experiments on yeast and Helicobacter pylori datasets, and achieved very high accuracies of 92.98% and 95.58% respectively, which is significantly better than previous works. In addition, we also obtained good prediction accuracies of 88.31%, 89.46%, 91.08%, 91.55%, and 94.81% on other five independent datasets C. elegans, M. musculus, H. sapiens, H. pylori, and E. coli for cross-species prediction. To further evaluate the proposed method, we compare it with the state-of-the-art support vector machine (SVM classifier on the yeast dataset. The experimental results demonstrate that our RVM-AB method is obviously better than the SVM-based method. The promising experimental results show the efficiency and simplicity of the proposed method, which can be an automatic decision support tool. To facilitate extensive studies for future proteomics research, we developed

  2. An analysis pipeline for the inference of protein-protein interaction networks

    Energy Technology Data Exchange (ETDEWEB)

    Taylor, Ronald C.; Singhal, Mudita; Daly, Don S.; Gilmore, Jason M.; Cannon, William R.; Domico, Kelly O.; White, Amanda M.; Auberry, Deanna L.; Auberry, Kenneth J.; Hooker, Brian S.; Hurst, G. B.; McDermott, Jason E.; McDonald, W. H.; Pelletier, Dale A.; Schmoyer, Denise A.; Wiley, H. S.

    2009-12-01

    An analysis pipeline has been created for deployment of a novel algorithm, the Bayesian Estimator of Protein-Protein Association Probabilities (BEPro), for use in the reconstruction of protein-protein interaction networks. We have combined the Software Environment for BIological Network Inference (SEBINI), an interactive environment for the deployment and testing of network inference algorithms that use high-throughput data, and the Collective Analysis of Biological Interaction Networks (CABIN), software that allows integration and analysis of protein-protein interaction and gene-to-gene regulatory evidence obtained from multiple sources, to allow interactions computed by BEPro to be stored, visualized, and further analyzed. Incorporating BEPro into SEBINI and automatically feeding the resulting inferred network into CABIN, we have created a structured workflow for protein-protein network inference and supplemental analysis from sets of mass spectrometry bait-prey experiment data. SEBINI demo site: https://www.emsl.pnl.gov /SEBINI/ Contact: ronald.taylor@pnl.gov. BEPro is available at http://www.pnl.gov/statistics/BEPro3/index.htm. Contact: ds.daly@pnl.gov. CABIN is available at http://www.sysbio.org/dataresources/cabin.stm. Contact: mudita.singhal@pnl.gov.

  3. STRING 8--a global view on proteins and their functional interactions in 630 organisms

    DEFF Research Database (Denmark)

    Jensen, Lars Juhl; Kuhn, Michael; Stark, Manuel

    2008-01-01

    Functional partnerships between proteins are at the core of complex cellular phenotypes, and the networks formed by interacting proteins provide researchers with crucial scaffolds for modeling, data reduction and annotation. STRING is a database and web resource dedicated to protein-protein inter......Functional partnerships between proteins are at the core of complex cellular phenotypes, and the networks formed by interacting proteins provide researchers with crucial scaffolds for modeling, data reduction and annotation. STRING is a database and web resource dedicated to protein......-protein interactions, including both physical and functional interactions. It weights and integrates information from numerous sources, including experimental repositories, computational prediction methods and public text collections, thus acting as a meta-database that maps all interaction evidence onto a common set...... of genomes and proteins. The most important new developments in STRING 8 over previous releases include a URL-based programming interface, which can be used to query STRING from other resources, improved interaction prediction via genomic neighborhood in prokaryotes, and the inclusion of protein structures...

  4. Simplified Method for Predicting a Functional Class of Proteins in Transcription Factor Complexes

    KAUST Repository

    Piatek, Marek J.

    2013-07-12

    Background:Initiation of transcription is essential for most of the cellular responses to environmental conditions and for cell and tissue specificity. This process is regulated through numerous proteins, their ligands and mutual interactions, as well as interactions with DNA. The key such regulatory proteins are transcription factors (TFs) and transcription co-factors (TcoFs). TcoFs are important since they modulate the transcription initiation process through interaction with TFs. In eukaryotes, transcription requires that TFs form different protein complexes with various nuclear proteins. To better understand transcription regulation, it is important to know the functional class of proteins interacting with TFs during transcription initiation. Such information is not fully available, since not all proteins that act as TFs or TcoFs are yet annotated as such, due to generally partial functional annotation of proteins. In this study we have developed a method to predict, using only sequence composition of the interacting proteins, the functional class of human TF binding partners to be (i) TF, (ii) TcoF, or (iii) other nuclear protein. This allows for complementing the annotation of the currently known pool of nuclear proteins. Since only the knowledge of protein sequences is required in addition to protein interaction, the method should be easily applicable to many species.Results:Based on experimentally validated interactions between human TFs with different TFs, TcoFs and other nuclear proteins, our two classification systems (implemented as a web-based application) achieve high accuracies in distinguishing TFs and TcoFs from other nuclear proteins, and TFs from TcoFs respectively.Conclusion:As demonstrated, given the fact that two proteins are capable of forming direct physical interactions and using only information about their sequence composition, we have developed a completely new method for predicting a functional class of TF interacting protein partners

  5. Targeting protein-protein interaction between MLL1 and reciprocal proteins for leukemia therapy.

    Science.gov (United States)

    Wang, Zhi-Hui; Li, Dong-Dong; Chen, Wei-Lin; You, Qi-Dong; Guo, Xiao-Ke

    2018-01-15

    The mixed lineage leukemia protein-1 (MLL1), as a lysine methyltransferase, predominantly regulates the methylation of histone H3 lysine 4 (H3K4) and functions in hematopoietic stem cell (HSC) self-renewal. MLL1 gene fuses with partner genes that results in the generation of MLL1 fusion proteins (MLL1-FPs), which are frequently detected in acute leukemia. In the progress of leukemogenesis, a great deal of proteins cooperate with MLL1 to form multiprotein complexes serving for the dysregulation of H3K4 methylation, the overexpression of homeobox (HOX) cluster genes, and the consequent generation of leukemia. Hence, disrupting the interactions between MLL1 and the reciprocal proteins has been considered to be a new treatment strategy for leukemia. Here, we reviewed potential protein-protein interactions (PPIs) between MLL1 and its reciprocal proteins, and summarized the inhibitors to target MLL1 PPIs. The druggability of MLL1 PPIs for leukemia were also discussed. Copyright © 2017. Published by Elsevier Ltd.

  6. Can understanding the packing of side chains improve the design of protein-protein interactions?

    Science.gov (United States)

    Zhou, Alice; O'Hern, Corey; Regan, Lynne

    2011-03-01

    With the long-term goal to improve the design of protein-protein interactions, we have begun extensive computational studies to understand how side-chains of key residues of binding partners geometrically fit together at protein-peptide interfaces, e.g. the tetratrico-peptide repeat protein and its cognate peptide). We describe simple atomic-scale models of hydrophobic dipeptides, which include hard-core repulsion, bond length and angle constraints, and Van der Waals attraction. By completely enumerating all minimal energy structures in these systems, we are able to reproduce important features of the probability distributions of side chain dihedral angles of hydrophic residues in the protein data bank. These results are the crucial first step in developing computational models that can predict the side chain conformations of residues at protein-peptide interfaces. CSO acknowledges support from NSF grant no. CMMT-1006527.

  7. Protein Function Prediction Based on Sequence and Structure Information

    KAUST Repository

    Smaili, Fatima Z.

    2016-05-25

    The number of available protein sequences in public databases is increasing exponentially. However, a significant fraction of these sequences lack functional annotation which is essential to our understanding of how biological systems and processes operate. In this master thesis project, we worked on inferring protein functions based on the primary protein sequence. In the approach we follow, 3D models are first constructed using I-TASSER. Functions are then deduced by structurally matching these predicted models, using global and local similarities, through three independent enzyme commission (EC) and gene ontology (GO) function libraries. The method was tested on 250 “hard” proteins, which lack homologous templates in both structure and function libraries. The results show that this method outperforms the conventional prediction methods based on sequence similarity or threading. Additionally, our method could be improved even further by incorporating protein-protein interaction information. Overall, the method we use provides an efficient approach for automated functional annotation of non-homologous proteins, starting from their sequence.

  8. Mass spectrometric analysis of protein interactions

    DEFF Research Database (Denmark)

    Borch, Jonas; Jørgensen, Thomas J. D.; Roepstorff, Peter

    2005-01-01

    Mass spectrometry is a powerful tool for identification of interaction partners and structural characterization of protein interactions because of its high sensitivity, mass accuracy and tolerance towards sample heterogeneity. Several tools that allow studies of protein interaction are now...... available and recent developments that increase the confidence of studies of protein interaction by mass spectrometry include quantification of affinity-purified proteins by stable isotope labeling and reagents for surface topology studies that can be identified by mass-contributing reporters (e.g. isotope...... labels, cleavable cross-linkers or fragment ions. The use of mass spectrometers to study protein interactions using deuterium exchange and for analysis of intact protein complexes recently has progressed considerably....

  9. A method for investigating protein-protein interactions related to Salmonella typhimurium pathogenesis

    Energy Technology Data Exchange (ETDEWEB)

    Chowdhury, Saiful M. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Shi, Liang [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Yoon, Hyunjin [Dartmouth College, Hanover, NH (United States); Ansong, Charles [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Rommereim, Leah M. [Dartmouth College, Hanover, NH (United States); Norbeck, Angela D. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Auberry, Kenneth J. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Moore, R. J. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Adkins, Joshua N. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Heffron, Fred [Oregon Health and Science Univ., Portland, OR (United States); Smith, Richard D. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States)

    2009-02-10

    We successfully modified an existing method to investigate protein-protein interactions in the pathogenic bacterium Salmonella typhimurium (STM). This method includes i) addition of a histidine-biotin-histidine tag to the bait proteins via recombinant DNA techniques; ii) in vivo cross-linking with formaldehyde; iii) tandem affinity purification of bait proteins under fully denaturing conditions; and iv) identification of the proteins cross-linked to the bait proteins by liquid-chromatography in conjunction with tandem mass-spectrometry. In vivo cross-linking stabilized protein interactions permitted the subsequent two-step purification step conducted under denaturing conditions. The two-step purification greatly reduced nonspecific binding of non-cross-linked proteins to bait proteins. Two different negative controls were employed to reduce false-positive identification. In an initial demonstration of this approach, we tagged three selected STM proteins- HimD, PduB and PhoP- with known binding partners that ranged from stable (e.g., HimD) to transient (i.e., PhoP). Distinct sets of interacting proteins were identified with each bait protein, including the known binding partners such as HimA for HimD, as well as anticipated and unexpected binding partners. Our results suggest that novel protein-protein interactions may be critical to pathogenesis by Salmonella typhimurium. .

  10. Molecular imaging of drug-modulated protein-protein interactions in living subjects.

    Science.gov (United States)

    Paulmurugan, Ramasamy; Massoud, Tarik F; Huang, Jing; Gambhir, Sanjiv S

    2004-03-15

    Networks of protein interactions mediate cellular responses to environmental stimuli and direct the execution of many different cellular functional pathways. Small molecules synthesized within cells or recruited from the external environment mediate many protein interactions. The study of small molecule-mediated interactions of proteins is important to understand abnormal signal transduction pathways in cancer and in drug development and validation. In this study, we used split synthetic renilla luciferase (hRLUC) protein fragment-assisted complementation to evaluate heterodimerization of the human proteins FRB and FKBP12 mediated by the small molecule rapamycin. The concentration of rapamycin required for efficient dimerization and that of its competitive binder ascomycin required for dimerization inhibition were studied in cell lines. The system was dually modulated in cell culture at the transcription level, by controlling nuclear factor kappaB promoter/enhancer elements using tumor necrosis factor alpha, and at the interaction level, by controlling the concentration of the dimerizer rapamycin. The rapamycin-mediated dimerization of FRB and FKBP12 also was studied in living mice by locating, quantifying, and timing the hRLUC complementation-based bioluminescence imaging signal using a cooled charged coupled device camera. This split reporter system can be used to efficiently screen small molecule drugs that modulate protein-protein interactions and also to assess drugs in living animals. Both are essential steps in the preclinical evaluation of candidate pharmaceutical agents targeting protein-protein interactions, including signaling pathways in cancer cells.

  11. Arabidopsis mRNA polyadenylation machinery: comprehensive analysis of protein-protein interactions and gene expression profiling

    Directory of Open Access Journals (Sweden)

    Mo Min

    2008-05-01

    Full Text Available Abstract Background The polyadenylation of mRNA is one of the critical processing steps during expression of almost all eukaryotic genes. It is tightly integrated with transcription, particularly its termination, as well as other RNA processing events, i.e. capping and splicing. The poly(A tail protects the mRNA from unregulated degradation, and it is required for nuclear export and translation initiation. In recent years, it has been demonstrated that the polyadenylation process is also involved in the regulation of gene expression. The polyadenylation process requires two components, the cis-elements on the mRNA and a group of protein factors that recognize the cis-elements and produce the poly(A tail. Here we report a comprehensive pairwise protein-protein interaction mapping and gene expression profiling of the mRNA polyadenylation protein machinery in Arabidopsis. Results By protein sequence homology search using human and yeast polyadenylation factors, we identified 28 proteins that may be components of Arabidopsis polyadenylation machinery. To elucidate the protein network and their functions, we first tested their protein-protein interaction profiles. Out of 320 pair-wise protein-protein interaction assays done using the yeast two-hybrid system, 56 (~17% showed positive interactions. 15 of these interactions were further tested, and all were confirmed by co-immunoprecipitation and/or in vitro co-purification. These interactions organize into three distinct hubs involving the Arabidopsis polyadenylation factors. These hubs are centered around AtCPSF100, AtCLPS, and AtFIPS. The first two are similar to complexes seen in mammals, while the third one stands out as unique to plants. When comparing the gene expression profiles extracted from publicly available microarray datasets, some of the polyadenylation related genes showed tissue-specific expression, suggestive of potential different polyadenylation complex configurations. Conclusion An

  12. Efficient extraction of protein-protein interactions from full-text articles.

    Science.gov (United States)

    Hakenberg, Jörg; Leaman, Robert; Vo, Nguyen Ha; Jonnalagadda, Siddhartha; Sullivan, Ryan; Miller, Christopher; Tari, Luis; Baral, Chitta; Gonzalez, Graciela

    2010-01-01

    Proteins and their interactions govern virtually all cellular processes, such as regulation, signaling, metabolism, and structure. Most experimental findings pertaining to such interactions are discussed in research papers, which, in turn, get curated by protein interaction databases. Authors, editors, and publishers benefit from efforts to alleviate the tasks of searching for relevant papers, evidence for physical interactions, and proper identifiers for each protein involved. The BioCreative II.5 community challenge addressed these tasks in a competition-style assessment to evaluate and compare different methodologies, to make aware of the increasing accuracy of automated methods, and to guide future implementations. In this paper, we present our approaches for protein-named entity recognition, including normalization, and for extraction of protein-protein interactions from full text. Our overall goal is to identify efficient individual components, and we compare various compositions to handle a single full-text article in between 10 seconds and 2 minutes. We propose strategies to transfer document-level annotations to the sentence-level, which allows for the creation of a more fine-grained training corpus; we use this corpus to automatically derive around 5,000 patterns. We rank sentences by relevance to the task of finding novel interactions with physical evidence, using a sentence classifier built from this training corpus. Heuristics for paraphrasing sentences help to further remove unnecessary information that might interfere with patterns, such as additional adjectives, clauses, or bracketed expressions. In BioCreative II.5, we achieved an f-score of 22 percent for finding protein interactions, and 43 percent for mapping proteins to UniProt IDs; disregarding species, f-scores are 30 percent and 55 percent, respectively. On average, our best-performing setup required around 2 minutes per full text. All data and pattern sets as well as Java classes that

  13. A selection that reports on protein-protein interactions within a thermophilic bacterium.

    Science.gov (United States)

    Nguyen, Peter Q; Silberg, Jonathan J

    2010-07-01

    Many proteins can be split into fragments that exhibit enhanced function upon fusion to interacting proteins. While this strategy has been widely used to create protein-fragment complementation assays (PCAs) for discovering protein-protein interactions within mesophilic organisms, similar assays have not yet been developed for studying natural and engineered protein complexes at the temperatures where thermophilic microbes grow. We describe the development of a selection for protein-protein interactions within Thermus thermophilus that is based upon growth complementation by fragments of Thermotoga neapolitana adenylate kinase (AK(Tn)). Complementation studies with an engineered thermophile (PQN1) that is not viable above 75 degrees C because its adk gene has been replaced by a Geobacillus stearothermophilus ortholog revealed that growth could be restored at 78 degrees C by a vector that coexpresses polypeptides corresponding to residues 1-79 and 80-220 of AK(Tn). In contrast, PQN1 growth was not complemented by AK(Tn) fragments harboring a C156A mutation within the zinc-binding tetracysteine motif unless these fragments were fused to Thermotoga maritima chemotaxis proteins that heterodimerize (CheA and CheY) or homodimerize (CheX). This enhanced complementation is interpreted as arising from chemotaxis protein-protein interactions, since AK(Tn)-C156A fragments having only one polypeptide fused to a chemotaxis protein did not complement PQN1 to the same extent. This selection increases the maximum temperature where a PCA can be used to engineer thermostable protein complexes and to map protein-protein interactions.

  14. Intercellular protein-protein interactions at synapses.

    Science.gov (United States)

    Yang, Xiaofei; Hou, Dongmei; Jiang, Wei; Zhang, Chen

    2014-06-01

    Chemical synapses are asymmetric intercellular junctions through which neurons send nerve impulses to communicate with other neurons or excitable cells. The appropriate formation of synapses, both spatially and temporally, is essential for brain function and depends on the intercellular protein-protein interactions of cell adhesion molecules (CAMs) at synaptic clefts. The CAM proteins link pre- and post-synaptic sites, and play essential roles in promoting synapse formation and maturation, maintaining synapse number and type, accumulating neurotransmitter receptors and ion channels, controlling neuronal differentiation, and even regulating synaptic plasticity directly. Alteration of the interactions of CAMs leads to structural and functional impairments, which results in many neurological disorders, such as autism, Alzheimer's disease and schizophrenia. Therefore, it is crucial to understand the functions of CAMs during development and in the mature neural system, as well as in the pathogenesis of some neurological disorders. Here, we review the function of the major classes of CAMs, and how dysfunction of CAMs relates to several neurological disorders.

  15. Supervised maximum-likelihood weighting of composite protein networks for complex prediction

    Directory of Open Access Journals (Sweden)

    Yong Chern Han

    2012-12-01

    Full Text Available Abstract Background Protein complexes participate in many important cellular functions, so finding the set of existent complexes is essential for understanding the organization and regulation of processes in the cell. With the availability of large amounts of high-throughput protein-protein interaction (PPI data, many algorithms have been proposed to discover protein complexes from PPI networks. However, such approaches are hindered by the high rate of noise in high-throughput PPI data, including spurious and missing interactions. Furthermore, many transient interactions are detected between proteins that are not from the same complex, while not all proteins from the same complex may actually interact. As a result, predicted complexes often do not match true complexes well, and many true complexes go undetected. Results We address these challenges by integrating PPI data with other heterogeneous data sources to construct a composite protein network, and using a supervised maximum-likelihood approach to weight each edge based on its posterior probability of belonging to a complex. We then use six different clustering algorithms, and an aggregative clustering strategy, to discover complexes in the weighted network. We test our method on Saccharomyces cerevisiae and Homo sapiens, and show that complex discovery is improved: compared to previously proposed supervised and unsupervised weighting approaches, our method recalls more known complexes, achieves higher precision at all recall levels, and generates novel complexes of greater functional similarity. Furthermore, our maximum-likelihood approach allows learned parameters to be used to visualize and evaluate the evidence of novel predictions, aiding human judgment of their credibility. Conclusions Our approach integrates multiple data sources with supervised learning to create a weighted composite protein network, and uses six clustering algorithms with an aggregative clustering strategy to

  16. An Overview of the Prediction of Protein DNA-Binding Sites

    Directory of Open Access Journals (Sweden)

    Jingna Si

    2015-03-01

    Full Text Available Interactions between proteins and DNA play an important role in many essential biological processes such as DNA replication, transcription, splicing, and repair. The identification of amino acid residues involved in DNA-binding sites is critical for understanding the mechanism of these biological activities. In the last decade, numerous computational approaches have been developed to predict protein DNA-binding sites based on protein sequence and/or structural information, which play an important role in complementing experimental strategies. At this time, approaches can be divided into three categories: sequence-based DNA-binding site prediction, structure-based DNA-binding site prediction, and homology modeling and threading. In this article, we review existing research on computational methods to predict protein DNA-binding sites, which includes data sets, various residue sequence/structural features, machine learning methods for comparison and selection, evaluation methods, performance comparison of different tools, and future directions in protein DNA-binding site prediction. In particular, we detail the meta-analysis of protein DNA-binding sites. We also propose specific implications that are likely to result in novel prediction methods, increased performance, or practical applications.

  17. Integral UBL domain proteins: a family of proteasome interacting proteins

    DEFF Research Database (Denmark)

    Hartmann-Petersen, Rasmus; Gordon, Colin

    2004-01-01

    The family of ubiquitin-like (UBL) domain proteins (UDPs) comprises a conserved group of proteins involved in a multitude of different cellular activities. However, recent studies on UBL-domain proteins indicate that these proteins appear to share a common property in their ability to interact...

  18. Holistic Approach to Partial Covalent Interactions in Protein Structure Prediction and Design with Rosetta.

    Science.gov (United States)

    Combs, Steven A; Mueller, Benjamin K; Meiler, Jens

    2018-05-29

    Partial covalent interactions (PCIs) in proteins, which include hydrogen bonds, salt bridges, cation-π, and π-π interactions, contribute to thermodynamic stability and facilitate interactions with other biomolecules. Several score functions have been developed within the Rosetta protein modeling framework that identify and evaluate these PCIs through analyzing the geometry between participating atoms. However, we hypothesize that PCIs can be unified through a simplified electron orbital representation. To test this hypothesis, we have introduced orbital based chemical descriptors for PCIs into Rosetta, called the PCI score function. Optimal geometries for the PCIs are derived from a statistical analysis of high-quality protein structures obtained from the Protein Data Bank (PDB), and the relative orientation of electron deficient hydrogen atoms and electron-rich lone pair or π orbitals are evaluated. We demonstrate that nativelike geometries of hydrogen bonds, salt bridges, cation-π, and π-π interactions are recapitulated during minimization of protein conformation. The packing density of tested protein structures increased from the standard score function from 0.62 to 0.64, closer to the native value of 0.70. Overall, rotamer recovery improved when using the PCI score function (75%) as compared to the standard Rosetta score function (74%). The PCI score function represents an improvement over the standard Rosetta score function for protein model scoring; in addition, it provides a platform for future directions in the analysis of small molecule to protein interactions, which depend on partial covalent interactions.

  19. Protein-x of hepatitis B virus in interaction with CCAAT/enhancer-binding protein α (C/EBPα - an in silico analysis approach

    Directory of Open Access Journals (Sweden)

    Mohamadkhani Ashraf

    2011-10-01

    Full Text Available Abstract Background Even though many functions of protein-x from the Hepatitis B virus (HBV have been revealed, the nature of protein-x is yet unknown. This protein is well-known for its transactivation activity through interaction with several cellular transcription factors, it is also known as an oncogene. In this work, we have presented computational approaches to design a model to show the structure of protein-x and its respective binding sites associated with the CCAAT/enhancer-binding protein α (C/EBPα. C/EBPα belongs to the bZip family of transcription factors, which activates transcription of several genes through its binding sites in liver and fat cells. The C/EBPα has been shown to bind and modulate enhancer I and the enhancer II/core promoter of HBV. In this study using the bioinformatics tools we tried to present a reliable model for the protein-x interaction with C/EBPα. Results The amino acid sequence of protein-x was extracted from UniProt [UniProt:Q80IU5] and the x-ray crystal structure of the partial CCAAT-enhancer α [PDB:1NWQ] was retrieved from the Protein Data Bank (PDB. Similarity search for protein-x was carried out by psi-blast and bl2seq using NCBI [GenBank: BAC65106.1] and Local Meta-Threading-Server (LOMETS was used as a threading server for determining the maximum tertiary structure similarities. Advanced MODELLER was implemented to design a comparative model, however, due to the lack of a suitable template, Quark was used for ab initio tertiary structure prediction. The PDB-blast search indicated a maximum of 23% sequence identity and 33% similarity with crystal structure of the porcine reproductive and respiratory syndrome virus leader protease Nsp1α [PDB:3IFU]. This meant that protein-x does not have a suitable template to predict its tertiary structure using comparative modeling tools, therefore we used QUARK as an ab initio 3D prediction approach. Docking results from the ab initio tertiary structure of

  20. On the role of electrostatics on protein-protein interactions

    Science.gov (United States)

    Zhang, Zhe; Witham, Shawn; Alexov, Emil

    2011-01-01

    The role of electrostatics on protein-protein interactions and binding is reviewed in this article. A brief outline of the computational modeling, in the framework of continuum electrostatics, is presented and basic electrostatic effects occurring upon the formation of the complex are discussed. The role of the salt concentration and pH of the water phase on protein-protein binding free energy is demonstrated and indicates that the increase of the salt concentration tends to weaken the binding, an observation that is attributed to the optimization of the charge-charge interactions across the interface. It is pointed out that the pH-optimum (pH of optimal binding affinity) varies among the protein-protein complexes, and perhaps is a result of their adaptation to particular subcellular compartment. At the end, the similarities and differences between hetero- and homo-complexes are outlined and discussed with respect to the binding mode and charge complementarity. PMID:21572182

  1. Noise reduction in protein-protein interaction graphs by the implementation of a novel weighting scheme

    Directory of Open Access Journals (Sweden)

    Moschopoulos Charalampos

    2011-06-01

    Full Text Available Abstract Background Recent technological advances applied to biology such as yeast-two-hybrid, phage display and mass spectrometry have enabled us to create a detailed map of protein interaction networks. These interaction networks represent a rich, yet noisy, source of data that could be used to extract meaningful information, such as protein complexes. Several interaction network weighting schemes have been proposed so far in the literature in order to eliminate the noise inherent in interactome data. In this paper, we propose a novel weighting scheme and apply it to the S. cerevisiae interactome. Complex prediction rates are improved by up to 39%, depending on the clustering algorithm applied. Results We adopt a two step procedure. During the first step, by applying both novel and well established protein-protein interaction (PPI weighting methods, weights are introduced to the original interactome graph based on the confidence level that a given interaction is a true-positive one. The second step applies clustering using established algorithms in the field of graph theory, as well as two variations of Spectral clustering. The clustered interactome networks are also cross-validated against the confirmed protein complexes present in the MIPS database. Conclusions The results of our experimental work demonstrate that interactome graph weighting methods clearly improve the clustering results of several clustering algorithms. Moreover, our proposed weighting scheme outperforms other approaches of PPI graph weighting.

  2. Identification of structural protein-protein interactions of herpes simplex virus type 1.

    Science.gov (United States)

    Lee, Jin H; Vittone, Valerio; Diefenbach, Eve; Cunningham, Anthony L; Diefenbach, Russell J

    2008-09-01

    In this study we have defined protein-protein interactions between the structural proteins of herpes simplex virus type 1 (HSV-1) using a LexA yeast two-hybrid system. The majority of the capsid, tegument and envelope proteins of HSV-1 were screened in a matrix approach. A total of 40 binary interactions were detected including 9 out of 10 previously identified tegument-tegument interactions (Vittone, V., Diefenbach, E., Triffett, D., Douglas, M.W., Cunningham, A.L., and Diefenbach, R.J., 2005. Determination of interactions between tegument proteins of herpes simplex virus type 1. J. Virol. 79, 9566-9571). A total of 12 interactions involving the capsid protein pUL35 (VP26) and 11 interactions involving the tegument protein pUL46 (VP11/12) were identified. The most significant novel interactions detected in this study, which are likely to play a role in viral assembly, include pUL35-pUL37 (capsid-tegument), pUL46-pUL37 (tegument-tegument) and pUL49 (VP22)-pUS9 (tegument-envelope). This information will provide further insights into the pathways of HSV-1 assembly and the identified interactions are potential targets for new antiviral drugs.

  3. Yeast Interacting Proteins Database: YPR103W, YOR047C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available tein involved in control of glucose-regulated gene expression; interacts with protein kinase Snf1p, glucose sensors...gulated gene expression; interacts with protein kinase Snf1p, glucose sensors Snf

  4. Protein-protein interactions within late pre-40S ribosomes.

    Directory of Open Access Journals (Sweden)

    Melody G Campbell

    2011-01-01

    Full Text Available Ribosome assembly in eukaryotic organisms requires more than 200 assembly factors to facilitate and coordinate rRNA transcription, processing, and folding with the binding of the ribosomal proteins. Many of these assembly factors bind and dissociate at defined times giving rise to discrete assembly intermediates, some of which have been partially characterized with regards to their protein and RNA composition. Here, we have analyzed the protein-protein interactions between the seven assembly factors bound to late cytoplasmic pre-40S ribosomes using recombinant proteins in binding assays. Our data show that these factors form two modules: one comprising Enp1 and the export adaptor Ltv1 near the beak structure, and the second comprising the kinase Rio2, the nuclease Nob1, and a regulatory RNA binding protein Dim2/Pno1 on the front of the head. The GTPase-like Tsr1 and the universally conserved methylase Dim1 are also peripherally connected to this second module. Additionally, in an effort to further define the locations for these essential proteins, we have analyzed the interactions between these assembly factors and six ribosomal proteins: Rps0, Rps3, Rps5, Rps14, Rps15 and Rps29. Together, these results and previous RNA-protein crosslinking data allow us to propose a model for the binding sites of these seven assembly factors. Furthermore, our data show that the essential kinase Rio2 is located at the center of the pre-ribosomal particle and interacts, directly or indirectly, with every other assembly factor, as well as three ribosomal proteins required for cytoplasmic 40S maturation. These data suggest that Rio2 could play a central role in regulating cytoplasmic maturation steps.

  5. Minimum curvilinearity to enhance topological prediction of protein interactions by network embedding

    KAUST Repository

    Cannistraci, Carlo; Alanis Lobato, Gregorio; Ravasi, Timothy

    2013-01-01

    Motivation: Most functions within the cell emerge thanks to protein-protein interactions (PPIs), yet experimental determination of PPIs is both expensive and time-consuming. PPI networks present significant levels of noise and incompleteness

  6. Protein Charge and Mass Contribute to the Spatio-temporal Dynamics of Protein-Protein Interactions in a Minimal Proteome

    Science.gov (United States)

    Xu, Yu; Wang, Hong; Nussinov, Ruth; Ma, Buyong

    2013-01-01

    We constructed and simulated a ‘minimal proteome’ model using Langevin dynamics. It contains 206 essential protein types which were compiled from the literature. For comparison, we generated six proteomes with randomized concentrations. We found that the net charges and molecular weights of the proteins in the minimal genome are not random. The net charge of a protein decreases linearly with molecular weight, with small proteins being mostly positively charged and large proteins negatively charged. The protein copy numbers in the minimal genome have the tendency to maximize the number of protein-protein interactions in the network. Negatively charged proteins which tend to have larger sizes can provide large collision cross-section allowing them to interact with other proteins; on the other hand, the smaller positively charged proteins could have higher diffusion speed and are more likely to collide with other proteins. Proteomes with random charge/mass populations form less stable clusters than those with experimental protein copy numbers. Our study suggests that ‘proper’ populations of negatively and positively charged proteins are important for maintaining a protein-protein interaction network in a proteome. It is interesting to note that the minimal genome model based on the charge and mass of E. Coli may have a larger protein-protein interaction network than that based on the lower organism M. pneumoniae. PMID:23420643

  7. Direct measurements of protein-stabilized gold nanoparticle interactions.

    Science.gov (United States)

    Eichmann, Shannon L; Bevan, Michael A

    2010-09-21

    We report integrated video and total internal reflection microscopy measurements of protein stabilized 110 nm Au nanoparticles confined in 280 nm gaps in physiological media. Measured potential energy profiles display quantitative agreement with Brownian dynamic simulations that include hydrodynamic interactions and camera exposure time and noise effects. Our results demonstrate agreement between measured nonspecific van der Waals and adsorbed protein interactions with theoretical potentials. Confined, lateral nanoparticle diffusivity measurements also display excellent agreement with predictions. These findings provide a basis to interrogate specific biomacromolecular interactions in similar experimental configurations and to design future improved measurement methods.

  8. A new protein-protein interaction sensor based on tripartite split-GFP association.

    Science.gov (United States)

    Cabantous, Stéphanie; Nguyen, Hau B; Pedelacq, Jean-Denis; Koraïchi, Faten; Chaudhary, Anu; Ganguly, Kumkum; Lockard, Meghan A; Favre, Gilles; Terwilliger, Thomas C; Waldo, Geoffrey S

    2013-10-04

    Monitoring protein-protein interactions in living cells is key to unraveling their roles in numerous cellular processes and various diseases. Previously described split-GFP based sensors suffer from poor folding and/or self-assembly background fluorescence. Here, we have engineered a micro-tagging system to monitor protein-protein interactions in vivo and in vitro. The assay is based on tripartite association between two twenty amino-acids long GFP tags, GFP10 and GFP11, fused to interacting protein partners, and the complementary GFP1-9 detector. When proteins interact, GFP10 and GFP11 self-associate with GFP1-9 to reconstitute a functional GFP. Using coiled-coils and FRB/FKBP12 model systems we characterize the sensor in vitro and in Escherichia coli. We extend the studies to mammalian cells and examine the FK-506 inhibition of the rapamycin-induced association of FRB/FKBP12. The small size of these tags and their minimal effect on fusion protein behavior and solubility should enable new experiments for monitoring protein-protein association by fluorescence.

  9. Bioluminescence resonance energy transfer system for measuring dynamic protein-protein interactions in bacteria.

    Science.gov (United States)

    Cui, Boyu; Wang, Yao; Song, Yunhong; Wang, Tietao; Li, Changfu; Wei, Yahong; Luo, Zhao-Qing; Shen, Xihui

    2014-05-20

    Protein-protein interactions are important for virtually every biological process, and a number of elegant approaches have been designed to detect and evaluate such interactions. However, few of these methods allow the detection of dynamic and real-time protein-protein interactions in bacteria. Here we describe a bioluminescence resonance energy transfer (BRET) system based on the bacterial luciferase LuxAB. We found that enhanced yellow fluorescent protein (eYFP) accepts the emission from LuxAB and emits yellow fluorescence. Importantly, BRET occurred when LuxAB and eYFP were fused, respectively, to the interacting protein pair FlgM and FliA. Furthermore, we observed sirolimus (i.e., rapamycin)-inducible interactions between FRB and FKBP12 and a dose-dependent abolishment of such interactions by FK506, the ligand of FKBP12. Using this system, we showed that osmotic stress or low pH efficiently induced multimerization of the regulatory protein OmpR and that the multimerization induced by low pH can be reversed by a neutralizing agent, further indicating the usefulness of this system in the measurement of dynamic interactions. This method can be adapted to analyze dynamic protein-protein interactions and the importance of such interactions in bacterial processes such as development and pathogenicity. Real-time measurement of protein-protein interactions in prokaryotes is highly desirable for determining the roles of protein complex in the development or virulence of bacteria, but methods that allow such measurement are not available. Here we describe the development of a bioluminescence resonance energy transfer (BRET) technology that meets this need. The use of endogenous excitation light in this strategy circumvents the requirement for the sophisticated instrument demanded by standard fluorescence resonance energy transfer (FRET). Furthermore, because the LuxAB substrate decanal is membrane permeable, the assay can be performed without lysing the bacterial cells

  10. Effectively identifying compound-protein interactions by learning from positive and unlabeled examples.

    Science.gov (United States)

    Cheng, Zhanzhan; Zhou, Shuigeng; Wang, Yang; Liu, Hui; Guan, Jihong; Chen, Yi-Ping Phoebe

    2016-05-18

    Prediction of compound-protein interactions (CPIs) is to find new compound-protein pairs where a protein is targeted by at least a compound, which is a crucial step in new drug design. Currently, a number of machine learning based methods have been developed to predict new CPIs in the literature. However, as there is not yet any publicly available set of validated negative CPIs, most existing machine learning based approaches use the unknown interactions (not validated CPIs) selected randomly as the negative examples to train classifiers for predicting new CPIs. Obviously, this is not quite reasonable and unavoidably impacts the CPI prediction performance. In this paper, we simply take the unknown CPIs as unlabeled examples, and propose a new method called PUCPI (the abbreviation of PU learning for Compound-Protein Interaction identification) that employs biased-SVM (Support Vector Machine) to predict CPIs using only positive and unlabeled examples. PU learning is a class of learning methods that leans from positive and unlabeled (PU) samples. To the best of our knowledge, this is the first work that identifies CPIs using only positive and unlabeled examples. We first collect known CPIs as positive examples and then randomly select compound-protein pairs not in the positive set as unlabeled examples. For each CPI/compound-protein pair, we extract protein domains as protein features and compound substructures as chemical features, then take the tensor product of the corresponding compound features and protein features as the feature vector of the CPI/compound-protein pair. After that, biased-SVM is employed to train classifiers on different datasets of CPIs and compound-protein pairs. Experiments over various datasets show that our method outperforms six typical classifiers, including random forest, L1- and L2-regularized logistic regression, naive Bayes, SVM and k-nearest neighbor (kNN), and three types of existing CPI prediction models. Source code, datasets and

  11. PEPSI-Dock: a detailed data-driven protein-protein interaction potential accelerated by polar Fourier correlation.

    Science.gov (United States)

    Neveu, Emilie; Ritchie, David W; Popov, Petr; Grudinin, Sergei

    2016-09-01

    Docking prediction algorithms aim to find the native conformation of a complex of proteins from knowledge of their unbound structures. They rely on a combination of sampling and scoring methods, adapted to different scales. Polynomial Expansion of Protein Structures and Interactions for Docking (PEPSI-Dock) improves the accuracy of the first stage of the docking pipeline, which will sharpen up the final predictions. Indeed, PEPSI-Dock benefits from the precision of a very detailed data-driven model of the binding free energy used with a global and exhaustive rigid-body search space. As well as being accurate, our computations are among the fastest by virtue of the sparse representation of the pre-computed potentials and FFT-accelerated sampling techniques. Overall, this is the first demonstration of a FFT-accelerated docking method coupled with an arbitrary-shaped distance-dependent interaction potential. First, we present a novel learning process to compute data-driven distant-dependent pairwise potentials, adapted from our previous method used for rescoring of putative protein-protein binding poses. The potential coefficients are learned by combining machine-learning techniques with physically interpretable descriptors. Then, we describe the integration of the deduced potentials into a FFT-accelerated spherical sampling provided by the Hex library. Overall, on a training set of 163 heterodimers, PEPSI-Dock achieves a success rate of 91% mid-quality predictions in the top-10 solutions. On a subset of the protein docking benchmark v5, it achieves 44.4% mid-quality predictions in the top-10 solutions when starting from bound structures and 20.5% when starting from unbound structures. The method runs in 5-15 min on a modern laptop and can easily be extended to other types of interactions. https://team.inria.fr/nano-d/software/PEPSI-Dock sergei.grudinin@inria.fr. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e

  12. A sequence-based dynamic ensemble learning system for protein ligand-binding site prediction

    KAUST Repository

    Chen, Peng

    2015-12-03

    Background: Proteins have the fundamental ability to selectively bind to other molecules and perform specific functions through such interactions, such as protein-ligand binding. Accurate prediction of protein residues that physically bind to ligands is important for drug design and protein docking studies. Most of the successful protein-ligand binding predictions were based on known structures. However, structural information is not largely available in practice due to the huge gap between the number of known protein sequences and that of experimentally solved structures

  13. A sequence-based dynamic ensemble learning system for protein ligand-binding site prediction

    KAUST Repository

    Chen, Peng; Hu, ShanShan; Zhang, Jun; Gao, Xin; Li, Jinyan; Xia, Junfeng; Wang, Bing

    2015-01-01

    Background: Proteins have the fundamental ability to selectively bind to other molecules and perform specific functions through such interactions, such as protein-ligand binding. Accurate prediction of protein residues that physically bind to ligands is important for drug design and protein docking studies. Most of the successful protein-ligand binding predictions were based on known structures. However, structural information is not largely available in practice due to the huge gap between the number of known protein sequences and that of experimentally solved structures

  14. Genome-wide analysis of protein-protein interactions and involvement of viral proteins in SARS-CoV replication.

    Directory of Open Access Journals (Sweden)

    Ji'an Pan

    Full Text Available Analyses of viral protein-protein interactions are an important step to understand viral protein functions and their underlying molecular mechanisms. In this study, we adopted a mammalian two-hybrid system to screen the genome-wide intraviral protein-protein interactions of SARS coronavirus (SARS-CoV and therefrom revealed a number of novel interactions which could be partly confirmed by in vitro biochemical assays. Three pairs of the interactions identified were detected in both directions: non-structural protein (nsp 10 and nsp14, nsp10 and nsp16, and nsp7 and nsp8. The interactions between the multifunctional nsp10 and nsp14 or nsp16, which are the unique proteins found in the members of Nidovirales with large RNA genomes including coronaviruses and toroviruses, may have important implication for the mechanisms of replication/transcription complex assembly and functions of these viruses. Using a SARS-CoV replicon expressing a luciferase reporter under the control of a transcription regulating sequence, it has been shown that several viral proteins (N, X and SUD domains of nsp3, and nsp12 provided in trans stimulated the replicon reporter activity, indicating that these proteins may regulate coronavirus replication and transcription. Collectively, our findings provide a basis and platform for further characterization of the functions and mechanisms of coronavirus proteins.

  15. Strong Ligand-Protein Interactions Derived from Diffuse Ligand Interactions with Loose Binding Sites.

    Science.gov (United States)

    Marsh, Lorraine

    2015-01-01

    Many systems in biology rely on binding of ligands to target proteins in a single high-affinity conformation with a favorable ΔG. Alternatively, interactions of ligands with protein regions that allow diffuse binding, distributed over multiple sites and conformations, can exhibit favorable ΔG because of their higher entropy. Diffuse binding may be biologically important for multidrug transporters and carrier proteins. A fine-grained computational method for numerical integration of total binding ΔG arising from diffuse regional interaction of a ligand in multiple conformations using a Markov Chain Monte Carlo (MCMC) approach is presented. This method yields a metric that quantifies the influence on overall ligand affinity of ligand binding to multiple, distinct sites within a protein binding region. This metric is essentially a measure of dispersion in equilibrium ligand binding and depends on both the number of potential sites of interaction and the distribution of their individual predicted affinities. Analysis of test cases indicates that, for some ligand/protein pairs involving transporters and carrier proteins, diffuse binding contributes greatly to total affinity, whereas in other cases the influence is modest. This approach may be useful for studying situations where "nonspecific" interactions contribute to biological function.

  16. A Mesoscopic Model for Protein-Protein Interactions in Solution

    OpenAIRE

    Lund, Mikael; Jönsson, Bo

    2003-01-01

    Protein self-association may be detrimental in biological systems, but can be utilized in a controlled fashion for protein crystallization. It is hence of considerable interest to understand how factors like solution conditions prevent or promote aggregation. Here we present a computational model describing interactions between protein molecules in solution. The calculations are based on a molecular description capturing the detailed structure of the protein molecule using x-ray or nuclear ma...

  17. Yeast Interacting Proteins Database: YMR280C, YOR047C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available olved in control of glucose-regulated gene expression; interacts with protein kinase Snf1p, glucose sensor... glucose-regulated gene expression; interacts with protein kinase Snf1p, glucose sensors Snf3p and Rgt2p, an

  18. Yeast Interacting Proteins Database: YOR047C, YKL038W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available racts with protein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding protein Spt15p; acts as a...Bait description Protein involved in control of glucose-regulated gene expression; interacts with protein kinase Snf1p, glucose senso...rs Snf3p and Rgt2p, and TATA-binding protein Spt15p; acts as a regulator of the tra

  19. Yeast Interacting Proteins Database: YFR049W, YOR047C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available protein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding protein Spt15p; acts as a regulator... (0) YOR047C STD1 Protein involved in control of glucose-regulated gene expression; interacts with protein kinase Snf1p, glucose sens...ors Snf3p and Rgt2p, and TATA-binding protein Spt15p; ac

  20. Development of a Model Protein Interaction Pair as a Benchmarking Tool for the Quantitative Analysis of 2-Site Protein-Protein Interactions.

    Science.gov (United States)

    Yamniuk, Aaron P; Newitt, John A; Doyle, Michael L; Arisaka, Fumio; Giannetti, Anthony M; Hensley, Preston; Myszka, David G; Schwarz, Fred P; Thomson, James A; Eisenstein, Edward

    2015-12-01

    A significant challenge in the molecular interaction field is to accurately determine the stoichiometry and stepwise binding affinity constants for macromolecules having >1 binding site. The mission of the Molecular Interactions Research Group (MIRG) of the Association of Biomolecular Resource Facilities (ABRF) is to show how biophysical technologies are used to quantitatively characterize molecular interactions, and to educate the ABRF members and scientific community on the utility and limitations of core technologies [such as biosensor, microcalorimetry, or analytic ultracentrifugation (AUC)]. In the present work, the MIRG has developed a robust model protein interaction pair consisting of a bivalent variant of the Bacillus amyloliquefaciens extracellular RNase barnase and a variant of its natural monovalent intracellular inhibitor protein barstar. It is demonstrated that this system can serve as a benchmarking tool for the quantitative analysis of 2-site protein-protein interactions. The protein interaction pair enables determination of precise binding constants for the barstar protein binding to 2 distinct sites on the bivalent barnase binding partner (termed binase), where the 2 binding sites were engineered to possess affinities that differed by 2 orders of magnitude. Multiple MIRG laboratories characterized the interaction using isothermal titration calorimetry (ITC), AUC, and surface plasmon resonance (SPR) methods to evaluate the feasibility of the system as a benchmarking model. Although general agreement was seen for the binding constants measured using solution-based ITC and AUC approaches, weaker affinity was seen for surface-based method SPR, with protein immobilization likely affecting affinity. An analysis of the results from multiple MIRG laboratories suggests that the bivalent barnase-barstar system is a suitable model for benchmarking new approaches for the quantitative characterization of complex biomolecular interactions.

  1. In Silico Identification of Proteins Associated with Drug-induced Liver Injury Based on the Prediction of Drug-target Interactions.

    Science.gov (United States)

    Ivanov, Sergey; Semin, Maxim; Lagunin, Alexey; Filimonov, Dmitry; Poroikov, Vladimir

    2017-07-01

    Drug-induced liver injury (DILI) is the leading cause of acute liver failure as well as one of the major reasons for drug withdrawal from clinical trials and the market. Elucidation of molecular interactions associated with DILI may help to detect potentially hazardous pharmacological agents at the early stages of drug development. The purpose of our study is to investigate which interactions with specific human protein targets may cause DILI. Prediction of interactions with 1534 human proteins was performed for the dataset with information about 699 drugs, which were divided into three categories of DILI: severe (178 drugs), moderate (310 drugs) and without DILI (211 drugs). Based on the comparison of drug-target interactions predicted for different drugs' categories and interpretation of those results using clustering, Gene Ontology, pathway and gene expression analysis, we identified 61 protein targets associated with DILI. Most of the revealed proteins were linked with hepatocytes' death caused by disruption of vital cellular processes, as well as the emergence of inflammation in the liver. It was found that interaction of a drug with the identified targets is the essential molecular mechanism of the severe DILI for the most of the considered pharmaceuticals. Thus, pharmaceutical agents interacting with many of the identified targets may be considered as candidates for filtering out at the early stages of drug research. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  2. Transcription Factor Functional Protein-Protein Interactions in Plant Defense Responses

    Directory of Open Access Journals (Sweden)

    Murilo S. Alves

    2014-03-01

    Full Text Available Responses to biotic stress in plants lead to dramatic reprogramming of gene expression, favoring stress responses at the expense of normal cellular functions. Transcription factors are master regulators of gene expression at the transcriptional level, and controlling the activity of these factors alters the transcriptome of the plant, leading to metabolic and phenotypic changes in response to stress. The functional analysis of interactions between transcription factors and other proteins is very important for elucidating the role of these transcriptional regulators in different signaling cascades. In this review, we present an overview of protein-protein interactions for the six major families of transcription factors involved in plant defense: basic leucine zipper containing domain proteins (bZIP, amino-acid sequence WRKYGQK (WRKY, myelocytomatosis related proteins (MYC, myeloblastosis related proteins (MYB, APETALA2/ ETHYLENE-RESPONSIVE ELEMENT BINDING FACTORS (AP2/EREBP and no apical meristem (NAM, Arabidopsis transcription activation factor (ATAF, and cup-shaped cotyledon (CUC (NAC. We describe the interaction partners of these transcription factors as molecular responses during pathogen attack and the key components of signal transduction pathways that take place during plant defense responses. These interactions determine the activation or repression of response pathways and are crucial to understanding the regulatory networks that modulate plant defense responses.

  3. PPI finder: a mining tool for human protein-protein interactions.

    Directory of Open Access Journals (Sweden)

    Min He

    Full Text Available BACKGROUND: The exponential increase of published biomedical literature prompts the use of text mining tools to manage the information overload automatically. One of the most common applications is to mine protein-protein interactions (PPIs from PubMed abstracts. Currently, most tools in mining PPIs from literature are using co-occurrence-based approaches or rule-based approaches. Hybrid methods (frame-based approaches by combining these two methods may have better performance in predicting PPIs. However, the predicted PPIs from these methods are rarely evaluated by known PPI databases and co-occurred terms in Gene Ontology (GO database. METHODOLOGY/PRINCIPAL FINDINGS: We here developed a web-based tool, PPI Finder, to mine human PPIs from PubMed abstracts based on their co-occurrences and interaction words, followed by evidences in human PPI databases and shared terms in GO database. Only 28% of the co-occurred pairs in PubMed abstracts appeared in any of the commonly used human PPI databases (HPRD, BioGRID and BIND. On the other hand, of the known PPIs in HPRD, 69% showed co-occurrences in the literature, and 65% shared GO terms. CONCLUSIONS: PPI Finder provides a useful tool for biologists to uncover potential novel PPIs. It is freely accessible at http://liweilab.genetics.ac.cn/tm/.

  4. Predicting the subcellular localization of viral proteins within a mammalian host cell

    Directory of Open Access Journals (Sweden)

    Thomas DY

    2006-04-01

    Full Text Available Abstract Background The bioinformatic prediction of protein subcellular localization has been extensively studied for prokaryotic and eukaryotic organisms. However, this is not the case for viruses whose proteins are often involved in extensive interactions at various subcellular localizations with host proteins. Results Here, we investigate the extent of utilization of human cellular localization mechanisms by viral proteins and we demonstrate that appropriate eukaryotic subcellular localization predictors can be used to predict viral protein localization within the host cell. Conclusion Such predictions provide a method to rapidly annotate viral proteomes with subcellular localization information. They are likely to have widespread applications both in the study of the functions of viral proteins in the host cell and in the design of antiviral drugs.

  5. Predicting DNA-binding proteins and binding residues by complex structure prediction and application to human proteome.

    Directory of Open Access Journals (Sweden)

    Huiying Zhao

    Full Text Available As more and more protein sequences are uncovered from increasingly inexpensive sequencing techniques, an urgent task is to find their functions. This work presents a highly reliable computational technique for predicting DNA-binding function at the level of protein-DNA complex structures, rather than low-resolution two-state prediction of DNA-binding as most existing techniques do. The method first predicts protein-DNA complex structure by utilizing the template-based structure prediction technique HHblits, followed by binding affinity prediction based on a knowledge-based energy function (Distance-scaled finite ideal-gas reference state for protein-DNA interactions. A leave-one-out cross validation of the method based on 179 DNA-binding and 3797 non-binding protein domains achieves a Matthews correlation coefficient (MCC of 0.77 with high precision (94% and high sensitivity (65%. We further found 51% sensitivity for 82 newly determined structures of DNA-binding proteins and 56% sensitivity for the human proteome. In addition, the method provides a reasonably accurate prediction of DNA-binding residues in proteins based on predicted DNA-binding complex structures. Its application to human proteome leads to more than 300 novel DNA-binding proteins; some of these predicted structures were validated by known structures of homologous proteins in APO forms. The method [SPOT-Seq (DNA] is available as an on-line server at http://sparks-lab.org.

  6. NatalieQ: A web server for protein-protein interaction network querying

    NARCIS (Netherlands)

    El-Kebir, M.; Brandt, B.W.; Heringa, J.; Klau, G.W.

    2014-01-01

    Background Molecular interactions need to be taken into account to adequately model the complex behavior of biological systems. These interactions are captured by various types of biological networks, such as metabolic, gene-regulatory, signal transduction and protein-protein interaction networks.

  7. Potato leafroll virus structural proteins manipulate overlapping, yet distinct protein interaction networks during infection.

    Science.gov (United States)

    DeBlasio, Stacy L; Johnson, Richard; Sweeney, Michelle M; Karasev, Alexander; Gray, Stewart M; MacCoss, Michael J; Cilia, Michelle

    2015-06-01

    Potato leafroll virus (PLRV) produces a readthrough protein (RTP) via translational readthrough of the coat protein amber stop codon. The RTP functions as a structural component of the virion and as a nonincorporated protein in concert with numerous insect and plant proteins to regulate virus movement/transmission and tissue tropism. Affinity purification coupled to quantitative MS was used to generate protein interaction networks for a PLRV mutant that is unable to produce the read through domain (RTD) and compared to the known wild-type PLRV protein interaction network. By quantifying differences in the protein interaction networks, we identified four distinct classes of PLRV-plant interactions: those plant and nonstructural viral proteins interacting with assembled coat protein (category I); plant proteins in complex with both coat protein and RTD (category II); plant proteins in complex with the RTD (category III); and plant proteins that had higher affinity for virions lacking the RTD (category IV). Proteins identified as interacting with the RTD are potential candidates for regulating viral processes that are mediated by the RTP such as phloem retention and systemic movement and can potentially be useful targets for the development of strategies to prevent infection and/or viral transmission of Luteoviridae species that infect important crop species. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  8. Hepatitis C Virus Protein Interaction Network Analysis Based on Hepatocellular Carcinoma.

    Directory of Open Access Journals (Sweden)

    Yuewen Han

    Full Text Available Epidemiological studies have validated the association between hepatitis C virus (HCV infection and hepatocellular carcinoma (HCC. An increasing number of studies show that protein-protein interactions (PPIs between HCV proteins and host proteins play a vital role in infection and mediate HCC progression. In this work, we collected all published interaction between HCV and human proteins, which include 455 unique human proteins participating in 524 HCV-human interactions. Then, we construct the HCV-human and HCV-HCC protein interaction networks, which display the biological knowledge regarding the mechanism of HCV pathogenesis, particularly with respect to pathogenesis of HCC. Through in-depth analysis of the HCV-HCC interaction network, we found that interactors are enriched in the JAK/STAT, p53, MAPK, TNF, Wnt, and cell cycle pathways. Using a random walk with restart algorithm, we predicted the importance of each protein in the HCV-HCC network and found that AKT1 may play a key role in the HCC progression. Moreover, we found that NS5A promotes HCC cells proliferation and metastasis by activating AKT/GSK3β/β-catenin pathway. This work provides a basis for a detailed map tracking new cellular interactions of HCV and identifying potential targets for HCV-related hepatocellular carcinoma treatment.

  9. Analysis of protein-protein interaction networks by means of annotated graph mining algorithms

    NARCIS (Netherlands)

    Rahmani, Hossein

    2012-01-01

    This thesis discusses solutions to several open problems in Protein-Protein Interaction (PPI) networks with the aid of Knowledge Discovery. PPI networks are usually represented as undirected graphs, with nodes corresponding to proteins and edges representing interactions among protein pairs. A large

  10. Neural Networks for protein Structure Prediction

    DEFF Research Database (Denmark)

    Bohr, Henrik

    1998-01-01

    This is a review about neural network applications in bioinformatics. Especially the applications to protein structure prediction, e.g. prediction of secondary structures, prediction of surface structure, fold class recognition and prediction of the 3-dimensional structure of protein backbones...

  11. Building blocks for protein interaction devices

    Science.gov (United States)

    Grünberg, Raik; Ferrar, Tony S.; van der Sloot, Almer M.; Constante, Marco; Serrano, Luis

    2010-01-01

    Here, we propose a framework for the design of synthetic protein networks from modular protein–protein or protein–peptide interactions and provide a starter toolkit of protein building blocks. Our proof of concept experiments outline a general work flow for part–based protein systems engineering. We streamlined the iterative BioBrick cloning protocol and assembled 25 synthetic multidomain proteins each from seven standardized DNA fragments. A systematic screen revealed two main factors controlling protein expression in Escherichia coli: obstruction of translation initiation by mRNA secondary structure or toxicity of individual domains. Eventually, 13 proteins were purified for further characterization. Starting from well-established biotechnological tools, two general–purpose interaction input and two readout devices were built and characterized in vitro. Constitutive interaction input was achieved with a pair of synthetic leucine zippers. The second interaction was drug-controlled utilizing the rapamycin-induced binding of FRB(T2098L) to FKBP12. The interaction kinetics of both devices were analyzed by surface plasmon resonance. Readout was based on Förster resonance energy transfer between fluorescent proteins and was quantified for various combinations of input and output devices. Our results demonstrate the feasibility of parts-based protein synthetic biology. Additionally, we identify future challenges and limitations of modular design along with approaches to address them. PMID:20215443

  12. Manipulating fatty acid biosynthesis in microalgae for biofuel through protein-protein interactions.

    Directory of Open Access Journals (Sweden)

    Jillian L Blatti

    Full Text Available Microalgae are a promising feedstock for renewable fuels, and algal metabolic engineering can lead to crop improvement, thus accelerating the development of commercially viable biodiesel production from algae biomass. We demonstrate that protein-protein interactions between the fatty acid acyl carrier protein (ACP and thioesterase (TE govern fatty acid hydrolysis within the algal chloroplast. Using green microalga Chlamydomonas reinhardtii (Cr as a model, a structural simulation of docking CrACP to CrTE identifies a protein-protein recognition surface between the two domains. A virtual screen reveals plant TEs with similar in silico binding to CrACP. Employing an activity-based crosslinking probe designed to selectively trap transient protein-protein interactions between the TE and ACP, we demonstrate in vitro that CrTE must functionally interact with CrACP to release fatty acids, while TEs of vascular plants show no mechanistic crosslinking to CrACP. This is recapitulated in vivo, where overproduction of the endogenous CrTE increased levels of short-chain fatty acids and engineering plant TEs into the C. reinhardtii chloroplast did not alter the fatty acid profile. These findings highlight the critical role of protein-protein interactions in manipulating fatty acid biosynthesis for algae biofuel engineering as illuminated by activity-based probes.

  13. Uncovering Viral Protein-Protein Interactions and their Role in Arenavirus Life Cycle

    Directory of Open Access Journals (Sweden)

    Nora López

    2012-09-01

    Full Text Available The Arenaviridae family includes widely distributed pathogens that cause severe hemorrhagic fever in humans. Replication and packaging of their single-stranded RNA genome involve RNA recognition by viral proteins and a number of key protein-protein interactions. Viral RNA synthesis is directed by the virus-encoded RNA dependent-RNA polymerase (L protein and requires viral RNA encapsidation by the Nucleoprotein. In addition to the role that the interaction between L and the Nucleoprotein may have in the replication process, polymerase activity appears to be modulated by the association between L and the small multifunctional Z protein. Z is also a structural component of the virions that plays an essential role in viral morphogenesis. Indeed, interaction of the Z protein with the Nucleoprotein is critical for genome packaging. Furthermore, current evidence suggests that binding between Z and the viral envelope glycoprotein complex is required for virion infectivity, and that Z homo-oligomerization is an essential step for particle assembly and budding. Efforts to understand the molecular basis of arenavirus life cycle have revealed important details on these viral protein-protein interactions that will be reviewed in this article.

  14. Protein-Protein Interaction Reagents | Office of Cancer Genomics

    Science.gov (United States)

    The CTD2 Center at Emory University has a library of genes used to study protein-protein interactions in mammalian cells. These genes are cloned in different mammalian expression vectors. A list of available cancer-associated genes can be accessed below. Emory_CTD^2_PPI_Reagents.xlsx Contact: Haian Fu

  15. Yeast Interacting Proteins Database: YGL127C, YOR047C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ith protein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding protein Spt15p; acts as a regula...rotein involved in control of glucose-regulated gene expression; interacts with protein kinase Snf1p, glucose sensors

  16. Relative quantification of protein-protein interactions using a dual luciferase reporter pull-down assay system.

    Directory of Open Access Journals (Sweden)

    Shuaizheng Jia

    Full Text Available The identification and quantitative analysis of protein-protein interactions are essential to the functional characterization of proteins in the post-proteomics era. The methods currently available are generally time-consuming, technically complicated, insensitive and/or semi-quantitative. The lack of simple, sensitive approaches to precisely quantify protein-protein interactions still prevents our understanding of the functions of many proteins. Here, we develop a novel dual luciferase reporter pull-down assay by combining a biotinylated Firefly luciferase pull-down assay with a dual luciferase reporter assay. The biotinylated Firefly luciferase-tagged protein enables rapid and efficient isolation of a putative Renilla luciferase-tagged binding protein from a relatively small amount of sample. Both of these proteins can be quantitatively detected using the dual luciferase reporter assay system. Protein-protein interactions, including Fos-Jun located in the nucleus; MAVS-TRAF3 in cytoplasm; inducible IRF3 dimerization; viral protein-regulated interactions, such as MAVS-MAVS and MAVS-TRAF3; IRF3 dimerization; and protein interaction domain mapping, are studied using this novel assay system. Herein, we demonstrate that this dual luciferase reporter pull-down assay enables the quantification of the relative amounts of interacting proteins that bind to streptavidin-coupled beads for protein purification. This study provides a simple, rapid, sensitive, and efficient approach to identify and quantify relative protein-protein interactions. Importantly, the dual luciferase reporter pull-down method will facilitate the functional determination of proteins.

  17. NMR Studies of Protein Hydration and Protein-Ligand Interactions

    Science.gov (United States)

    Chong, Yuan

    Water on the surface of a protein is called hydration water. Hydration water is known to play a crucial role in a variety of biological processes including protein folding, enzymatic activation, and drug binding. Although the significance of hydration water has been recognized, the underlying mechanism remains far from being understood. This dissertation employs a unique in-situ nuclear magnetic resonance (NMR) technique to study the mechanism of protein hydration and the role of hydration in alcohol-protein interactions. Water isotherms in proteins are measured at different temperatures via the in-situ NMR technique. Water is found to interact differently with hydrophilic and hydrophobic groups on the protein. Water adsorption on hydrophilic groups is hardly affected by the temperature, while water adsorption on hydrophobic groups strongly depends on the temperature around 10 C, below which the adsorption is substantially reduced. This effect is induced by the dramatic decrease in the protein flexibility below 10 C. Furthermore, nanosecond to microsecond protein dynamics and the free energy, enthalpy, and entropy of protein hydration are studied as a function of hydration level and temperature. A crossover at 10 C in protein dynamics and thermodynamics is revealed. The effect of water at hydrophilic groups on protein dynamics and thermodynamics shows little temperature dependence, whereas water at hydrophobic groups has stronger effect above 10 C. In addition, I investigate the role of water in alcohol binding to the protein using the in-situ NMR detection. The isotherms of alcohols are first measured on dry proteins, then on proteins with a series of controlled hydration levels. The free energy, enthalpy, and entropy of alcohol binding are also determined. Two distinct types of alcohol binding are identified. On the one hand, alcohols can directly bind to a few specific sites on the protein. This type of binding is independent of temperature and can be

  18. Predicting protein-binding RNA nucleotides with consideration of binding partners.

    Science.gov (United States)

    Tuvshinjargal, Narankhuu; Lee, Wook; Park, Byungkyu; Han, Kyungsook

    2015-06-01

    In recent years several computational methods have been developed to predict RNA-binding sites in protein. Most of these methods do not consider interacting partners of a protein, so they predict the same RNA-binding sites for a given protein sequence even if the protein binds to different RNAs. Unlike the problem of predicting RNA-binding sites in protein, the problem of predicting protein-binding sites in RNA has received little attention mainly because it is much more difficult and shows a lower accuracy on average. In our previous study, we developed a method that predicts protein-binding nucleotides from an RNA sequence. In an effort to improve the prediction accuracy and usefulness of the previous method, we developed a new method that uses both RNA and protein sequence data. In this study, we identified effective features of RNA and protein molecules and developed a new support vector machine (SVM) model to predict protein-binding nucleotides from RNA and protein sequence data. The new model that used both protein and RNA sequence data achieved a sensitivity of 86.5%, a specificity of 86.2%, a positive predictive value (PPV) of 72.6%, a negative predictive value (NPV) of 93.8% and Matthews correlation coefficient (MCC) of 0.69 in a 10-fold cross validation; it achieved a sensitivity of 58.8%, a specificity of 87.4%, a PPV of 65.1%, a NPV of 84.2% and MCC of 0.48 in independent testing. For comparative purpose, we built another prediction model that used RNA sequence data alone and ran it on the same dataset. In a 10 fold-cross validation it achieved a sensitivity of 85.7%, a specificity of 80.5%, a PPV of 67.7%, a NPV of 92.2% and MCC of 0.63; in independent testing it achieved a sensitivity of 67.7%, a specificity of 78.8%, a PPV of 57.6%, a NPV of 85.2% and MCC of 0.45. In both cross-validations and independent testing, the new model that used both RNA and protein sequences showed a better performance than the model that used RNA sequence data alone in

  19. Prediction of the anti-inflammatory mechanisms of curcumin by module-based protein interaction network analysis

    Directory of Open Access Journals (Sweden)

    Yanxiong Gan

    2015-11-01

    Full Text Available Curcumin, the medically active component from Curcuma longa (Turmeric, is widely used to treat inflammatory diseases. Protein interaction network (PIN analysis was used to predict its mechanisms of molecular action. Targets of curcumin were obtained based on ChEMBL and STITCH databases. Protein–protein interactions (PPIs were extracted from the String database. The PIN of curcumin was constructed by Cytoscape and the function modules identified by gene ontology (GO enrichment analysis based on molecular complex detection (MCODE. A PIN of curcumin with 482 nodes and 1688 interactions was constructed, which has scale-free, small world and modular properties. Based on analysis of these function modules, the mechanism of curcumin is proposed. Two modules were found to be intimately associated with inflammation. With function modules analysis, the anti-inflammatory effects of curcumin were related to SMAD, ERG and mediation by the TLR family. TLR9 may be a potential target of curcumin to treat inflammation.

  20. Parallel force assay for protein-protein interactions.

    Science.gov (United States)

    Aschenbrenner, Daniela; Pippig, Diana A; Klamecka, Kamila; Limmer, Katja; Leonhardt, Heinrich; Gaub, Hermann E

    2014-01-01

    Quantitative proteome research is greatly promoted by high-resolution parallel format assays. A characterization of protein complexes based on binding forces offers an unparalleled dynamic range and allows for the effective discrimination of non-specific interactions. Here we present a DNA-based Molecular Force Assay to quantify protein-protein interactions, namely the bond between different variants of GFP and GFP-binding nanobodies. We present different strategies to adjust the maximum sensitivity window of the assay by influencing the binding strength of the DNA reference duplexes. The binding of the nanobody Enhancer to the different GFP constructs is compared at high sensitivity of the assay. Whereas the binding strength to wild type and enhanced GFP are equal within experimental error, stronger binding to superfolder GFP is observed. This difference in binding strength is attributed to alterations in the amino acids that form contacts according to the crystal structure of the initial wild type GFP-Enhancer complex. Moreover, we outline the potential for large-scale parallelization of the assay.

  1. Combined chemical shift changes and amino acid specific chemical shift mapping of protein-protein interactions

    Energy Technology Data Exchange (ETDEWEB)

    Schumann, Frank H.; Riepl, Hubert [University of Regensburg, Institute of Biophysics and Physical Biochemistry (Germany); Maurer, Till [Boehringer Ingelheim Pharma GmbH and Co. KG, Analytical Sciences Department (Germany); Gronwald, Wolfram [University of Regensburg, Institute of Biophysics and Physical Biochemistry (Germany); Neidig, Klaus-Peter [Bruker BioSpin GmbH, Software Department (Germany); Kalbitzer, Hans Robert [University of Regensburg, Institute of Biophysics and Physical Biochemistry (Germany)], E-mail: hans-robert.kalbitzer@biologie.uni-regensburg.de

    2007-12-15

    Protein-protein interactions are often studied by chemical shift mapping using solution NMR spectroscopy. When heteronuclear data are available the interaction interface is usually predicted by combining the chemical shift changes of different nuclei to a single quantity, the combined chemical shift perturbation {delta}{delta}{sub comb}. In this paper different procedures (published and non-published) to calculate {delta}{delta}{sub comb} are examined that include a variety of different functional forms and weighting factors for each nucleus. The predictive power of all shift mapping methods depends on the magnitude of the overlap of the chemical shift distributions of interacting and non-interacting residues and the cut-off criterion used. In general, the quality of the prediction on the basis of chemical shift changes alone is rather unsatisfactory but the combination of chemical shift changes on the basis of the Hamming or the Euclidian distance can improve the result. The corrected standard deviation to zero of the combined chemical shift changes can provide a reasonable cut-off criterion. As we show combined chemical shifts can also be applied for a more reliable quantitative evaluation of titration data.

  2. ProteinSplit: splitting of multi-domain proteins using prediction of ordered and disordered regions in protein sequences for virtual structural genomics

    International Nuclear Information System (INIS)

    Wyrwicz, Lucjan S; Koczyk, Grzegorz; Rychlewski, Leszek; Plewczynski, Dariusz

    2007-01-01

    The annotation of protein folds within newly sequenced genomes is the main target for semi-automated protein structure prediction (virtual structural genomics). A large number of automated methods have been developed recently with very good results in the case of single-domain proteins. Unfortunately, most of these automated methods often fail to properly predict the distant homology between a given multi-domain protein query and structural templates. Therefore a multi-domain protein should be split into domains in order to overcome this limitation. ProteinSplit is designed to identify protein domain boundaries using a novel algorithm that predicts disordered regions in protein sequences. The software utilizes various sequence characteristics to assess the local propensity of a protein to be disordered or ordered in terms of local structure stability. These disordered parts of a protein are likely to create interdomain spacers. Because of its speed and portability, the method was successfully applied to several genome-wide fold annotation experiments. The user can run an automated analysis of sets of proteins or perform semi-automated multiple user projects (saving the results on the server). Additionally the sequences of predicted domains can be sent to the Bioinfo.PL Protein Structure Prediction Meta-Server for further protein three-dimensional structure and function prediction. The program is freely accessible as a web service at http://lucjan.bioinfo.pl/proteinsplit together with detailed benchmark results on the critical assessment of a fully automated structure prediction (CAFASP) set of sequences. The source code of the local version of protein domain boundary prediction is available upon request from the authors

  3. Kinome signaling through regulated protein-protein interactions in normal and cancer cells.

    Science.gov (United States)

    Pawson, Tony; Kofler, Michael

    2009-04-01

    The flow of molecular information through normal and oncogenic signaling pathways frequently depends on protein phosphorylation, mediated by specific kinases, and the selective binding of the resulting phosphorylation sites to interaction domains present on downstream targets. This physical and functional interplay of catalytic and interaction domains can be clearly seen in cytoplasmic tyrosine kinases such as Src, Abl, Fes, and ZAP-70. Although the kinase and SH2 domains of these proteins possess similar intrinsic properties of phosphorylating tyrosine residues or binding phosphotyrosine sites, they also undergo intramolecular interactions when linked together, in a fashion that varies from protein to protein. These cooperative interactions can have diverse effects on substrate recognition and kinase activity, and provide a variety of mechanisms to link the stimulation of catalytic activity to substrate recognition. Taken together, these data have suggested how protein kinases, and the signaling pathways in which they are embedded, can evolve complex properties through the stepwise linkage of domains within single polypeptides or multi-protein assemblies.

  4. Predicting protein structures with a multiplayer online game.

    Science.gov (United States)

    Cooper, Seth; Khatib, Firas; Treuille, Adrien; Barbero, Janos; Lee, Jeehyung; Beenen, Michael; Leaver-Fay, Andrew; Baker, David; Popović, Zoran; Players, Foldit

    2010-08-05

    People exert large amounts of problem-solving effort playing computer games. Simple image- and text-recognition tasks have been successfully 'crowd-sourced' through games, but it is not clear if more complex scientific problems can be solved with human-directed computing. Protein structure prediction is one such problem: locating the biologically relevant native conformation of a protein is a formidable computational challenge given the very large size of the search space. Here we describe Foldit, a multiplayer online game that engages non-scientists in solving hard prediction problems. Foldit players interact with protein structures using direct manipulation tools and user-friendly versions of algorithms from the Rosetta structure prediction methodology, while they compete and collaborate to optimize the computed energy. We show that top-ranked Foldit players excel at solving challenging structure refinement problems in which substantial backbone rearrangements are necessary to achieve the burial of hydrophobic residues. Players working collaboratively develop a rich assortment of new strategies and algorithms; unlike computational approaches, they explore not only the conformational space but also the space of possible search strategies. The integration of human visual problem-solving and strategy development capabilities with traditional computational algorithms through interactive multiplayer games is a powerful new approach to solving computationally-limited scientific problems.

  5. Cost Function Network-based Design of Protein-Protein Interactions: predicting changes in binding affinity.

    Science.gov (United States)

    Viricel, Clément; de Givry, Simon; Schiex, Thomas; Barbe, Sophie

    2018-02-20

    Accurate and economic methods to predict change in protein binding free energy upon mutation are imperative to accelerate the design of proteins for a wide range of applications. Free energy is defined by enthalpic and entropic contributions. Following the recent progresses of Artificial Intelligence-based algorithms for guaranteed NP-hard energy optimization and partition function computation, it becomes possible to quickly compute minimum energy conformations and to reliably estimate the entropic contribution of side-chains in the change of free energy of large protein interfaces. Using guaranteed Cost Function Network algorithms, Rosetta energy functions and Dunbrack's rotamer library, we developed and assessed EasyE and JayZ, two methods for binding affinity estimation that ignore or include conformational entropic contributions on a large benchmark of binding affinity experimental measures. If both approaches outperform most established tools, we observe that side-chain conformational entropy brings little or no improvement on most systems but becomes crucial in some rare cases. as open-source Python/C ++ code at sourcesup.renater.fr/projects/easy-jayz. thomas.schiex@inra.fr and sophie.barbe@insa-toulouse.fr. Supplementary data are available at Bioinformatics online.

  6. Dendrimer-protein interactions versus dendrimer-based nanomedicine.

    Science.gov (United States)

    Shcharbin, Dzmitry; Shcharbina, Natallia; Dzmitruk, Volha; Pedziwiatr-Werbicka, Elzbieta; Ionov, Maksim; Mignani, Serge; de la Mata, F Javier; Gómez, Rafael; Muñoz-Fernández, Maria Angeles; Majoral, Jean-Pierre; Bryszewska, Maria

    2017-04-01

    Dendrimers are hyperbranched polymers belonging to the huge class of nanomedical devices. Their wide application in biology and medicine requires understanding of the fundamental mechanisms of their interactions with biological systems. Summarizing, electrostatic force plays the predominant role in dendrimer-protein interactions, especially with charged dendrimers. Other kinds of interactions have been proven, such as H-bonding, van der Waals forces, and even hydrophobic interactions. These interactions depend on the characteristics of both participants: flexibility and surface charge of a dendrimer, rigidity of protein structure and the localization of charged amino acids at its surface. pH and ionic strength of solutions can significantly modulate interactions. Ligands and cofactors attached to a protein can also change dendrimer-protein interactions. Binding of dendrimers to a protein can change its secondary structure, conformation, intramolecular mobility and functional activity. However, this strongly depends on rigidity versus flexibility of a protein's structure. In addition, the potential applications of dendrimers to nanomedicine are reviwed related to dendrimer-protein interactions. Copyright © 2017 Elsevier B.V. All rights reserved.

  7. Cell penetrating peptides to dissect host-pathogen protein-protein interactions in Theileria -transformed leukocytes

    KAUST Repository

    Haidar, Malak; de Laté , Perle Latré ; Kennedy, Eileen J.; Langsley, Gordon

    2017-01-01

    One powerful application of cell penetrating peptides is the delivery into cells of molecules that function as specific competitors or inhibitors of protein-protein interactions. Ablating defined protein-protein interactions is a refined way

  8. Genes2Networks: connecting lists of gene symbols using mammalian protein interactions databases

    Directory of Open Access Journals (Sweden)

    Ma'ayan Avi

    2007-10-01

    Full Text Available Abstract Background In recent years, mammalian protein-protein interaction network databases have been developed. The interactions in these databases are either extracted manually from low-throughput experimental biomedical research literature, extracted automatically from literature using techniques such as natural language processing (NLP, generated experimentally using high-throughput methods such as yeast-2-hybrid screens, or interactions are predicted using an assortment of computational approaches. Genes or proteins identified as significantly changing in proteomic experiments, or identified as susceptibility disease genes in genomic studies, can be placed in the context of protein interaction networks in order to assign these genes and proteins to pathways and protein complexes. Results Genes2Networks is a software system that integrates the content of ten mammalian interaction network datasets. Filtering techniques to prune low-confidence interactions were implemented. Genes2Networks is delivered as a web-based service using AJAX. The system can be used to extract relevant subnetworks created from "seed" lists of human Entrez gene symbols. The output includes a dynamic linkable three color web-based network map, with a statistical analysis report that identifies significant intermediate nodes used to connect the seed list. Conclusion Genes2Networks is powerful web-based software that can help experimental biologists to interpret lists of genes and proteins such as those commonly produced through genomic and proteomic experiments, as well as lists of genes and proteins associated with disease processes. This system can be used to find relationships between genes and proteins from seed lists, and predict additional genes or proteins that may play key roles in common pathways or protein complexes.

  9. Characterization of host proteins interacting with the lymphocytic choriomeningitis virus L protein.

    Science.gov (United States)

    Khamina, Kseniya; Lercher, Alexander; Caldera, Michael; Schliehe, Christopher; Vilagos, Bojan; Sahin, Mehmet; Kosack, Lindsay; Bhattacharya, Anannya; Májek, Peter; Stukalov, Alexey; Sacco, Roberto; James, Leo C; Pinschewer, Daniel D; Bennett, Keiryn L; Menche, Jörg; Bergthaler, Andreas

    2017-12-01

    RNA-dependent RNA polymerases (RdRps) play a key role in the life cycle of RNA viruses and impact their immunobiology. The arenavirus lymphocytic choriomeningitis virus (LCMV) strain Clone 13 provides a benchmark model for studying chronic infection. A major genetic determinant for its ability to persist maps to a single amino acid exchange in the viral L protein, which exhibits RdRp activity, yet its functional consequences remain elusive. To unravel the L protein interactions with the host proteome, we engineered infectious L protein-tagged LCMV virions by reverse genetics. A subsequent mass-spectrometric analysis of L protein pulldowns from infected human cells revealed a comprehensive network of interacting host proteins. The obtained LCMV L protein interactome was bioinformatically integrated with known host protein interactors of RdRps from other RNA viruses, emphasizing interconnected modules of human proteins. Functional characterization of selected interactors highlighted proviral (DDX3X) as well as antiviral (NKRF, TRIM21) host factors. To corroborate these findings, we infected Trim21-/- mice with LCMV and found impaired virus control in chronic infection. These results provide insights into the complex interactions of the arenavirus LCMV and other viral RdRps with the host proteome and contribute to a better molecular understanding of how chronic viruses interact with their host.

  10. Force spectroscopy studies on protein-ligand interactions: a single protein mechanics perspective.

    Science.gov (United States)

    Hu, Xiaotang; Li, Hongbin

    2014-10-01

    Protein-ligand interactions are ubiquitous and play important roles in almost every biological process. The direct elucidation of the thermodynamic, structural and functional consequences of protein-ligand interactions is thus of critical importance to decipher the mechanism underlying these biological processes. A toolbox containing a variety of powerful techniques has been developed to quantitatively study protein-ligand interactions in vitro as well as in living systems. The development of atomic force microscopy-based single molecule force spectroscopy techniques has expanded this toolbox and made it possible to directly probe the mechanical consequence of ligand binding on proteins. Many recent experiments have revealed how ligand binding affects the mechanical stability and mechanical unfolding dynamics of proteins, and provided mechanistic understanding on these effects. The enhancement effect of mechanical stability by ligand binding has been used to help tune the mechanical stability of proteins in a rational manner and develop novel functional binding assays for protein-ligand interactions. Single molecule force spectroscopy studies have started to shed new lights on the structural and functional consequence of ligand binding on proteins that bear force under their biological settings. Copyright © 2014 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.

  11. Context-specific protein network miner - an online system for exploring context-specific protein interaction networks from the literature

    KAUST Repository

    Chowdhary, Rajesh

    2012-04-06

    Background: Protein interaction networks (PINs) specific within a particular context contain crucial information regarding many cellular biological processes. For example, PINs may include information on the type and directionality of interaction (e.g. phosphorylation), location of interaction (i.e. tissues, cells), and related diseases. Currently, very few tools are capable of deriving context-specific PINs for conducting exploratory analysis. Results: We developed a literature-based online system, Context-specific Protein Network Miner (CPNM), which derives context-specific PINs in real-time from the PubMed database based on a set of user-input keywords and enhanced PubMed query system. CPNM reports enriched information on protein interactions (with type and directionality), their network topology with summary statistics (e.g. most densely connected proteins in the network; most densely connected protein-pairs; and proteins connected by most inbound/outbound links) that can be explored via a user-friendly interface. Some of the novel features of the CPNM system include PIN generation, ontology-based PubMed query enhancement, real-time, user-queried, up-to-date PubMed document processing, and prediction of PIN directionality. Conclusions: CPNM provides a tool for biologists to explore PINs. It is freely accessible at http://www.biotextminer.com/CPNM/. © 2012 Chowdhary et al.

  12. Context-specific protein network miner - an online system for exploring context-specific protein interaction networks from the literature

    KAUST Repository

    Chowdhary, Rajesh; Tan, Sin Lam; Zhang, Jinfeng; Karnik, Shreyas; Bajic, Vladimir B.; Liu, Jun S.

    2012-01-01

    Background: Protein interaction networks (PINs) specific within a particular context contain crucial information regarding many cellular biological processes. For example, PINs may include information on the type and directionality of interaction (e.g. phosphorylation), location of interaction (i.e. tissues, cells), and related diseases. Currently, very few tools are capable of deriving context-specific PINs for conducting exploratory analysis. Results: We developed a literature-based online system, Context-specific Protein Network Miner (CPNM), which derives context-specific PINs in real-time from the PubMed database based on a set of user-input keywords and enhanced PubMed query system. CPNM reports enriched information on protein interactions (with type and directionality), their network topology with summary statistics (e.g. most densely connected proteins in the network; most densely connected protein-pairs; and proteins connected by most inbound/outbound links) that can be explored via a user-friendly interface. Some of the novel features of the CPNM system include PIN generation, ontology-based PubMed query enhancement, real-time, user-queried, up-to-date PubMed document processing, and prediction of PIN directionality. Conclusions: CPNM provides a tool for biologists to explore PINs. It is freely accessible at http://www.biotextminer.com/CPNM/. © 2012 Chowdhary et al.

  13. Yeast Interacting Proteins Database: YOR358W, YOR047C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ; interacts with protein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding protein Spt15p; act...rotein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding protein Spt15p; acts as a regulator o

  14. Prediction of protein post-translational modifications: main trends and methods

    Science.gov (United States)

    Sobolev, B. N.; Veselovsky, A. V.; Poroikov, V. V.

    2014-02-01

    The review summarizes main trends in the development of methods for the prediction of protein post-translational modifications (PTMs) by considering the three most common types of PTMs — phosphorylation, acetylation and glycosylation. Considerable attention is given to general characteristics of regulatory interactions associated with PTMs. Different approaches to the prediction of PTMs are analyzed. Most of the methods are based only on the analysis of the neighbouring environment of modification sites. The related software is characterized by relatively low accuracy of PTM predictions, which may be due both to the incompleteness of training data and the features of PTM regulation. Advantages and limitations of the phylogenetic approach are considered. The prediction of PTMs using data on regulatory interactions, including the modular organization of interacting proteins, is a promising field, provided that a more carefully selected training data will be used. The bibliography includes 145 references.

  15. Interaction of Proteins Identified in Human Thyroid Cells

    Science.gov (United States)

    Pietsch, Jessica; Riwaldt, Stefan; Bauer, Johann; Sickmann, Albert; Weber, Gerhard; Grosse, Jirka; Infanger, Manfred; Eilles, Christoph; Grimm, Daniela

    2013-01-01

    Influence of gravity forces on the regulation of protein expression by healthy and malignant thyroid cells was studied with the aim to identify protein interactions. Western blot analyses of a limited number of proteins suggested a time-dependent regulation of protein expression by simulated microgravity. After applying free flow isoelectric focusing and mass spectrometry to search for differently expressed proteins by thyroid cells exposed to simulated microgravity for three days, a considerable number of candidates for gravi-sensitive proteins were detected. In order to show how proteins sensitive to microgravity could directly influence other proteins, we investigated all polypeptide chains identified with Mascot scores above 100, looking for groups of interacting proteins. Hence, UniProtKB entry numbers of all detected proteins were entered into the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) and processed. The program indicated that we had detected various groups of interacting proteins in each of the three cell lines studied. The major groups of interacting proteins play a role in pathways of carbohydrate and protein metabolism, regulation of cell growth and cell membrane structuring. Analyzing these groups, networks of interaction could be established which show how a punctual influence of simulated microgravity may propagate via various members of interaction chains. PMID:23303277

  16. Interaction of Proteins Identified in Human Thyroid Cells

    Directory of Open Access Journals (Sweden)

    Jessica Pietsch

    2013-01-01

    Full Text Available Influence of gravity forces on the regulation of protein expression by healthy and malignant thyroid cells was studied with the aim to identify protein interactions. Western blot analyses of a limited number of proteins suggested a time-dependent regulation of protein expression by simulated microgravity. After applying free flow isoelectric focusing and mass spectrometry to search for differently expressed proteins by thyroid cells exposed to simulated microgravity for three days, a considerable number of candidates for gravi-sensitive proteins were detected. In order to show how proteins sensitive to microgravity could directly influence other proteins, we investigated all polypeptide chains identified with Mascot scores above 100, looking for groups of interacting proteins. Hence, UniProtKB entry numbers of all detected proteins were entered into the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING and processed. The program indicated that we had detected various groups of interacting proteins in each of the three cell lines studied. The major groups of interacting proteins play a role in pathways of carbohydrate and protein metabolism, regulation of cell growth and cell membrane structuring. Analyzing these groups, networks of interaction could be established which show how a punctual influence of simulated microgravity may propagate via various members of interaction chains.

  17. The human interactome knowledge base (hint-kb): An integrative human protein interaction database enriched with predicted protein–protein interaction scores using a novel hybrid technique

    KAUST Repository

    Theofilatos, Konstantinos A.; Dimitrakopoulos, Christos M.; Likothanassis, Spiridon D.; Kleftogiannis, Dimitrios A.; Moschopoulos, Charalampos N.; Alexakos, Christos; Papadimitriou, Stergios; Mavroudi, Seferina P.

    2013-01-01

    Proteins are the functional components of many cellular processes and the identification of their physical protein–protein interactions (PPIs) is an area of mature academic research. Various databases have been developed containing information about

  18. HinT proteins and their putative interaction partners in Mollicutes and Chlamydiaceae

    Directory of Open Access Journals (Sweden)

    Hegemann Johannes H

    2005-05-01

    Full Text Available Background HinT proteins are found in prokaryotes and eukaryotes and belong to the superfamily of HIT proteins, which are characterized by an histidine-triad sequence motif. While the eukaryotic variants hydrolyze AMP derivates and modulate transcription, the function of prokaryotic HinT proteins is less clearly defined. In Mycoplasma hominis, HinT is concomitantly expressed with the proteins P60 and P80, two domains of a surface exposed membrane complex, and in addition interacts with the P80 moiety. Results An cluster of hitABL genes, similar to that of M. hominis was found in M. pulmonis, M. mycoides subspecies mycoides SC, M. mobile and Mesoplasma florum. RT-PCR analyses provided evidence that the P80, P60 and HinT homologues of M. pulmonis were polycistronically organized, suggesting a genetic and physical interaction between the proteins encoded by these genes in these species. While the hit loci of M. pneumoniae and M. genitalium encoded, in addition to HinT, a protein with several transmembrane segments, the hit locus of Ureaplasma parvum encoded a pore-forming protein, UU270, a P60 homologue, UU271, HinT, UU272, and a membrane protein of unknown function, UU273. Although a full-length mRNA spanning the four genes was not detected, amplification of all intergenic regions from the center of UU270 to the end of UU273 by RT-PCR may be indicative of a common, but unstable mRNA. In Chlamydiaceae the hit gene is flanked upstream by a gene predicted to encode a metal dependent hydrolase and downstream by a gene putatively encoding a protein with ARM-repeats, which are known to be involved in protein-protein interactions. In RT-PCR analyses of C. pneumoniae, regions comprising only two genes, Cp265/Cp266 and Cp266/Cp267 were able to be amplified. In contrast to this in vivo interaction analysis using the yeast two-hybrid system and in vitro immune co-precipitation revealed an interaction between Cp267, which contains the ARM repeats, Cp265, the

  19. Protein complex prediction via dense subgraphs and false positive analysis.

    Directory of Open Access Journals (Sweden)

    Cecilia Hernandez

    Full Text Available Many proteins work together with others in groups called complexes in order to achieve a specific function. Discovering protein complexes is important for understanding biological processes and predict protein functions in living organisms. Large-scale and throughput techniques have made possible to compile protein-protein interaction networks (PPI networks, which have been used in several computational approaches for detecting protein complexes. Those predictions might guide future biologic experimental research. Some approaches are topology-based, where highly connected proteins are predicted to be complexes; some propose different clustering algorithms using partitioning, overlaps among clusters for networks modeled with unweighted or weighted graphs; and others use density of clusters and information based on protein functionality. However, some schemes still require much processing time or the quality of their results can be improved. Furthermore, most of the results obtained with computational tools are not accompanied by an analysis of false positives. We propose an effective and efficient mining algorithm for discovering highly connected subgraphs, which is our base for defining protein complexes. Our representation is based on transforming the PPI network into a directed acyclic graph that reduces the number of represented edges and the search space for discovering subgraphs. Our approach considers weighted and unweighted PPI networks. We compare our best alternative using PPI networks from Saccharomyces cerevisiae (yeast and Homo sapiens (human with state-of-the-art approaches in terms of clustering, biological metrics and execution times, as well as three gold standards for yeast and two for human. Furthermore, we analyze false positive predicted complexes searching the PDBe (Protein Data Bank in Europe database in order to identify matching protein complexes that have been purified and structurally characterized. Our analysis shows

  20. A Structural Perspective on the Modulation of Protein-Protein Interactions with Small Molecules.

    Science.gov (United States)

    Demirel, Habibe Cansu; Dogan, Tunca; Tuncbag, Nurcan

    2018-05-31

    Protein-protein interactions (PPIs) are the key components in many cellular processes including signaling pathways, enzymatic reactions and epigenetic regulation. Abnormal interactions of some proteins may be pathogenic and cause various disorders including cancer and neurodegenerative diseases. Although inhibiting PPIs with small molecules is a challenging task, it gained an increasing interest because of its strong potential for drug discovery and design. The knowledge of the interface as well as the structural and chemical characteristics of the PPIs and their roles in the cellular pathways are necessary for a rational design of small molecules to modulate PPIs. In this study, we review the recent progress in the field and detail the physicochemical properties of PPIs including binding hot spots with a focus on structural methods. Then, we review recent approaches for structural prediction of PPIs. Finally, we revisit the concept of targeting PPIs in a systems biology perspective and we refer to the non-structural approaches, usually employed when the structural information is not present. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  1. Protein-Protein Interactions (PPI) reagents: | Office of Cancer Genomics

    Science.gov (United States)

    The CTD2 Center at Emory University has a library of genes used to study protein-protein interactions in mammalian cells. These genes are cloned in different mammalian expression vectors. A list of available cancer-associated genes can be accessed below.

  2. Protein-Protein Interactions of Viroporins in Coronaviruses and Paramyxoviruses: New Targets for Antivirals?

    Directory of Open Access Journals (Sweden)

    Jaume Torres

    2015-06-01

    Full Text Available Viroporins are members of a rapidly growing family of channel-forming small polypeptides found in viruses. The present review will be focused on recent structural and protein-protein interaction information involving two viroporins found in enveloped viruses that target the respiratory tract; (i the envelope protein in coronaviruses and (ii the small hydrophobic protein in paramyxoviruses. Deletion of these two viroporins leads to viral attenuation in vivo, whereas data from cell culture shows involvement in the regulation of stress and inflammation. The channel activity and structure of some representative members of these viroporins have been recently characterized in some detail. In addition, searches for protein-protein interactions using yeast-two hybrid techniques have shed light on possible functional roles for their exposed cytoplasmic domains. A deeper analysis of these interactions should not only provide a more complete overview of the multiple functions of these viroporins, but also suggest novel strategies that target protein-protein interactions as much needed antivirals. These should complement current efforts to block viroporin channel activity.

  3. Analysis of Protein-Membrane Interactions

    DEFF Research Database (Denmark)

    Kemmer, Gerdi Christine

    Cellular membranes are complex structures, consisting of hundreds of different lipids and proteins. These membranes act as barriers between distinct environments, constituting hot spots for many essential functions of the cell, including signaling, energy conversion, and transport. These functions....... Discovered interactions were then probed on the level of the membrane using liposome-based assays. In the second part, a transmembrane protein was investigated. Assays to probe activity of the plasma membrane ATPase (Arabidopsis thaliana H+ -ATPase isoform 2 (AHA2)) in single liposomes using both giant...... are implemented by soluble proteins reversibly binding to, as well as by integral membrane proteins embedded in, cellular membranes. The activity and interaction of these proteins is furthermore modulated by the lipids of the membrane. Here, liposomes were used as model membrane systems to investigate...

  4. Yeast Interacting Proteins Database: YGL145W, YNL258C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available ripheral membrane protein required for Golgi-to-ER retrograde traffic; component ... membrane protein required for Golgi-to-ER retrograde traffic; component of the ER target site that interact

  5. The role of exon shuffling in shaping protein-protein interaction networks

    Directory of Open Access Journals (Sweden)

    França Gustavo S

    2010-12-01

    Full Text Available Abstract Background Physical protein-protein interaction (PPI is a critical phenomenon for the function of most proteins in living organisms and a significant fraction of PPIs are the result of domain-domain interactions. Exon shuffling, intron-mediated recombination of exons from existing genes, is known to have been a major mechanism of domain shuffling in metazoans. Thus, we hypothesized that exon shuffling could have a significant influence in shaping the topology of PPI networks. Results We tested our hypothesis by compiling exon shuffling and PPI data from six eukaryotic species: Homo sapiens, Mus musculus, Drosophila melanogaster, Caenorhabditis elegans, Cryptococcus neoformans and Arabidopsis thaliana. For all four metazoan species, genes enriched in exon shuffling events presented on average higher vertex degree (number of interacting partners in PPI networks. Furthermore, we verified that a set of protein domains that are simultaneously promiscuous (known to interact to multiple types of other domains, self-interacting (able to interact with another copy of themselves and abundant in the genomes presents a stronger signal for exon shuffling. Conclusions Exon shuffling appears to have been a recurrent mechanism for the emergence of new PPIs along metazoan evolution. In metazoan genomes, exon shuffling also promoted the expansion of some protein domains. We speculate that their promiscuous and self-interacting properties may have been decisive for that expansion.

  6. BioPlex Display: An Interactive Suite for Large-Scale AP-MS Protein-Protein Interaction Data.

    Science.gov (United States)

    Schweppe, Devin K; Huttlin, Edward L; Harper, J Wade; Gygi, Steven P

    2018-01-05

    The development of large-scale data sets requires a new means to display and disseminate research studies to large audiences. Knowledge of protein-protein interaction (PPI) networks has become a principle interest of many groups within the field of proteomics. At the confluence of technologies, such as cross-linking mass spectrometry, yeast two-hybrid, protein cofractionation, and affinity purification mass spectrometry (AP-MS), detection of PPIs can uncover novel biological inferences at a high-throughput. Thus new platforms to provide community access to large data sets are necessary. To this end, we have developed a web application that enables exploration and dissemination of the growing BioPlex interaction network. BioPlex is a large-scale interactome data set based on AP-MS of baits from the human ORFeome. The latest BioPlex data set release (BioPlex 2.0) contains 56 553 interactions from 5891 AP-MS experiments. To improve community access to this vast compendium of interactions, we developed BioPlex Display, which integrates individual protein querying, access to empirical data, and on-the-fly annotation of networks within an easy-to-use and mobile web application. BioPlex Display enables rapid acquisition of data from BioPlex and development of hypotheses based on protein interactions.

  7. Prediction of essential proteins based on subcellular localization and gene expression correlation.

    Science.gov (United States)

    Fan, Yetian; Tang, Xiwei; Hu, Xiaohua; Wu, Wei; Ping, Qing

    2017-12-01

    Essential proteins are indispensable to the survival and development process of living organisms. To understand the functional mechanisms of essential proteins, which can be applied to the analysis of disease and design of drugs, it is important to identify essential proteins from a set of proteins first. As traditional experimental methods designed to test out essential proteins are usually expensive and laborious, computational methods, which utilize biological and topological features of proteins, have attracted more attention in recent years. Protein-protein interaction networks, together with other biological data, have been explored to improve the performance of essential protein prediction. The proposed method SCP is evaluated on Saccharomyces cerevisiae datasets and compared with five other methods. The results show that our method SCP outperforms the other five methods in terms of accuracy of essential protein prediction. In this paper, we propose a novel algorithm named SCP, which combines the ranking by a modified PageRank algorithm based on subcellular compartments information, with the ranking by Pearson correlation coefficient (PCC) calculated from gene expression data. Experiments show that subcellular localization information is promising in boosting essential protein prediction.

  8. In silico modeling of the yeast protein and protein family interaction network

    Science.gov (United States)

    Goh, K.-I.; Kahng, B.; Kim, D.

    2004-03-01

    Understanding of how protein interaction networks of living organisms have evolved or are organized can be the first stepping stone in unveiling how life works on a fundamental ground. Here we introduce an in silico ``coevolutionary'' model for the protein interaction network and the protein family network. The essential ingredient of the model includes the protein family identity and its robustness under evolution, as well as the three previously proposed: gene duplication, divergence, and mutation. This model produces a prototypical feature of complex networks in a wide range of parameter space, following the generalized Pareto distribution in connectivity. Moreover, we investigate other structural properties of our model in detail with some specific values of parameters relevant to the yeast Saccharomyces cerevisiae, showing excellent agreement with the empirical data. Our model indicates that the physical constraints encoded via the domain structure of proteins play a crucial role in protein interactions.

  9. Drosophila Protein interaction Map (DPiM)

    OpenAIRE

    Guruharsha, K.G.; Obar, Robert A.; Mintseris, Julian; Aishwarya, K.; Krishnan, R.T.; VijayRaghavan, K.; Artavanis-Tsakonas, Spyros

    2012-01-01

    Proteins perform essential cellular functions as part of protein complexes, often in conjunction with RNA, DNA, metabolites and other small molecules. The genome encodes thousands of proteins but not all of them are expressed in every cell type; and expressed proteins are not active at all times. Such diversity of protein expression and function accounts for the level of biological intricacy seen in nature. Defining protein-protein interactions in protein complexes, and establishing the when,...

  10. A Particle Swarm Optimization-Based Approach with Local Search for Predicting Protein Folding.

    Science.gov (United States)

    Yang, Cheng-Hong; Lin, Yu-Shiun; Chuang, Li-Yeh; Chang, Hsueh-Wei

    2017-10-01

    The hydrophobic-polar (HP) model is commonly used for predicting protein folding structures and hydrophobic interactions. This study developed a particle swarm optimization (PSO)-based algorithm combined with local search algorithms; specifically, the high exploration PSO (HEPSO) algorithm (which can execute global search processes) was combined with three local search algorithms (hill-climbing algorithm, greedy algorithm, and Tabu table), yielding the proposed HE-L-PSO algorithm. By using 20 known protein structures, we evaluated the performance of the HE-L-PSO algorithm in predicting protein folding in the HP model. The proposed HE-L-PSO algorithm exhibited favorable performance in predicting both short and long amino acid sequences with high reproducibility and stability, compared with seven reported algorithms. The HE-L-PSO algorithm yielded optimal solutions for all predicted protein folding structures. All HE-L-PSO-predicted protein folding structures possessed a hydrophobic core that is similar to normal protein folding.

  11. 3dRPC: a web server for 3D RNA-protein structure prediction.

    Science.gov (United States)

    Huang, Yangyu; Li, Haotian; Xiao, Yi

    2018-04-01

    RNA-protein interactions occur in many biological processes. To understand the mechanism of these interactions one needs to know three-dimensional (3D) structures of RNA-protein complexes. 3dRPC is an algorithm for prediction of 3D RNA-protein complex structures and consists of a docking algorithm RPDOCK and a scoring function 3dRPC-Score. RPDOCK is used to sample possible complex conformations of an RNA and a protein by calculating the geometric and electrostatic complementarities and stacking interactions at the RNA-protein interface according to the features of atom packing of the interface. 3dRPC-Score is a knowledge-based potential that uses the conformations of nucleotide-amino-acid pairs as statistical variables and that is used to choose the near-native complex-conformations obtained from the docking method above. Recently, we built a web server for 3dRPC. The users can easily use 3dRPC without installing it locally. RNA and protein structures in PDB (Protein Data Bank) format are the only needed input files. It can also incorporate the information of interface residues or residue-pairs obtained from experiments or theoretical predictions to improve the prediction. The address of 3dRPC web server is http://biophy.hust.edu.cn/3dRPC. yxiao@hust.edu.cn.

  12. Parallel force assay for protein-protein interactions.

    Directory of Open Access Journals (Sweden)

    Daniela Aschenbrenner

    Full Text Available Quantitative proteome research is greatly promoted by high-resolution parallel format assays. A characterization of protein complexes based on binding forces offers an unparalleled dynamic range and allows for the effective discrimination of non-specific interactions. Here we present a DNA-based Molecular Force Assay to quantify protein-protein interactions, namely the bond between different variants of GFP and GFP-binding nanobodies. We present different strategies to adjust the maximum sensitivity window of the assay by influencing the binding strength of the DNA reference duplexes. The binding of the nanobody Enhancer to the different GFP constructs is compared at high sensitivity of the assay. Whereas the binding strength to wild type and enhanced GFP are equal within experimental error, stronger binding to superfolder GFP is observed. This difference in binding strength is attributed to alterations in the amino acids that form contacts according to the crystal structure of the initial wild type GFP-Enhancer complex. Moreover, we outline the potential for large-scale parallelization of the assay.

  13. InSilico Proteomics System: Integration and Application of Protein and Protein-Protein Interaction Data using Microsoft .NET

    Directory of Open Access Journals (Sweden)

    Straßer Wolfgang

    2006-12-01

    Full Text Available In the last decades, biological databases became the major knowledge resource for researchers in the field of molecular biology. The distribution of information among these databases is one of the major problems. An overview about the subject area of data access and representation of protein and protein-protein interaction data within public biological databases is described. For a comprehensive and consistent way of searching and analysing integrated protein and protein-protein interaction data, the InSilico Proteomics (ISP project has been initiated. Its three main objectives are (1 to provide an integrated knowledge pool for data investigation and global network analysis functions for a better understanding of a cell’s interactome, (2 employment of public data for plausibility analysis and validation of in-house experimental data and (3 testing the applicability of Microsoft’s .NET architecture for bioinformatics applications. Data integrated into the ISP database can be queried through the Web portal PRIMOS (PRotein Interaction and MOlecule Search which is freely available at http://biomis.fh-hagenberg.at/isp/primos.

  14. Protein-Protein Docking in Drug Design and Discovery.

    Science.gov (United States)

    Kaczor, Agnieszka A; Bartuzi, Damian; Stępniewski, Tomasz Maciej; Matosiuk, Dariusz; Selent, Jana

    2018-01-01

    Protein-protein interactions (PPIs) are responsible for a number of key physiological processes in the living cells and underlie the pathomechanism of many diseases. Nowadays, along with the concept of so-called "hot spots" in protein-protein interactions, which are well-defined interface regions responsible for most of the binding energy, these interfaces can be targeted with modulators. In order to apply structure-based design techniques to design PPIs modulators, a three-dimensional structure of protein complex has to be available. In this context in silico approaches, in particular protein-protein docking, are a valuable complement to experimental methods for elucidating 3D structure of protein complexes. Protein-protein docking is easy to use and does not require significant computer resources and time (in contrast to molecular dynamics) and it results in 3D structure of a protein complex (in contrast to sequence-based methods of predicting binding interfaces). However, protein-protein docking cannot address all the aspects of protein dynamics, in particular the global conformational changes during protein complex formation. In spite of this fact, protein-protein docking is widely used to model complexes of water-soluble proteins and less commonly to predict structures of transmembrane protein assemblies, including dimers and oligomers of G protein-coupled receptors (GPCRs). In this chapter we review the principles of protein-protein docking, available algorithms and software and discuss the recent examples, benefits, and drawbacks of protein-protein docking application to water-soluble proteins, membrane anchoring and transmembrane proteins, including GPCRs.

  15. Evidence of probabilistic behaviour in protein interaction networks

    Directory of Open Access Journals (Sweden)

    Reifman Jaques

    2008-01-01

    Full Text Available Abstract Background Data from high-throughput experiments of protein-protein interactions are commonly used to probe the nature of biological organization and extract functional relationships between sets of proteins. What has not been appreciated is that the underlying mechanisms involved in assembling these networks may exhibit considerable probabilistic behaviour. Results We find that the probability of an interaction between two proteins is generally proportional to the numerical product of their individual interacting partners, or degrees. The degree-weighted behaviour is manifested throughout the protein-protein interaction networks studied here, except for the high-degree, or hub, interaction areas. However, we find that the probabilities of interaction between the hubs are still high. Further evidence is provided by path length analyses, which show that these hubs are separated by very few links. Conclusion The results suggest that protein-protein interaction networks incorporate probabilistic elements that lead to scale-rich hierarchical architectures. These observations seem to be at odds with a biologically-guided organization. One interpretation of the findings is that we are witnessing the ability of proteins to indiscriminately bind rather than the protein-protein interactions that are actually utilized by the cell in biological processes. Therefore, the topological study of a degree-weighted network requires a more refined methodology to extract biological information about pathways, modules, or other inferred relationships among proteins.

  16. Mapping functional prion-prion protein interaction sites using prion protein based peptide-arrays

    NARCIS (Netherlands)

    Rigter, A.; Priem, J.; Timmers-Parohi, D.; Langeveld, J.; Bossers, A.

    2009-01-01

    Protein-protein interactions are at the basis of most if not all biological processes in living cells. Therefore, adapting existing techniques or developing new techniques to study interactions between proteins are of importance in elucidating which amino acid sequences contribute to these

  17. Sequence- and interactome-based prediction of viral protein hotspots targeting host proteins: a case study for HIV Nef.

    Directory of Open Access Journals (Sweden)

    Mahdi Sarmady

    Full Text Available Virus proteins alter protein pathways of the host toward the synthesis of viral particles by breaking and making edges via binding to host proteins. In this study, we developed a computational approach to predict viral sequence hotspots for binding to host proteins based on sequences of viral and host proteins and literature-curated virus-host protein interactome data. We use a motif discovery algorithm repeatedly on collections of sequences of viral proteins and immediate binding partners of their host targets and choose only those motifs that are conserved on viral sequences and highly statistically enriched among binding partners of virus protein targeted host proteins. Our results match experimental data on binding sites of Nef to host proteins such as MAPK1, VAV1, LCK, HCK, HLA-A, CD4, FYN, and GNB2L1 with high statistical significance but is a poor predictor of Nef binding sites on highly flexible, hoop-like regions. Predicted hotspots recapture CD8 cell epitopes of HIV Nef highlighting their importance in modulating virus-host interactions. Host proteins potentially targeted or outcompeted by Nef appear crowding the T cell receptor, natural killer cell mediated cytotoxicity, and neurotrophin signaling pathways. Scanning of HIV Nef motifs on multiple alignments of hepatitis C protein NS5A produces results consistent with literature, indicating the potential value of the hotspot discovery in advancing our understanding of virus-host crosstalk.

  18. Affinity purification combined with mass spectrometry to identify herpes simplex virus protein-protein interactions.

    Science.gov (United States)

    Meckes, David G

    2014-01-01

    The identification and characterization of herpes simplex virus protein interaction complexes are fundamental to understanding the molecular mechanisms governing the replication and pathogenesis of the virus. Recent advances in affinity-based methods, mass spectrometry configurations, and bioinformatics tools have greatly increased the quantity and quality of protein-protein interaction datasets. In this chapter, detailed and reliable methods that can easily be implemented are presented for the identification of protein-protein interactions using cryogenic cell lysis, affinity purification, trypsin digestion, and mass spectrometry.

  19. Feature generation and representations for protein-protein interaction classification.

    Science.gov (United States)

    Lan, Man; Tan, Chew Lim; Su, Jian

    2009-10-01

    Automatic detecting protein-protein interaction (PPI) relevant articles is a crucial step for large-scale biological database curation. The previous work adopted POS tagging, shallow parsing and sentence splitting techniques, but they achieved worse performance than the simple bag-of-words representation. In this paper, we generated and investigated multiple types of feature representations in order to further improve the performance of PPI text classification task. Besides the traditional domain-independent bag-of-words approach and the term weighting methods, we also explored other domain-dependent features, i.e. protein-protein interaction trigger keywords, protein named entities and the advanced ways of incorporating Natural Language Processing (NLP) output. The integration of these multiple features has been evaluated on the BioCreAtIvE II corpus. The experimental results showed that both the advanced way of using NLP output and the integration of bag-of-words and NLP output improved the performance of text classification. Specifically, in comparison with the best performance achieved in the BioCreAtIvE II IAS, the feature-level and classifier-level integration of multiple features improved the performance of classification 2.71% and 3.95%, respectively.

  20. Evidence for the interaction of the regulatory protein Ki-1/57 with p53 and its interacting proteins

    International Nuclear Information System (INIS)

    Nery, Flavia C.; Rui, Edmilson; Kuniyoshi, Tais M.; Kobarg, Joerg

    2006-01-01

    Ki-1/57 is a cytoplasmic and nuclear phospho-protein of 57 kDa and interacts with the adaptor protein RACK1, the transcription factor MEF2C, and the chromatin remodeling factor CHD3, suggesting that it might be involved in the regulation of transcription. Here, we describe yeast two-hybrid studies that identified a total of 11 proteins interacting with Ki-1/57, all of which interact or are functionally associated with p53 or other members of the p53 family of proteins. We further found that Ki-1/57 is able to interact with p53 itself in the yeast two-hybrid system when the interaction was tested directly. This interaction could be confirmed by pull down assays with purified proteins in vitro and by reciprocal co-immunoprecipitation assays from the human Hodgkin analogous lymphoma cell line L540. Furthermore, we found that the phosphorylation of p53 by PKC abolishes its interaction with Ki-1/57 in vitro

  1. The effects of non-synonymous single nucleotide polymorphisms (nsSNPs) on protein-protein interactions.

    Science.gov (United States)

    Yates, Christopher M; Sternberg, Michael J E

    2013-11-01

    Non-synonymous single nucleotide polymorphisms (nsSNPs) are single base changes leading to a change to the amino acid sequence of the encoded protein. Many of these variants are associated with disease, so nsSNPs have been well studied, with studies looking at the effects of nsSNPs on individual proteins, for example, on stability and enzyme active sites. In recent years, the impact of nsSNPs upon protein-protein interactions has also been investigated, giving a greater insight into the mechanisms by which nsSNPs can lead to disease. In this review, we summarize these studies, looking at the various mechanisms by which nsSNPs can affect protein-protein interactions. We focus on structural changes that can impair interaction, changes to disorder, gain of interaction, and post-translational modifications before looking at some examples of nsSNPs at human-pathogen protein-protein interfaces and the analysis of nsSNPs from a network perspective. © 2013.

  2. ComplexContact: a web server for inter-protein contact prediction using deep learning

    KAUST Repository

    Zeng, Hong; Wang, Sheng; Zhou, Tianming; Zhao, Feifeng; Li, Xiufeng; Wu, Qing; Xu, Jinbo

    2018-01-01

    ComplexContact (http://raptorx2.uchicago.edu/ComplexContact/) is a web server for sequence-based interfacial residue-residue contact prediction of a putative protein complex. Interfacial residue-residue contacts are critical for understanding how proteins form complex and interact at residue level. When receiving a pair of protein sequences, ComplexContact first searches for their sequence homologs and builds two paired multiple sequence alignments (MSA), then it applies co-evolution analysis and a CASP-winning deep learning (DL) method to predict interfacial contacts from paired MSAs and visualizes the prediction as an image. The DL method was originally developed for intra-protein contact prediction and performed the best in CASP12. Our large-scale experimental test further shows that ComplexContact greatly outperforms pure co-evolution methods for inter-protein contact prediction, regardless of the species.

  3. ComplexContact: a web server for inter-protein contact prediction using deep learning

    KAUST Repository

    Zeng, Hong

    2018-05-20

    ComplexContact (http://raptorx2.uchicago.edu/ComplexContact/) is a web server for sequence-based interfacial residue-residue contact prediction of a putative protein complex. Interfacial residue-residue contacts are critical for understanding how proteins form complex and interact at residue level. When receiving a pair of protein sequences, ComplexContact first searches for their sequence homologs and builds two paired multiple sequence alignments (MSA), then it applies co-evolution analysis and a CASP-winning deep learning (DL) method to predict interfacial contacts from paired MSAs and visualizes the prediction as an image. The DL method was originally developed for intra-protein contact prediction and performed the best in CASP12. Our large-scale experimental test further shows that ComplexContact greatly outperforms pure co-evolution methods for inter-protein contact prediction, regardless of the species.

  4. ComplexContact: a web server for inter-protein contact prediction using deep learning.

    Science.gov (United States)

    Zeng, Hong; Wang, Sheng; Zhou, Tianming; Zhao, Feifeng; Li, Xiufeng; Wu, Qing; Xu, Jinbo

    2018-05-22

    ComplexContact (http://raptorx2.uchicago.edu/ComplexContact/) is a web server for sequence-based interfacial residue-residue contact prediction of a putative protein complex. Interfacial residue-residue contacts are critical for understanding how proteins form complex and interact at residue level. When receiving a pair of protein sequences, ComplexContact first searches for their sequence homologs and builds two paired multiple sequence alignments (MSA), then it applies co-evolution analysis and a CASP-winning deep learning (DL) method to predict interfacial contacts from paired MSAs and visualizes the prediction as an image. The DL method was originally developed for intra-protein contact prediction and performed the best in CASP12. Our large-scale experimental test further shows that ComplexContact greatly outperforms pure co-evolution methods for inter-protein contact prediction, regardless of the species.

  5. Functional structural motifs for protein-ligand, protein-protein, and protein-nucleic acid interactions and their connection to supersecondary structures.

    Science.gov (United States)

    Kinjo, Akira R; Nakamura, Haruki

    2013-01-01

    Protein functions are mediated by interactions between proteins and other molecules. One useful approach to analyze protein functions is to compare and classify the structures of interaction interfaces of proteins. Here, we describe the procedures for compiling a database of interface structures and efficiently comparing the interface structures. To do so requires a good understanding of the data structures of the Protein Data Bank (PDB). Therefore, we also provide a detailed account of the PDB exchange dictionary necessary for extracting data that are relevant for analyzing interaction interfaces and secondary structures. We identify recurring structural motifs by classifying similar interface structures, and we define a coarse-grained representation of supersecondary structures (SSS) which represents a sequence of two or three secondary structure elements including their relative orientations as a string of four to seven letters. By examining the correspondence between structural motifs and SSS strings, we show that no SSS string has particularly high propensity to be found interaction interfaces in general, indicating any SSS can be used as a binding interface. When individual structural motifs are examined, there are some SSS strings that have high propensity for particular groups of structural motifs. In addition, it is shown that while the SSS strings found in particular structural motifs for nonpolymer and protein interfaces are as abundant as in other structural motifs that belong to the same subunit, structural motifs for nucleic acid interfaces exhibit somewhat stronger preference for SSS strings. In regard to protein folds, many motif-specific SSS strings were found across many folds, suggesting that SSS may be a useful description to investigate the universality of ligand binding modes.

  6. Structure homology and interaction redundancy for discovering virus–host protein interactions

    Science.gov (United States)

    de Chassey, Benoît; Meyniel-Schicklin, Laurène; Aublin-Gex, Anne; Navratil, Vincent; Chantier, Thibaut; André, Patrice; Lotteau, Vincent

    2013-01-01

    Virus–host interactomes are instrumental to understand global perturbations of cellular functions induced by infection and discover new therapies. The construction of such interactomes is, however, technically challenging and time consuming. Here we describe an original method for the prediction of high-confidence interactions between viral and human proteins through a combination of structure and high-quality interactome data. Validation was performed for the NS1 protein of the influenza virus, which led to the identification of new host factors that control viral replication. PMID:24008843

  7. Structure homology and interaction redundancy for discovering virus-host protein interactions.

    Science.gov (United States)

    de Chassey, Benoît; Meyniel-Schicklin, Laurène; Aublin-Gex, Anne; Navratil, Vincent; Chantier, Thibaut; André, Patrice; Lotteau, Vincent

    2013-10-01

    Virus-host interactomes are instrumental to understand global perturbations of cellular functions induced by infection and discover new therapies. The construction of such interactomes is, however, technically challenging and time consuming. Here we describe an original method for the prediction of high-confidence interactions between viral and human proteins through a combination of structure and high-quality interactome data. Validation was performed for the NS1 protein of the influenza virus, which led to the identification of new host factors that control viral replication.

  8. Protein thermostability prediction within homologous families using temperature-dependent statistical potentials.

    Directory of Open Access Journals (Sweden)

    Fabrizio Pucci

    Full Text Available The ability to rationally modify targeted physical and biological features of a protein of interest holds promise in numerous academic and industrial applications and paves the way towards de novo protein design. In particular, bioprocesses that utilize the remarkable properties of enzymes would often benefit from mutants that remain active at temperatures that are either higher or lower than the physiological temperature, while maintaining the biological activity. Many in silico methods have been developed in recent years for predicting the thermodynamic stability of mutant proteins, but very few have focused on thermostability. To bridge this gap, we developed an algorithm for predicting the best descriptor of thermostability, namely the melting temperature Tm, from the protein's sequence and structure. Our method is applicable when the Tm of proteins homologous to the target protein are known. It is based on the design of several temperature-dependent statistical potentials, derived from datasets consisting of either mesostable or thermostable proteins. Linear combinations of these potentials have been shown to yield an estimation of the protein folding free energies at low and high temperatures, and the difference of these energies, a prediction of the melting temperature. This particular construction, that distinguishes between the interactions that contribute more than others to the stability at high temperatures and those that are more stabilizing at low T, gives better performances compared to the standard approach based on T-independent potentials which predict the thermal resistance from the thermodynamic stability. Our method has been tested on 45 proteins of known Tm that belong to 11 homologous families. The standard deviation between experimental and predicted Tm's is equal to 13.6°C in cross validation, and decreases to 8.3°C if the 6 worst predicted proteins are excluded. Possible extensions of our approach are discussed.

  9. Data management of protein interaction networks

    CERN Document Server

    Cannataro, Mario

    2012-01-01

    Interactomics: a complete survey from data generation to knowledge extraction With the increasing use of high-throughput experimental assays, more and more protein interaction databases are becoming available. As a result, computational analysis of protein-to-protein interaction (PPI) data and networks, now known as interactomics, has become an essential tool to determine functionally associated proteins. From wet lab technologies to data management to knowledge extraction, this timely book guides readers through the new science of interactomics, giving them the tools needed to: Generate

  10. BLAST-based structural annotation of protein residues using Protein Data Bank.

    Science.gov (United States)

    Singh, Harinder; Raghava, Gajendra P S

    2016-01-25

    In the era of next-generation sequencing where thousands of genomes have been already sequenced; size of protein databases is growing with exponential rate. Structural annotation of these proteins is one of the biggest challenges for the computational biologist. Although, it is easy to perform BLAST search against Protein Data Bank (PDB) but it is difficult for a biologist to annotate protein residues from BLAST search. A web-server StarPDB has been developed for structural annotation of a protein based on its similarity with known protein structures. It uses standard BLAST software for performing similarity search of a query protein against protein structures in PDB. This server integrates wide range modules for assigning different types of annotation that includes, Secondary-structure, Accessible surface area, Tight-turns, DNA-RNA and Ligand modules. Secondary structure module allows users to predict regular secondary structure states to each residue in a protein. Accessible surface area predict the exposed or buried residues in a protein. Tight-turns module is designed to predict tight turns like beta-turns in a protein. DNA-RNA module developed for predicting DNA and RNA interacting residues in a protein. Similarly, Ligand module of server allows one to predicted ligands, metal and nucleotides ligand interacting residues in a protein. In summary, this manuscript presents a web server for comprehensive annotation of a protein based on similarity search. It integrates number of visualization tools that facilitate users to understand structure and function of protein residues. This web server is available freely for scientific community from URL http://crdd.osdd.net/raghava/starpdb .

  11. The role of hydrophobic interactions in positioning of peripheral proteins in membranes

    Directory of Open Access Journals (Sweden)

    Lomize Mikhail A

    2007-06-01

    Full Text Available Abstract Background Three-dimensional (3D structures of numerous peripheral membrane proteins have been determined. Biological activity, stability, and conformations of these proteins depend on their spatial positions with respect to the lipid bilayer. However, these positions are usually undetermined. Results We report the first large-scale computational study of monotopic/peripheral proteins with known 3D structures. The optimal translational and rotational positions of 476 proteins are determined by minimizing energy of protein transfer from water to the lipid bilayer, which is approximated by a hydrocarbon slab with a decadiene-like polarity and interfacial regions characterized by water-permeation profiles. Predicted membrane-binding sites, protein tilt angles and membrane penetration depths are consistent with spin-labeling, chemical modification, fluorescence, NMR, mutagenesis, and other experimental studies of 53 peripheral proteins and peptides. Experimental membrane binding affinities of peripheral proteins were reproduced in cases that did not involve a helix-coil transition, specific binding of lipids, or a predominantly electrostatic association. Coordinates of all examined peripheral proteins and peptides with the calculated hydrophobic membrane boundaries, subcellular localization, topology, structural classification, and experimental references are available through the Orientations of Proteins in Membranes (OPM database. Conclusion Positions of diverse peripheral proteins and peptides in the lipid bilayer can be accurately predicted using their 3D structures that represent a proper membrane-bound conformation and oligomeric state, and have membrane binding elements present. The success of the implicit solvation model suggests that hydrophobic interactions are usually sufficient to determine the spatial position of a protein in the membrane, even when electrostatic interactions or specific binding of lipids are substantial. Our

  12. The simulation approach to lipid-protein interactions.

    Science.gov (United States)

    Paramo, Teresa; Garzón, Diana; Holdbrook, Daniel A; Khalid, Syma; Bond, Peter J

    2013-01-01

    The interactions between lipids and proteins are crucial for a range of biological processes, from the folding and stability of membrane proteins to signaling and metabolism facilitated by lipid-binding proteins. However, high-resolution structural details concerning functional lipid/protein interactions are scarce due to barriers in both experimental isolation of native lipid-bound complexes and subsequent biophysical characterization. The molecular dynamics (MD) simulation approach provides a means to complement available structural data, yielding dynamic, structural, and thermodynamic data for a protein embedded within a physiologically realistic, modelled lipid environment. In this chapter, we provide a guide to current methods for setting up and running simulations of membrane proteins and soluble, lipid-binding proteins, using standard atomistically detailed representations, as well as simplified, coarse-grained models. In addition, we outline recent studies that illustrate the power of the simulation approach in the context of biologically relevant lipid/protein interactions.

  13. Protein-Ligand Empirical Interaction Components for Virtual Screening.

    Science.gov (United States)

    Yan, Yuna; Wang, Weijun; Sun, Zhaoxi; Zhang, John Z H; Ji, Changge

    2017-08-28

    A major shortcoming of empirical scoring functions is that they often fail to predict binding affinity properly. Removing false positives of docking results is one of the most challenging works in structure-based virtual screening. Postdocking filters, making use of all kinds of experimental structure and activity information, may help in solving the issue. We describe a new method based on detailed protein-ligand interaction decomposition and machine learning. Protein-ligand empirical interaction components (PLEIC) are used as descriptors for support vector machine learning to develop a classification model (PLEIC-SVM) to discriminate false positives from true positives. Experimentally derived activity information is used for model training. An extensive benchmark study on 36 diverse data sets from the DUD-E database has been performed to evaluate the performance of the new method. The results show that the new method performs much better than standard empirical scoring functions in structure-based virtual screening. The trained PLEIC-SVM model is able to capture important interaction patterns between ligand and protein residues for one specific target, which is helpful in discarding false positives in postdocking filtering.

  14. Human-Chromatin-Related Protein Interactions Identify a Demethylase Complex Required for Chromosome Segregation

    Directory of Open Access Journals (Sweden)

    Edyta Marcon

    2014-07-01

    Full Text Available Chromatin regulation is driven by multicomponent protein complexes, which form functional modules. Deciphering the components of these modules and their interactions is central to understanding the molecular pathways these proteins are regulating, their functions, and their relation to both normal development and disease. We describe the use of affinity purifications of tagged human proteins coupled with mass spectrometry to generate a protein-protein interaction map encompassing known and predicted chromatin-related proteins. On the basis of 1,394 successful purifications of 293 proteins, we report a high-confidence (85% precision network involving 11,464 protein-protein interactions among 1,738 different human proteins, grouped into 164 often overlapping protein complexes with a particular focus on the family of JmjC-containing lysine demethylases, their partners, and their roles in chromatin remodeling. We show that RCCD1 is a partner of histone H3K36 demethylase KDM8 and demonstrate that both are important for cell-cycle-regulated transcriptional repression in centromeric regions and accurate mitotic division.

  15. Energy landscape of all-atom protein-protein interactions revealed by multiscale enhanced sampling.

    Directory of Open Access Journals (Sweden)

    Kei Moritsugu

    2014-10-01

    Full Text Available Protein-protein interactions are regulated by a subtle balance of complicated atomic interactions and solvation at the interface. To understand such an elusive phenomenon, it is necessary to thoroughly survey the large configurational space from the stable complex structure to the dissociated states using the all-atom model in explicit solvent and to delineate the energy landscape of protein-protein interactions. In this study, we carried out a multiscale enhanced sampling (MSES simulation of the formation of a barnase-barstar complex, which is a protein complex characterized by an extraordinary tight and fast binding, to determine the energy landscape of atomistic protein-protein interactions. The MSES adopts a multicopy and multiscale scheme to enable for the enhanced sampling of the all-atom model of large proteins including explicit solvent. During the 100-ns MSES simulation of the barnase-barstar system, we observed the association-dissociation processes of the atomistic protein complex in solution several times, which contained not only the native complex structure but also fully non-native configurations. The sampled distributions suggest that a large variety of non-native states went downhill to the stable complex structure, like a fast folding on a funnel-like potential. This funnel landscape is attributed to dominant configurations in the early stage of the association process characterized by near-native orientations, which will accelerate the native inter-molecular interactions. These configurations are guided mostly by the shape complementarity between barnase and barstar, and lead to the fast formation of the final complex structure along the downhill energy landscape.

  16. Multi-level learning: improving the prediction of protein, domain and residue interactions by allowing information flow between levels

    Directory of Open Access Journals (Sweden)

    McDermott Drew

    2009-08-01

    Full Text Available Abstract Background Proteins interact through specific binding interfaces that contain many residues in domains. Protein interactions thus occur on three different levels of a concept hierarchy: whole-proteins, domains, and residues. Each level offers a distinct and complementary set of features for computationally predicting interactions, including functional genomic features of whole proteins, evolutionary features of domain families and physical-chemical features of individual residues. The predictions at each level could benefit from using the features at all three levels. However, it is not trivial as the features are provided at different granularity. Results To link up the predictions at the three levels, we propose a multi-level machine-learning framework that allows for explicit information flow between the levels. We demonstrate, using representative yeast interaction networks, that our algorithm is able to utilize complementary feature sets to make more accurate predictions at the three levels than when the three problems are approached independently. To facilitate application of our multi-level learning framework, we discuss three key aspects of multi-level learning and the corresponding design choices that we have made in the implementation of a concrete learning algorithm. 1 Architecture of information flow: we show the greater flexibility of bidirectional flow over independent levels and unidirectional flow; 2 Coupling mechanism of the different levels: We show how this can be accomplished via augmenting the training sets at each level, and discuss the prevention of error propagation between different levels by means of soft coupling; 3 Sparseness of data: We show that the multi-level framework compounds data sparsity issues, and discuss how this can be dealt with by building local models in information-rich parts of the data. Our proof-of-concept learning algorithm demonstrates the advantage of combining levels, and opens up

  17. A membrane protein / signaling protein interaction network for Arabidopsis version AMPv2

    Directory of Open Access Journals (Sweden)

    Sylvie Lalonde

    2010-09-01

    Full Text Available Interactions between membrane proteins and the soluble fraction are essential for signal transduction and for regulating nutrient transport. To gain insights into the membrane-based interactome, 3,852 open reading frames (ORFs out of a target list of 8,383 representing membrane and signaling proteins from Arabidopsis thaliana were cloned into a Gateway compatible vector. The mating-based split-ubiquitin system was used to screen for potential protein-protein interactions (pPPIs among 490 Arabidopsis ORFs. A binary robotic screen between 142 receptor-like kinases, 72 transporters, 57 soluble protein kinases and phosphatases, 40 glycosyltransferases, 95 proteins of various functions and 89 proteins with unknown function detected 387 out of 90,370 possible PPIs. A secondary screen confirmed 343 (of 387 pPPIs between 179 proteins, yielding a scale-free network (r2=0.863. Eighty of 142 transmembrane receptor-like kinases (RLK tested positive, identifying three homomers, 63 heteromers and 80 pPPIs with other proteins. Thirty-one out of 142 RLK interactors (including RLKs had previously been found to be phosphorylated; thus interactors may be substrates for respective RLKs. None of the pPPIs described here had been reported in the major interactome databases, including potential interactors of G protein-coupled receptors, phospholipase C, and AMT ammonium transporters. Two RLKs found as putative interactors of AMT1;1 were independently confirmed using a split luciferase assay in Arabidopsis protoplasts. These RLKs may be involved in ammonium-dependent phosphorylation of the C-terminus and regulation of ammonium uptake activity. The robotic screening method established here will enable a systematic analysis of membrane protein interactions in fungi, plants and metazoa.

  18. Prediction of Protein Hotspots from Whole Protein Sequences by a Random Projection Ensemble System

    Directory of Open Access Journals (Sweden)

    Jinjian Jiang

    2017-07-01

    Full Text Available Hotspot residues are important in the determination of protein-protein interactions, and they always perform specific functions in biological processes. The determination of hotspot residues is by the commonly-used method of alanine scanning mutagenesis experiments, which is always costly and time consuming. To address this issue, computational methods have been developed. Most of them are structure based, i.e., using the information of solved protein structures. However, the number of solved protein structures is extremely less than that of sequences. Moreover, almost all of the predictors identified hotspots from the interfaces of protein complexes, seldom from the whole protein sequences. Therefore, determining hotspots from whole protein sequences by sequence information alone is urgent. To address the issue of hotspot predictions from the whole sequences of proteins, we proposed an ensemble system with random projections using statistical physicochemical properties of amino acids. First, an encoding scheme involving sequence profiles of residues and physicochemical properties from the AAindex1 dataset is developed. Then, the random projection technique was adopted to project the encoding instances into a reduced space. Then, several better random projections were obtained by training an IBk classifier based on the training dataset, which were thus applied to the test dataset. The ensemble of random projection classifiers is therefore obtained. Experimental results showed that although the performance of our method is not good enough for real applications of hotspots, it is very promising in the determination of hotspot residues from whole sequences.

  19. Fast dynamics perturbation analysis for prediction of protein functional sites

    Directory of Open Access Journals (Sweden)

    Cohn Judith D

    2008-01-01

    Full Text Available Abstract Background We present a fast version of the dynamics perturbation analysis (DPA algorithm to predict functional sites in protein structures. The original DPA algorithm finds regions in proteins where interactions cause a large change in the protein conformational distribution, as measured using the relative entropy Dx. Such regions are associated with functional sites. Results The Fast DPA algorithm, which accelerates DPA calculations, is motivated by an empirical observation that Dx in a normal-modes model is highly correlated with an entropic term that only depends on the eigenvalues of the normal modes. The eigenvalues are accurately estimated using first-order perturbation theory, resulting in a N-fold reduction in the overall computational requirements of the algorithm, where N is the number of residues in the protein. The performance of the original and Fast DPA algorithms was compared using protein structures from a standard small-molecule docking test set. For nominal implementations of each algorithm, top-ranked Fast DPA predictions overlapped the true binding site 94% of the time, compared to 87% of the time for original DPA. In addition, per-protein recall statistics (fraction of binding-site residues that are among predicted residues were slightly better for Fast DPA. On the other hand, per-protein precision statistics (fraction of predicted residues that are among binding-site residues were slightly better using original DPA. Overall, the performance of Fast DPA in predicting ligand-binding-site residues was comparable to that of the original DPA algorithm. Conclusion Compared to the original DPA algorithm, the decreased run time with comparable performance makes Fast DPA well-suited for implementation on a web server and for high-throughput analysis.

  20. Yeast Interacting Proteins Database: YOR302W, YOR047C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available rol of glucose-regulated gene expression; interacts with protein kinase Snf1p, glucose sensors Snf3p and Rgt...tein kinase Snf1p, glucose sensors Snf3p and Rgt2p, and TATA-binding protein Spt1

  1. Interaction between Na-K-ATPase and Bcl-2 proteins BclXL and Bak.

    Science.gov (United States)

    Lauf, Peter K; Alqahtani, Tariq; Flues, Karin; Meller, Jaroslaw; Adragna, Norma C

    2015-01-01

    In silico analysis predicts interaction between Na-K-ATPase (NKA) and Bcl-2 protein canonical BH3- and BH1-like motifs, consistent with NKA inhibition by the benzo-phenanthridine alkaloid chelerythrine, a BH3 mimetic, in fetal human lens epithelial cells (FHLCs) (Lauf PK, Heiny J, Meller J, Lepera MA, Koikov L, Alter GM, Brown TL, Adragna NC. Cell Physiol Biochem 31: 257-276, 2013). This report establishes proof of concept: coimmunoprecipitation and immunocolocalization showed unequivocal and direct physical interaction between NKA and Bcl-2 proteins. Specifically, NKA antibodies (ABs) coimmunoprecipitated BclXL (B-cell lymphoma extra large) and BAK (Bcl-2 antagonist killer) proteins in FHLCs and A549 lung cancer cells. In contrast, both anti-Bcl-2 ABs failed to pull down NKA. Notably, the molecular mass of BAK1 proteins pulled down by NKA and BclXL ABs appeared to be some 4-kDa larger than found in input monomers. In silico analysis predicts these higher molecular mass BAK1 proteins as alternative splicing variants, encoding 42 amino acid (aa) larger proteins than the known 211-aa long canonical BAK1 protein. These BAK1 variants may constitute a pool separate from that forming mitochondrial pores by specifically interacting with NKA and BclXL proteins. We propose a NKA-Bcl-2 protein ternary complex supporting our hypothesis for a special sensor role of NKA in Bcl-2 protein control of cell survival and apoptosis. Copyright © 2015 the American Physiological Society.

  2. Full Data of Yeast Interacting Proteins Database (Original Version) - Yeast Interacting Proteins Database | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Yeast Interacting Proteins Database Full Data of Yeast Interacting Proteins Database (Origin...al Version) Data detail Data name Full Data of Yeast Interacting Proteins Database (Original Version) DOI 10....18908/lsdba.nbdc00742-004 Description of data contents The entire data in the Yeast Interacting Proteins Database...eir interactions are required. Several sources including YPD (Yeast Proteome Database, Costanzo, M. C., Hoga...ematic name in the SGD (Saccharomyces Genome Database; http://www.yeastgenome.org /). Bait gene name The gen

  3. A conserved NAD+ binding pocket that regulates protein-protein interactions during aging.

    Science.gov (United States)

    Li, Jun; Bonkowski, Michael S; Moniot, Sébastien; Zhang, Dapeng; Hubbard, Basil P; Ling, Alvin J Y; Rajman, Luis A; Qin, Bo; Lou, Zhenkun; Gorbunova, Vera; Aravind, L; Steegborn, Clemens; Sinclair, David A

    2017-03-24

    DNA repair is essential for life, yet its efficiency declines with age for reasons that are unclear. Numerous proteins possess Nudix homology domains (NHDs) that have no known function. We show that NHDs are NAD + (oxidized form of nicotinamide adenine dinucleotide) binding domains that regulate protein-protein interactions. The binding of NAD + to the NHD domain of DBC1 (deleted in breast cancer 1) prevents it from inhibiting PARP1 [poly(adenosine diphosphate-ribose) polymerase], a critical DNA repair protein. As mice age and NAD + concentrations decline, DBC1 is increasingly bound to PARP1, causing DNA damage to accumulate, a process rapidly reversed by restoring the abundance of NAD + Thus, NAD + directly regulates protein-protein interactions, the modulation of which may protect against cancer, radiation, and aging. Copyright © 2017, American Association for the Advancement of Science.

  4. Predicting turns in proteins with a unified model.

    Directory of Open Access Journals (Sweden)

    Qi Song

    Full Text Available MOTIVATION: Turns are a critical element of the structure of a protein; turns play a crucial role in loops, folds, and interactions. Current prediction methods are well developed for the prediction of individual turn types, including α-turn, β-turn, and γ-turn, etc. However, for further protein structure and function prediction it is necessary to develop a uniform model that can accurately predict all types of turns simultaneously. RESULTS: In this study, we present a novel approach, TurnP, which offers the ability to investigate all the turns in a protein based on a unified model. The main characteristics of TurnP are: (i using newly exploited features of structural evolution information (secondary structure and shape string of protein based on structure homologies, (ii considering all types of turns in a unified model, and (iii practical capability of accurate prediction of all turns simultaneously for a query. TurnP utilizes predicted secondary structures and predicted shape strings, both of which have greater accuracy, based on innovative technologies which were both developed by our group. Then, sequence and structural evolution features, which are profile of sequence, profile of secondary structures and profile of shape strings are generated by sequence and structure alignment. When TurnP was validated on a non-redundant dataset (4,107 entries by five-fold cross-validation, we achieved an accuracy of 88.8% and a sensitivity of 71.8%, which exceeded the most state-of-the-art predictors of certain type of turn. Newly determined sequences, the EVA and CASP9 datasets were used as independent tests and the results we achieved were outstanding for turn predictions and confirmed the good performance of TurnP for practical applications.

  5. Cascaded bidirectional recurrent neural networks for protein secondary structure prediction.

    Science.gov (United States)

    Chen, Jinmiao; Chaudhari, Narendra

    2007-01-01

    Protein secondary structure (PSS) prediction is an important topic in bioinformatics. Our study on a large set of non-homologous proteins shows that long-range interactions commonly exist and negatively affect PSS prediction. Besides, we also reveal strong correlations between secondary structure (SS) elements. In order to take into account the long-range interactions and SS-SS correlations, we propose a novel prediction system based on cascaded bidirectional recurrent neural network (BRNN). We compare the cascaded BRNN against another two BRNN architectures, namely the original BRNN architecture used for speech recognition as well as Pollastri's BRNN that was proposed for PSS prediction. Our cascaded BRNN achieves an overall three state accuracy Q3 of 74.38\\%, and reaches a high Segment OVerlap (SOV) of 66.0455. It outperforms the original BRNN and Pollastri's BRNN in both Q3 and SOV. Specifically, it improves the SOV score by 4-6%.

  6. Single-well monitoring of protein-protein interaction and phosphorylation-dephosphorylation events.

    Science.gov (United States)

    Arcand, Mathieu; Roby, Philippe; Bossé, Roger; Lipari, Francesco; Padrós, Jaime; Beaudet, Lucille; Marcil, Alexandre; Dahan, Sophie

    2010-04-20

    We combined oxygen channeling assays with two distinct chemiluminescent beads to detect simultaneously protein phosphorylation and interaction events that are usually monitored separately. This novel method was tested in the ERK1/2 MAP kinase pathway. It was first used to directly monitor dissociation of MAP kinase ERK2 from MEK1 upon phosphorylation and to evaluate MAP kinase phosphatase (MKP) selectivity and mechanism of action. In addition, MEK1 and ERK2 were probed with an ATP competitor and an allosteric MEK1 inhibitor, which generated distinct phosphorylation-interaction patterns. Simultaneous monitoring of protein-protein interactions and substrate phosphorylation can provide significant mechanistic insight into enzyme activity and small molecule action.

  7. Protein-material interactions: From micro-to-nano scale

    International Nuclear Information System (INIS)

    Tsapikouni, Theodora S.; Missirlis, Yannis F.

    2008-01-01

    The article presents a survey on the significance of protein-material interactions, the mechanisms which control them and the techniques used for their study. Protein-surface interactions play a key role in regenerative medicine, drug delivery, biosensor technology and chromatography, while it is related to various undesired effects such as biofouling and bio-prosthetic malfunction. Although the effects of protein-surface interaction concern the micro-scale, being sometimes obvious even with bare eyes, they derive from biophysical events at the nano-scale. The sequential steps for protein adsorption involve events at the single biomolecule level and the forces driving or inhibiting protein adsorption act at the molecular level too. Following the scaling of protein-surface interactions, various techniques have been developed for their study both in the micro- and nano-scale. Protein labelling with radioisotopes or fluorescent probes, colorimetric assays and the quartz crystal microbalance were the first techniques used to monitor protein adsorption isotherms, while the surface force apparatus was used to measure the interaction forces between protein layers at the micro-scale. Recently, more elaborate techniques like total internal reflection fluorescence (TIRF), Fourier transform infrared spectroscopy (FTIR), surface plasmon resonance, Raman spectroscopy, ellipsometry and time of flight secondary ion mass spectrometry (ToF-SIMS) have been applied for the investigation of protein density, structure or orientation at the interfaces. However, a turning point in the study of protein interactions with the surfaces was the invention and the wide-spread use of atomic force microscopy (AFM) which can both image single protein molecules on surfaces and directly measure the interaction force

  8. Sequence-based prediction of protein-binding sites in DNA: comparative study of two SVM models.

    Science.gov (United States)

    Park, Byungkyu; Im, Jinyong; Tuvshinjargal, Narankhuu; Lee, Wook; Han, Kyungsook

    2014-11-01

    As many structures of protein-DNA complexes have been known in the past years, several computational methods have been developed to predict DNA-binding sites in proteins. However, its inverse problem (i.e., predicting protein-binding sites in DNA) has received much less attention. One of the reasons is that the differences between the interaction propensities of nucleotides are much smaller than those between amino acids. Another reason is that DNA exhibits less diverse sequence patterns than protein. Therefore, predicting protein-binding DNA nucleotides is much harder than predicting DNA-binding amino acids. We computed the interaction propensity (IP) of nucleotide triplets with amino acids using an extensive dataset of protein-DNA complexes, and developed two support vector machine (SVM) models that predict protein-binding nucleotides from sequence data alone. One SVM model predicts protein-binding nucleotides using DNA sequence data alone, and the other SVM model predicts protein-binding nucleotides using both DNA and protein sequences. In a 10-fold cross-validation with 1519 DNA sequences, the SVM model that uses DNA sequence data only predicted protein-binding nucleotides with an accuracy of 67.0%, an F-measure of 67.1%, and a Matthews correlation coefficient (MCC) of 0.340. With an independent dataset of 181 DNAs that were not used in training, it achieved an accuracy of 66.2%, an F-measure 66.3% and a MCC of 0.324. Another SVM model that uses both DNA and protein sequences achieved an accuracy of 69.6%, an F-measure of 69.6%, and a MCC of 0.383 in a 10-fold cross-validation with 1519 DNA sequences and 859 protein sequences. With an independent dataset of 181 DNAs and 143 proteins, it showed an accuracy of 67.3%, an F-measure of 66.5% and a MCC of 0.329. Both in cross-validation and independent testing, the second SVM model that used both DNA and protein sequence data showed better performance than the first model that used DNA sequence data. To the best of

  9. Novel fusion protein approach for efficient high-throughput screening of small molecule-mediating protein-protein interactions in cells and living animals.

    Science.gov (United States)

    Paulmurugan, Ramasamy; Gambhir, Sanjiv S

    2005-08-15

    Networks of protein interactions execute many different intracellular pathways. Small molecules either synthesized within the cell or obtained from the external environment mediate many of these protein-protein interactions. The study of these small molecule-mediated protein-protein interactions is important in understanding abnormal signal transduction pathways in a variety of disorders, as well as in optimizing the process of drug development and validation. In this study, we evaluated the rapamycin-mediated interaction of the human proteins FK506-binding protein (FKBP12) rapamycin-binding domain (FRB) and FKBP12 by constructing a fusion of these proteins with a split-Renilla luciferase or a split enhanced green fluorescent protein (split-EGFP) such that complementation of the reporter fragments occurs in the presence of rapamycin. Different linker peptides in the fusion protein were evaluated for the efficient maintenance of complemented reporter activity. This system was studied in both cell culture and xenografts in living animals. We found that peptide linkers with two or four EAAAR repeat showed higher protein-protein interaction-mediated signal with lower background signal compared with having no linker or linkers with amino acid sequences GGGGSGGGGS, ACGSLSCGSF, and ACGSLSCGSFACGSLSCGSF. A 9 +/- 2-fold increase in signal intensity both in cell culture and in living mice was seen compared with a system that expresses both reporter fragments and the interacting proteins separately. In this fusion system, rapamycin induced heterodimerization of the FRB and FKBP12 moieties occurred rapidly even at very lower concentrations (0.00001 nmol/L) of rapamycin. For a similar fusion system employing split-EGFP, flow cytometry analysis showed significant level of rapamycin-induced complementation.

  10. Study of protein-probe complexation equilibria and protein-surfactant interaction using charge transfer fluorescence probe methyl ester of N,N-dimethylamino naphthyl acrylic acid

    Energy Technology Data Exchange (ETDEWEB)

    Mahanta, Subrata; Balia Singh, Rupashree; Bagchi, Arnab [Department of Chemistry University of Calcutta 92, A.P.C. Road, Kolkata 700009 (India); Nath, Debnarayan [Department of Physical Chemistry, Indian Association for the Cultivation of Science, Jadavpur, Kolkata 700 032 (India); Guchhait, Nikhil, E-mail: nguchhait@yahoo.co [Department of Chemistry University of Calcutta 92, A.P.C. Road, Kolkata 700009 (India)

    2010-06-15

    In this paper, we demonstrate the interaction between intramolecular charge transfer (ICT) probe-Methyl ester of N,N-dimethylamino naphthyl acrylic acid (MDMANA) with bovine serum albumin (BSA) using absorption and fluorescence emission spectroscopy. The nature of probe protein binding interaction, fluorescence resonance energy transfer from protein to probe and time resolved fluorescence decay measurement predict that the probe molecule binds strongly to the hydrophobic cavity of the protein. Furthermore, the interaction of the anionic surfactant sodium dodecyl sulphate (SDS) with water soluble protein BSA has been investigated using MDMANA as fluorescenece probe. The changes in the spectral characteristics of charge transfer fluorescence probe MDMANA in BSA-SDS environment reflects well the nature of the protein-surfactant binding interaction such as specific binding, non-cooperative binding, cooperative binding and saturation binding.

  11. Reconstruction of the yeast protein-protein interaction network involved in nutrient sensing and global metabolic regulation

    DEFF Research Database (Denmark)

    Nandy, Subir Kumar; Jouhten, Paula; Nielsen, Jens

    2010-01-01

    proteins. Despite the value of BioGRID for studying protein-protein interactions, there is a need for manual curation of these interactions in order to remove false positives. RESULTS: Here we describe an annotated reconstruction of the protein-protein interactions around four key nutrient......) and for all the interactions between them (edges). The annotated information is readily available utilizing the functionalities of network modelling tools such as Cytoscape and CellDesigner. CONCLUSIONS: The reported fully annotated interaction model serves as a platform for integrated systems biology studies...

  12. Nanoparticles-cell association predicted by protein corona fingerprints

    Science.gov (United States)

    Palchetti, S.; Digiacomo, L.; Pozzi, D.; Peruzzi, G.; Micarelli, E.; Mahmoudi, M.; Caracciolo, G.

    2016-06-01

    chemistry (unmodified and PEGylated) to investigate the relationships between NP physicochemical properties (nanoparticle size, aggregation state and surface charge), protein corona fingerprints (PCFs), and NP-cell association. We found out that none of the NPs' physicochemical properties alone was exclusively able to account for association with human cervical cancer cell line (HeLa). For the entire library of NPs, a total of 436 distinct serum proteins were detected. We developed a predictive-validation modeling that provides a means of assessing the relative significance of the identified corona proteins. Interestingly, a minor fraction of the HC, which consists of only 8 PCFs were identified as main promoters of NP association with HeLa cells. Remarkably, identified PCFs have several receptors with high level of expression on the plasma membrane of HeLa cells. Electronic supplementary information (ESI) available: Table S1. Cell viability (%) and cell association of the different nanoparticles used. Table S2. Total number of identified proteins on the different nanoparticles used. Tables S3-S18. Top 25 most abundant corona proteins identified in the protein corona of nanoparticles NP2-NP16 following 1 hour incubation with HP. Table S19. List of descriptors used. Table S20. Potential targets of protein corona fingerprints with its own interaction score (mentha) and the expression median value in Hela cells. Fig. S1 and S2. Effect of exposure to human plasma on size and zeta potential of NPs. Fig. S3. Predictive modeling of nanoparticle-cell association. See DOI: 10.1039/c6nr03898k

  13. A lanthipeptide library used to identify a protein-protein interaction inhibitor.

    Science.gov (United States)

    Yang, Xiao; Lennard, Katherine R; He, Chang; Walker, Mark C; Ball, Andrew T; Doigneaux, Cyrielle; Tavassoli, Ali; van der Donk, Wilfred A

    2018-04-01

    In this article we describe the production and screening of a genetically encoded library of 10 6 lanthipeptides in Escherichia coli using the substrate-tolerant lanthipeptide synthetase ProcM. This plasmid-encoded library was combined with a bacterial reverse two-hybrid system for the interaction of the HIV p6 protein with the UEV domain of the human TSG101 protein, which is a critical protein-protein interaction for HIV budding from infected cells. Using this approach, we identified an inhibitor of this interaction from the lanthipeptide library, whose activity was verified in vitro and in cell-based virus-like particle-budding assays. Given the variety of lanthipeptide backbone scaffolds that may be produced with ProcM, this method may be used for the generation of genetically encoded libraries of natural product-like lanthipeptides containing substantial structural diversity. Such libraries may be combined with any cell-based assay to identify lanthipeptides with new biological activities.

  14. RAIN: RNA-protein Association and Interaction Networks

    DEFF Research Database (Denmark)

    Junge, Alexander; Refsgaard, Jan Christian; Garde, Christian

    2017-01-01

    is challenging due to data heterogeneity. Here, we present a database of ncRNA-RNA and ncRNA-protein interactions and its integration with the STRING database of protein-protein interactions. These ncRNA associations cover four organisms and have been established from curated examples, experimental data...

  15. Computational design of protein interactions: designing proteins that neutralize influenza by inhibiting its hemagglutinin surface protein

    Science.gov (United States)

    Fleishman, Sarel

    2012-02-01

    Molecular recognition underlies all life processes. Design of interactions not seen in nature is a test of our understanding of molecular recognition and could unlock the vast potential of subtle control over molecular interaction networks, allowing the design of novel diagnostics and therapeutics for basic and applied research. We developed the first general method for designing protein interactions. The method starts by computing a region of high affinity interactions between dismembered amino acid residues and the target surface and then identifying proteins that can harbor these residues. Designs are tested experimentally for binding the target surface and successful ones are affinity matured using yeast cell surface display. Applied to the conserved stem region of influenza hemagglutinin we designed two unrelated proteins that, following affinity maturation, bound hemagglutinin at subnanomolar dissociation constants. Co-crystal structures of hemagglutinin bound to the two designed binders were within 1Angstrom RMSd of their models, validating the accuracy of the design strategy. One of the designed proteins inhibits the conformational changes that underlie hemagglutinin's cell-invasion functions and blocks virus infectivity in cell culture, suggesting that such proteins may in future serve as diagnostics and antivirals against a wide range of pathogenic influenza strains. We have used this method to obtain experimentally validated binders of several other target proteins, demonstrating the generality of the approach. We discuss the combination of modeling and high-throughput characterization of design variants which has been key to the success of this approach, as well as how we have used the data obtained in this project to enhance our understanding of molecular recognition. References: Science 332:816 JMB, in press Protein Sci 20:753

  16. Enhancing the Functional Content of Eukaryotic Protein Interaction Networks

    Science.gov (United States)

    Pandey, Gaurav; Arora, Sonali; Manocha, Sahil; Whalen, Sean

    2014-01-01

    Protein interaction networks are a promising type of data for studying complex biological systems. However, despite the rich information embedded in these networks, these networks face important data quality challenges of noise and incompleteness that adversely affect the results obtained from their analysis. Here, we apply a robust measure of local network structure called common neighborhood similarity (CNS) to address these challenges. Although several CNS measures have been proposed in the literature, an understanding of their relative efficacies for the analysis of interaction networks has been lacking. We follow the framework of graph transformation to convert the given interaction network into a transformed network corresponding to a variety of CNS measures evaluated. The effectiveness of each measure is then estimated by comparing the quality of protein function predictions obtained from its corresponding transformed network with those from the original network. Using a large set of human and fly protein interactions, and a set of over GO terms for both, we find that several of the transformed networks produce more accurate predictions than those obtained from the original network. In particular, the measure and other continuous CNS measures perform well this task, especially for large networks. Further investigation reveals that the two major factors contributing to this improvement are the abilities of CNS measures to prune out noisy edges and enhance functional coherence in the transformed networks. PMID:25275489

  17. Small sets of interacting proteins suggest functional linkage mechanisms via Bayesian analogical reasoning.

    Science.gov (United States)

    Airoldi, Edoardo M; Heller, Katherine A; Silva, Ricardo

    2011-07-01

    Proteins and protein complexes coordinate their activity to execute cellular functions. In a number of experimental settings, including synthetic genetic arrays, genetic perturbations and RNAi screens, scientists identify a small set of protein interactions of interest. A working hypothesis is often that these interactions are the observable phenotypes of some functional process, which is not directly observable. Confirmatory analysis requires finding other pairs of proteins whose interaction may be additional phenotypical evidence about the same functional process. Extant methods for finding additional protein interactions rely heavily on the information in the newly identified set of interactions. For instance, these methods leverage the attributes of the individual proteins directly, in a supervised setting, in order to find relevant protein pairs. A small set of protein interactions provides a small sample to train parameters of prediction methods, thus leading to low confidence. We develop RBSets, a computational approach to ranking protein interactions rooted in analogical reasoning; that is, the ability to learn and generalize relations between objects. Our approach is tailored to situations where the training set of protein interactions is small, and leverages the attributes of the individual proteins indirectly, in a Bayesian ranking setting that is perhaps closest to propensity scoring in mathematical psychology. We find that RBSets leads to good performance in identifying additional interactions starting from a small evidence set of interacting proteins, for which an underlying biological logic in terms of functional processes and signaling pathways can be established with some confidence. Our approach is scalable and can be applied to large databases with minimal computational overhead. Our results suggest that analogical reasoning within a Bayesian ranking problem is a promising new approach for real-time biological discovery. Java code is available at

  18. Hydrophobic Interaction Chromatography for Bottom-Up Proteomics Analysis of Single Proteins and Protein Complexes.

    Science.gov (United States)

    Rackiewicz, Michal; Große-Hovest, Ludger; Alpert, Andrew J; Zarei, Mostafa; Dengjel, Jörn

    2017-06-02

    Hydrophobic interaction chromatography (HIC) is a robust standard analytical method to purify proteins while preserving their biological activity. It is widely used to study post-translational modifications of proteins and drug-protein interactions. In the current manuscript we employed HIC to separate proteins, followed by bottom-up LC-MS/MS experiments. We used this approach to fractionate antibody species followed by comprehensive peptide mapping as well as to study protein complexes in human cells. HIC-reversed-phase chromatography (RPC)-mass spectrometry (MS) is a powerful alternative to fractionate proteins for bottom-up proteomics experiments making use of their distinct hydrophobic properties.

  19. Rechecking the Centrality-Lethality Rule in the Scope of Protein Subcellular Localization Interaction Networks.

    Directory of Open Access Journals (Sweden)

    Xiaoqing Peng

    Full Text Available Essential proteins are indispensable for living organisms to maintain life activities and play important roles in the studies of pathology, synthetic biology, and drug design. Therefore, besides experiment methods, many computational methods are proposed to identify essential proteins. Based on the centrality-lethality rule, various centrality methods are employed to predict essential proteins in a Protein-protein Interaction Network (PIN. However, neglecting the temporal and spatial features of protein-protein interactions, the centrality scores calculated by centrality methods are not effective enough for measuring the essentiality of proteins in a PIN. Moreover, many methods, which overfit with the features of essential proteins for one species, may perform poor for other species. In this paper, we demonstrate that the centrality-lethality rule also exists in Protein Subcellular Localization Interaction Networks (PSLINs. To do this, a method based on Localization Specificity for Essential protein Detection (LSED, was proposed, which can be combined with any centrality method for calculating the improved centrality scores by taking into consideration PSLINs in which proteins play their roles. In this study, LSED was combined with eight centrality methods separately to calculate Localization-specific Centrality Scores (LCSs for proteins based on the PSLINs of four species (Saccharomyces cerevisiae, Homo sapiens, Mus musculus and Drosophila melanogaster. Compared to the proteins with high centrality scores measured from the global PINs, more proteins with high LCSs measured from PSLINs are essential. It indicates that proteins with high LCSs measured from PSLINs are more likely to be essential and the performance of centrality methods can be improved by LSED. Furthermore, LSED provides a wide applicable prediction model to identify essential proteins for different species.

  20. Prediction of Effective Drug Combinations by Chemical Interaction, Protein Interaction and Target Enrichment of KEGG Pathways

    Directory of Open Access Journals (Sweden)

    Lei Chen

    2013-01-01

    Full Text Available Drug combinatorial therapy could be more effective in treating some complex diseases than single agents due to better efficacy and reduced side effects. Although some drug combinations are being used, their underlying molecular mechanisms are still poorly understood. Therefore, it is of great interest to deduce a novel drug combination by their molecular mechanisms in a robust and rigorous way. This paper attempts to predict effective drug combinations by a combined consideration of: (1 chemical interaction between drugs, (2 protein interactions between drugs’ targets, and (3 target enrichment of KEGG pathways. A benchmark dataset was constructed, consisting of 121 confirmed effective combinations and 605 random combinations. Each drug combination was represented by 465 features derived from the aforementioned three properties. Some feature selection techniques, including Minimum Redundancy Maximum Relevance and Incremental Feature Selection, were adopted to extract the key features. Random forest model was built with its performance evaluated by 5-fold cross-validation. As a result, 55 key features providing the best prediction result were selected. These important features may help to gain insights into the mechanisms of drug combinations, and the proposed prediction model could become a useful tool for screening possible drug combinations.

  1. Improving predictions of protein-protein interfaces by combining amino acid-specific classifiers based on structural and physicochemical descriptors with their weighted neighbor averages.

    Directory of Open Access Journals (Sweden)

    Fábio R de Moraes

    Full Text Available Protein-protein interactions are involved in nearly all regulatory processes in the cell and are considered one of the most important issues in molecular biology and pharmaceutical sciences but are still not fully understood. Structural and computational biology contributed greatly to the elucidation of the mechanism of protein interactions. In this paper, we present a collection of the physicochemical and structural characteristics that distinguish interface-forming residues (IFR from free surface residues (FSR. We formulated a linear discriminative analysis (LDA classifier to assess whether chosen descriptors from the BlueStar STING database (http://www.cbi.cnptia.embrapa.br/SMS/ are suitable for such a task. Receiver operating characteristic (ROC analysis indicates that the particular physicochemical and structural descriptors used for building the linear classifier perform much better than a random classifier and in fact, successfully outperform some of the previously published procedures, whose performance indicators were recently compared by other research groups. The results presented here show that the selected set of descriptors can be utilized to predict IFRs, even when homologue proteins are missing (particularly important for orphan proteins where no homologue is available for comparative analysis/indication or, when certain conformational changes accompany interface formation. The development of amino acid type specific classifiers is shown to increase IFR classification performance. Also, we found that the addition of an amino acid conservation attribute did not improve the classification prediction. This result indicates that the increase in predictive power associated with amino acid conservation is exhausted by adequate use of an extensive list of independent physicochemical and structural parameters that, by themselves, fully describe the nano-environment at protein-protein interfaces. The IFR classifier developed in this study

  2. Improving predictions of protein-protein interfaces by combining amino acid-specific classifiers based on structural and physicochemical descriptors with their weighted neighbor averages.

    Science.gov (United States)

    de Moraes, Fábio R; Neshich, Izabella A P; Mazoni, Ivan; Yano, Inácio H; Pereira, José G C; Salim, José A; Jardine, José G; Neshich, Goran

    2014-01-01

    Protein-protein interactions are involved in nearly all regulatory processes in the cell and are considered one of the most important issues in molecular biology and pharmaceutical sciences but are still not fully understood. Structural and computational biology contributed greatly to the elucidation of the mechanism of protein interactions. In this paper, we present a collection of the physicochemical and structural characteristics that distinguish interface-forming residues (IFR) from free surface residues (FSR). We formulated a linear discriminative analysis (LDA) classifier to assess whether chosen descriptors from the BlueStar STING database (http://www.cbi.cnptia.embrapa.br/SMS/) are suitable for such a task. Receiver operating characteristic (ROC) analysis indicates that the particular physicochemical and structural descriptors used for building the linear classifier perform much better than a random classifier and in fact, successfully outperform some of the previously published procedures, whose performance indicators were recently compared by other research groups. The results presented here show that the selected set of descriptors can be utilized to predict IFRs, even when homologue proteins are missing (particularly important for orphan proteins where no homologue is available for comparative analysis/indication) or, when certain conformational changes accompany interface formation. The development of amino acid type specific classifiers is shown to increase IFR classification performance. Also, we found that the addition of an amino acid conservation attribute did not improve the classification prediction. This result indicates that the increase in predictive power associated with amino acid conservation is exhausted by adequate use of an extensive list of independent physicochemical and structural parameters that, by themselves, fully describe the nano-environment at protein-protein interfaces. The IFR classifier developed in this study is now

  3. Improving Predictions of Protein-Protein Interfaces by Combining Amino Acid-Specific Classifiers Based on Structural and Physicochemical Descriptors with Their Weighted Neighbor Averages

    Science.gov (United States)

    de Moraes, Fábio R.; Neshich, Izabella A. P.; Mazoni, Ivan; Yano, Inácio H.; Pereira, José G. C.; Salim, José A.; Jardine, José G.; Neshich, Goran

    2014-01-01

    Protein-protein interactions are involved in nearly all regulatory processes in the cell and are considered one of the most important issues in molecular biology and pharmaceutical sciences but are still not fully understood. Structural and computational biology contributed greatly to the elucidation of the mechanism of protein interactions. In this paper, we present a collection of the physicochemical and structural characteristics that distinguish interface-forming residues (IFR) from free surface residues (FSR). We formulated a linear discriminative analysis (LDA) classifier to assess whether chosen descriptors from the BlueStar STING database (http://www.cbi.cnptia.embrapa.br/SMS/) are suitable for such a task. Receiver operating characteristic (ROC) analysis indicates that the particular physicochemical and structural descriptors used for building the linear classifier perform much better than a random classifier and in fact, successfully outperform some of the previously published procedures, whose performance indicators were recently compared by other research groups. The results presented here show that the selected set of descriptors can be utilized to predict IFRs, even when homologue proteins are missing (particularly important for orphan proteins where no homologue is available for comparative analysis/indication) or, when certain conformational changes accompany interface formation. The development of amino acid type specific classifiers is shown to increase IFR classification performance. Also, we found that the addition of an amino acid conservation attribute did not improve the classification prediction. This result indicates that the increase in predictive power associated with amino acid conservation is exhausted by adequate use of an extensive list of independent physicochemical and structural parameters that, by themselves, fully describe the nano-environment at protein-protein interfaces. The IFR classifier developed in this study is now

  4. The coat protein complex II, COPII, protein Sec13 directly interacts with presenilin-1

    International Nuclear Information System (INIS)

    Nielsen, Anders Lade

    2009-01-01

    Mutations in the human gene encoding presenilin-1, PS1, account for most cases of early-onset familial Alzheimer's disease. PS1 has nine transmembrane domains and a large loop orientated towards the cytoplasm. PS1 locates to cellular compartments as endoplasmic reticulum (ER), Golgi apparatus, vesicular structures, and plasma membrane, and is an integral member of γ-secretase, a protein protease complex with specificity for intra-membranous cleavage of substrates such as β-amyloid precursor protein. Here, an interaction between PS1 and the Sec13 protein is described. Sec13 takes part in coat protein complex II, COPII, vesicular trafficking, nuclear pore function, and ER directed protein sequestering and degradation control. The interaction maps to the N-terminal part of the large hydrophilic PS1 loop and the first of the six WD40-repeats present in Sec13. The identified Sec13 interaction to PS1 is a new candidate interaction for linking PS1 to secretory and protein degrading vesicular circuits.

  5. The coat protein complex II, COPII, protein Sec13 directly interacts with presenilin-1

    Energy Technology Data Exchange (ETDEWEB)

    Nielsen, Anders Lade, E-mail: aln@humgen.au.dk [Department of Human Genetics, The Bartholin Building, University of Aarhus, DK-8000 Aarhus C (Denmark)

    2009-10-23

    Mutations in the human gene encoding presenilin-1, PS1, account for most cases of early-onset familial Alzheimer's disease. PS1 has nine transmembrane domains and a large loop orientated towards the cytoplasm. PS1 locates to cellular compartments as endoplasmic reticulum (ER), Golgi apparatus, vesicular structures, and plasma membrane, and is an integral member of {gamma}-secretase, a protein protease complex with specificity for intra-membranous cleavage of substrates such as {beta}-amyloid precursor protein. Here, an interaction between PS1 and the Sec13 protein is described. Sec13 takes part in coat protein complex II, COPII, vesicular trafficking, nuclear pore function, and ER directed protein sequestering and degradation control. The interaction maps to the N-terminal part of the large hydrophilic PS1 loop and the first of the six WD40-repeats present in Sec13. The identified Sec13 interaction to PS1 is a new candidate interaction for linking PS1 to secretory and protein degrading vesicular circuits.

  6. Towards a better understanding of the specificity of protein-protein interaction

    Czech Academy of Sciences Publication Activity Database

    Kysilka, Jiří; Vondrášek, Jiří

    2012-01-01

    Roč. 25, č. 11 (2012), s. 604-615 ISSN 0952-3499 R&D Projects: GA ČR GAP208/10/0725; GA ČR GAP302/10/0427; GA MŠk(CZ) LH11020 Institutional research plan: CEZ:AV0Z40550506; CEZ:AV0Z50520701 Keywords : protein-protein interaction * molecular recognition * x-ray structure analysis * empirical potentials * side chain-side chain interaction * interaction energy * bioinformatics Subject RIV: CE - Biochemistry Impact factor: 3.006, year: 2012

  7. Topology of membrane proteins-predictions, limitations and variations.

    Science.gov (United States)

    Tsirigos, Konstantinos D; Govindarajan, Sudha; Bassot, Claudio; Västermark, Åke; Lamb, John; Shu, Nanjiang; Elofsson, Arne

    2017-10-26

    Transmembrane proteins perform a variety of important biological functions necessary for the survival and growth of the cells. Membrane proteins are built up by transmembrane segments that span the lipid bilayer. The segments can either be in the form of hydrophobic alpha-helices or beta-sheets which create a barrel. A fundamental aspect of the structure of transmembrane proteins is the membrane topology, that is, the number of transmembrane segments, their position in the protein sequence and their orientation in the membrane. Along these lines, many predictive algorithms for the prediction of the topology of alpha-helical and beta-barrel transmembrane proteins exist. The newest algorithms obtain an accuracy close to 80% both for alpha-helical and beta-barrel transmembrane proteins. However, lately it has been shown that the simplified picture presented when describing a protein family by its topology is limited. To demonstrate this, we highlight examples where the topology is either not conserved in a protein superfamily or where the structure cannot be described solely by the topology of a protein. The prediction of these non-standard features from sequence alone was not successful until the recent revolutionary progress in 3D-structure prediction of proteins. Copyright © 2017 Elsevier Ltd. All rights reserved.

  8. Studying Protein-Protein Interactions by Biotin AP-Tagged Pulldown and LTQ-Orbitrap Mass Spectrometry.

    Science.gov (United States)

    Xie, Zhongqiu; Jia, Yuemeng; Li, Hui

    2017-01-01

    The study of protein-protein interactions represents a key aspect of biological research. Identifying unknown protein binding partners using mass spectrometry (MS)-based proteomics has evolved into an indispensable strategy in drug discovery. The classic approach of immunoprecipitation with specific antibodies against the proteins of interest has limitations, such as the need for immunoprecipitation-qualified antibody. The biotin AP-tag pull-down system has the advantage of high specificity, ease of use, and no requirement for antibody. It is based on the high specificity, high affinity interaction between biotin and streptavidin. After pulldown, in-gel tryptic digestion and tandem mass spectrometry (MS/MS) analysis of sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) protein bands can be performed. In this work, we provide protocols that can be used for the identification of proteins that interact with FOXM1, a protein that has recently emerged as a potential biomarker and drug target in oncotherapy, as an example. We focus on the pull-down procedure and assess the efficacy of the pulldown with known FOXM1 interactors such as β-catenin. We use a high performance LTQ Orbitrap MSn system that combines rapid LTQ ion trap data acquisition with high mass accuracy Orbitrap analysis to identify the interacting proteins.

  9. Wiki-pi: a web-server of annotated human protein-protein interactions to aid in discovery of protein function.

    Directory of Open Access Journals (Sweden)

    Naoki Orii

    Full Text Available Protein-protein interactions (PPIs are the basis of biological functions. Knowledge of the interactions of a protein can help understand its molecular function and its association with different biological processes and pathways. Several publicly available databases provide comprehensive information about individual proteins, such as their sequence, structure, and function. There also exist databases that are built exclusively to provide PPIs by curating them from published literature. The information provided in these web resources is protein-centric, and not PPI-centric. The PPIs are typically provided as lists of interactions of a given gene with links to interacting partners; they do not present a comprehensive view of the nature of both the proteins involved in the interactions. A web database that allows search and retrieval based on biomedical characteristics of PPIs is lacking, and is needed. We present Wiki-Pi (read Wiki-π, a web-based interface to a database of human PPIs, which allows users to retrieve interactions by their biomedical attributes such as their association to diseases, pathways, drugs and biological functions. Each retrieved PPI is shown with annotations of both of the participant proteins side-by-side, creating a basis to hypothesize the biological function facilitated by the interaction. Conceptually, it is a search engine for PPIs analogous to PubMed for scientific literature. Its usefulness in generating novel scientific hypotheses is demonstrated through the study of IGSF21, a little-known gene that was recently identified to be associated with diabetic retinopathy. Using Wiki-Pi, we infer that its association to diabetic retinopathy may be mediated through its interactions with the genes HSPB1, KRAS, TMSB4X and DGKD, and that it may be involved in cellular response to external stimuli, cytoskeletal organization and regulation of molecular activity. The website also provides a wiki-like capability allowing users

  10. Yeast Interacting Proteins Database: YNL258C, YKR022C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YNL258C DSL1 Peripheral membrane protein required for Golgi-to-ER retrograde traffi...equired for Golgi-to-ER retrograde traffic; component of the ER target site that interacts with coatomer, th...it ORF YNL258C Bait gene name DSL1 Bait description Peripheral membrane protein r

  11. Determining effects of non-synonymous SNPs on protein-protein interactions using supervised and semi-supervised learning.

    Directory of Open Access Journals (Sweden)

    Nan Zhao

    2014-05-01

    Full Text Available Single nucleotide polymorphisms (SNPs are among the most common types of genetic variation in complex genetic disorders. A growing number of studies link the functional role of SNPs with the networks and pathways mediated by the disease-associated genes. For example, many non-synonymous missense SNPs (nsSNPs have been found near or inside the protein-protein interaction (PPI interfaces. Determining whether such nsSNP will disrupt or preserve a PPI is a challenging task to address, both experimentally and computationally. Here, we present this task as three related classification problems, and develop a new computational method, called the SNP-IN tool (non-synonymous SNP INteraction effect predictor. Our method predicts the effects of nsSNPs on PPIs, given the interaction's structure. It leverages supervised and semi-supervised feature-based classifiers, including our new Random Forest self-learning protocol. The classifiers are trained based on a dataset of comprehensive mutagenesis studies for 151 PPI complexes, with experimentally determined binding affinities of the mutant and wild-type interactions. Three classification problems were considered: (1 a 2-class problem (strengthening/weakening PPI mutations, (2 another 2-class problem (mutations that disrupt/preserve a PPI, and (3 a 3-class classification (detrimental/neutral/beneficial mutation effects. In total, 11 different supervised and semi-supervised classifiers were trained and assessed resulting in a promising performance, with the weighted f-measure ranging from 0.87 for Problem 1 to 0.70 for the most challenging Problem 3. By integrating prediction results of the 2-class classifiers into the 3-class classifier, we further improved its performance for Problem 3. To demonstrate the utility of SNP-IN tool, it was applied to study the nsSNP-induced rewiring of two disease-centered networks. The accurate and balanced performance of SNP-IN tool makes it readily available to study the

  12. Determining Effects of Non-synonymous SNPs on Protein-Protein Interactions using Supervised and Semi-supervised Learning

    Science.gov (United States)

    Zhao, Nan; Han, Jing Ginger; Shyu, Chi-Ren; Korkin, Dmitry

    2014-01-01

    Single nucleotide polymorphisms (SNPs) are among the most common types of genetic variation in complex genetic disorders. A growing number of studies link the functional role of SNPs with the networks and pathways mediated by the disease-associated genes. For example, many non-synonymous missense SNPs (nsSNPs) have been found near or inside the protein-protein interaction (PPI) interfaces. Determining whether such nsSNP will disrupt or preserve a PPI is a challenging task to address, both experimentally and computationally. Here, we present this task as three related classification problems, and develop a new computational method, called the SNP-IN tool (non-synonymous SNP INteraction effect predictor). Our method predicts the effects of nsSNPs on PPIs, given the interaction's structure. It leverages supervised and semi-supervised feature-based classifiers, including our new Random Forest self-learning protocol. The classifiers are trained based on a dataset of comprehensive mutagenesis studies for 151 PPI complexes, with experimentally determined binding affinities of the mutant and wild-type interactions. Three classification problems were considered: (1) a 2-class problem (strengthening/weakening PPI mutations), (2) another 2-class problem (mutations that disrupt/preserve a PPI), and (3) a 3-class classification (detrimental/neutral/beneficial mutation effects). In total, 11 different supervised and semi-supervised classifiers were trained and assessed resulting in a promising performance, with the weighted f-measure ranging from 0.87 for Problem 1 to 0.70 for the most challenging Problem 3. By integrating prediction results of the 2-class classifiers into the 3-class classifier, we further improved its performance for Problem 3. To demonstrate the utility of SNP-IN tool, it was applied to study the nsSNP-induced rewiring of two disease-centered networks. The accurate and balanced performance of SNP-IN tool makes it readily available to study the rewiring of

  13. Cell penetrating peptides to dissect host-pathogen protein-protein interactions in Theileria -transformed leukocytes

    KAUST Repository

    Haidar, Malak

    2017-09-08

    One powerful application of cell penetrating peptides is the delivery into cells of molecules that function as specific competitors or inhibitors of protein-protein interactions. Ablating defined protein-protein interactions is a refined way to explore their contribution to a particular cellular phenotype in a given disease context. Cell-penetrating peptides can be synthetically constrained through various chemical modifications that stabilize a given structural fold with the potential to improve competitive binding to specific targets. Theileria-transformed leukocytes display high PKA activity, but PKAis an enzyme that plays key roles in multiple cellular processes; consequently genetic ablation of kinase activity gives rise to a myriad of confounding phenotypes. By contrast, ablation of a specific kinase-substrate interaction has the potential to give more refined information and we illustrate this here by describing how surgically ablating PKA interactions with BAD gives precise information on the type of glycolysis performed by Theileria-transformed leukocytes. In addition, we provide two other examples of how ablating specific protein-protein interactions in Theileria-infected leukocytes leads to precise phenotypes and argue that constrained penetrating peptides have great therapeutic potential to combat infectious diseases in general.

  14. Defining the protein interaction network of human malaria parasite Plasmodium falciparum

    KAUST Repository

    Ramaprasad, Abhinay

    2012-02-01

    Malaria, caused by the protozoan parasite Plasmodium falciparum, affects around 225. million people yearly and a huge international effort is directed towards combating this grave threat to world health and economic development. Considerable advances have been made in malaria research triggered by the sequencing of its genome in 2002, followed by several high-throughput studies defining the malaria transcriptome and proteome. A protein-protein interaction (PPI) network seeks to trace the dynamic interactions between proteins, thereby elucidating their local and global functional relationships. Experimentally derived PPI network from high-throughput methods such as yeast two hybrid (Y2H) screens are inherently noisy, but combining these independent datasets by computational methods tends to give a greater accuracy and coverage. This review aims to discuss the computational approaches used till date to construct a malaria protein interaction network and to catalog the functional predictions and biological inferences made from analysis of the PPI network. © 2011 Elsevier Inc.

  15. CellMap visualizes protein-protein interactions and subcellular localization

    Science.gov (United States)

    Dallago, Christian; Goldberg, Tatyana; Andrade-Navarro, Miguel Angel; Alanis-Lobato, Gregorio; Rost, Burkhard

    2018-01-01

    Many tools visualize protein-protein interaction (PPI) networks. The tool introduced here, CellMap, adds one crucial novelty by visualizing PPI networks in the context of subcellular localization, i.e. the location in the cell or cellular component in which a PPI happens. Users can upload images of cells and define areas of interest against which PPIs for selected proteins are displayed (by default on a cartoon of a cell). Annotations of localization are provided by the user or through our in-house database. The visualizer and server are written in JavaScript, making CellMap easy to customize and to extend by researchers and developers. PMID:29497493

  16. A scalable double-barcode sequencing platform for characterization of dynamic protein-protein interactions.

    Science.gov (United States)

    Schlecht, Ulrich; Liu, Zhimin; Blundell, Jamie R; St Onge, Robert P; Levy, Sasha F

    2017-05-25

    Several large-scale efforts have systematically catalogued protein-protein interactions (PPIs) of a cell in a single environment. However, little is known about how the protein interactome changes across environmental perturbations. Current technologies, which assay one PPI at a time, are too low throughput to make it practical to study protein interactome dynamics. Here, we develop a highly parallel protein-protein interaction sequencing (PPiSeq) platform that uses a novel double barcoding system in conjunction with the dihydrofolate reductase protein-fragment complementation assay in Saccharomyces cerevisiae. PPiSeq detects PPIs at a rate that is on par with current assays and, in contrast with current methods, quantitatively scores PPIs with enough accuracy and sensitivity to detect changes across environments. Both PPI scoring and the bulk of strain construction can be performed with cell pools, making the assay scalable and easily reproduced across environments. PPiSeq is therefore a powerful new tool for large-scale investigations of dynamic PPIs.

  17. Integrative approaches to the prediction of protein functions based on the feature selection

    Directory of Open Access Journals (Sweden)

    Lee Hyunju

    2009-12-01

    Full Text Available Abstract Background Protein function prediction has been one of the most important issues in functional genomics. With the current availability of various genomic data sets, many researchers have attempted to develop integration models that combine all available genomic data for protein function prediction. These efforts have resulted in the improvement of prediction quality and the extension of prediction coverage. However, it has also been observed that integrating more data sources does not always increase the prediction quality. Therefore, selecting data sources that highly contribute to the protein function prediction has become an important issue. Results We present systematic feature selection methods that assess the contribution of genome-wide data sets to predict protein functions and then investigate the relationship between genomic data sources and protein functions. In this study, we use ten different genomic data sources in Mus musculus, including: protein-domains, protein-protein interactions, gene expressions, phenotype ontology, phylogenetic profiles and disease data sources to predict protein functions that are labelled with Gene Ontology (GO terms. We then apply two approaches to feature selection: exhaustive search feature selection using a kernel based logistic regression (KLR, and a kernel based L1-norm regularized logistic regression (KL1LR. In the first approach, we exhaustively measure the contribution of each data set for each function based on its prediction quality. In the second approach, we use the estimated coefficients of features as measures of contribution of data sources. Our results show that the proposed methods improve the prediction quality compared to the full integration of all data sources and other filter-based feature selection methods. We also show that contributing data sources can differ depending on the protein function. Furthermore, we observe that highly contributing data sets can be similar among

  18. Interrogating the architecture of protein assemblies and protein interaction networks by cross-linking mass spectrometry

    NARCIS (Netherlands)

    Liu, Fan; Heck, Albert J R

    2015-01-01

    Proteins are involved in almost all processes of the living cell. They are organized through extensive networks of interaction, by tightly bound macromolecular assemblies or more transiently via signaling nodes. Therefore, revealing the architecture of protein complexes and protein interaction

  19. StaRProtein, A Web Server for Prediction of the Stability of Repeat Proteins

    Science.gov (United States)

    Xu, Yongtao; Zhou, Xu; Huang, Meilan

    2015-01-01

    Repeat proteins have become increasingly important due to their capability to bind to almost any proteins and the potential as alternative therapy to monoclonal antibodies. In the past decade repeat proteins have been designed to mediate specific protein-protein interactions. The tetratricopeptide and ankyrin repeat proteins are two classes of helical repeat proteins that form different binding pockets to accommodate various partners. It is important to understand the factors that define folding and stability of repeat proteins in order to prioritize the most stable designed repeat proteins to further explore their potential binding affinities. Here we developed distance-dependant statistical potentials using two classes of alpha-helical repeat proteins, tetratricopeptide and ankyrin repeat proteins respectively, and evaluated their efficiency in predicting the stability of repeat proteins. We demonstrated that the repeat-specific statistical potentials based on these two classes of repeat proteins showed paramount accuracy compared with non-specific statistical potentials in: 1) discriminate correct vs. incorrect models 2) rank the stability of designed repeat proteins. In particular, the statistical scores correlate closely with the equilibrium unfolding free energies of repeat proteins and therefore would serve as a novel tool in quickly prioritizing the designed repeat proteins with high stability. StaRProtein web server was developed for predicting the stability of repeat proteins. PMID:25807112

  20. A Machine Learning Approach for Hot-Spot Detection at Protein-Protein Interfaces

    NARCIS (Netherlands)

    Melo, Rita; Fieldhouse, Robert; Melo, André; Correia, João D G; Cordeiro, Maria Natália D S; Gümüş, Zeynep H; Costa, Joaquim; Bonvin, Alexandre M J J; de Sousa Moreira, Irina

    2016-01-01

    Understanding protein-protein interactions is a key challenge in biochemistry. In this work, we describe a more accurate methodology to predict Hot-Spots (HS) in protein-protein interfaces from their native complex structure compared to previous published Machine Learning (ML) techniques. Our model

  1. Insight into bacterial virulence mechanisms against host immune response via the Yersinia pestis-human protein-protein interaction network.

    Science.gov (United States)

    Yang, Huiying; Ke, Yuehua; Wang, Jian; Tan, Yafang; Myeni, Sebenzile K; Li, Dong; Shi, Qinghai; Yan, Yanfeng; Chen, Hui; Guo, Zhaobiao; Yuan, Yanzhi; Yang, Xiaoming; Yang, Ruifu; Du, Zongmin

    2011-11-01

    A Yersinia pestis-human protein interaction network is reported here to improve our understanding of its pathogenesis. Up to 204 interactions between 66 Y. pestis bait proteins and 109 human proteins were identified by yeast two-hybrid assay and then combined with 23 previously published interactions to construct a protein-protein interaction network. Topological analysis of the interaction network revealed that human proteins targeted by Y. pestis were significantly enriched in the proteins that are central in the human protein-protein interaction network. Analysis of this network showed that signaling pathways important for host immune responses were preferentially targeted by Y. pestis, including the pathways involved in focal adhesion, regulation of cytoskeleton, leukocyte transendoepithelial migration, and Toll-like receptor (TLR) and mitogen-activated protein kinase (MAPK) signaling. Cellular pathways targeted by Y. pestis are highly relevant to its pathogenesis. Interactions with host proteins involved in focal adhesion and cytoskeketon regulation pathways could account for resistance of Y. pestis to phagocytosis. Interference with TLR and MAPK signaling pathways by Y. pestis reflects common characteristics of pathogen-host interaction that bacterial pathogens have evolved to evade host innate immune response by interacting with proteins in those signaling pathways. Interestingly, a large portion of human proteins interacting with Y. pestis (16/109) also interacted with viral proteins (Epstein-Barr virus [EBV] and hepatitis C virus [HCV]), suggesting that viral and bacterial pathogens attack common cellular functions to facilitate infections. In addition, we identified vasodilator-stimulated phosphoprotein (VASP) as a novel interaction partner of YpkA and showed that YpkA could inhibit in vitro actin assembly mediated by VASP.

  2. Towards a rigorous network of protein-protein interactions of the model sulfate reducer Desulfovibrio vulgaris Hildenborough.

    Directory of Open Access Journals (Sweden)

    Swapnil R Chhabra

    Full Text Available Protein-protein interactions offer an insight into cellular processes beyond what may be obtained by the quantitative functional genomics tools of proteomics and transcriptomics. The aforementioned tools have been extensively applied to study Escherichia coli and other aerobes and more recently to study the stress response behavior of Desulfovibrio vulgaris Hildenborough, a model obligate anaerobe and sulfate reducer and the subject of this study. Here we carried out affinity purification followed by mass spectrometry to reconstruct an interaction network among 12 chromosomally encoded bait and 90 prey proteins based on 134 bait-prey interactions identified to be of high confidence. Protein-protein interaction data are often plagued by the lack of adequate controls and replication analyses necessary to assess confidence in the results, including identification of potential false positives. We addressed these issues through the use of biological replication, exponentially modified protein abundance indices, results from an experimental negative control, and a statistical test to assign confidence to each putative interacting pair applicable to small interaction data studies. We discuss the biological significance of metabolic features of D. vulgaris revealed by these protein-protein interaction data and the observed protein modifications. These include the distinct role of the putative carbon monoxide-induced hydrogenase, unique electron transfer routes associated with different oxidoreductases, and the possible role of methylation in regulating sulfate reduction.

  3. Femtosecond UV-laser pulses to unveil protein-protein interactions in living cells.

    Science.gov (United States)

    Itri, Francesco; Monti, Daria M; Della Ventura, Bartolomeo; Vinciguerra, Roberto; Chino, Marco; Gesuele, Felice; Lombardi, Angelina; Velotta, Raffaele; Altucci, Carlo; Birolo, Leila; Piccoli, Renata; Arciello, Angela

    2016-02-01

    A hallmark to decipher bioprocesses is to characterize protein-protein interactions in living cells. To do this, the development of innovative methodologies, which do not alter proteins and their natural environment, is particularly needed. Here, we report a method (LUCK, Laser UV Cross-linKing) to in vivo cross-link proteins by UV-laser irradiation of living cells. Upon irradiation of HeLa cells under controlled conditions, cross-linked products of glyceraldehyde-3-phosphate dehydrogenase (GAPDH) were detected, whose yield was found to be a linear function of the total irradiation energy. We demonstrated that stable dimers of GAPDH were formed through intersubunit cross-linking, as also observed when the pure protein was irradiated by UV-laser in vitro. We proposed a defined patch of aromatic residues located at the enzyme subunit interface as the cross-linking sites involved in dimer formation. Hence, by this technique, UV-laser is able to photofix protein surfaces that come in direct contact. Due to the ultra-short time scale of UV-laser-induced cross-linking, this technique could be extended to weld even transient protein interactions in their native context.

  4. Selection of organisms for the co-evolution-based study of protein interactions.

    Science.gov (United States)

    Herman, Dorota; Ochoa, David; Juan, David; Lopez, Daniel; Valencia, Alfonso; Pazos, Florencio

    2011-09-12

    The prediction and study of protein interactions and functional relationships based on similarity of phylogenetic trees, exemplified by the mirrortree and related methodologies, is being widely used. Although dependence between the performance of these methods and the set of organisms used to build the trees was suspected, so far nobody assessed it in an exhaustive way, and, in general, previous works used as many organisms as possible. In this work we asses the effect of using different sets of organism (chosen according with various phylogenetic criteria) on the performance of this methodology in detecting protein interactions of different nature. We show that the performance of three mirrortree-related methodologies depends on the set of organisms used for building the trees, and it is not always directly related to the number of organisms in a simple way. Certain subsets of organisms seem to be more suitable for the predictions of certain types of interactions. This relationship between type of interaction and optimal set of organism for detecting them makes sense in the light of the phylogenetic distribution of the organisms and the nature of the interactions. In order to obtain an optimal performance when predicting protein interactions, it is recommended to use different sets of organisms depending on the available computational resources and data, as well as the type of interactions of interest.

  5. Refining intra-protein contact prediction by graph analysis

    Directory of Open Access Journals (Sweden)

    Eyal Eran

    2007-05-01

    Full Text Available Abstract Background Accurate prediction of intra-protein residue contacts from sequence information will allow the prediction of protein structures. Basic predictions of such specific contacts can be further refined by jointly analyzing predicted contacts, and by adding information on the relative positions of contacts in the protein primary sequence. Results We introduce a method for graph analysis refinement of intra-protein contacts, termed GARP. Our previously presented intra-contact prediction method by means of pair-to-pair substitution matrix (P2PConPred was used to test the GARP method. In our approach, the top contact predictions obtained by a basic prediction method were used as edges to create a weighted graph. The edges were scored by a mutual clustering coefficient that identifies highly connected graph regions, and by the density of edges between the sequence regions of the edge nodes. A test set of 57 proteins with known structures was used to determine contacts. GARP improves the accuracy of the P2PConPred basic prediction method in whole proteins from 12% to 18%. Conclusion Using a simple approach we increased the contact prediction accuracy of a basic method by 1.5 times. Our graph approach is simple to implement, can be used with various basic prediction methods, and can provide input for further downstream analyses.

  6. Interaction of a non-histone chromatin protein (high-mobility group protein 2) with DNA

    International Nuclear Information System (INIS)

    Goodwin, G.H.; Shooter, K.V.; Johns, E.W.

    1975-01-01

    The interaction with DNA of the calf thymus chromatin non-histone protein termed the high-mobility group protein 2 has been studied by sedimentation analysis in the ultracentrifuge and by measuring the binding of the 125 I-labelled protein to DNA. The results have been compared with those obtained previously by us [Eur. J. Biochem. (1974) 47, 263-270] for the interaction of high-mobility group protein 1 with DNA. Although the binding parameters are similar for these two proteins, high-mobility group protein 2 differs from high-mobility group protein 1 in that the former appears to change the shape of the DNA to a more compact form. The molecular weight of high-mobility group protein 2 has been determined by equilibrium sedimentation and a mean value of 26,000 was obtained. A low level of nuclease activity detected in one preparation of high-mobility group protein 2 has been investigated. (orig.) [de

  7. Core Data of Yeast Interacting Proteins Database (Original Version) - Yeast Interacting Proteins Database | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available y are in the reverse direction. *1 A comprehensive two-hybrid analysis to explore the yeast protein interact...s. 2000 Jan 1;28(1):73-6. *2 The yeast proteome database (YPD) and Caenorhabditis elegans proteome database (WormPD): comprehensive...000 Jan 1;28(1):73-6. *3 A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisia

  8. Analysis of Protein-RNA and Protein-Peptide Interactions in Equine Infectious Anemia

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Jae-Hyung [Iowa State Univ., Ames, IA (United States)

    2007-01-01

    Macromolecular interactions are essential for virtually all cellular functions including signal transduction processes, metabolic processes, regulation of gene expression and immune responses. This dissertation focuses on the characterization of two important macromolecular interactions involved in the relationship between Equine Infectious Anemia Virus (EIAV) and its host cell in horse: (1) the interaction between the EIAV Rev protein and its binding site, the Rev-responsive element (RRE) and (2) interactions between equine MHC class I molecules and epitope peptides derived from EIAV proteins. EIAV, one of the most divergent members of the lentivirus family, has a single-stranded RNA genome and carries several regulatory and structural proteins within its viral particle. Rev is an essential EIAV regulatory encoded protein that interacts with the viral RRE, a specific binding site in the viral mRNA. Using a combination of experimental and computational methods, the interactions between EIAV Rev and RRE were characterized in detail. EIAV Rev was shown to have a bipartite RNA binding domain contain two arginine rich motifs (ARMs). The RRE secondary structure was determined and specific structural motifs that act as cis-regulatory elements for EIAV Rev-RRE interaction were identified. Interestingly, a structural motif located in the high affinity Rev binding site is well conserved in several diverse lentiviral genoes, including HIV-1. Macromolecular interactions involved in the immune response of the horse to EIAV infection were investigated by analyzing complexes between MHC class I proteins and epitope peptides derived from EIAV Rev, Env and Gag proteins. Computational modeling results provided a mechanistic explanation for the experimental finding that a single amino acid change in the peptide binding domain of the quine MHC class I molecule differentially affectes the recognitino of specific epitopes by EIAV-specific CTL. Together, the findings in this

  9. Integrative Identification of Arabidopsis Mitochondrial Proteome and Its Function Exploitation through Protein Interaction Network

    Science.gov (United States)

    Cui, Jian; Liu, Jinghua; Li, Yuhua; Shi, Tieliu

    2011-01-01

    Mitochondria are major players on the production of energy, and host several key reactions involved in basic metabolism and biosynthesis of essential molecules. Currently, the majority of nucleus-encoded mitochondrial proteins are unknown even for model plant Arabidopsis. We reported a computational framework for predicting Arabidopsis mitochondrial proteins based on a probabilistic model, called Naive Bayesian Network, which integrates disparate genomic data generated from eight bioinformatics tools, multiple orthologous mappings, protein domain properties and co-expression patterns using 1,027 microarray profiles. Through this approach, we predicted 2,311 candidate mitochondrial proteins with 84.67% accuracy and 2.53% FPR performances. Together with those experimental confirmed proteins, 2,585 mitochondria proteins (named CoreMitoP) were identified, we explored those proteins with unknown functions based on protein-protein interaction network (PIN) and annotated novel functions for 26.65% CoreMitoP proteins. Moreover, we found newly predicted mitochondrial proteins embedded in particular subnetworks of the PIN, mainly functioning in response to diverse environmental stresses, like salt, draught, cold, and wound etc. Candidate mitochondrial proteins involved in those physiological acitivites provide useful targets for further investigation. Assigned functions also provide comprehensive information for Arabidopsis mitochondrial proteome. PMID:21297957

  10. Protein-surface interactions on stimuli-responsive polymeric biomaterials.

    Science.gov (United States)

    Cross, Michael C; Toomey, Ryan G; Gallant, Nathan D

    2016-03-04

    Responsive surfaces: a review of the dependence of protein adsorption on the reversible volume phase transition in stimuli-responsive polymers. Specifically addressed are a widely studied subset: thermoresponsive polymers. Findings are also generalizable to other materials which undergo a similarly reversible volume phase transition. As of 2015, over 100,000 articles have been published on stimuli-responsive polymers and many more on protein-biomaterial interactions. Significantly, fewer than 100 of these have focused specifically on protein interactions with stimuli-responsive polymers. These report a clear trend of increased protein adsorption in the collapsed state compared to the swollen state. This control over protein interactions makes stimuli-responsive polymers highly useful in biomedical applications such as wound repair scaffolds, on-demand drug delivery, and antifouling surfaces. Outstanding questions are whether the protein adsorption is reversible with the volume phase transition and whether there is a time-dependence. A clear understanding of protein interactions with stimuli-responsive polymers will advance theoretical models, experimental results, and biomedical applications.

  11. Distinct Mechanism Evolved for Mycobacterial RNA Polymerase and Topoisomerase I Protein-Protein Interaction.

    Science.gov (United States)

    Banda, Srikanth; Cao, Nan; Tse-Dinh, Yuk-Ching

    2017-09-15

    We report here a distinct mechanism of interaction between topoisomerase I and RNA polymerase in Mycobacterium tuberculosis and Mycobacterium smegmatis that has evolved independently from the previously characterized interaction between bacterial topoisomerase I and RNA polymerase. Bacterial DNA topoisomerase I is responsible for preventing the hyper-negative supercoiling of genomic DNA. The association of topoisomerase I with RNA polymerase during transcription elongation could efficiently relieve transcription-driven negative supercoiling. Our results demonstrate a direct physical interaction between the C-terminal domains of topoisomerase I (TopoI-CTDs) and the β' subunit of RNA polymerase of M. smegmatis in the absence of DNA. The TopoI-CTDs in mycobacteria are evolutionarily unrelated in amino acid sequence and three-dimensional structure to the TopoI-CTD found in the majority of bacterial species outside Actinobacteria, including Escherichia coli. The functional interaction between topoisomerase I and RNA polymerase has evolved independently in mycobacteria and E. coli, with distinctively different structural elements of TopoI-CTD utilized for this protein-protein interaction. Zinc ribbon motifs in E. coli TopoI-CTD are involved in the interaction with RNA polymerase. For M. smegmatis TopoI-CTD, a 27-amino-acid tail that is rich in basic residues at the C-terminal end is responsible for the interaction with RNA polymerase. Overexpression of recombinant TopoI-CTD in M. smegmatis competed with the endogenous topoisomerase I for protein-protein interactions with RNA polymerase. The TopoI-CTD overexpression resulted in decreased survival following treatment with antibiotics and hydrogen peroxide, supporting the importance of the protein-protein interaction between topoisomerase I and RNA polymerase during stress response of mycobacteria. Copyright © 2017 Elsevier Ltd. All rights reserved.

  12. In vivo interactions between the proteins of infectious bursal disease virus: capsid protein VP3 interacts with the RNA dependent polymerase VP1

    NARCIS (Netherlands)

    Tacken, M.G.J.; Rottier, P.J.M.; Gielkens, A.L.J.; Peeters, B.P.H.

    2000-01-01

    Little is known about the intermolecular interactions between the viral proteins of infectious bursal disease virus (IBDV). By using the yeast two-hybrid system, which allows the detection of protein-protein interactions in vivo, all possible interactions were tested by fusing the viral proteins to

  13. Interactions in vivo between the proteins of infectious bursal disease virus: capsid protein VP3 interacts with the RNA-dependent polymerase, VP1

    NARCIS (Netherlands)

    Tacken, M.G.J.; Rottier, P.J.M.; Gielkens, A.L.J.; Peeters, B.P.H.

    2000-01-01

    Little is known about the intermolecular interactions between the viral proteins of infectious bursal disease virus (IBDV). By using the yeast two-hybrid system, which allows the detection of protein-protein interactions in vivo, all possible interactions were tested by fusing the viral proteins to

  14. Protein interactions in genome maintenance as novel antibacterial targets.

    Directory of Open Access Journals (Sweden)

    Aimee H Marceau

    Full Text Available Antibacterial compounds typically act by directly inhibiting essential bacterial enzyme activities. Although this general mechanism of action has fueled traditional antibiotic discovery efforts for decades, new antibiotic development has not kept pace with the emergence of drug resistant bacterial strains. These limitations have severely restricted the therapeutic tools available for treating bacterial infections. Here we test an alternative antibacterial lead-compound identification strategy in which essential protein-protein interactions are targeted rather than enzymatic activities. Bacterial single-stranded DNA-binding proteins (SSBs form conserved protein interaction "hubs" that are essential for recruiting many DNA replication, recombination, and repair proteins to SSB/DNA nucleoprotein substrates. Three small molecules that block SSB/protein interactions are shown to have antibacterial activity against diverse bacterial species. Consistent with a model in which the compounds target multiple SSB/protein interactions, treatment of Bacillus subtilis cultures with the compounds leads to rapid inhibition of DNA replication and recombination, and ultimately to cell death. The compounds also have unanticipated effects on protein synthesis that could be due to a previously unknown role for SSB/protein interactions in translation or to off-target effects. Our results highlight the potential of targeting protein-protein interactions, particularly those that mediate genome maintenance, as a powerful approach for identifying new antibacterial compounds.

  15. A proteomics strategy to elucidate functional protein-protein interactions applied to EGF signaling

    DEFF Research Database (Denmark)

    Blagoev, B.; Kratchmarova, I.; Ong, S.E.

    2003-01-01

    Mass spectrometry-based proteomics can reveal protein-protein interactions on a large scale, but it has been difficult to separate background binding from functionally important interactions and still preserve weak binders. To investigate the epidermal growth factor receptor (EGFR) pathway, we em...

  16. The development of an affinity evaluation and prediction system by using protein–protein docking simulations and parameter tuning

    Directory of Open Access Journals (Sweden)

    Koki Tsukamoto

    2009-01-01

    Full Text Available Koki Tsukamoto1, Tatsuya Yoshikawa1,2, Kiyonobu Yokota1, Yuichiro Hourai1, Kazuhiko Fukui11Computational Biology Research Center (CBRC, National Institute of Advanced Industrial Science and Technology (AIST, Koto-ku, Tokyo, Japan; 2Department of Bioinformatic Engineering, Graduate School of Information Science and Technology, Osaka University, Toyonaka, Osaka, JapanAbstract: A system was developed to evaluate and predict the interaction between protein pairs by using the widely used shape complementarity search method as the algorithm for docking simulations between the proteins. We used this system, which we call the affinity evaluation and prediction (AEP system, to evaluate the interaction between 20 protein pairs. The system first executes a “round robin” shape complementarity search of the target protein group, and evaluates the interaction between the complex structures obtained by the search. These complex structures are selected by using a statistical procedure that we developed called ‘grouping’. At a prevalence of 5.0%, our AEP system predicted protein–protein interactions with a 50.0% recall, 55.6% precision, 95.5% accuracy, and an F-measure of 0.526. By optimizing the grouping process, our AEP system successfully predicted 10 protein pairs (among 20 pairs that were biologically relevant combinations. Our ultimate goal is to construct an affinity database that will provide cell biologists and drug designers with crucial information obtained using our AEP system.Keywords: protein–protein interaction, affinity analysis, protein–protein docking, FFT, massive parallel computing

  17. Signatures of pleiotropy, economy and convergent evolution in a domain-resolved map of human-virus protein-protein interaction networks.

    Science.gov (United States)

    Garamszegi, Sara; Franzosa, Eric A; Xia, Yu

    2013-01-01

    A central challenge in host-pathogen systems biology is the elucidation of general, systems-level principles that distinguish host-pathogen interactions from within-host interactions. Current analyses of host-pathogen and within-host protein-protein interaction networks are largely limited by their resolution, treating proteins as nodes and interactions as edges. Here, we construct a domain-resolved map of human-virus and within-human protein-protein interaction networks by annotating protein interactions with high-coverage, high-accuracy, domain-centric interaction mechanisms: (1) domain-domain interactions, in which a domain in one protein binds to a domain in a second protein, and (2) domain-motif interactions, in which a domain in one protein binds to a short, linear peptide motif in a second protein. Analysis of these domain-resolved networks reveals, for the first time, significant mechanistic differences between virus-human and within-human interactions at the resolution of single domains. While human proteins tend to compete with each other for domain binding sites by means of sequence similarity, viral proteins tend to compete with human proteins for domain binding sites in the absence of sequence similarity. Independent of their previously established preference for targeting human protein hubs, viral proteins also preferentially target human proteins containing linear motif-binding domains. Compared to human proteins, viral proteins participate in more domain-motif interactions, target more unique linear motif-binding domains per residue, and contain more unique linear motifs per residue. Together, these results suggest that viruses surmount genome size constraints by convergently evolving multiple short linear motifs in order to effectively mimic, hijack, and manipulate complex host processes for their survival. Our domain-resolved analyses reveal unique signatures of pleiotropy, economy, and convergent evolution in viral-host interactions that are

  18. Next-Generation Sequencing for Binary Protein-Protein Interactions

    Directory of Open Access Journals (Sweden)

    Bernhard eSuter

    2015-12-01

    Full Text Available The yeast two-hybrid (Y2H system exploits host cell genetics in order to display binary protein-protein interactions (PPIs via defined and selectable phenotypes. Numerous improvements have been made to this method, adapting the screening principle for diverse applications, including drug discovery and the scale-up for proteome wide interaction screens in human and other organisms. Here we discuss a systematic workflow and analysis scheme for screening data generated by Y2H and related assays that includes high-throughput selection procedures, readout of comprehensive results via next-generation sequencing (NGS, and the interpretation of interaction data via quantitative statistics. The novel assays and tools will serve the broader scientific community to harness the power of NGS technology to address PPI networks in health and disease. We discuss examples of how this next-generation platform can be applied to address specific questions in diverse fields of biology and medicine.

  19. A protein domain interaction interface database: InterPare

    Directory of Open Access Journals (Sweden)

    Lee Jungsul

    2005-08-01

    Full Text Available Abstract Background Most proteins function by interacting with other molecules. Their interaction interfaces are highly conserved throughout evolution to avoid undesirable interactions that lead to fatal disorders in cells. Rational drug discovery includes computational methods to identify the interaction sites of lead compounds to the target molecules. Identifying and classifying protein interaction interfaces on a large scale can help researchers discover drug targets more efficiently. Description We introduce a large-scale protein domain interaction interface database called InterPare http://interpare.net. It contains both inter-chain (between chains interfaces and intra-chain (within chain interfaces. InterPare uses three methods to detect interfaces: 1 the geometric distance method for checking the distance between atoms that belong to different domains, 2 Accessible Surface Area (ASA, a method for detecting the buried region of a protein that is detached from a solvent when forming multimers or complexes, and 3 the Voronoi diagram, a computational geometry method that uses a mathematical definition of interface regions. InterPare includes visualization tools to display protein interior, surface, and interaction interfaces. It also provides statistics such as the amino acid propensities of queried protein according to its interior, surface, and interface region. The atom coordinates that belong to interface, surface, and interior regions can be downloaded from the website. Conclusion InterPare is an open and public database server for protein interaction interface information. It contains the large-scale interface data for proteins whose 3D-structures are known. As of November 2004, there were 10,583 (Geometric distance, 10,431 (ASA, and 11,010 (Voronoi diagram entries in the Protein Data Bank (PDB containing interfaces, according to the above three methods. In the case of the geometric distance method, there are 31,620 inter-chain domain

  20. HCVpro: Hepatitis C virus protein interaction database

    KAUST Repository

    Kwofie, Samuel K.

    2011-12-01

    It is essential to catalog characterized hepatitis C virus (HCV) protein-protein interaction (PPI) data and the associated plethora of vital functional information to augment the search for therapies, vaccines and diagnostic biomarkers. In furtherance of these goals, we have developed the hepatitis C virus protein interaction database (HCVpro) by integrating manually verified hepatitis C virus-virus and virus-human protein interactions curated from literature and databases. HCVpro is a comprehensive and integrated HCV-specific knowledgebase housing consolidated information on PPIs, functional genomics and molecular data obtained from a variety of virus databases (VirHostNet, VirusMint, HCVdb and euHCVdb), and from BIND and other relevant biology repositories. HCVpro is further populated with information on hepatocellular carcinoma (HCC) related genes that are mapped onto their encoded cellular proteins. Incorporated proteins have been mapped onto Gene Ontologies, canonical pathways, Online Mendelian Inheritance in Man (OMIM) and extensively cross-referenced to other essential annotations. The database is enriched with exhaustive reviews on structure and functions of HCV proteins, current state of drug and vaccine development and links to recommended journal articles. Users can query the database using specific protein identifiers (IDs), chromosomal locations of a gene, interaction detection methods, indexed PubMed sources as well as HCVpro, BIND and VirusMint IDs. The use of HCVpro is free and the resource can be accessed via http://apps.sanbi.ac.za/hcvpro/ or http://cbrc.kaust.edu.sa/hcvpro/. © 2011 Elsevier B.V.

  1. PPI-IRO: A two-stage method for protein-protein interaction extraction based on interaction relation ontology

    KAUST Repository

    Li, Chuanxi; Chen, Peng; Wang, Rujing; Wang, Xiujie; Su, Yaru; Li, Jinyan

    2014-01-01

    Mining Protein-Protein Interactions (PPIs) from the fast-growing biomedical literature resources has been proven as an effective approach for the identifi cation of biological regulatory networks. This paper presents a novel method based on the idea

  2. (S)Pinning down protein interactions by NMR

    DEFF Research Database (Denmark)

    Teilum, Kaare; Kunze, Micha Ben Achim; Erlendsson, Simon

    2017-01-01

    Protein molecules are highly diverse communication platforms and their interaction repertoire stretches from atoms over small molecules such as sugars and lipids to macromolecules. An important route to understanding molecular communication is to quantitatively describe their interactions...... all types of protein reactions, which can span orders of magnitudes in affinities, reaction rates and lifetimes of states. As the more versatile technique, solution NMR spectroscopy offers a remarkable catalogue of methods that can be successfully applied to the quantitative as well as qualitative...... descriptions of protein interactions. In this review we provide an easy-access approach to NMR for the non-NMR specialist and describe how and when solution state NMR spectroscopy is the method of choice for addressing protein ligand interaction. We describe very briefly the theoretical background...

  3. Yeast Interacting Proteins Database: YER081W, YDR105C [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available YDR105C TMS1 Vacuolar membrane protein of unknown function that is conserved in mammals; predicted to contai...tion that is conserved in mammals; predicted to contain eleven transmembrane heli

  4. Super-resolution imaging and tracking of protein-protein interactions in sub-diffraction cellular space

    Science.gov (United States)

    Liu, Zhen; Xing, Dong; Su, Qian Peter; Zhu, Yun; Zhang, Jiamei; Kong, Xinyu; Xue, Boxin; Wang, Sheng; Sun, Hao; Tao, Yile; Sun, Yujie

    2014-07-01

    Imaging the location and dynamics of individual interacting protein pairs is essential but often difficult because of the fluorescent background from other paired and non-paired molecules, particularly in the sub-diffraction cellular space. Here we develop a new method combining bimolecular fluorescence complementation and photoactivated localization microscopy for super-resolution imaging and single-molecule tracking of specific protein-protein interactions. The method is used to study the interaction of two abundant proteins, MreB and EF-Tu, in Escherichia coli cells. The super-resolution imaging shows interesting distribution and domain sizes of interacting MreB-EF-Tu pairs as a subpopulation of total EF-Tu. The single-molecule tracking of MreB, EF-Tu and MreB-EF-Tu pairs reveals intriguing localization-dependent heterogonous dynamics and provides valuable insights to understanding the roles of MreB-EF-Tu interactions.

  5. Using the clustered circular layout as an informative method for visualizing protein-protein interaction networks.

    Science.gov (United States)

    Fung, David C Y; Wilkins, Marc R; Hart, David; Hong, Seok-Hee

    2010-07-01

    The force-directed layout is commonly used in computer-generated visualizations of protein-protein interaction networks. While it is good for providing a visual outline of the protein complexes and their interactions, it has two limitations when used as a visual analysis method. The first is poor reproducibility. Repeated running of the algorithm does not necessarily generate the same layout, therefore, demanding cognitive readaptation on the investigator's part. The second limitation is that it does not explicitly display complementary biological information, e.g. Gene Ontology, other than the protein names or gene symbols. Here, we present an alternative layout called the clustered circular layout. Using the human DNA replication protein-protein interaction network as a case study, we compared the two network layouts for their merits and limitations in supporting visual analysis.

  6. Essential multimeric enzymes in kinetoplastid parasites: A host of potentially druggable protein-protein interactions.

    Science.gov (United States)

    Wachsmuth, Leah M; Johnson, Meredith G; Gavenonis, Jason

    2017-06-01

    Parasitic diseases caused by kinetoplastid parasites of the genera Trypanosoma and Leishmania are an urgent public health crisis in the developing world. These closely related species possess a number of multimeric enzymes in highly conserved pathways involved in vital functions, such as redox homeostasis and nucleotide synthesis. Computational alanine scanning of these protein-protein interfaces has revealed a host of potentially ligandable sites on several established and emerging anti-parasitic drug targets. Analysis of interfaces with multiple clustered hotspots has suggested several potentially inhibitable protein-protein interactions that may have been overlooked by previous large-scale analyses focusing solely on secondary structure. These protein-protein interactions provide a promising lead for the development of new peptide and macrocycle inhibitors of these enzymes.

  7. microProtein Prediction Program (miP3) : A Software for Predicting microProteins and Their Target Transcription Factors

    NARCIS (Netherlands)

    de Klein, Niek; Magnani, Enrico; Banf, Michael; Rhee, Seung Yon

    2015-01-01

    An emerging concept in transcriptional regulation is that a class of truncated transcription factors (TFs), called microProteins (miPs), engages in protein-protein interactions with TF complexes and provides feedback controls. A handful of miP examples have been described in the literature but the

  8. Protein Sorting Prediction

    DEFF Research Database (Denmark)

    Nielsen, Henrik

    2017-01-01

    and drawbacks of each of these approaches is described through many examples of methods that predict secretion, integration into membranes, or subcellular locations in general. The aim of this chapter is to provide a user-level introduction to the field with a minimum of computational theory.......Many computational methods are available for predicting protein sorting in bacteria. When comparing them, it is important to know that they can be grouped into three fundamentally different approaches: signal-based, global-property-based and homology-based prediction. In this chapter, the strengths...

  9. The function of communities in protein interaction networks at multiple scales

    Directory of Open Access Journals (Sweden)

    Jones Nick S

    2010-07-01

    Full Text Available Abstract Background If biology is modular then clusters, or communities, of proteins derived using only protein interaction network structure should define protein modules with similar biological roles. We investigate the link between biological modules and network communities in yeast and its relationship to the scale at which we probe the network. Results Our results demonstrate that the functional homogeneity of communities depends on the scale selected, and that almost all proteins lie in a functionally homogeneous community at some scale. We judge functional homogeneity using a novel test and three independent characterizations of protein function, and find a high degree of overlap between these measures. We show that a high mean clustering coefficient of a community can be used to identify those that are functionally homogeneous. By tracing the community membership of a protein through multiple scales we demonstrate how our approach could be useful to biologists focusing on a particular protein. Conclusions We show that there is no one scale of interest in the community structure of the yeast protein interaction network, but we can identify the range of resolution parameters that yield the most functionally coherent communities, and predict which communities are most likely to be functionally homogeneous.

  10. Predicting membrane protein types by fusing composite protein sequence features into pseudo amino acid composition.

    Science.gov (United States)

    Hayat, Maqsood; Khan, Asifullah

    2011-02-21

    Membrane proteins are vital type of proteins that serve as channels, receptors, and energy transducers in a cell. Prediction of membrane protein types is an important research area in bioinformatics. Knowledge of membrane protein types provides some valuable information for predicting novel example of the membrane protein types. However, classification of membrane protein types can be both time consuming and susceptible to errors due to the inherent similarity of membrane protein types. In this paper, neural networks based membrane protein type prediction system is proposed. Composite protein sequence representation (CPSR) is used to extract the features of a protein sequence, which includes seven feature sets; amino acid composition, sequence length, 2 gram exchange group frequency, hydrophobic group, electronic group, sum of hydrophobicity, and R-group. Principal component analysis is then employed to reduce the dimensionality of the feature vector. The probabilistic neural network (PNN), generalized regression neural network, and support vector machine (SVM) are used as classifiers. A high success rate of 86.01% is obtained using SVM for the jackknife test. In case of independent dataset test, PNN yields the highest accuracy of 95.73%. These classifiers exhibit improved performance using other performance measures such as sensitivity, specificity, Mathew's correlation coefficient, and F-measure. The experimental results show that the prediction performance of the proposed scheme for classifying membrane protein types is the best reported, so far. This performance improvement may largely be credited to the learning capabilities of neural networks and the composite feature extraction strategy, which exploits seven different properties of protein sequences. The proposed Mem-Predictor can be accessed at http://111.68.99.218/Mem-Predictor. Copyright © 2010 Elsevier Ltd. All rights reserved.

  11. Brain transcriptome-wide screen for HIV-1 Nef protein interaction partners reveals various membrane-associated proteins.

    Directory of Open Access Journals (Sweden)

    Ellen C Kammula

    Full Text Available HIV-1 Nef protein contributes essentially to the pathology of AIDS by a variety of protein-protein-interactions within the host cell. The versatile functionality of Nef is partially attributed to different conformational states and posttranslational modifications, such as myristoylation. Up to now, many interaction partners of Nef have been identified using classical yeast two-hybrid screens. Such screens rely on transcriptional activation of reporter genes in the nucleus to detect interactions. Thus, the identification of Nef interaction partners that are integral membrane proteins, membrane-associated proteins or other proteins that do not translocate into the nucleus is hampered. In the present study, a split-ubiquitin based yeast two-hybrid screen was used to identify novel membrane-localized interaction partners of Nef. More than 80% of the hereby identified interaction partners of Nef are transmembrane proteins. The identified hits are GPM6B, GPM6A, BAP31, TSPAN7, CYB5B, CD320/TCblR, VSIG4, PMEPA1, OCIAD1, ITGB1, CHN1, PH4, CLDN10, HSPA9, APR-3, PEBP1 and B3GNT, which are involved in diverse cellular processes like signaling, apoptosis, neurogenesis, cell adhesion and protein trafficking or quality control. For a subfraction of the hereby identified proteins we present data supporting their direct interaction with HIV-1 Nef. We discuss the results with respect to many phenotypes observed in HIV infected cells and patients. The identified Nef interaction partners may help to further elucidate the molecular basis of HIV-related diseases.

  12. Specificity and evolvability in eukaryotic protein interaction networks.

    Directory of Open Access Journals (Sweden)

    Pedro Beltrao

    2007-02-01

    Full Text Available Progress in uncovering the protein interaction networks of several species has led to questions of what underlying principles might govern their organization. Few studies have tried to determine the impact of protein interaction network evolution on the observed physiological differences between species. Using comparative genomics and structural information, we show here that eukaryotic species have rewired their interactomes at a fast rate of approximately 10(-5 interactions changed per protein pair, per million years of divergence. For Homo sapiens this corresponds to 10(3 interactions changed per million years. Additionally we find that the specificity of binding strongly determines the interaction turnover and that different biological processes show significantly different link dynamics. In particular, human proteins involved in immune response, transport, and establishment of localization show signs of positive selection for change of interactions. Our analysis suggests that a small degree of molecular divergence can give rise to important changes at the network level. We propose that the power law distribution observed in protein interaction networks could be partly explained by the cell's requirement for different degrees of protein binding specificity.

  13. Roles for text mining in protein function prediction.

    Science.gov (United States)

    Verspoor, Karin M

    2014-01-01

    The Human Genome Project has provided science with a hugely valuable resource: the blueprints for life; the specification of all of the genes that make up a human. While the genes have all been identified and deciphered, it is proteins that are the workhorses of the human body: they are essential to virtually all cell functions and are the primary mechanism through which biological function is carried out. Hence in order to fully understand what happens at a molecular level in biological organisms, and eventually to enable development of treatments for diseases where some aspect of a biological system goes awry, we must understand the functions of proteins. However, experimental characterization of protein function cannot scale to the vast amount of DNA sequence data now available. Computational protein function prediction has therefore emerged as a problem at the forefront of modern biology (Radivojac et al., Nat Methods 10(13):221-227, 2013).Within the varied approaches to computational protein function prediction that have been explored, there are several that make use of biomedical literature mining. These methods take advantage of information in the published literature to associate specific proteins with specific protein functions. In this chapter, we introduce two main strategies for doing this: association of function terms, represented as Gene Ontology terms (Ashburner et al., Nat Genet 25(1):25-29, 2000), to proteins based on information in published articles, and a paradigm called LEAP-FS (Literature-Enhanced Automated Prediction of Functional Sites) in which literature mining is used to validate the predictions of an orthogonal computational protein function prediction method.

  14. Interactions between whey proteins and salivary proteins as related to astringency of whey protein beverages at low pH.

    Science.gov (United States)

    Ye, A; Streicher, C; Singh, H

    2011-12-01

    Whey protein beverages have been shown to be astringent at low pH. In the present study, the interactions between model whey proteins (β-lactoglobulin and lactoferrin) and human saliva in the pH range from 7 to 2 were investigated using particle size, turbidity, and ζ-potential measurements and sodium dodecyl sulfate-PAGE. The correlation between the sensory results of astringency and the physicochemical data was discussed. Strong interactions between β-lactoglobulin and salivary proteins led to an increase in the particle size and turbidity of mixtures of both unheated and heated β-lactoglobulin and human saliva at pH ∼3.4. However, the large particle size and high turbidity that occurred at pH 2.0 were the result of aggregation of human salivary proteins. The intense astringency in whey protein beverages may result from these increases in particle size and turbidity at these pH values and from the aggregation and precipitation of human salivary proteins alone at pH salivary proteins in the interaction is a key factor in the perception of astringency in whey protein beverages. At any pH, the increases in particle size and turbidity were much smaller in mixtures of lactoferrin and saliva, which suggests that aggregation and precipitation may not be the only mechanism linked to the perception of astringency in whey protein. Copyright © 2011 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  15. Deep learning methods for protein torsion angle prediction.

    Science.gov (United States)

    Li, Haiou; Hou, Jie; Adhikari, Badri; Lyu, Qiang; Cheng, Jianlin

    2017-09-18

    Deep learning is one of the most powerful machine learning methods that has achieved the state-of-the-art performance in many domains. Since deep learning was introduced to the field of bioinformatics in 2012, it has achieved success in a number of areas such as protein residue-residue contact prediction, secondary structure prediction, and fold recognition. In this work, we developed deep learning methods to improve the prediction of torsion (dihedral) angles of proteins. We design four different deep learning architectures to predict protein torsion angles. The architectures including deep neural network (DNN) and deep restricted Boltzmann machine (DRBN), deep recurrent neural network (DRNN) and deep recurrent restricted Boltzmann machine (DReRBM) since the protein torsion angle prediction is a sequence related problem. In addition to existing protein features, two new features (predicted residue contact number and the error distribution of torsion angles extracted from sequence fragments) are used as input to each of the four deep learning architectures to predict phi and psi angles of protein backbone. The mean absolute error (MAE) of phi and psi angles predicted by DRNN, DReRBM, DRBM and DNN is about 20-21° and 29-30° on an independent dataset. The MAE of phi angle is comparable to the existing methods, but the MAE of psi angle is 29°, 2° lower than the existing methods. On the latest CASP12 targets, our methods also achieved the performance better than or comparable to a state-of-the art method. Our experiment demonstrates that deep learning is a valuable method for predicting protein torsion angles. The deep recurrent network architecture performs slightly better than deep feed-forward architecture, and the predicted residue contact number and the error distribution of torsion angles extracted from sequence fragments are useful features for improving prediction accuracy.

  16. Protein subcellular localization prediction using artificial intelligence technology.

    Science.gov (United States)

    Nair, Rajesh; Rost, Burkhard

    2008-01-01

    Proteins perform many important tasks in living organisms, such as catalysis of biochemical reactions, transport of nutrients, and recognition and transmission of signals. The plethora of aspects of the role of any particular protein is referred to as its "function." One aspect of protein function that has been the target of intensive research by computational biologists is its subcellular localization. Proteins must be localized in the same subcellular compartment to cooperate toward a common physiological function. Aberrant subcellular localization of proteins can result in several diseases, including kidney stones, cancer, and Alzheimer's disease. To date, sequence homology remains the most widely used method for inferring the function of a protein. However, the application of advanced artificial intelligence (AI)-based techniques in recent years has resulted in significant improvements in our ability to predict the subcellular localization of a protein. The prediction accuracy has risen steadily over the years, in large part due to the application of AI-based methods such as hidden Markov models (HMMs), neural networks (NNs), and support vector machines (SVMs), although the availability of larger experimental datasets has also played a role. Automatic methods that mine textual information from the biological literature and molecular biology databases have considerably sped up the process of annotation for proteins for which some information regarding function is available in the literature. State-of-the-art methods based on NNs and HMMs can predict the presence of N-terminal sorting signals extremely accurately. Ab initio methods that predict subcellular localization for any protein sequence using only the native amino acid sequence and features predicted from the native sequence have shown the most remarkable improvements. The prediction accuracy of these methods has increased by over 30% in the past decade. The accuracy of these methods is now on par with

  17. The drug-minded protein interaction database (DrumPID) for efficient target analysis and drug development.

    Science.gov (United States)

    Kunz, Meik; Liang, Chunguang; Nilla, Santosh; Cecil, Alexander; Dandekar, Thomas

    2016-01-01

    The drug-minded protein interaction database (DrumPID) has been designed to provide fast, tailored information on drugs and their protein networks including indications, protein targets and side-targets. Starting queries include compound, target and protein interactions and organism-specific protein families. Furthermore, drug name, chemical structures and their SMILES notation, affected proteins (potential drug targets), organisms as well as diseases can be queried including various combinations and refinement of searches. Drugs and protein interactions are analyzed in detail with reference to protein structures and catalytic domains, related compound structures as well as potential targets in other organisms. DrumPID considers drug functionality, compound similarity, target structure, interactome analysis and organismic range for a compound, useful for drug development, predicting drug side-effects and structure-activity relationships.Database URL:http://drumpid.bioapps.biozentrum.uni-wuerzburg.de. © The Author(s) 2016. Published by Oxford University Press.

  18. Yeast Interacting Proteins Database: YDL239C, YDR273W [Yeast Interacting Proteins Database

    Lifescience Database Archive (English)

    Full Text Available of a Don1p-containing structure at the leading edge of the prospore membrane via interaction with spindle p...it as prey (1) YDR273W DON1 Meiosis-specific component of the spindle pole body, part of the leading... edge protein (LEP) coat, forms a ring-like structure at the leading edge of the prospore...ption Protein required for spore wall formation, thought to mediate assembly of a Don1p-containing structure at the leading...description Meiosis-specific component of the spindle pole body, part of the leading edge protein (LEP) coat

  19. Quantitative analysis and prediction of curvature in leucine-rich repeat proteins.

    Science.gov (United States)

    Hindle, K Lauren; Bella, Jordi; Lovell, Simon C

    2009-11-01

    Leucine-rich repeat (LRR) proteins form a large and diverse family. They have a wide range of functions most of which involve the formation of protein-protein interactions. All known LRR structures form curved solenoids, although there is large variation in their curvature. It is this curvature that determines the shape and dimensions of the inner space available for ligand binding. Unfortunately, large-scale parameters such as the overall curvature of a protein domain are extremely difficult to predict. Here, we present a quantitative analysis of determinants of curvature of this family. Individual repeats typically range in length between 20 and 30 residues and have a variety of secondary structures on their convex side. The observed curvature of the LRR domains correlates poorly with the lengths of their individual repeats. We have, therefore, developed a scoring function based on the secondary structure of the convex side of the protein that allows prediction of the overall curvature with a high degree of accuracy. We also demonstrate the effectiveness of this method in selecting a suitable template for comparative modeling. We have developed an automated, quantitative protocol that can be used to predict accurately the curvature of leucine-rich repeat proteins of unknown structure from sequence alone. This protocol is available as an online resource at http://www.bioinf.manchester.ac.uk/curlrr/.

  20. MUFOLD-SS: New deep inception-inside-inception networks for protein secondary structure prediction.

    Science.gov (United States)

    Fang, Chao; Shang, Yi; Xu, Dong

    2018-05-01

    Protein secondary structure prediction can provide important information for protein 3D structure prediction and protein functions. Deep learning offers a new opportunity to significantly improve prediction accuracy. In this article, a new deep neural network architecture, named the Deep inception-inside-inception (Deep3I) network, is proposed for protein secondary structure prediction and implemented as a software tool MUFOLD-SS. The input to MUFOLD-SS is a carefully designed feature matrix corresponding to the primary amino acid sequence of a protein, which consists of a rich set of information derived from individual amino acid, as well as the context of the protein sequence. Specifically, the feature matrix is a composition of physio-chemical properties of amino acids, PSI-BLAST profile, and HHBlits profile. MUFOLD-SS is composed of a sequence of nested inception modules and maps the input matrix to either eight states or three states of secondary structures. The architecture of MUFOLD-SS enables effective processing of local and global interactions between amino acids in making accurate prediction. In extensive experiments on multiple datasets, MUFOLD-SS outperformed the best existing methods and other deep neural networks significantly. MUFold-SS can be downloaded from http://dslsrv8.cs.missouri.edu/~cf797/MUFoldSS/download.html. © 2018 Wiley Periodicals, Inc.

  1. Signatures of pleiotropy, economy and convergent evolution in a domain-resolved map of human-virus protein-protein interaction networks.

    Directory of Open Access Journals (Sweden)

    Sara Garamszegi

    Full Text Available A central challenge in host-pathogen systems biology is the elucidation of general, systems-level principles that distinguish host-pathogen interactions from within-host interactions. Current analyses of host-pathogen and within-host protein-protein interaction networks are largely limited by their resolution, treating proteins as nodes and interactions as edges. Here, we construct a domain-resolved map of human-virus and within-human protein-protein interaction networks by annotating protein interactions with high-coverage, high-accuracy, domain-centric interaction mechanisms: (1 domain-domain interactions, in which a domain in one protein binds to a domain in a second protein, and (2 domain-motif interactions, in which a domain in one protein binds to a short, linear peptide motif in a second protein. Analysis of these domain-resolved networks reveals, for the first time, significant mechanistic differences between virus-human and within-human interactions at the resolution of single domains. While human proteins tend to compete with each other for domain binding sites by means of sequence similarity, viral proteins tend to compete with human proteins for domain binding sites in the absence of sequence similarity. Independent of their previously established preference for targeting human protein hubs, viral proteins also preferentially target human proteins containing linear motif-binding domains. Compared to human proteins, viral proteins participate in more domain-motif interactions, target more unique linear motif-binding domains per residue, and contain more unique linear motifs per residue. Together, these results suggest that viruses surmount genome size constraints by convergently evolving multiple short linear motifs in order to effectively mimic, hijack, and manipulate complex host processes for their survival. Our domain-resolved analyses reveal unique signatures of pleiotropy, economy, and convergent evolution in viral

  2. Topological and organizational properties of the products of house-keeping and tissue-specific genes in protein-protein interaction networks.

    Science.gov (United States)

    Lin, Wen-Hsien; Liu, Wei-Chung; Hwang, Ming-Jing

    2009-03-11

    Human cells of various tissue types differ greatly in morphology despite having the same set of genetic information. Some genes are expressed in all cell types to perform house-keeping functions, while some are selectively expressed to perform tissue-specific functions. In this study, we wished to elucidate how proteins encoded by human house-keeping genes and tissue-specific genes are organized in human protein-protein interaction networks. We constructed protein-protein interaction networks for different tissue types using two gene expression datasets and one protein-protein interaction database. We then calculated three network indices of topological importance, the degree, closeness, and betweenness centralities, to measure the network position of proteins encoded by house-keeping and tissue-specific genes, and quantified their local connectivity structure. Compared to a random selection of proteins, house-keeping gene-encoded proteins tended to have a greater number of directly interacting neighbors and occupy network positions in several shortest paths of interaction between protein pairs, whereas tissue-specific gene-encoded proteins did not. In addition, house-keeping gene-encoded proteins tended to connect with other house-keeping gene-encoded proteins in all tissue types, whereas tissue-specific gene-encoded proteins also tended to connect with other tissue-specific gene-encoded proteins, but only in approximately half of the tissue types examined. Our analysis showed that house-keeping gene-encoded proteins tend to occupy important network positions, while those encoded by tissue-specific genes do not. The biological implications of our findings were discussed and we proposed a hypothesis regarding how cells organize their protein tools in protein-protein interaction networks. Our results led us to speculate that house-keeping gene-encoded proteins might form a core in human protein-protein interaction networks, while clusters of tissue-specific gene

  3. Catching the PEG-induced attractive interaction between proteins.

    Science.gov (United States)

    Vivarès, D; Belloni, L; Tardieu, A; Bonneté, F

    2002-09-01

    We present the experimental and theoretical background of a method to characterize the protein-protein attractive potential induced by one of the mostly used crystallizing agents in the protein-field, the poly(ethylene glycol) (PEG). This attractive interaction is commonly called, in colloid physics, the depletion interaction. Small-Angle X-ray Scattering experiments and numerical treatments based on liquid-state theories were performed on urate oxidase-PEG mixtures with two different PEGs (3350 Da and 8000 Da). A "two-component" approach was used in which the polymer-polymer, the protein-polymer and the protein-protein pair potentials were determined. The resulting effective protein-protein potential was characterized. This potential is the sum of the free-polymer protein-protein potential and of the PEG-induced depletion potential. The depletion potential was found to be hardly dependent upon the protein concentration but strongly function of the polymer size and concentration. Our results were also compared with two models, which give an analytic expression for the depletion potential.

  4. Interactions between whey proteins and kaolinite surfaces

    Energy Technology Data Exchange (ETDEWEB)

    Barral, S. [Department of Chemical Engineering and Environmental Technology, University of Oviedo, Julian Claveria 8, 33006 Oviedo (Spain); Villa-Garcia, M.A. [Department of Organic and Inorganic Chemistry, University of Oviedo, Julian Claveria 8, 33006 Oviedo (Spain)], E-mail: mavg@uniovi.es; Rendueles, M. [Project Management Area, University of Oviedo, Independencia 13, 33004 Oviedo (Spain); Diaz, M. [Department of Chemical Engineering and Environmental Technology, University of Oviedo, Julian Claveria 8, 33006 Oviedo (Spain)

    2008-07-15

    The nature of the interactions between whey proteins and kaolinite surfaces was investigated by adsorption-desorption experiments at room temperature, performed at the isoelectric point (IEP) of the proteins and at pH 7. It was found that kaolinite is a strong adsorbent for proteins, reaching the maximum adsorption capacity at the IEP of each protein. At pH 7.0, the retention capacity decreased considerably. The adsorption isotherms showed typical Langmuir characteristics. X-ray diffraction data for the protein-kaolinite complexes showed that protein molecules were not intercalated in the mineral structure, but immobilized at the external surfaces and the edges of the kaolinite. Fourier transform IR results indicate the absence of hydrogen bonding between kaolinite surfaces and the polypeptide chain. The adsorption patterns appear to be related to electrostatic interactions, although steric effects should be also considered.

  5. Interactions between whey proteins and kaolinite surfaces

    International Nuclear Information System (INIS)

    Barral, S.; Villa-Garcia, M.A.; Rendueles, M.; Diaz, M.

    2008-01-01

    The nature of the interactions between whey proteins and kaolinite surfaces was investigated by adsorption-desorption experiments at room temperature, performed at the isoelectric point (IEP) of the proteins and at pH 7. It was found that kaolinite is a strong adsorbent for proteins, reaching the maximum adsorption capacity at the IEP of each protein. At pH 7.0, the retention capacity decreased considerably. The adsorption isotherms showed typical Langmuir characteristics. X-ray diffraction data for the protein-kaolinite complexes showed that protein molecules were not intercalated in the mineral structure, but immobilized at the external surfaces and the edges of the kaolinite. Fourier transform IR results indicate the absence of hydrogen bonding between kaolinite surfaces and the polypeptide chain. The adsorption patterns appear to be related to electrostatic interactions, although steric effects should be also considered

  6. Using the Relevance Vector Machine Model Combined with Local Phase Quantization to Predict Protein-Protein Interactions from Protein Sequences

    Directory of Open Access Journals (Sweden)

    Ji-Yong An

    2016-01-01

    Full Text Available We propose a novel computational method known as RVM-LPQ that combines the Relevance Vector Machine (RVM model and Local Phase Quantization (LPQ to predict PPIs from protein sequences. The main improvements are the results of representing protein sequences using the LPQ feature representation on a Position Specific Scoring Matrix (PSSM, reducing the influence of noise using a Principal Component Analysis (PCA, and using a Relevance Vector Machine (RVM based classifier. We perform 5-fold cross-validation experiments on Yeast and Human datasets, and we achieve very high accuracies of 92.65% and 97.62%, respectively, which is significantly better than previous works. To further evaluate the proposed method, we compare it with the state-of-the-art support vector machine (SVM classifier on the Yeast dataset. The experimental results demonstrate that our RVM-LPQ method is obviously better than the SVM-based method. The promising experimental results show the efficiency and simplicity of the proposed method, which can be an automatic decision support tool for future proteomics research.

  7. Prediction of Carbohydrate-Binding Proteins from Sequences Using Support Vector Machines

    Directory of Open Access Journals (Sweden)

    Seizi Someya

    2010-01-01

    Full Text Available Carbohydrate-binding proteins are proteins that can interact with sugar chains but do not modify them. They are involved in many physiological functions, and we have developed a method for predicting them from their amino acid sequences. Our method is based on support vector machines (SVMs. We first clarified the definition of carbohydrate-binding proteins and then constructed positive and negative datasets with which the SVMs were trained. By applying the leave-one-out test to these datasets, our method delivered 0.92 of the area under the receiver operating characteristic (ROC curve. We also examined two amino acid grouping methods that enable effective learning of sequence patterns and evaluated the performance of these methods. When we applied our method in combination with the homology-based prediction method to the annotated human genome database, H-invDB, we found that the true positive rate of prediction was improved.

  8. Prioritizing disease candidate proteins in cardiomyopathy-specific protein-protein interaction networks based on "guilt by association" analysis.

    Directory of Open Access Journals (Sweden)

    Wan Li

    Full Text Available The cardiomyopathies are a group of heart muscle diseases which can be inherited (familial. Identifying potential disease-related proteins is important to understand mechanisms of cardiomyopathies. Experimental identification of cardiomyophthies is costly and labour-intensive. In contrast, bioinformatics approach has a competitive advantage over experimental method. Based on "guilt by association" analysis, we prioritized candidate proteins involving in human cardiomyopathies. We first built weighted human cardiomyopathy-specific protein-protein interaction networks for three subtypes of cardiomyopathies using the known disease proteins from Online Mendelian Inheritance in Man as seeds. We then developed a method in prioritizing disease candidate proteins to rank candidate proteins in the network based on "guilt by association" analysis. It was found that most candidate proteins with high scores shared disease-related pathways with disease seed proteins. These top ranked candidate proteins were related with the corresponding disease subtypes, and were potential disease-related proteins. Cross-validation and comparison with other methods indicated that our approach could be used for the identification of potentially novel disease proteins, which may provide insights into cardiomyopathy-related mechanisms in a more comprehensive and integrated way.

  9. Prediction of protein loop geometries in solution

    NARCIS (Netherlands)

    Rapp, Chaya S.; Strauss, Temima; Nederveen, Aart; Fuentes, Gloria

    2007-01-01

    The ability to determine the structure of a protein in solution is a critical tool for structural biology, as proteins in their native state are found in aqueous environments. Using a physical chemistry based prediction protocol, we demonstrate the ability to reproduce protein loop geometries in

  10. Toward a rigorous network of protein-protein interactions of the model sulfate reducer Desulfovibrio vulgaris Hildenborough

    Energy Technology Data Exchange (ETDEWEB)

    Chhabra, S.R.; Joachimiak, M.P.; Petzold, C.J.; Zane, G.M.; Price, M.N.; Gaucher, S.; Reveco, S.A.; Fok, V.; Johanson, A.R.; Batth, T.S.; Singer, M.; Chandonia, J.M.; Joyner, D.; Hazen, T.C.; Arkin, A.P.; Wall, J.D.; Singh, A.K.; Keasling, J.D.

    2011-05-01

    Protein–protein interactions offer an insight into cellular processes beyond what may be obtained by the quantitative functional genomics tools of proteomics and transcriptomics. The aforementioned tools have been extensively applied to study E. coli and other aerobes and more recently to study the stress response behavior of Desulfovibrio 5 vulgaris Hildenborough, a model anaerobe and sulfate reducer. In this paper we present the first attempt to identify protein-protein interactions in an obligate anaerobic bacterium. We used suicide vector-assisted chromosomal modification of 12 open reading frames encoded by this sulfate reducer to append an eight amino acid affinity tag to the carboxy-terminus of the chosen proteins. Three biological replicates of the 10 ‘pulled-down’ proteins were separated and analyzed using liquid chromatography-mass spectrometry. Replicate agreement ranged between 35% and 69%. An interaction network among 12 bait and 90 prey proteins was reconstructed based on 134 bait-prey interactions computationally identified to be of high confidence. We discuss the biological significance of several unique metabolic features of D. vulgaris revealed by this protein-protein interaction data 15 and protein modifications that were observed. These include the distinct role of the putative carbon monoxide-induced hydrogenase, unique electron transfer routes associated with different oxidoreductases, and the possible role of methylation in regulating sulfate reduction.

  11. Protein-protein docking using region-based 3D Zernike descriptors.

    Science.gov (United States)

    Venkatraman, Vishwesh; Yang, Yifeng D; Sael, Lee; Kihara, Daisuke

    2009-12-09

    Protein-protein interactions are a pivotal component of many biological processes and mediate a variety of functions. Knowing the tertiary structure of a protein complex is therefore essential for understanding the interaction mechanism. However, experimental techniques to solve the structure of the complex are often found to be difficult. To this end, computational protein-protein docking approaches can provide a useful alternative to address this issue. Prediction of docking conformations relies on methods that effectively capture shape features of the participating proteins while giving due consideration to conformational changes that may occur. We present a novel protein docking algorithm based on the use of 3D Zernike descriptors as regional features of molecular shape. The key motivation of using these descriptors is their invariance to transformation, in addition to a compact representation of local surface shape characteristics. Docking decoys are generated using geometric hashing, which are then ranked by a scoring function that incorporates a buried surface area and a novel geometric complementarity term based on normals associated with the 3D Zernike shape description. Our docking algorithm was tested on both bound and unbound cases in the ZDOCK benchmark 2.0 dataset. In 74% of the bound docking predictions, our method was able to find a near-native solution (interface C-alphaRMSD 3D Zernike descriptors are adept in capturing shape complementarity at the protein-protein interface and useful for protein docking prediction. Rigorous benchmark studies show that our docking approach has a superior performance compared to existing methods.

  12. Graphical analysis of pH-dependent properties of proteins predicted using PROPKA.

    Science.gov (United States)

    Rostkowski, Michał; Olsson, Mats H M; Søndergaard, Chresten R; Jensen, Jan H

    2011-01-26

    Charge states of ionizable residues in proteins determine their pH-dependent properties through their pKa values. Thus, various theoretical methods to determine ionization constants of residues in biological systems have been developed. One of the more widely used approaches for predicting pKa values in proteins is the PROPKA program, which provides convenient structural rationalization of the predicted pKa values without any additional calculations. The PROPKA Graphical User Interface (GUI) is a new tool for studying the pH-dependent properties of proteins such as charge and stabilization energy. It facilitates a quantitative analysis of pKa values of ionizable residues together with their structural determinants by providing a direct link between the pKa data, predicted by the PROPKA calculations, and the structure via the Visual Molecular Dynamics (VMD) program. The GUI also calculates contributions to the pH-dependent unfolding free energy at a given pH for each ionizable group in the protein. Moreover, the PROPKA-computed pKa values or energy contributions of the ionizable residues in question can be displayed interactively. The PROPKA GUI can also be used for comparing pH-dependent properties of more than one structure at the same time. The GUI considerably extends the analysis and validation possibilities of the PROPKA approach. The PROPKA GUI can conveniently be used to investigate ionizable groups, and their interactions, of residues with significantly perturbed pKa values or residues that contribute to the stabilization energy the most. Charge-dependent properties can be studied either for a single protein or simultaneously with other homologous structures, which makes it a helpful tool, for instance, in protein design studies or structure-based function predictions. The GUI is implemented as a Tcl/Tk plug-in for VMD, and can be obtained online at http://propka.ki.ku.dk/~luca/wiki/index.php/GUI_Web.

  13. CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure prediction

    KAUST Repository

    Cui, Xuefeng; Lu, Zhiwu; Wang, Sheng; Jing-Yan Wang, Jim; Gao, Xin

    2016-01-01

    Motivation: Protein homology detection, a fundamental problem in computational biology, is an indispensable step toward predicting protein structures and understanding protein functions. Despite the advances in recent decades on sequence alignment

  14. Macrocyclic peptide inhibitors for the protein-protein interaction of Zaire Ebola virus protein 24 and karyopherin alpha 5.

    Science.gov (United States)

    Song, Xiao; Lu, Lu-Yi; Passioura, Toby; Suga, Hiroaki

    2017-06-21

    Ebola virus infection leads to severe hemorrhagic fever in human and non-human primates with an average case fatality rate of 50%. To date, numerous potential therapies are in development, but FDA-approved drugs or vaccines are yet unavailable. Ebola viral protein 24 (VP24) is a multifunctional protein that plays critical roles in the pathogenesis of Ebola virus infection, e.g. innate immune suppression by blocking the interaction between KPNA and PY-STAT1. Here we report macrocyclic peptide inhibitors of the VP24-KPNA5 protein-protein interaction (PPI) by means of the RaPID (Random non-standard Peptides Integrated Discovery) system. These macrocyclic peptides showed remarkably high affinity to recombinant Zaire Ebola virus VP24 (eVP24), with a dissociation constant in the single digit nanomolar range, and could also successfully disrupt the eVP24-KPNA interaction. This work provides for the first time a chemical probe capable of modulating this PPI interaction and is the starting point for the development of unique anti-viral drugs against the Ebola virus.

  15. MFPred: Rapid and accurate prediction of protein-peptide recognition multispecificity using self-consistent mean field theory.

    Directory of Open Access Journals (Sweden)

    Aliza B Rubenstein

    2017-06-01

    Full Text Available Multispecificity-the ability of a single receptor protein molecule to interact with multiple substrates-is a hallmark of molecular recognition at protein-protein and protein-peptide interfaces, including enzyme-substrate complexes. The ability to perform structure-based prediction of multispecificity would aid in the identification of novel enzyme substrates, protein interaction partners, and enable design of novel enzymes targeted towards alternative substrates. The relatively slow speed of current biophysical, structure-based methods limits their use for prediction and, especially, design of multispecificity. Here, we develop a rapid, flexible-backbone self-consistent mean field theory-based technique, MFPred, for multispecificity modeling at protein-peptide interfaces. We benchmark our method by predicting experimentally determined peptide specificity profiles for a range of receptors: protease and kinase enzymes, and protein recognition modules including SH2, SH3, MHC Class I and PDZ domains. We observe robust recapitulation of known specificities for all receptor-peptide complexes, and comparison with other methods shows that MFPred results in equivalent or better prediction accuracy with a ~10-1000-fold decrease in computational expense. We find that modeling bound peptide backbone flexibility is key to the observed accuracy of the method. We used MFPred for predicting with high accuracy the impact of receptor-side mutations on experimentally determined multispecificity of a protease enzyme. Our approach should enable the design of a wide range of altered receptor proteins with programmed multispecificities.

  16. Stapled Voltage-Gated Calcium Channel (CaV) α-Interaction Domain (AID) Peptides Act As Selective Protein-Protein Interaction Inhibitors of CaV Function.

    Science.gov (United States)

    Findeisen, Felix; Campiglio, Marta; Jo, Hyunil; Abderemane-Ali, Fayal; Rumpf, Christine H; Pope, Lianne; Rossen, Nathan D; Flucher, Bernhard E; DeGrado, William F; Minor, Daniel L

    2017-06-21

    For many voltage-gated ion channels (VGICs), creation of a properly functioning ion channel requires the formation of specific protein-protein interactions between the transmembrane pore-forming subunits and cystoplasmic accessory subunits. Despite the importance of such protein-protein interactions in VGIC function and assembly, their potential as sites for VGIC modulator development has been largely overlooked. Here, we develop meta-xylyl (m-xylyl) stapled peptides that target a prototypic VGIC high affinity protein-protein interaction, the interaction between the voltage-gated calcium channel (Ca V ) pore-forming subunit α-interaction domain (AID) and cytoplasmic β-subunit (Ca V β). We show using circular dichroism spectroscopy, X-ray crystallography, and isothermal titration calorimetry that the m-xylyl staples enhance AID helix formation are structurally compatible with native-like AID:Ca V β interactions and reduce the entropic penalty associated with AID binding to Ca V β. Importantly, electrophysiological studies reveal that stapled AID peptides act as effective inhibitors of the Ca V α 1 :Ca V β interaction that modulate Ca V function in an Ca V β isoform-selective manner. Together, our studies provide a proof-of-concept demonstration of the use of protein-protein interaction inhibitors to control VGIC function and point to strategies for improved AID-based Ca V modulator design.

  17. Positive Selection and Centrality in the Yeast and Fly Protein-Protein Interaction Networks

    Directory of Open Access Journals (Sweden)

    Sandip Chakraborty

    2016-01-01

    Full Text Available Proteins within a molecular network are expected to be subject to different selective pressures depending on their relative hierarchical positions. However, it is not obvious what genes within a network should be more likely to evolve under positive selection. On one hand, only mutations at genes with a relatively high degree of control over adaptive phenotypes (such as those encoding highly connected proteins are expected to be “seen” by natural selection. On the other hand, a high degree of pleiotropy at these genes is expected to hinder adaptation. Previous analyses of the human protein-protein interaction network have shown that genes under long-term, recurrent positive selection (as inferred from interspecific comparisons tend to act at the periphery of the network. It is unknown, however, whether these trends apply to other organisms. Here, we show that long-term positive selection has preferentially targeted the periphery of the yeast interactome. Conversely, in flies, genes under positive selection encode significantly more connected and central proteins. These observations are not due to covariation of genes’ adaptability and centrality with confounding factors. Therefore, the distribution of proteins encoded by genes under recurrent positive selection across protein-protein interaction networks varies from one species to another.

  18. Seeing the trees through the forest : sequence-based homo- and heteromeric protein-protein interaction sites prediction using random forest

    NARCIS (Netherlands)

    Hou, Qingzhen; De Geest, Paul F.G.; Vranken, Wim F.; Heringa, Jaap; Feenstra, K. Anton

    2017-01-01

    Motivation: Genome sequencing is producing an ever-increasing amount of associated protein sequences. Few of these sequences have experimentally validated annotations, however, and computational predictions are becoming increasingly successful in producing such annotations. One key challenge remains

  19. CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure prediction

    KAUST Repository

    Cui, Xuefeng

    2016-06-15

    Motivation: Protein homology detection, a fundamental problem in computational biology, is an indispensable step toward predicting protein structures and understanding protein functions. Despite the advances in recent decades on sequence alignment, threading and alignment-free methods, protein homology detection remains a challenging open problem. Recently, network methods that try to find transitive paths in the protein structure space demonstrate the importance of incorporating network information of the structure space. Yet, current methods merge the sequence space and the structure space into a single space, and thus introduce inconsistency in combining different sources of information. Method: We present a novel network-based protein homology detection method, CMsearch, based on cross-modal learning. Instead of exploring a single network built from the mixture of sequence and structure space information, CMsearch builds two separate networks to represent the sequence space and the structure space. It then learns sequence–structure correlation by simultaneously taking sequence information, structure information, sequence space information and structure space information into consideration. Results: We tested CMsearch on two challenging tasks, protein homology detection and protein structure prediction, by querying all 8332 PDB40 proteins. Our results demonstrate that CMsearch is insensitive to the similarity metrics used to define the sequence and the structure spaces. By using HMM–HMM alignment as the sequence similarity metric, CMsearch clearly outperforms state-of-the-art homology detection methods and the CASP-winning template-based protein structure prediction methods.

  20. PIWI Proteins and PIWI-Interacting RNA

    DEFF Research Database (Denmark)

    Han, Yi Neng; Li, Yuan; Xia, Sheng Qiang

    2017-01-01

    tissue types as well and play important roles in transposon silencing, epigenetic regulation, gene and protein regulation, genome rearrangement, spermatogenesis and germ stem-cell maintenance. PIWI proteins were first discovered in Drosophila and they play roles in spermatogenesis, germline stem-cell......P-Element induced wimpy testis (PIWI)-interacting RNAs (piRNAs) are a type of noncoding RNAs (ncRNAs) and interact with PIWI proteins. piRNAs were primarily described in the germline, but emerging evidence revealed that piRNAs are expressed in a tissue-specific manner among multiple human somatic...... maintenance, self-renewal, retrotransposons silencing and the male germline mobility control. A growing number of studies have demonstrated that several piRNA and PIWI proteins are aberrantly expressed in various kinds of cancers and may probably serve as a novel biomarker and therapeutic target for cancer...

  1. Identification of novel direct protein-protein interactions by irradiating living cells with femtosecond UV laser pulses.

    Science.gov (United States)

    Itri, Francesco; Monti, Daria Maria; Chino, Marco; Vinciguerra, Roberto; Altucci, Carlo; Lombardi, Angela; Piccoli, Renata; Birolo, Leila; Arciello, Angela

    2017-10-07

    The identification of protein-protein interaction networks in living cells is becoming increasingly fundamental to elucidate main biological processes and to understand disease molecular bases on a system-wide level. We recently described a method (LUCK, Laser UV Cross-linKing) to cross-link interacting protein surfaces in living cells by UV laser irradiation. By using this innovative methodology, that does not require any protein modification or cell engineering, here we demonstrate that, upon UV laser irradiation of HeLa cells, a direct interaction between GAPDH and alpha-enolase was "frozen" by a cross-linking event. We validated the occurrence of this direct interaction by co-immunoprecipitation and Immuno-FRET analyses. This represents a proof of principle of the LUCK capability to reveal direct protein interactions in their physiological environment. Copyright © 2017 Elsevier Inc. All rights reserved.

  2. Protein-protein docking with dynamic residue protonation states.

    Directory of Open Access Journals (Sweden)

    Krishna Praneeth Kilambi

    2014-12-01

    Full Text Available Protein-protein interactions depend on a host of environmental factors. Local pH conditions influence the interactions through the protonation states of the ionizable residues that can change upon binding. In this work, we present a pH-sensitive docking approach, pHDock, that can sample side-chain protonation states of five ionizable residues (Asp, Glu, His, Tyr, Lys on-the-fly during the docking simulation. pHDock produces successful local docking funnels in approximately half (79/161 the protein complexes, including 19 cases where standard RosettaDock fails. pHDock also performs better than the two control cases comprising docking at pH 7.0 or using fixed, predetermined protonation states. On average, the top-ranked pHDock structures have lower interface RMSDs and recover more native interface residue-residue contacts and hydrogen bonds compared to RosettaDock. Addition of backbone flexibility using a computationally-generated conformational ensemble further improves native contact and hydrogen bond recovery in the top-ranked structures. Although pHDock is designed to improve docking, it also successfully predicts a large pH-dependent binding affinity change in the Fc-FcRn complex, suggesting that it can be exploited to improve affinity predictions. The approaches in the study contribute to the goal of structural simulations of whole-cell protein-protein interactions including all the environmental factors, and they can be further expanded for pH-sensitive protein design.

  3. Protein-Protein Interaction Network and Gene Ontology

    Science.gov (United States)

    Choi, Yunkyu; Kim, Seok; Yi, Gwan-Su; Park, Jinah

    Evolution of computer technologies makes it possible to access a large amount and various kinds of biological data via internet such as DNA sequences, proteomics data and information discovered about them. It is expected that the combination of various data could help researchers find further knowledge about them. Roles of a visualization system are to invoke human abilities to integrate information and to recognize certain patterns in the data. Thus, when the various kinds of data are examined and analyzed manually, an effective visualization system is an essential part. One instance of these integrated visualizations can be combination of protein-protein interaction (PPI) data and Gene Ontology (GO) which could help enhance the analysis of PPI network. We introduce a simple but comprehensive visualization system that integrates GO and PPI data where GO and PPI graphs are visualized side-by-side and supports quick reference functions between them. Furthermore, the proposed system provides several interactive visualization methods for efficiently analyzing the PPI network and GO directedacyclic- graph such as context-based browsing and common ancestors finding.

  4. Interactions among tobacco sieve element occlusion (SEO) proteins.

    Science.gov (United States)

    Jekat, Stephan B; Ernst, Antonia M; Zielonka, Sascia; Noll, Gundula A; Prüfer, Dirk

    2012-12-01

    Angiosperms transport their photoassimilates through sieve tubes, which comprise longitudinally-connected sieve elements. In dicots and also some monocots, the sieve elements contain parietal structural proteins known as phloem proteins or P-proteins. Following injury, P proteins disperse and accumulate as viscous plugs at the sieve plates to prevent the loss of valuable transport sugars. Tobacco (Nicotiana tabacum) P-proteins are multimeric complexes comprising subunits encoded by members of the SEO (sieve element occlusion) gene family. The existence of multiple subunits suggests that P-protein assembly involves interactions between SEO proteins, but this process is largely uncharacterized and it is unclear whether the different subunits perform unique roles or are redundant. We therefore extended our analysis of the tobacco P-proteins NtSEO1 and NtSEO2 to investigate potential interactions between them, and found that both proteins can form homomeric and heteromeric complexes in planta.

  5. PFP: Automated prediction of gene ontology functional annotations with confidence scores using protein sequence data.

    Science.gov (United States)

    Hawkins, Troy; Chitale, Meghana; Luban, Stanislav; Kihara, Daisuke

    2009-02-15

    Protein function prediction is a central problem in bioinformatics, increasing in importance recently due to the rapid accumulation of biological data awaiting interpretation. Sequence data represents the bulk of this new stock and is the obvious target for consideration as input, as newly sequenced organisms often lack any other type of biological characterization. We have previously introduced PFP (Protein Function Prediction) as our sequence-based predictor of Gene Ontology (GO) functional terms. PFP interprets the results of a PSI-BLAST search by extracting and scoring individual functional attributes, searching a wide range of E-value sequence matches, and utilizing conventional data mining techniques to fill in missing information. We have shown it to be effective in predicting both specific and low-resolution functional attributes when sufficient data is unavailable. Here we describe (1) significant improvements to the PFP infrastructure, including the addition of prediction significance and confidence scores, (2) a thorough benchmark of performance and comparisons to other related prediction methods, and (3) applications of PFP predictions to genome-scale data. We applied PFP predictions to uncharacterized protein sequences from 15 organisms. Among these sequences, 60-90% could be annotated with a GO molecular function term at high confidence (>or=80%). We also applied our predictions to the protein-protein interaction network of the Malaria plasmodium (Plasmodium falciparum). High confidence GO biological process predictions (>or=90%) from PFP increased the number of fully enriched interactions in this dataset from 23% of interactions to 94%. Our benchmark comparison shows significant performance improvement of PFP relative to GOtcha, InterProScan, and PSI-BLAST predictions. This is consistent with the performance of PFP as the overall best predictor in both the AFP-SIG '05 and CASP7 function (FN) assessments. PFP is available as a web service at http

  6. DARC 2.0: Improved Docking and Virtual Screening at Protein Interaction Sites.

    Directory of Open Access Journals (Sweden)

    Ragul Gowthaman

    Full Text Available Over the past decade, protein-protein interactions have emerged as attractive but challenging targets for therapeutic intervention using small molecules. Due to the relatively flat surfaces that typify protein interaction sites, modern virtual screening tools developed for optimal performance against "traditional" protein targets perform less well when applied instead at protein interaction sites. Previously, we described a docking method specifically catered to the shallow binding modes characteristic of small-molecule inhibitors of protein interaction sites. This method, called DARC (Docking Approach using Ray Casting, operates by comparing the topography of the protein surface when "viewed" from a vantage point inside the protein against the topography of a bound ligand when "viewed" from the same vantage point. Here, we present five key enhancements to DARC. First, we use multiple vantage points to more accurately determine protein-ligand surface complementarity. Second, we describe a new scheme for rapidly determining optimal weights in the DARC scoring function. Third, we incorporate sampling of ligand conformers "on-the-fly" during docking. Fourth, we move beyond simple shape complementarity and introduce a term in the scoring function to capture electrostatic complementarity. Finally, we adjust the control flow in our GPU implementation of DARC to achieve greater speedup of these calculations. At each step of this study, we evaluate the performance of DARC in a "pose recapitulation" experiment: predicting the binding mode of 25 inhibitors each solved in complex with its distinct target protein (a protein interaction site. Whereas the previous version of DARC docked only one of these inhibitors to within 2 Å RMSD of its position in the crystal structure, the newer version achieves this level of accuracy for 12 of the 25 complexes, corresponding to a statistically significant performance improvement (p < 0.001. Collectively then, we find

  7. Analysis of the protein-protein interactions between the human acidic ribosomal P-proteins: evaluation by the two hybrid system

    DEFF Research Database (Denmark)

    Tchórzewski, M; Boldyreff, B; Issinger, O

    2000-01-01

    The surface acidic ribosomal proteins (P-proteins), together with ribosomal core protein P0 form a multimeric lateral protuberance on the 60 S ribosomal subunit. This structure, also called stalk, is important for efficient translational activity of the ribosome. In order to shed more light...... forms the 60 S ribosomal stalk: P0-(P1/P2)(2). Additionally, mutual interactions among human and yeast P-proteins were analyzed. Heterodimer formation could be observed between human P2 and yeast P1 proteins....

  8. Stoichiometric balance of protein copy numbers is measurable and functionally significant in a protein-protein interaction network for yeast endocytosis.

    Science.gov (United States)

    Holland, David O; Johnson, Margaret E

    2018-03-01

    Stoichiometric balance, or dosage balance, implies that proteins that are subunits of obligate complexes (e.g. the ribosome) should have copy numbers expressed to match their stoichiometry in that complex. Establishing balance (or imbalance) is an important tool for inferring subunit function and assembly bottlenecks. We show here that these correlations in protein copy numbers can extend beyond complex subunits to larger protein-protein interactions networks (PPIN) involving a range of reversible binding interactions. We develop a simple method for quantifying balance in any interface-resolved PPINs based on network structure and experimentally observed protein copy numbers. By analyzing such a network for the clathrin-mediated endocytosis (CME) system in yeast, we found that the real protein copy numbers were significantly more balanced in relation to their binding partners compared to randomly sampled sets of yeast copy numbers. The observed balance is not perfect, highlighting both under and overexpressed proteins. We evaluate the potential cost and benefits of imbalance using two criteria. First, a potential cost to imbalance is that 'leftover' proteins without remaining functional partners are free to misinteract. We systematically quantify how this misinteraction cost is most dangerous for strong-binding protein interactions and for network topologies observed in biological PPINs. Second, a more direct consequence of imbalance is that the formation of specific functional complexes depends on relative copy numbers. We therefore construct simple kinetic models of two sub-networks in the CME network to assess multi-protein assembly of the ARP2/3 complex and a minimal, nine-protein clathrin-coated vesicle forming module. We find that the observed, imperfectly balanced copy numbers are less effective than balanced copy numbers in producing fast and complete multi-protein assemblies. However, we speculate that strategic imbalance in the vesicle forming module

  9. Unravelling Protein-Protein Interaction Networks Linked to Aliphatic and Indole Glucosinolate Biosynthetic Pathways in Arabidopsis

    Directory of Open Access Journals (Sweden)

    Sebastian J. Nintemann

    2017-11-01

    Full Text Available Within the cell, biosynthetic pathways are embedded in protein-protein interaction networks. In Arabidopsis, the biosynthetic pathways of aliphatic and indole glucosinolate defense compounds are well-characterized. However, little is known about the spatial orchestration of these enzymes and their interplay with the cellular environment. To address these aspects, we applied two complementary, untargeted approaches—split-ubiquitin yeast 2-hybrid and co-immunoprecipitation screens—to identify proteins interacting with CYP83A1 and CYP83B1, two homologous enzymes specific for aliphatic and indole glucosinolate biosynthesis, respectively. Our analyses reveal distinct functional networks with substantial interconnection among the identified interactors for both pathway-specific markers, and add to our knowledge about how biochemical pathways are connected to cellular processes. Specifically, a group of protein interactors involved in cell death and the hypersensitive response provides a potential link between the glucosinolate defense compounds and defense against biotrophic pathogens, mediated by protein-protein interactions.

  10. PDZ domain-mediated interactions of G protein-coupled receptors with postsynaptic density protein 95

    DEFF Research Database (Denmark)

    Møller, Thor C; Wirth, Volker F; Roberts, Nina Ingerslev

    2013-01-01

    G protein-coupled receptors (GPCRs) constitute the largest family of membrane proteins in the human genome. Their signaling is regulated by scaffold proteins containing PDZ domains, but although these interactions are important for GPCR function, they are still poorly understood. We here present...

  11. Quantitative analysis of protein-ligand interactions by NMR.

    Science.gov (United States)

    Furukawa, Ayako; Konuma, Tsuyoshi; Yanaka, Saeko; Sugase, Kenji

    2016-08-01

    Protein-ligand interactions have been commonly studied through static structures of the protein-ligand complex. Recently, however, there has been increasing interest in investigating the dynamics of protein-ligand interactions both for fundamental understanding of the underlying mechanisms and for drug development. NMR is a versatile and powerful tool, especially because it provides site-specific quantitative information. NMR has widely been used to determine the dissociation constant (KD), in particular, for relatively weak interactions. The simplest NMR method is a chemical-shift titration experiment, in which the chemical-shift changes of a protein in response to ligand titration are measured. There are other quantitative NMR methods, but they mostly apply only to interactions in the fast-exchange regime. These methods derive the dissociation constant from population-averaged NMR quantities of the free and bound states of a protein or ligand. In contrast, the recent advent of new relaxation-based experiments, including R2 relaxation dispersion and ZZ-exchange, has enabled us to obtain kinetic information on protein-ligand interactions in the intermediate- and slow-exchange regimes. Based on R2 dispersion or ZZ-exchange, methods that can determine the association rate, kon, dissociation rate, koff, and KD have been developed. In these approaches, R2 dispersion or ZZ-exchange curves are measured for multiple samples with different protein and/or ligand concentration ratios, and the relaxation data are fitted to theoretical kinetic models. It is critical to choose an appropriate kinetic model, such as the two- or three-state exchange model, to derive the correct kinetic information. The R2 dispersion and ZZ-exchange methods are suitable for the analysis of protein-ligand interactions with a micromolar or sub-micromolar dissociation constant but not for very weak interactions, which are typical in very fast exchange. This contrasts with the NMR methods that are used

  12. Regulation of PCNA-protein interactions for genome stability

    DEFF Research Database (Denmark)

    Mailand, Niels; Gibbs-Seymour, Ian; Bekker-Jensen, Simon

    2013-01-01

    Proliferating cell nuclear antigen (PCNA) has a central role in promoting faithful DNA replication, providing a molecular platform that facilitates the myriad protein-protein and protein-DNA interactions that occur at the replication fork. Numerous PCNA-associated proteins compete for binding...

  13. Effective comparative analysis of protein-protein interaction networks by measuring the steady-state network flow using a Markov model.

    Science.gov (United States)

    Jeong, Hyundoo; Qian, Xiaoning; Yoon, Byung-Jun

    2016-10-06

    Comparative analysis of protein-protein interaction (PPI) networks provides an effective means of detecting conserved functional network modules across different species. Such modules typically consist of orthologous proteins with conserved interactions, which can be exploited to computationally predict the modules through network comparison. In this work, we propose a novel probabilistic framework for comparing PPI networks and effectively predicting the correspondence between proteins, represented as network nodes, that belong to conserved functional modules across the given PPI networks. The basic idea is to estimate the steady-state network flow between nodes that belong to different PPI networks based on a Markov random walk model. The random walker is designed to make random moves to adjacent nodes within a PPI network as well as cross-network moves between potential orthologous nodes with high sequence similarity. Based on this Markov random walk model, we estimate the steady-state network flow - or the long-term relative frequency of the transitions that the random walker makes - between nodes in different PPI networks, which can be used as a probabilistic score measuring their potential correspondence. Subsequently, the estimated scores can be used for detecting orthologous proteins in conserved functional modules through network alignment. Through evaluations based on multiple real PPI networks, we demonstrate that the proposed scheme leads to improved alignment results that are biologically more meaningful at reduced computational cost, outperforming the current state-of-the-art algorithms. The source code and datasets can be downloaded from http://www.ece.tamu.edu/~bjyoon/CUFID .

  14. The dynamic multisite interactions between two intrinsically disordered proteins

    KAUST Repository

    Wu, Shaowen

    2017-05-11

    Protein interactions involving intrinsically disordered proteins (IDPs) comprise a variety of binding modes, from the well characterized folding upon binding to dynamic fuzzy complex. To date, most studies concern the binding of an IDP to a structured protein, while the Interaction between two IDPs is poorly understood. In this study, we combined NMR, smFRET, and molecular dynamics (MD) simulation to characterize the interaction between two IDPs, the C-terminal domain (CTD) of protein 4.1G and the nuclear mitotic apparatus (NuMA) protein. It is revealed that CTD and NuMA form a fuzzy complex with remaining structural disorder. Multiple binding sites on both proteins were identified by MD and mutagenesis studies. Our study provides an atomic scenario in which two IDPs bearing multiple binding sites interact with each other in dynamic equilibrium. The combined approach employed here could be widely applicable for investigating IDPs and their dynamic interactions.

  15. Predicting binding within disordered protein regions to structurally characterised peptide-binding domains.

    Directory of Open Access Journals (Sweden)

    Waqasuddin Khan

    Full Text Available Disordered regions of proteins often bind to structured domains, mediating interactions within and between proteins. However, it is difficult to identify a priori the short disordered regions involved in binding. We set out to determine if docking such peptide regions to peptide binding domains would assist in these predictions.We assembled a redundancy reduced dataset of SLiM (Short Linear Motif containing proteins from the ELM database. We selected 84 sequences which had an associated PDB structures showing the SLiM bound to a protein receptor, where the SLiM was found within a 50 residue region of the protein sequence which was predicted to be disordered. First, we investigated the Vina docking scores of overlapping tripeptides from the 50 residue SLiM containing disordered regions of the protein sequence to the corresponding PDB domain. We found only weak discrimination of docking scores between peptides involved in binding and adjacent non-binding peptides in this context (AUC 0.58.Next, we trained a bidirectional recurrent neural network (BRNN using as input the protein sequence, predicted secondary structure, Vina docking score and predicted disorder score. The results were very promising (AUC 0.72 showing that multiple sources of information can be combined to produce results which are clearly superior to any single source.We conclude that the Vina docking score alone has only modest power to define the location of a peptide within a larger protein region known to contain it. However, combining this information with other knowledge (using machine learning methods clearly improves the identification of peptide binding regions within a protein sequence. This approach combining docking with machine learning is primarily a predictor of binding to peptide-binding sites, and is not intended as a predictor of specificity of binding to particular receptors.

  16. Combining random gene fission and rational gene fusion to discover near-infrared fluorescent protein fragments that report on protein-protein interactions.

    Science.gov (United States)

    Pandey, Naresh; Nobles, Christopher L; Zechiedrich, Lynn; Maresso, Anthony W; Silberg, Jonathan J

    2015-05-15

    Gene fission can convert monomeric proteins into two-piece catalysts, reporters, and transcription factors for systems and synthetic biology. However, some proteins can be challenging to fragment without disrupting function, such as near-infrared fluorescent protein (IFP). We describe a directed evolution strategy that can overcome this challenge by randomly fragmenting proteins and concomitantly fusing the protein fragments to pairs of proteins or peptides that associate. We used this method to create libraries that express fragmented IFP as fusions to a pair of associating peptides (IAAL-E3 and IAAL-K3) and proteins (CheA and CheY) and screened for fragmented IFP with detectable near-infrared fluorescence. Thirteen novel fragmented IFPs were identified, all of which arose from backbone fission proximal to the interdomain linker. Either the IAAL-E3 and IAAL-K3 peptides or CheA and CheY proteins could assist with IFP fragment complementation, although the IAAL-E3 and IAAL-K3 peptides consistently yielded higher fluorescence. These results demonstrate how random gene fission can be coupled to rational gene fusion to create libraries enriched in fragmented proteins with AND gate logic that is dependent upon a protein-protein interaction, and they suggest that these near-infrared fluorescent protein fragments will be suitable as reporters for pairs of promoters and protein-protein interactions within whole animals.

  17. Identification of proteins that may directly interact with human RPA.

    Science.gov (United States)

    Nakaya, Ryou; Takaya, Junichiro; Onuki, Takeshi; Moritani, Mariko; Nozaki, Naohito; Ishimi, Yukio

    2010-11-01

    RPA, which consisted of three subunits (RPA1, 2 and 3), plays essential roles in DNA transactions. At the DNA replication forks, RPA binds to single-stranded DNA region to stabilize the structure and to assemble other replication proteins. Interactions between RPA and several replication proteins have been reported but the analysis is not comprehensive. We systematically performed the qualitative analysis to identify RPA interaction partners to understand the protein-protein interaction at the replication forks. We expressed in insect cells the three subunits of human RPA, together with one replication protein, which is present at the forks under normal conditions and/or under the replication stress conditions, to examine the interaction. Among 30 proteins examined in total, it was found that at least 14 proteins interacted with RPA. RPA interacted with MCM3-7, MCM-BP and CDC45 proteins among the proteins that play roles in the initiation and the elongation of the DNA replication. RPA bound with TIPIN, CLASPIN and RAD17, which are involved in the DNA replication checkpoint functions. RPA also bound with cyclin-dependent kinases and an amino-terminal fragment of Rb protein that negatively regulates DNA replication. These results suggest that RPA interacts with the specific proteins among those that play roles in the regulation of the replication fork progression.

  18. Multiplex single-molecule interaction profiling of DNA-barcoded proteins.

    Science.gov (United States)

    Gu, Liangcai; Li, Chao; Aach, John; Hill, David E; Vidal, Marc; Church, George M

    2014-11-27

    In contrast with advances in massively parallel DNA sequencing, high-throughput protein analyses are often limited by ensemble measurements, individual analyte purification and hence compromised quality and cost-effectiveness. Single-molecule protein detection using optical methods is limited by the number of spectrally non-overlapping chromophores. Here we introduce a single-molecular-interaction sequencing (SMI-seq) technology for parallel protein interaction profiling leveraging single-molecule advantages. DNA barcodes are attached to proteins collectively via ribosome display or individually via enzymatic conjugation. Barcoded proteins are assayed en masse in aqueous solution and subsequently immobilized in a polyacrylamide thin film to construct a random single-molecule array, where barcoding DNAs are amplified into in situ polymerase colonies (polonies) and analysed by DNA sequencing. This method allows precise quantification of various proteins with a theoretical maximum array density of over one million polonies per square millimetre. Furthermore, protein interactions can be measured on the basis of the statistics of colocalized polonies arising from barcoding DNAs of interacting proteins. Two demanding applications, G-protein coupled receptor and antibody-binding profiling, are demonstrated. SMI-seq enables 'library versus library' screening in a one-pot assay, simultaneously interrogating molecular binding affinity and specificity.

  19. Surface dynamics in allosteric regulation of protein-protein interactions: modulation of calmodulin functions by Ca2+.

    Directory of Open Access Journals (Sweden)

    Yosef Y Kuttner

    2013-04-01

    Full Text Available Knowledge of the structural basis of protein-protein interactions (PPI is of fundamental importance for understanding the organization and functioning of biological networks and advancing the design of therapeutics which target PPI. Allosteric modulators play an important role in regulating such interactions by binding at site(s orthogonal to the complex interface and altering the protein's propensity for complex formation. In this work, we apply an approach recently developed by us for analyzing protein surfaces based on steered molecular dynamics simulation (SMD to the study of the dynamic properties of functionally distinct conformations of a model protein, calmodulin (CaM, whose ability to interact with target proteins is regulated by the presence of the allosteric modulator Ca(2+. Calmodulin is a regulatory protein that acts as an intracellular Ca(2+ sensor to control a wide variety of cellular processes. We demonstrate that SMD analysis is capable of pinpointing CaM surfaces implicated in the recognition of both the allosteric modulator Ca(2+ and target proteins. Our analysis of changes in the dynamic properties of the CaM backbone elicited by Ca(2+ binding yielded new insights into the molecular mechanism of allosteric regulation of CaM-target interactions.

  20. Using Förster-Resonance Energy Transfer to Measure Protein Interactions Between Bcl-2 Family Proteins on Mitochondrial Membranes.

    Science.gov (United States)

    Pogmore, Justin P; Pemberton, James M; Chi, Xiaoke; Andrews, David W

    2016-01-01

    The Bcl-2 family of proteins regulates the process of mitochondrial outer membrane permeabilization, causing the release of cytochrome c and committing a cell to apoptosis. The majority of the functional interactions between these proteins occur at, on, or within the mitochondrial outer membrane, complicating structural studies of the proteins and complexes. As a result most in vitro studies of these protein-protein interactions use truncated proteins and/or detergents which can cause artificial interactions. Herein, we describe a detergent-free, fluorescence-based, in vitro technique to study binding between full-length recombinant Bcl-2 family proteins, particularly cleaved BID (cBID) and BCL-XL, on the membranes of purified mitochondria.