WorldWideScience

Sample records for predicting protein subcellular

  1. Protein subcellular localization prediction using artificial intelligence technology.

    Science.gov (United States)

    Nair, Rajesh; Rost, Burkhard

    2008-01-01

    Proteins perform many important tasks in living organisms, such as catalysis of biochemical reactions, transport of nutrients, and recognition and transmission of signals. The plethora of aspects of the role of any particular protein is referred to as its "function." One aspect of protein function that has been the target of intensive research by computational biologists is its subcellular localization. Proteins must be localized in the same subcellular compartment to cooperate toward a common physiological function. Aberrant subcellular localization of proteins can result in several diseases, including kidney stones, cancer, and Alzheimer's disease. To date, sequence homology remains the most widely used method for inferring the function of a protein. However, the application of advanced artificial intelligence (AI)-based techniques in recent years has resulted in significant improvements in our ability to predict the subcellular localization of a protein. The prediction accuracy has risen steadily over the years, in large part due to the application of AI-based methods such as hidden Markov models (HMMs), neural networks (NNs), and support vector machines (SVMs), although the availability of larger experimental datasets has also played a role. Automatic methods that mine textual information from the biological literature and molecular biology databases have considerably sped up the process of annotation for proteins for which some information regarding function is available in the literature. State-of-the-art methods based on NNs and HMMs can predict the presence of N-terminal sorting signals extremely accurately. Ab initio methods that predict subcellular localization for any protein sequence using only the native amino acid sequence and features predicted from the native sequence have shown the most remarkable improvements. The prediction accuracy of these methods has increased by over 30% in the past decade. The accuracy of these methods is now on par with

  2. Predicting the subcellular localization of viral proteins within a mammalian host cell

    Directory of Open Access Journals (Sweden)

    Thomas DY

    2006-04-01

    Full Text Available Abstract Background The bioinformatic prediction of protein subcellular localization has been extensively studied for prokaryotic and eukaryotic organisms. However, this is not the case for viruses whose proteins are often involved in extensive interactions at various subcellular localizations with host proteins. Results Here, we investigate the extent of utilization of human cellular localization mechanisms by viral proteins and we demonstrate that appropriate eukaryotic subcellular localization predictors can be used to predict viral protein localization within the host cell. Conclusion Such predictions provide a method to rapidly annotate viral proteomes with subcellular localization information. They are likely to have widespread applications both in the study of the functions of viral proteins in the host cell and in the design of antiviral drugs.

  3. Multi-Label Learning via Random Label Selection for Protein Subcellular Multi-Locations Prediction.

    Science.gov (United States)

    Wang, Xiao; Li, Guo-Zheng

    2013-03-12

    Prediction of protein subcellular localization is an important but challenging problem, particularly when proteins may simultaneously exist at, or move between, two or more different subcellular location sites. Most of the existing protein subcellular localization methods are only used to deal with the single-location proteins. In the past few years, only a few methods have been proposed to tackle proteins with multiple locations. However, they only adopt a simple strategy, that is, transforming the multi-location proteins to multiple proteins with single location, which doesn't take correlations among different subcellular locations into account. In this paper, a novel method named RALS (multi-label learning via RAndom Label Selection), is proposed to learn from multi-location proteins in an effective and efficient way. Through five-fold cross validation test on a benchmark dataset, we demonstrate our proposed method with consideration of label correlations obviously outperforms the baseline BR method without consideration of label correlations, indicating correlations among different subcellular locations really exist and contribute to improvement of prediction performance. Experimental results on two benchmark datasets also show that our proposed methods achieve significantly higher performance than some other state-of-the-art methods in predicting subcellular multi-locations of proteins. The prediction web server is available at http://levis.tongji.edu.cn:8080/bioinfo/MLPred-Euk/ for the public usage.

  4. DeepLoc: prediction of protein subcellular localization using deep learning

    DEFF Research Database (Denmark)

    Almagro Armenteros, Jose Juan; Sønderby, Casper Kaae; Sønderby, Søren Kaae

    2017-01-01

    The prediction of eukaryotic protein subcellular localization is a well-studied topic in bioinformatics due to its relevance in proteomics research. Many machine learning methods have been successfully applied in this task, but in most of them, predictions rely on annotation of homologues from...... knowledge databases. For novel proteins where no annotated homologues exist, and for predicting the effects of sequence variants, it is desirable to have methods for predicting protein properties from sequence information only. Here, we present a prediction algorithm using deep neural networks to predict...... current state-of-the-art algorithms, including those relying on homology information. The method is available as a web server at http://www.cbs.dtu.dk/services/DeepLoc . Example code is available at https://github.com/JJAlmagro/subcellular_localization . The dataset is available at http...

  5. Predicting protein subcellular locations using hierarchical ensemble of Bayesian classifiers based on Markov chains

    Directory of Open Access Journals (Sweden)

    Eils Roland

    2006-06-01

    Full Text Available Abstract Background The subcellular location of a protein is closely related to its function. It would be worthwhile to develop a method to predict the subcellular location for a given protein when only the amino acid sequence of the protein is known. Although many efforts have been made to predict subcellular location from sequence information only, there is the need for further research to improve the accuracy of prediction. Results A novel method called HensBC is introduced to predict protein subcellular location. HensBC is a recursive algorithm which constructs a hierarchical ensemble of classifiers. The classifiers used are Bayesian classifiers based on Markov chain models. We tested our method on six various datasets; among them are Gram-negative bacteria dataset, data for discriminating outer membrane proteins and apoptosis proteins dataset. We observed that our method can predict the subcellular location with high accuracy. Another advantage of the proposed method is that it can improve the accuracy of the prediction of some classes with few sequences in training and is therefore useful for datasets with imbalanced distribution of classes. Conclusion This study introduces an algorithm which uses only the primary sequence of a protein to predict its subcellular location. The proposed recursive scheme represents an interesting methodology for learning and combining classifiers. The method is computationally efficient and competitive with the previously reported approaches in terms of prediction accuracies as empirical results indicate. The code for the software is available upon request.

  6. Detrended cross-correlation coefficient: Application to predict apoptosis protein subcellular localization.

    Science.gov (United States)

    Liang, Yunyun; Liu, Sanyang; Zhang, Shengli

    2016-12-01

    Apoptosis, or programed cell death, plays a central role in the development and homeostasis of an organism. Obtaining information on subcellular location of apoptosis proteins is very helpful for understanding the apoptosis mechanism. The prediction of subcellular localization of an apoptosis protein is still a challenging task, and existing methods mainly based on protein primary sequences. In this paper, we introduce a new position-specific scoring matrix (PSSM)-based method by using detrended cross-correlation (DCCA) coefficient of non-overlapping windows. Then a 190-dimensional (190D) feature vector is constructed on two widely used datasets: CL317 and ZD98, and support vector machine is adopted as classifier. To evaluate the proposed method, objective and rigorous jackknife cross-validation tests are performed on the two datasets. The results show that our approach offers a novel and reliable PSSM-based tool for prediction of apoptosis protein subcellular localization. Copyright © 2016 Elsevier Inc. All rights reserved.

  7. A novel representation for apoptosis protein subcellular localization prediction using support vector machine.

    Science.gov (United States)

    Zhang, Li; Liao, Bo; Li, Dachao; Zhu, Wen

    2009-07-21

    Apoptosis, or programmed cell death, plays an important role in development of an organism. Obtaining information on subcellular location of apoptosis proteins is very helpful to understand the apoptosis mechanism. In this paper, based on the concept that the position distribution information of amino acids is closely related with the structure and function of proteins, we introduce the concept of distance frequency [Matsuda, S., Vert, J.P., Ueda, N., Toh, H., Akutsu, T., 2005. A novel representation of protein sequences for prediction of subcellular location using support vector machines. Protein Sci. 14, 2804-2813] and propose a novel way to calculate distance frequencies. In order to calculate the local features, each protein sequence is separated into p parts with the same length in our paper. Then we use the novel representation of protein sequences and adopt support vector machine to predict subcellular location. The overall prediction accuracy is significantly improved by jackknife test.

  8. Prediction of protein subcellular localization using support vector machine with the choice of proper kernel

    Directory of Open Access Journals (Sweden)

    Al Mehedi Hasan

    2017-07-01

    Full Text Available The prediction of subcellular locations of proteins can provide useful hints for revealing their functions as well as for understanding the mechanisms of some diseases and, finally, for developing novel drugs. As the number of newly discovered proteins has been growing exponentially, laboratory-based experiments to determine the location of an uncharacterized protein in a living cell have become both expensive and time-consuming. Consequently, to tackle these challenges, computational methods are being developed as an alternative to help biologists in selecting target proteins and designing related experiments. However, the success of protein subcellular localization prediction is still a complicated and challenging problem, particularly when query proteins may have multi-label characteristics, i.e. their simultaneous existence in more than one subcellular location, or if they move between two or more different subcellular locations as well. At this point, to get rid of this problem, several types of subcellular localization prediction methods with different levels of accuracy have been proposed. The support vector machine (SVM has been employed to provide potential solutions for problems connected with the prediction of protein subcellular localization. However, the practicability of SVM is affected by difficulties in selecting its appropriate kernel as well as in selecting the parameters of that selected kernel. The literature survey has shown that most researchers apply the radial basis function (RBF kernel to build a SVM based subcellular localization prediction system. Surprisingly, there are still many other kernel functions which have not yet been applied in the prediction of protein subcellular localization. However, the nature of this classification problem requires the application of different kernels for SVM to ensure an optimal result. From this viewpoint, this paper presents the work to apply different kernels for SVM in protein

  9. MultiLoc2: integrating phylogeny and Gene Ontology terms improves subcellular protein localization prediction

    Directory of Open Access Journals (Sweden)

    Kohlbacher Oliver

    2009-09-01

    Full Text Available Abstract Background Knowledge of subcellular localization of proteins is crucial to proteomics, drug target discovery and systems biology since localization and biological function are highly correlated. In recent years, numerous computational prediction methods have been developed. Nevertheless, there is still a need for prediction methods that show more robustness and higher accuracy. Results We extended our previous MultiLoc predictor by incorporating phylogenetic profiles and Gene Ontology terms. Two different datasets were used for training the system, resulting in two versions of this high-accuracy prediction method. One version is specialized for globular proteins and predicts up to five localizations, whereas a second version covers all eleven main eukaryotic subcellular localizations. In a benchmark study with five localizations, MultiLoc2 performs considerably better than other methods for animal and plant proteins and comparably for fungal proteins. Furthermore, MultiLoc2 performs clearly better when using a second dataset that extends the benchmark study to all eleven main eukaryotic subcellular localizations. Conclusion MultiLoc2 is an extensive high-performance subcellular protein localization prediction system. By incorporating phylogenetic profiles and Gene Ontology terms MultiLoc2 yields higher accuracies compared to its previous version. Moreover, it outperforms other prediction systems in two benchmarks studies. MultiLoc2 is available as user-friendly and free web-service, available at: http://www-bs.informatik.uni-tuebingen.de/Services/MultiLoc2.

  10. Multi-label learning with fuzzy hypergraph regularization for protein subcellular location prediction.

    Science.gov (United States)

    Chen, Jing; Tang, Yuan Yan; Chen, C L Philip; Fang, Bin; Lin, Yuewei; Shang, Zhaowei

    2014-12-01

    Protein subcellular location prediction aims to predict the location where a protein resides within a cell using computational methods. Considering the main limitations of the existing methods, we propose a hierarchical multi-label learning model FHML for both single-location proteins and multi-location proteins. The latent concepts are extracted through feature space decomposition and label space decomposition under the nonnegative data factorization framework. The extracted latent concepts are used as the codebook to indirectly connect the protein features to their annotations. We construct dual fuzzy hypergraphs to capture the intrinsic high-order relations embedded in not only feature space, but also label space. Finally, the subcellular location annotation information is propagated from the labeled proteins to the unlabeled proteins by performing dual fuzzy hypergraph Laplacian regularization. The experimental results on the six protein benchmark datasets demonstrate the superiority of our proposed method by comparing it with the state-of-the-art methods, and illustrate the benefit of exploiting both feature correlations and label correlations.

  11. ngLOC: software and web server for predicting protein subcellular localization in prokaryotes and eukaryotes

    Directory of Open Access Journals (Sweden)

    King Brian R

    2012-07-01

    Full Text Available Abstract Background Understanding protein subcellular localization is a necessary component toward understanding the overall function of a protein. Numerous computational methods have been published over the past decade, with varying degrees of success. Despite the large number of published methods in this area, only a small fraction of them are available for researchers to use in their own studies. Of those that are available, many are limited by predicting only a small number of organelles in the cell. Additionally, the majority of methods predict only a single location for a sequence, even though it is known that a large fraction of the proteins in eukaryotic species shuttle between locations to carry out their function. Findings We present a software package and a web server for predicting the subcellular localization of protein sequences based on the ngLOC method. ngLOC is an n-gram-based Bayesian classifier that predicts subcellular localization of proteins both in prokaryotes and eukaryotes. The overall prediction accuracy varies from 89.8% to 91.4% across species. This program can predict 11 distinct locations each in plant and animal species. ngLOC also predicts 4 and 5 distinct locations on gram-positive and gram-negative bacterial datasets, respectively. Conclusions ngLOC is a generic method that can be trained by data from a variety of species or classes for predicting protein subcellular localization. The standalone software is freely available for academic use under GNU GPL, and the ngLOC web server is also accessible at http://ngloc.unmc.edu.

  12. Prediction of essential proteins based on subcellular localization and gene expression correlation.

    Science.gov (United States)

    Fan, Yetian; Tang, Xiwei; Hu, Xiaohua; Wu, Wei; Ping, Qing

    2017-12-01

    Essential proteins are indispensable to the survival and development process of living organisms. To understand the functional mechanisms of essential proteins, which can be applied to the analysis of disease and design of drugs, it is important to identify essential proteins from a set of proteins first. As traditional experimental methods designed to test out essential proteins are usually expensive and laborious, computational methods, which utilize biological and topological features of proteins, have attracted more attention in recent years. Protein-protein interaction networks, together with other biological data, have been explored to improve the performance of essential protein prediction. The proposed method SCP is evaluated on Saccharomyces cerevisiae datasets and compared with five other methods. The results show that our method SCP outperforms the other five methods in terms of accuracy of essential protein prediction. In this paper, we propose a novel algorithm named SCP, which combines the ranking by a modified PageRank algorithm based on subcellular compartments information, with the ranking by Pearson correlation coefficient (PCC) calculated from gene expression data. Experiments show that subcellular localization information is promising in boosting essential protein prediction.

  13. Accurate prediction of subcellular location of apoptosis proteins combining Chou’s PseAAC and PsePSSM based on wavelet denoising

    Science.gov (United States)

    Chen, Cheng; Chen, Rui-Xin; Wang, Lei; Wang, Ming-Hui; Zhang, Yan

    2017-01-01

    Apoptosis proteins subcellular localization information are very important for understanding the mechanism of programmed cell death and the development of drugs. The prediction of subcellular localization of an apoptosis protein is still a challenging task because the prediction of apoptosis proteins subcellular localization can help to understand their function and the role of metabolic processes. In this paper, we propose a novel method for protein subcellular localization prediction. Firstly, the features of the protein sequence are extracted by combining Chou's pseudo amino acid composition (PseAAC) and pseudo-position specific scoring matrix (PsePSSM), then the feature information of the extracted is denoised by two-dimensional (2-D) wavelet denoising. Finally, the optimal feature vectors are input to the SVM classifier to predict subcellular location of apoptosis proteins. Quite promising predictions are obtained using the jackknife test on three widely used datasets and compared with other state-of-the-art methods. The results indicate that the method proposed in this paper can remarkably improve the prediction accuracy of apoptosis protein subcellular localization, which will be a supplementary tool for future proteomics research. PMID:29296195

  14. HPSLPred: An Ensemble Multi-Label Classifier for Human Protein Subcellular Location Prediction with Imbalanced Source.

    Science.gov (United States)

    Wan, Shixiang; Duan, Yucong; Zou, Quan

    2017-09-01

    Predicting the subcellular localization of proteins is an important and challenging problem. Traditional experimental approaches are often expensive and time-consuming. Consequently, a growing number of research efforts employ a series of machine learning approaches to predict the subcellular location of proteins. There are two main challenges among the state-of-the-art prediction methods. First, most of the existing techniques are designed to deal with multi-class rather than multi-label classification, which ignores connections between multiple labels. In reality, multiple locations of particular proteins imply that there are vital and unique biological significances that deserve special focus and cannot be ignored. Second, techniques for handling imbalanced data in multi-label classification problems are necessary, but never employed. For solving these two issues, we have developed an ensemble multi-label classifier called HPSLPred, which can be applied for multi-label classification with an imbalanced protein source. For convenience, a user-friendly webserver has been established at http://server.malab.cn/HPSLPred. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  15. ESLpred2: improved method for predicting subcellular localization of eukaryotic proteins

    Directory of Open Access Journals (Sweden)

    Raghava Gajendra PS

    2008-11-01

    Full Text Available Abstract Background The expansion of raw protein sequence databases in the post genomic era and availability of fresh annotated sequences for major localizations particularly motivated us to introduce a new improved version of our previously forged eukaryotic subcellular localizations prediction method namely "ESLpred". Since, subcellular localization of a protein offers essential clues about its functioning, hence, availability of localization predictor would definitely aid and expedite the protein deciphering studies. However, robustness of a predictor is highly dependent on the superiority of dataset and extracted protein attributes; hence, it becomes imperative to improve the performance of presently available method using latest dataset and crucial input features. Results Here, we describe augmentation in the prediction performance obtained for our most popular ESLpred method using new crucial features as an input to Support Vector Machine (SVM. In addition, recently available, highly non-redundant dataset encompassing three kingdoms specific protein sequence sets; 1198 fungi sequences, 2597 from animal and 491 plant sequences were also included in the present study. First, using the evolutionary information in the form of profile composition along with whole and N-terminal sequence composition as an input feature vector of 440 dimensions, overall accuracies of 72.7, 75.8 and 74.5% were achieved respectively after five-fold cross-validation. Further, enhancement in performance was observed when similarity search based results were coupled with whole and N-terminal sequence composition along with profile composition by yielding overall accuracies of 75.9, 80.8, 76.6% respectively; best accuracies reported till date on the same datasets. Conclusion These results provide confidence about the reliability and accurate prediction of SVM modules generated in the present study using sequence and profile compositions along with similarity search

  16. Plant-mPLoc: a top-down strategy to augment the power for predicting plant protein subcellular localization.

    Directory of Open Access Journals (Sweden)

    Kuo-Chen Chou

    Full Text Available One of the fundamental goals in proteomics and cell biology is to identify the functions of proteins in various cellular organelles and pathways. Information of subcellular locations of proteins can provide useful insights for revealing their functions and understanding how they interact with each other in cellular network systems. Most of the existing methods in predicting plant protein subcellular localization can only cover three or four location sites, and none of them can be used to deal with multiplex plant proteins that can simultaneously exist at two, or move between, two or more different location sits. Actually, such multiplex proteins might have special biological functions worthy of particular notice. The present study was devoted to improve the existing plant protein subcellular location predictors from the aforementioned two aspects. A new predictor called "Plant-mPLoc" is developed by integrating the gene ontology information, functional domain information, and sequential evolutionary information through three different modes of pseudo amino acid composition. It can be used to identify plant proteins among the following 12 location sites: (1 cell membrane, (2 cell wall, (3 chloroplast, (4 cytoplasm, (5 endoplasmic reticulum, (6 extracellular, (7 Golgi apparatus, (8 mitochondrion, (9 nucleus, (10 peroxisome, (11 plastid, and (12 vacuole. Compared with the existing methods for predicting plant protein subcellular localization, the new predictor is much more powerful and flexible. Particularly, it also has the capacity to deal with multiple-location proteins, which is beyond the reach of any existing predictors specialized for identifying plant protein subcellular localization. As a user-friendly web-server, Plant-mPLoc is freely accessible at http://www.csbio.sjtu.edu.cn/bioinf/plant-multi/. Moreover, for the convenience of the vast majority of experimental scientists, a step-by-step guide is provided on how to use the web-server to

  17. Prediction of protein subcellular locations by GO-FunD-PseAA predictor.

    Science.gov (United States)

    Chou, Kuo-Chen; Cai, Yu-Dong

    2004-08-06

    The localization of a protein in a cell is closely correlated with its biological function. With the explosion of protein sequences entering into DataBanks, it is highly desired to develop an automated method that can fast identify their subcellular location. This will expedite the annotation process, providing timely useful information for both basic research and industrial application. In view of this, a powerful predictor has been developed by hybridizing the gene ontology approach [Nat. Genet. 25 (2000) 25], functional domain composition approach [J. Biol. Chem. 277 (2002) 45765], and the pseudo-amino acid composition approach [Proteins Struct. Funct. Genet. 43 (2001) 246; Erratum: ibid. 44 (2001) 60]. As a showcase, the recently constructed dataset [Bioinformatics 19 (2003) 1656] was used for demonstration. The dataset contains 7589 proteins classified into 12 subcellular locations: chloroplast, cytoplasmic, cytoskeleton, endoplasmic reticulum, extracellular, Golgi apparatus, lysosomal, mitochondrial, nuclear, peroxisomal, plasma membrane, and vacuolar. The overall success rate of prediction obtained by the jackknife cross-validation was 92%. This is so far the highest success rate performed on this dataset by following an objective and rigorous cross-validation procedure.

  18. Evaluation and comparison of mammalian subcellular localization prediction methods

    Directory of Open Access Journals (Sweden)

    Fink J Lynn

    2006-12-01

    Full Text Available Abstract Background Determination of the subcellular location of a protein is essential to understanding its biochemical function. This information can provide insight into the function of hypothetical or novel proteins. These data are difficult to obtain experimentally but have become especially important since many whole genome sequencing projects have been finished and many resulting protein sequences are still lacking detailed functional information. In order to address this paucity of data, many computational prediction methods have been developed. However, these methods have varying levels of accuracy and perform differently based on the sequences that are presented to the underlying algorithm. It is therefore useful to compare these methods and monitor their performance. Results In order to perform a comprehensive survey of prediction methods, we selected only methods that accepted large batches of protein sequences, were publicly available, and were able to predict localization to at least nine of the major subcellular locations (nucleus, cytosol, mitochondrion, extracellular region, plasma membrane, Golgi apparatus, endoplasmic reticulum (ER, peroxisome, and lysosome. The selected methods were CELLO, MultiLoc, Proteome Analyst, pTarget and WoLF PSORT. These methods were evaluated using 3763 mouse proteins from SwissProt that represent the source of the training sets used in development of the individual methods. In addition, an independent evaluation set of 2145 mouse proteins from LOCATE with a bias towards the subcellular localization underrepresented in SwissProt was used. The sensitivity and specificity were calculated for each method and compared to a theoretical value based on what might be observed by random chance. Conclusion No individual method had a sufficient level of sensitivity across both evaluation sets that would enable reliable application to hypothetical proteins. All methods showed lower performance on the LOCATE

  19. ClubSub-P: Cluster-based subcellular localization prediction for Gram-negative bacteria and Archaea.

    Directory of Open Access Journals (Sweden)

    Nagarajan eParamasivam

    2011-11-01

    Full Text Available The subcellular localization of proteins provides important clues to their function in a cell. In our efforts to predict useful vaccine targets against Gram-negative bacteria, we noticed that misannotated start codons frequently lead to wrongly assigned subcellular localizations. This and other problems in subcellular localization prediction, such as the relatively high false positive and false negative rates of some tools, can be avoided by applying multiple prediction tools to groups of homologous proteins. Here we present ClubSub-P, an online database that combines existing subcellular localization prediction tools into a consensus pipeline from more than 600 proteomes of fully sequenced microorganisms. On top of the consensus prediction at the level of single sequences, the tool uses clusters of homologous proteins from Gram-negative bacteria and from Archaea to eliminate false positive and false negative predictions. ClubSub-P can assign the subcellular localization of proteins from Gram-negative bacteria and Archaea with high precision. The database is searchable, and can easily be expanded using either new bacterial genomes or new prediction tools as they become available. This will further improve the performance of the subcellular localization prediction, as well as the detection of misannotated start codons and other annotation errors. ClubSub-P is available online at http://toolkit.tuebingen.mpg.de/clubsubp/

  20. HybridGO-Loc: mining hybrid features on gene ontology for predicting subcellular localization of multi-location proteins.

    Science.gov (United States)

    Wan, Shibiao; Mak, Man-Wai; Kung, Sun-Yuan

    2014-01-01

    Protein subcellular localization prediction, as an essential step to elucidate the functions in vivo of proteins and identify drugs targets, has been extensively studied in previous decades. Instead of only determining subcellular localization of single-label proteins, recent studies have focused on predicting both single- and multi-location proteins. Computational methods based on Gene Ontology (GO) have been demonstrated to be superior to methods based on other features. However, existing GO-based methods focus on the occurrences of GO terms and disregard their relationships. This paper proposes a multi-label subcellular-localization predictor, namely HybridGO-Loc, that leverages not only the GO term occurrences but also the inter-term relationships. This is achieved by hybridizing the GO frequencies of occurrences and the semantic similarity between GO terms. Given a protein, a set of GO terms are retrieved by searching against the gene ontology database, using the accession numbers of homologous proteins obtained via BLAST search as the keys. The frequency of GO occurrences and semantic similarity (SS) between GO terms are used to formulate frequency vectors and semantic similarity vectors, respectively, which are subsequently hybridized to construct fusion vectors. An adaptive-decision based multi-label support vector machine (SVM) classifier is proposed to classify the fusion vectors. Experimental results based on recent benchmark datasets and a new dataset containing novel proteins show that the proposed hybrid-feature predictor significantly outperforms predictors based on individual GO features as well as other state-of-the-art predictors. For readers' convenience, the HybridGO-Loc server, which is for predicting virus or plant proteins, is available online at http://bioinfo.eie.polyu.edu.hk/HybridGoServer/.

  1. Predicting Subcellular Localization of Proteins by Bioinformatic Algorithms

    DEFF Research Database (Denmark)

    Nielsen, Henrik

    2015-01-01

    was used. Various statistical and machine learning algorithms are used with all three approaches, and various measures and standards are employed when reporting the performances of the developed methods. This chapter presents a number of available methods for prediction of sorting signals and subcellular...

  2. Imbalanced multi-modal multi-label learning for subcellular localization prediction of human proteins with both single and multiple sites.

    Directory of Open Access Journals (Sweden)

    Jianjun He

    Full Text Available It is well known that an important step toward understanding the functions of a protein is to determine its subcellular location. Although numerous prediction algorithms have been developed, most of them typically focused on the proteins with only one location. In recent years, researchers have begun to pay attention to the subcellular localization prediction of the proteins with multiple sites. However, almost all the existing approaches have failed to take into account the correlations among the locations caused by the proteins with multiple sites, which may be the important information for improving the prediction accuracy of the proteins with multiple sites. In this paper, a new algorithm which can effectively exploit the correlations among the locations is proposed by using gaussian process model. Besides, the algorithm also can realize optimal linear combination of various feature extraction technologies and could be robust to the imbalanced data set. Experimental results on a human protein data set show that the proposed algorithm is valid and can achieve better performance than the existing approaches.

  3. mPLR-Loc: an adaptive decision multi-label classifier based on penalized logistic regression for protein subcellular localization prediction.

    Science.gov (United States)

    Wan, Shibiao; Mak, Man-Wai; Kung, Sun-Yuan

    2015-03-15

    Proteins located in appropriate cellular compartments are of paramount importance to exert their biological functions. Prediction of protein subcellular localization by computational methods is required in the post-genomic era. Recent studies have been focusing on predicting not only single-location proteins but also multi-location proteins. However, most of the existing predictors are far from effective for tackling the challenges of multi-label proteins. This article proposes an efficient multi-label predictor, namely mPLR-Loc, based on penalized logistic regression and adaptive decisions for predicting both single- and multi-location proteins. Specifically, for each query protein, mPLR-Loc exploits the information from the Gene Ontology (GO) database by using its accession number (AC) or the ACs of its homologs obtained via BLAST. The frequencies of GO occurrences are used to construct feature vectors, which are then classified by an adaptive decision-based multi-label penalized logistic regression classifier. Experimental results based on two recent stringent benchmark datasets (virus and plant) show that mPLR-Loc remarkably outperforms existing state-of-the-art multi-label predictors. In addition to being able to rapidly and accurately predict subcellular localization of single- and multi-label proteins, mPLR-Loc can also provide probabilistic confidence scores for the prediction decisions. For readers' convenience, the mPLR-Loc server is available online (http://bioinfo.eie.polyu.edu.hk/mPLRLocServer). Copyright © 2014 Elsevier Inc. All rights reserved.

  4. Protein subcellular localization assays using split fluorescent proteins

    Science.gov (United States)

    Waldo, Geoffrey S [Santa Fe, NM; Cabantous, Stephanie [Los Alamos, NM

    2009-09-08

    The invention provides protein subcellular localization assays using split fluorescent protein systems. The assays are conducted in living cells, do not require fixation and washing steps inherent in existing immunostaining and related techniques, and permit rapid, non-invasive, direct visualization of protein localization in living cells. The split fluorescent protein systems used in the practice of the invention generally comprise two or more self-complementing fragments of a fluorescent protein, such as GFP, wherein one or more of the fragments correspond to one or more beta-strand microdomains and are used to "tag" proteins of interest, and a complementary "assay" fragment of the fluorescent protein. Either or both of the fragments may be functionalized with a subcellular targeting sequence enabling it to be expressed in or directed to a particular subcellular compartment (i.e., the nucleus).

  5. Subcellular localization for Gram positive and Gram negative bacterial proteins using linear interpolation smoothing model.

    Science.gov (United States)

    Saini, Harsh; Raicar, Gaurav; Dehzangi, Abdollah; Lal, Sunil; Sharma, Alok

    2015-12-07

    Protein subcellular localization is an important topic in proteomics since it is related to a protein׳s overall function, helps in the understanding of metabolic pathways, and in drug design and discovery. In this paper, a basic approximation technique from natural language processing called the linear interpolation smoothing model is applied for predicting protein subcellular localizations. The proposed approach extracts features from syntactical information in protein sequences to build probabilistic profiles using dependency models, which are used in linear interpolation to determine how likely is a sequence to belong to a particular subcellular location. This technique builds a statistical model based on maximum likelihood. It is able to deal effectively with high dimensionality that hinders other traditional classifiers such as Support Vector Machines or k-Nearest Neighbours without sacrificing performance. This approach has been evaluated by predicting subcellular localizations of Gram positive and Gram negative bacterial proteins. Copyright © 2015 Elsevier Ltd. All rights reserved.

  6. Gene ontology based transfer learning for protein subcellular localization

    Directory of Open Access Journals (Sweden)

    Zhou Shuigeng

    2011-02-01

    Full Text Available Abstract Background Prediction of protein subcellular localization generally involves many complex factors, and using only one or two aspects of data information may not tell the true story. For this reason, some recent predictive models are deliberately designed to integrate multiple heterogeneous data sources for exploiting multi-aspect protein feature information. Gene ontology, hereinafter referred to as GO, uses a controlled vocabulary to depict biological molecules or gene products in terms of biological process, molecular function and cellular component. With the rapid expansion of annotated protein sequences, gene ontology has become a general protein feature that can be used to construct predictive models in computational biology. Existing models generally either concatenated the GO terms into a flat binary vector or applied majority-vote based ensemble learning for protein subcellular localization, both of which can not estimate the individual discriminative abilities of the three aspects of gene ontology. Results In this paper, we propose a Gene Ontology Based Transfer Learning Model (GO-TLM for large-scale protein subcellular localization. The model transfers the signature-based homologous GO terms to the target proteins, and further constructs a reliable learning system to reduce the adverse affect of the potential false GO terms that are resulted from evolutionary divergence. We derive three GO kernels from the three aspects of gene ontology to measure the GO similarity of two proteins, and derive two other spectrum kernels to measure the similarity of two protein sequences. We use simple non-parametric cross validation to explicitly weigh the discriminative abilities of the five kernels, such that the time & space computational complexities are greatly reduced when compared to the complicated semi-definite programming and semi-indefinite linear programming. The five kernels are then linearly merged into one single kernel for

  7. iLoc-Animal: a multi-label learning classifier for predicting subcellular localization of animal proteins.

    Science.gov (United States)

    Lin, Wei-Zhong; Fang, Jian-An; Xiao, Xuan; Chou, Kuo-Chen

    2013-04-05

    Predicting protein subcellular localization is a challenging problem, particularly when query proteins have multi-label features meaning that they may simultaneously exist at, or move between, two or more different subcellular location sites. Most of the existing methods can only be used to deal with the single-label proteins. Actually, multi-label proteins should not be ignored because they usually bear some special function worthy of in-depth studies. By introducing the "multi-label learning" approach, a new predictor, called iLoc-Animal, has been developed that can be used to deal with the systems containing both single- and multi-label animal (metazoan except human) proteins. Meanwhile, to measure the prediction quality of a multi-label system in a rigorous way, five indices were introduced; they are "Absolute-True", "Absolute-False" (or Hamming-Loss"), "Accuracy", "Precision", and "Recall". As a demonstration, the jackknife cross-validation was performed with iLoc-Animal on a benchmark dataset of animal proteins classified into the following 20 location sites: (1) acrosome, (2) cell membrane, (3) centriole, (4) centrosome, (5) cell cortex, (6) cytoplasm, (7) cytoskeleton, (8) endoplasmic reticulum, (9) endosome, (10) extracellular, (11) Golgi apparatus, (12) lysosome, (13) mitochondrion, (14) melanosome, (15) microsome, (16) nucleus, (17) peroxisome, (18) plasma membrane, (19) spindle, and (20) synapse, where many proteins belong to two or more locations. For such a complicated system, the outcomes achieved by iLoc-Animal for all the aforementioned five indices were quite encouraging, indicating that the predictor may become a useful tool in this area. It has not escaped our notice that the multi-label approach and the rigorous measurement metrics can also be used to investigate many other multi-label problems in molecular biology. As a user-friendly web-server, iLoc-Animal is freely accessible to the public at the web-site .

  8. Finding the Subcellular Location of Barley, Wheat, Rice and Maize Proteins: The Compendium of Crop Proteins with Annotated Locations (cropPAL).

    Science.gov (United States)

    Hooper, Cornelia M; Castleden, Ian R; Aryamanesh, Nader; Jacoby, Richard P; Millar, A Harvey

    2016-01-01

    Barley, wheat, rice and maize provide the bulk of human nutrition and have extensive industrial use as agricultural products. The genomes of these crops each contains >40,000 genes encoding proteins; however, the major genome databases for these species lack annotation information of protein subcellular location for >80% of these gene products. We address this gap, by constructing the compendium of crop protein subcellular locations called crop Proteins with Annotated Locations (cropPAL). Subcellular location is most commonly determined by fluorescent protein tagging of live cells or mass spectrometry detection in subcellular purifications, but can also be predicted from amino acid sequence or protein expression patterns. The cropPAL database collates 556 published studies, from >300 research institutes in >30 countries that have been previously published, as well as compiling eight pre-computed subcellular predictions for all Hordeum vulgare, Triticum aestivum, Oryza sativa and Zea mays protein sequences. The data collection including metadata for proteins and published studies can be accessed through a search portal http://crop-PAL.org. The subcellular localization information housed in cropPAL helps to depict plant cells as compartmentalized protein networks that can be investigated for improving crop yield and quality, and developing new biotechnological solutions to agricultural challenges. © The Author 2015. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  9. pLoc-mPlant: predict subcellular localization of multi-location plant proteins by incorporating the optimal GO information into general PseAAC.

    Science.gov (United States)

    Cheng, Xiang; Xiao, Xuan; Chou, Kuo-Chen

    2017-08-22

    One of the fundamental goals in cellular biochemistry is to identify the functions of proteins in the context of compartments that organize them in the cellular environment. To realize this, it is indispensable to develop an automated method for fast and accurate identification of the subcellular locations of uncharacterized proteins. The current study is focused on plant protein subcellular location prediction based on the sequence information alone. Although considerable efforts have been made in this regard, the problem is far from being solved yet. Most of the existing methods can be used to deal with single-location proteins only. Actually, proteins with multi-locations may have some special biological functions. This kind of multiplex protein is particularly important for both basic research and drug design. Using the multi-label theory, we present a new predictor called "pLoc-mPlant" by extracting the optimal GO (Gene Ontology) information into the Chou's general PseAAC (Pseudo Amino Acid Composition). Rigorous cross-validation on the same stringent benchmark dataset indicated that the proposed pLoc-mPlant predictor is remarkably superior to iLoc-Plant, the state-of-the-art method for predicting plant protein subcellular localization. To maximize the convenience of most experimental scientists, a user-friendly web-server for the new predictor has been established at , by which users can easily get their desired results without the need to go through the complicated mathematics involved.

  10. Multi-location gram-positive and gram-negative bacterial protein subcellular localization using gene ontology and multi-label classifier ensemble.

    Science.gov (United States)

    Wang, Xiao; Zhang, Jun; Li, Guo-Zheng

    2015-01-01

    It has become a very important and full of challenge task to predict bacterial protein subcellular locations using computational methods. Although there exist a lot of prediction methods for bacterial proteins, the majority of these methods can only deal with single-location proteins. But unfortunately many multi-location proteins are located in the bacterial cells. Moreover, multi-location proteins have special biological functions capable of helping the development of new drugs. So it is necessary to develop new computational methods for accurately predicting subcellular locations of multi-location bacterial proteins. In this article, two efficient multi-label predictors, Gpos-ECC-mPLoc and Gneg-ECC-mPLoc, are developed to predict the subcellular locations of multi-label gram-positive and gram-negative bacterial proteins respectively. The two multi-label predictors construct the GO vectors by using the GO terms of homologous proteins of query proteins and then adopt a powerful multi-label ensemble classifier to make the final multi-label prediction. The two multi-label predictors have the following advantages: (1) they improve the prediction performance of multi-label proteins by taking the correlations among different labels into account; (2) they ensemble multiple CC classifiers and further generate better prediction results by ensemble learning; and (3) they construct the GO vectors by using the frequency of occurrences of GO terms in the typical homologous set instead of using 0/1 values. Experimental results show that Gpos-ECC-mPLoc and Gneg-ECC-mPLoc can efficiently predict the subcellular locations of multi-label gram-positive and gram-negative bacterial proteins respectively. Gpos-ECC-mPLoc and Gneg-ECC-mPLoc can efficiently improve prediction accuracy of subcellular localization of multi-location gram-positive and gram-negative bacterial proteins respectively. The online web servers for Gpos-ECC-mPLoc and Gneg-ECC-mPLoc predictors are freely accessible

  11. pLoc-mHum: predict subcellular localization of multi-location human proteins via general PseAAC to winnow out the crucial GO information.

    Science.gov (United States)

    Cheng, Xiang; Xiao, Xuan; Chou, Kuo-Chen

    2018-05-01

    For in-depth understanding the functions of proteins in a cell, the knowledge of their subcellular localization is indispensable. The current study is focused on human protein subcellular location prediction based on the sequence information alone. Although considerable efforts have been made in this regard, the problem is far from being solved yet. Most existing methods can be used to deal with single-location proteins only. Actually, proteins with multi-locations may have some special biological functions that are particularly important for both basic research and drug design. Using the multi-label theory, we present a new predictor called 'pLoc-mHum' by extracting the crucial GO (Gene Ontology) information into the general PseAAC (Pseudo Amino Acid Composition). Rigorous cross-validations on a same stringent benchmark dataset have indicated that the proposed pLoc-mHum predictor is remarkably superior to iLoc-Hum, the state-of-the-art method in predicting the human protein subcellular localization. To maximize the convenience of most experimental scientists, a user-friendly web-server for the new predictor has been established at http://www.jci-bioinfo.cn/pLoc-mHum/, by which users can easily get their desired results without the need to go through the complicated mathematics involved. xcheng@gordonlifescience.org. Supplementary data are available at Bioinformatics online.

  12. PSORTb 3.0: improved protein subcellular localization prediction with refined localization subcategories and predictive capabilities for all prokaryotes.

    Science.gov (United States)

    Yu, Nancy Y; Wagner, James R; Laird, Matthew R; Melli, Gabor; Rey, Sébastien; Lo, Raymond; Dao, Phuong; Sahinalp, S Cenk; Ester, Martin; Foster, Leonard J; Brinkman, Fiona S L

    2010-07-01

    PSORTb has remained the most precise bacterial protein subcellular localization (SCL) predictor since it was first made available in 2003. However, the recall needs to be improved and no accurate SCL predictors yet make predictions for archaea, nor differentiate important localization subcategories, such as proteins targeted to a host cell or bacterial hyperstructures/organelles. Such improvements should preferably be encompassed in a freely available web-based predictor that can also be used as a standalone program. We developed PSORTb version 3.0 with improved recall, higher proteome-scale prediction coverage, and new refined localization subcategories. It is the first SCL predictor specifically geared for all prokaryotes, including archaea and bacteria with atypical membrane/cell wall topologies. It features an improved standalone program, with a new batch results delivery system complementing its web interface. We evaluated the most accurate SCL predictors using 5-fold cross validation plus we performed an independent proteomics analysis, showing that PSORTb 3.0 is the most accurate but can benefit from being complemented by Proteome Analyst predictions. http://www.psort.org/psortb (download open source software or use the web interface). psort-mail@sfu.ca Supplementary data are available at Bioinformatics online.

  13. Predict subcellular locations of singleplex and multiplex proteins by semi-supervised learning and dimension-reducing general mode of Chou's PseAAC.

    Science.gov (United States)

    Pacharawongsakda, Eakasit; Theeramunkong, Thanaruk

    2013-12-01

    Predicting protein subcellular location is one of major challenges in Bioinformatics area since such knowledge helps us understand protein functions and enables us to select the targeted proteins during drug discovery process. While many computational techniques have been proposed to improve predictive performance for protein subcellular location, they have several shortcomings. In this work, we propose a method to solve three main issues in such techniques; i) manipulation of multiplex proteins which may exist or move between multiple cellular compartments, ii) handling of high dimensionality in input and output spaces and iii) requirement of sufficient labeled data for model training. Towards these issues, this work presents a new computational method for predicting proteins which have either single or multiple locations. The proposed technique, namely iFLAST-CORE, incorporates the dimensionality reduction in the feature and label spaces with co-training paradigm for semi-supervised multi-label classification. For this purpose, the Singular Value Decomposition (SVD) is applied to transform the high-dimensional feature space and label space into the lower-dimensional spaces. After that, due to limitation of labeled data, the co-training regression makes use of unlabeled data by predicting the target values in the lower-dimensional spaces of unlabeled data. In the last step, the component of SVD is used to project labels in the lower-dimensional space back to those in the original space and an adaptive threshold is used to map a numeric value to a binary value for label determination. A set of experiments on viral proteins and gram-negative bacterial proteins evidence that our proposed method improve the classification performance in terms of various evaluation metrics such as Aiming (or Precision), Coverage (or Recall) and macro F-measure, compared to the traditional method that uses only labeled data.

  14. Hum-mPLoc 3.0: prediction enhancement of human protein subcellular localization through modeling the hidden correlations of gene ontology and functional domain features.

    Science.gov (United States)

    Zhou, Hang; Yang, Yang; Shen, Hong-Bin

    2017-03-15

    Protein subcellular localization prediction has been an important research topic in computational biology over the last decade. Various automatic methods have been proposed to predict locations for large scale protein datasets, where statistical machine learning algorithms are widely used for model construction. A key step in these predictors is encoding the amino acid sequences into feature vectors. Many studies have shown that features extracted from biological domains, such as gene ontology and functional domains, can be very useful for improving the prediction accuracy. However, domain knowledge usually results in redundant features and high-dimensional feature spaces, which may degenerate the performance of machine learning models. In this paper, we propose a new amino acid sequence-based human protein subcellular location prediction approach Hum-mPLoc 3.0, which covers 12 human subcellular localizations. The sequences are represented by multi-view complementary features, i.e. context vocabulary annotation-based gene ontology (GO) terms, peptide-based functional domains, and residue-based statistical features. To systematically reflect the structural hierarchy of the domain knowledge bases, we propose a novel feature representation protocol denoted as HCM (Hidden Correlation Modeling), which will create more compact and discriminative feature vectors by modeling the hidden correlations between annotation terms. Experimental results on four benchmark datasets show that HCM improves prediction accuracy by 5-11% and F 1 by 8-19% compared with conventional GO-based methods. A large-scale application of Hum-mPLoc 3.0 on the whole human proteome reveals proteins co-localization preferences in the cell. www.csbio.sjtu.edu.cn/bioinf/Hum-mPLoc3/. hbshen@sjtu.edu.cn. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  15. The SubCons webserver: A user friendly web interface for state-of-the-art subcellular localization prediction.

    Science.gov (United States)

    Salvatore, M; Shu, N; Elofsson, A

    2018-01-01

    SubCons is a recently developed method that predicts the subcellular localization of a protein. It combines predictions from four predictors using a Random Forest classifier. Here, we present the user-friendly web-interface implementation of SubCons. Starting from a protein sequence, the server rapidly predicts the subcellular localizations of an individual protein. In addition, the server accepts the submission of sets of proteins either by uploading the files or programmatically by using command line WSDL API scripts. This makes SubCons ideal for proteome wide analyses allowing the user to scan a whole proteome in few days. From the web page, it is also possible to download precalculated predictions for several eukaryotic organisms. To evaluate the performance of SubCons we present a benchmark of LocTree3 and SubCons using two recent mass-spectrometry based datasets of mouse and drosophila proteins. The server is available at http://subcons.bioinfo.se/. © 2017 The Protein Society.

  16. pLoc-mVirus: Predict subcellular localization of multi-location virus proteins via incorporating the optimal GO information into general PseAAC.

    Science.gov (United States)

    Cheng, Xiang; Xiao, Xuan; Chou, Kuo-Chen

    2017-09-10

    Knowledge of subcellular locations of proteins is crucially important for in-depth understanding their functions in a cell. With the explosive growth of protein sequences generated in the postgenomic age, it is highly demanded to develop computational tools for timely annotating their subcellular locations based on the sequence information alone. The current study is focused on virus proteins. Although considerable efforts have been made in this regard, the problem is far from being solved yet. Most existing methods can be used to deal with single-location proteins only. Actually, proteins with multi-locations may have some special biological functions. This kind of multiplex proteins is particularly important for both basic research and drug design. Using the multi-label theory, we present a new predictor called "pLoc-mVirus" by extracting the optimal GO (Gene Ontology) information into the general PseAAC (Pseudo Amino Acid Composition). Rigorous cross-validation on a same stringent benchmark dataset indicated that the proposed pLoc-mVirus predictor is remarkably superior to iLoc-Virus, the state-of-the-art method in predicting virus protein subcellular localization. To maximize the convenience of most experimental scientists, a user-friendly web-server for the new predictor has been established at http://www.jci-bioinfo.cn/pLoc-mVirus/, by which users can easily get their desired results without the need to go through the complicated mathematics involved. Copyright © 2017 Elsevier B.V. All rights reserved.

  17. Identifying essential proteins based on sub-network partition and prioritization by integrating subcellular localization information.

    Science.gov (United States)

    Li, Min; Li, Wenkai; Wu, Fang-Xiang; Pan, Yi; Wang, Jianxin

    2018-06-14

    Essential proteins are important participants in various life activities and play a vital role in the survival and reproduction of living organisms. Identification of essential proteins from protein-protein interaction (PPI) networks has great significance to facilitate the study of human complex diseases, the design of drugs and the development of bioinformatics and computational science. Studies have shown that highly connected proteins in a PPI network tend to be essential. A series of computational methods have been proposed to identify essential proteins by analyzing topological structures of PPI networks. However, the high noise in the PPI data can degrade the accuracy of essential protein prediction. Moreover, proteins must be located in the appropriate subcellular localization to perform their functions, and only when the proteins are located in the same subcellular localization, it is possible that they can interact with each other. In this paper, we propose a new network-based essential protein discovery method based on sub-network partition and prioritization by integrating subcellular localization information, named SPP. The proposed method SPP was tested on two different yeast PPI networks obtained from DIP database and BioGRID database. The experimental results show that SPP can effectively reduce the effect of false positives in PPI networks and predict essential proteins more accurately compared with other existing computational methods DC, BC, CC, SC, EC, IC, NC. Copyright © 2018 Elsevier Ltd. All rights reserved.

  18. pLoc-mAnimal: predict subcellular localization of animal proteins with both single and multiple sites.

    Science.gov (United States)

    Cheng, Xiang; Zhao, Shu-Guang; Lin, Wei-Zhong; Xiao, Xuan; Chou, Kuo-Chen

    2017-11-15

    Cells are deemed the basic unit of life. However, many important functions of cells as well as their growth and reproduction are performed via the protein molecules located at their different organelles or locations. Facing explosive growth of protein sequences, we are challenged to develop fast and effective method to annotate their subcellular localization. However, this is by no means an easy task. Particularly, mounting evidences have indicated proteins have multi-label feature meaning that they may simultaneously exist at, or move between, two or more different subcellular location sites. Unfortunately, most of the existing computational methods can only be used to deal with the single-label proteins. Although the 'iLoc-Animal' predictor developed recently is quite powerful that can be used to deal with the animal proteins with multiple locations as well, its prediction quality needs to be improved, particularly in enhancing the absolute true rate and reducing the absolute false rate. Here we propose a new predictor called 'pLoc-mAnimal', which is superior to iLoc-Animal as shown by the compelling facts. When tested by the most rigorous cross-validation on the same high-quality benchmark dataset, the absolute true success rate achieved by the new predictor is 37% higher and the absolute false rate is four times lower in comparison with the state-of-the-art predictor. To maximize the convenience of most experimental scientists, a user-friendly web-server for the new predictor has been established at http://www.jci-bioinfo.cn/pLoc-mAnimal/, by which users can easily get their desired results without the need to go through the complicated mathematics involved. xxiao@gordonlifescience.org or kcchou@gordonlifescience.org. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  19. Mapping the Subcellular Proteome of Shewanella oneidensis MR-1 using Sarkosyl-based fractionation and LC-MS/MS protein identification

    Energy Technology Data Exchange (ETDEWEB)

    Brown, Roslyn N.; Romine, Margaret F.; Schepmoes, Athena A.; Smith, Richard D.; Lipton, Mary S.

    2010-07-19

    A simple and effective subcellular proteomic method for fractionation and analysis of gram-negative bacterial cytoplasm, periplasm, inner, and outer membranes was applied to Shewanella oneidensis to gain insight into its subcellular architecture. A combination of differential centrifugation, Sarkosyl solubilization, and osmotic lysis was used to prepare subcellular fractions. Global differences in protein fractions were observed by SDS PAGE and heme staining, and tryptic peptides were analyzed using high-resolution LC-MS/MS. Compared to crude cell lysates, the fractionation method achieved a significant enrichment (average ~2-fold) in proteins predicted to be localized to each subcellular fraction. Compared to other detergent, organic solvent, and density-based methods previously reported, Sarkosyl most effectively facilitated separation of the inner and outer membranes and was amenable to mass spectrometry, making this procedure ideal for probing the subcellular proteome of gram-negative bacteria via LC-MS/MS. With 40% of the observable proteome represented, this study has provided extensive information on both subcellular architecture and relative abundance of proteins in S. oneidensis and provides a foundation for future work on subcellular organization and protein-membrane interactions in other gram-negative bacteria.

  20. Diversity and subcellular distribution of archaeal secreted proteins

    Directory of Open Access Journals (Sweden)

    Mechthild ePohlschroder

    2012-07-01

    Full Text Available Secreted proteins make up a significant percentage of a prokaryotic proteome and play critical roles in important cellular processes such as polymer degradation, nutrient uptake, signal transduction, cell wall biosynthesis and motility. The majority of archaeal proteins are believed to be secreted either in an unfolded conformation via the universally conserved Sec pathway or in a folded conformation via the Twin arginine transport (Tat pathway. Extensive in vivo and in silico analyses of N-terminal signal peptides that target proteins to these pathways have led to the development of computational tools that not only predict Sec and Tat substrates with high accuracy but also provide information about signal peptide processing and targeting. Predictions therefore include indications as to whether a substrate is a soluble secreted protein, a membrane or cell-wall anchored protein, or a surface structure subunit, and whether it is targeted for post-translational modification such as glycosylation or the addition of a lipid. The use of these in silico tools, in combination with biochemical and genetic analyses of transport pathways and their substrates, has resulted in improved predictions of the subcellular localization of archaeal secreted proteins, allowing for a more accurate annotation of archaeal proteomes, and has led to the identification of potential adaptations to extreme environments, as well as archaeal kingdom-specific pathways. A more comprehensive understanding of the transport pathways and post-translational modifications of secreted archaeal proteins will also generate invaluable insights that will facilitate the identification of commercially valuable archaeal enzymes and the development of heterologous systems in which to efficiently express them.

  1. Diversity and subcellular distribution of archaeal secreted proteins.

    Science.gov (United States)

    Szabo, Zalan; Pohlschroder, Mechthild

    2012-01-01

    Secreted proteins make up a significant percentage of a prokaryotic proteome and play critical roles in important cellular processes such as polymer degradation, nutrient uptake, signal transduction, cell wall biosynthesis, and motility. The majority of archaeal proteins are believed to be secreted either in an unfolded conformation via the universally conserved Sec pathway or in a folded conformation via the Twin arginine transport (Tat) pathway. Extensive in vivo and in silico analyses of N-terminal signal peptides that target proteins to these pathways have led to the development of computational tools that not only predict Sec and Tat substrates with high accuracy but also provide information about signal peptide processing and targeting. Predictions therefore include indications as to whether a substrate is a soluble secreted protein, a membrane or cell wall anchored protein, or a surface structure subunit, and whether it is targeted for post-translational modification such as glycosylation or the addition of a lipid. The use of these in silico tools, in combination with biochemical and genetic analyses of transport pathways and their substrates, has resulted in improved predictions of the subcellular localization of archaeal secreted proteins, allowing for a more accurate annotation of archaeal proteomes, and has led to the identification of potential adaptations to extreme environments, as well as phyla-specific pathways among the archaea. A more comprehensive understanding of the transport pathways used and post-translational modifications of secreted archaeal proteins will also facilitate the identification and heterologous expression of commercially valuable archaeal enzymes.

  2. Subcellular sites for bacterial protein export

    NARCIS (Netherlands)

    Campo, Nathalie; Tjalsma, Harold; Buist, Girbe; Stepniak, Dariusz; Meijer, Michel; Veenhuis, Marten; Westermann, Martin; Müller, Jörg P.; Bron, Sierd; Kok, Jan; Kuipers, Oscar P.; Jongbloed, Jan D.H.

    2004-01-01

    Most bacterial proteins destined to leave the cytoplasm are exported to extracellular compartments or imported into the cytoplasmic membrane via the highly conserved SecA-YEG pathway. In the present studies, the subcellular distributions of core components of this pathway, SecA and SecY, and of the

  3. Subcellular sites for bacterial protein export.

    NARCIS (Netherlands)

    Campo, N.; Tjalsma, H.; Buist, G.; Stepniak, D.; Meijer, M.; Veenhuis, M.; Westermann, M.; Muller, J.P.; Bron, S.; Kok, J.; Kuipers, O.P.; Jongbloed, J.D.

    2004-01-01

    Most bacterial proteins destined to leave the cytoplasm are exported to extracellular compartments or imported into the cytoplasmic membrane via the highly conserved SecA-YEG pathway. In the present studies, the subcellular distributions of core components of this pathway, SecA and SecY, and of the

  4. Decoding the Divergent Subcellular Location of Two Highly Similar Paralogous LEA Proteins

    Directory of Open Access Journals (Sweden)

    Marie-Hélène Avelange-Macherel

    2018-05-01

    Full Text Available Many mitochondrial proteins are synthesized as precursors in the cytosol with an N-terminal mitochondrial targeting sequence (MTS which is cleaved off upon import. Although much is known about import mechanisms and MTS structural features, the variability of MTS still hampers robust sub-cellular software predictions. Here, we took advantage of two paralogous late embryogenesis abundant proteins (LEA from Arabidopsis with different subcellular locations to investigate structural determinants of mitochondrial import and gain insight into the evolution of the LEA genes. LEA38 and LEA2 are short proteins of the LEA_3 family, which are very similar along their whole sequence, but LEA38 is targeted to mitochondria while LEA2 is cytosolic. Differences in the N-terminal protein sequences were used to generate a series of mutated LEA2 which were expressed as GFP-fusion proteins in leaf protoplasts. By combining three types of mutation (substitution, charge inversion, and segment replacement, we were able to redirect the mutated LEA2 to mitochondria. Analysis of the effect of the mutations and determination of the LEA38 MTS cleavage site highlighted important structural features within and beyond the MTS. Overall, these results provide an explanation for the likely loss of mitochondrial location after duplication of the ancestral gene.

  5. Protein Sorting Prediction

    DEFF Research Database (Denmark)

    Nielsen, Henrik

    2017-01-01

    and drawbacks of each of these approaches is described through many examples of methods that predict secretion, integration into membranes, or subcellular locations in general. The aim of this chapter is to provide a user-level introduction to the field with a minimum of computational theory.......Many computational methods are available for predicting protein sorting in bacteria. When comparing them, it is important to know that they can be grouped into three fundamentally different approaches: signal-based, global-property-based and homology-based prediction. In this chapter, the strengths...

  6. MU-LOC: A Machine-Learning Method for Predicting Mitochondrially Localized Proteins in Plants

    DEFF Research Database (Denmark)

    Zhang, Ning; Rao, R Shyama Prasad; Salvato, Fernanda

    2018-01-01

    -sequence or a multitude of internal signals. Compared with experimental approaches, computational predictions provide an efficient way to infer subcellular localization of a protein. However, it is still challenging to predict plant mitochondrially localized proteins accurately due to various limitations. Consequently......, the performance of current tools can be improved with new data and new machine-learning methods. We present MU-LOC, a novel computational approach for large-scale prediction of plant mitochondrial proteins. We collected a comprehensive dataset of plant subcellular localization, extracted features including amino...

  7. CellMap visualizes protein-protein interactions and subcellular localization

    Science.gov (United States)

    Dallago, Christian; Goldberg, Tatyana; Andrade-Navarro, Miguel Angel; Alanis-Lobato, Gregorio; Rost, Burkhard

    2018-01-01

    Many tools visualize protein-protein interaction (PPI) networks. The tool introduced here, CellMap, adds one crucial novelty by visualizing PPI networks in the context of subcellular localization, i.e. the location in the cell or cellular component in which a PPI happens. Users can upload images of cells and define areas of interest against which PPIs for selected proteins are displayed (by default on a cartoon of a cell). Annotations of localization are provided by the user or through our in-house database. The visualizer and server are written in JavaScript, making CellMap easy to customize and to extend by researchers and developers. PMID:29497493

  8. Comparative study of human mitochondrial proteome reveals extensive protein subcellular relocalization after gene duplications

    Directory of Open Access Journals (Sweden)

    Huang Yong

    2009-11-01

    Full Text Available Abstract Background Gene and genome duplication is the principle creative force in evolution. Recently, protein subcellular relocalization, or neolocalization was proposed as one of the mechanisms responsible for the retention of duplicated genes. This hypothesis received support from the analysis of yeast genomes, but has not been tested thoroughly on animal genomes. In order to evaluate the importance of subcellular relocalizations for retention of duplicated genes in animal genomes, we systematically analyzed nuclear encoded mitochondrial proteins in the human genome by reconstructing phylogenies of mitochondrial multigene families. Results The 456 human mitochondrial proteins selected for this study were clustered into 305 gene families including 92 multigene families. Among the multigene families, 59 (64% consisted of both mitochondrial and cytosolic (non-mitochondrial proteins (mt-cy families while the remaining 33 (36% were composed of mitochondrial proteins (mt-mt families. Phylogenetic analyses of mt-cy families revealed three different scenarios of their neolocalization following gene duplication: 1 relocalization from mitochondria to cytosol, 2 from cytosol to mitochondria and 3 multiple subcellular relocalizations. The neolocalizations were most commonly enabled by the gain or loss of N-terminal mitochondrial targeting signals. The majority of detected subcellular relocalization events occurred early in animal evolution, preceding the evolution of tetrapods. Mt-mt protein families showed a somewhat different pattern, where gene duplication occurred more evenly in time. However, for both types of protein families, most duplication events appear to roughly coincide with two rounds of genome duplications early in vertebrate evolution. Finally, we evaluated the effects of inaccurate and incomplete annotation of mitochondrial proteins and found that our conclusion of the importance of subcellular relocalization after gene duplication on

  9. MU-LOC: A Machine-Learning Method for Predicting Mitochondrially Localized Proteins in Plants

    Directory of Open Access Journals (Sweden)

    Ning Zhang

    2018-05-01

    Full Text Available Targeting and translocation of proteins to the appropriate subcellular compartments are crucial for cell organization and function. Newly synthesized proteins are transported to mitochondria with the assistance of complex targeting sequences containing either an N-terminal pre-sequence or a multitude of internal signals. Compared with experimental approaches, computational predictions provide an efficient way to infer subcellular localization of a protein. However, it is still challenging to predict plant mitochondrially localized proteins accurately due to various limitations. Consequently, the performance of current tools can be improved with new data and new machine-learning methods. We present MU-LOC, a novel computational approach for large-scale prediction of plant mitochondrial proteins. We collected a comprehensive dataset of plant subcellular localization, extracted features including amino acid composition, protein position weight matrix, and gene co-expression information, and trained predictors using deep neural network and support vector machine. Benchmarked on two independent datasets, MU-LOC achieved substantial improvements over six state-of-the-art tools for plant mitochondrial targeting prediction. In addition, MU-LOC has the advantage of predicting plant mitochondrial proteins either possessing or lacking N-terminal pre-sequences. We applied MU-LOC to predict candidate mitochondrial proteins for the whole proteome of Arabidopsis and potato. MU-LOC is publicly available at http://mu-loc.org.

  10. Protein Subcellular Localization with Gaussian Kernel Discriminant Analysis and Its Kernel Parameter Selection.

    Science.gov (United States)

    Wang, Shunfang; Nie, Bing; Yue, Kun; Fei, Yu; Li, Wenjia; Xu, Dongshu

    2017-12-15

    Kernel discriminant analysis (KDA) is a dimension reduction and classification algorithm based on nonlinear kernel trick, which can be novelly used to treat high-dimensional and complex biological data before undergoing classification processes such as protein subcellular localization. Kernel parameters make a great impact on the performance of the KDA model. Specifically, for KDA with the popular Gaussian kernel, to select the scale parameter is still a challenging problem. Thus, this paper introduces the KDA method and proposes a new method for Gaussian kernel parameter selection depending on the fact that the differences between reconstruction errors of edge normal samples and those of interior normal samples should be maximized for certain suitable kernel parameters. Experiments with various standard data sets of protein subcellular localization show that the overall accuracy of protein classification prediction with KDA is much higher than that without KDA. Meanwhile, the kernel parameter of KDA has a great impact on the efficiency, and the proposed method can produce an optimum parameter, which makes the new algorithm not only perform as effectively as the traditional ones, but also reduce the computational time and thus improve efficiency.

  11. Determining the sub-cellular localization of proteins within Caenorhabditis elegans body wall muscle.

    Science.gov (United States)

    Meissner, Barbara; Rogalski, Teresa; Viveiros, Ryan; Warner, Adam; Plastino, Lorena; Lorch, Adam; Granger, Laure; Segalat, Laurent; Moerman, Donald G

    2011-01-01

    Determining the sub-cellular localization of a protein within a cell is often an essential step towards understanding its function. In Caenorhabditis elegans, the relatively large size of the body wall muscle cells and the exquisite organization of their sarcomeres offer an opportunity to identify the precise position of proteins within cell substructures. Our goal in this study is to generate a comprehensive "localizome" for C. elegans body wall muscle by GFP-tagging proteins expressed in muscle and determining their location within the cell. For this project, we focused on proteins that we know are expressed in muscle and are orthologs or at least homologs of human proteins. To date we have analyzed the expression of about 227 GFP-tagged proteins that show localized expression in the body wall muscle of this nematode (e.g. dense bodies, M-lines, myofilaments, mitochondria, cell membrane, nucleus or nucleolus). For most proteins analyzed in this study no prior data on sub-cellular localization was available. In addition to discrete sub-cellular localization we observe overlapping patterns of localization including the presence of a protein in the dense body and the nucleus, or the dense body and the M-lines. In total we discern more than 14 sub-cellular localization patterns within nematode body wall muscle. The localization of this large set of proteins within a muscle cell will serve as an invaluable resource in our investigation of muscle sarcomere assembly and function.

  12. ClubSub-P: Cluster-Based Subcellular Localization Prediction for Gram-Negative Bacteria and Archaea

    Science.gov (United States)

    Paramasivam, Nagarajan; Linke, Dirk

    2011-01-01

    The subcellular localization (SCL) of proteins provides important clues to their function in a cell. In our efforts to predict useful vaccine targets against Gram-negative bacteria, we noticed that misannotated start codons frequently lead to wrongly assigned SCLs. This and other problems in SCL prediction, such as the relatively high false-positive and false-negative rates of some tools, can be avoided by applying multiple prediction tools to groups of homologous proteins. Here we present ClubSub-P, an online database that combines existing SCL prediction tools into a consensus pipeline from more than 600 proteomes of fully sequenced microorganisms. On top of the consensus prediction at the level of single sequences, the tool uses clusters of homologous proteins from Gram-negative bacteria and from Archaea to eliminate false-positive and false-negative predictions. ClubSub-P can assign the SCL of proteins from Gram-negative bacteria and Archaea with high precision. The database is searchable, and can easily be expanded using either new bacterial genomes or new prediction tools as they become available. This will further improve the performance of the SCL prediction, as well as the detection of misannotated start codons and other annotation errors. ClubSub-P is available online at http://toolkit.tuebingen.mpg.de/clubsubp/ PMID:22073040

  13. Fast subcellular localization by cascaded fusion of signal-based and homology-based methods

    Directory of Open Access Journals (Sweden)

    Wang Wei

    2011-10-01

    Full Text Available Abstract Background The functions of proteins are closely related to their subcellular locations. In the post-genomics era, the amount of gene and protein data grows exponentially, which necessitates the prediction of subcellular localization by computational means. Results This paper proposes mitigating the computation burden of alignment-based approaches to subcellular localization prediction by a cascaded fusion of cleavage site prediction and profile alignment. Specifically, the informative segments of protein sequences are identified by a cleavage site predictor using the information in their N-terminal shorting signals. Then, the sequences are truncated at the cleavage site positions, and the shortened sequences are passed to PSI-BLAST for computing their profiles. Subcellular localization are subsequently predicted by a profile-to-profile alignment support-vector-machine (SVM classifier. To further reduce the training and recognition time of the classifier, the SVM classifier is replaced by a new kernel method based on the perturbational discriminant analysis (PDA. Conclusions Experimental results on a new dataset based on Swiss-Prot Release 57.5 show that the method can make use of the best property of signal- and homology-based approaches and can attain an accuracy comparable to that achieved by using full-length sequences. Analysis of profile-alignment score matrices suggest that both profile creation time and profile alignment time can be reduced without significant reduction in subcellular localization accuracy. It was found that PDA enjoys a short training time as compared to the conventional SVM. We advocate that the method will be important for biologists to conduct large-scale protein annotation or for bioinformaticians to perform preliminary investigations on new algorithms that involve pairwise alignments.

  14. Organ accumulation and subcellular location of Cicer arietinum ST1 protein.

    Science.gov (United States)

    Albornos, Lucía; Cabrera, Javier; Hernández-Nistal, Josefina; Martín, Ignacio; Labrador, Emilia; Dopico, Berta

    2014-07-01

    The ST (ShooT Specific) proteins are a new family of proteins characterized by a signal peptide, tandem repeats of 25/26 amino acids, and a domain of unknown function (DUF2775), whose presence is limited to a few families of dicotyledonous plants, mainly Fabaceae and Asteraceae. Their function remains unknown, although involvement in plant growth, fruit morphogenesis or in biotic and abiotic interactions have been suggested. This work is focused on ST1, a Cicer arietinum ST protein. We established the protein accumulation in different tissues and organs of chickpea seedlings and plants and its subcellular localization, which could indicate the possible function of ST1. The raising of specific antibodies against ST1 protein revealed that its accumulation in epicotyls and radicles was related to their elongation rate. Its pattern of tissue location in cotyledons during seed formation and early seed germination, as well as its localization in the perivascular fibres of epicotyls and radicles, indicated a possible involvement in seed germination and seedling growth. ST1 protein appears both inside the cell and in the cell wall. This double subcellular localization was found in every organ in which the ST1 protein was detected: seeds, cotyledons and seedling epicotyls and radicles. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  15. LocateP: Genome-scale subcellular-location predictor for bacterial proteins

    Directory of Open Access Journals (Sweden)

    Zhou Miaomiao

    2008-03-01

    Full Text Available Abstract Background In the past decades, various protein subcellular-location (SCL predictors have been developed. Most of these predictors, like TMHMM 2.0, SignalP 3.0, PrediSi and Phobius, aim at the identification of one or a few SCLs, whereas others such as CELLO and Psortb.v.2.0 aim at a broader classification. Although these tools and pipelines can achieve a high precision in the accurate prediction of signal peptides and transmembrane helices, they have a much lower accuracy when other sequence characteristics are concerned. For instance, it proved notoriously difficult to identify the fate of proteins carrying a putative type I signal peptidase (SPIase cleavage site, as many of those proteins are retained in the cell membrane as N-terminally anchored membrane proteins. Moreover, most of the SCL classifiers are based on the classification of the Swiss-Prot database and consequently inherited the inconsistency of that SCL classification. As accurate and detailed SCL prediction on a genome scale is highly desired by experimental researchers, we decided to construct a new SCL prediction pipeline: LocateP. Results LocateP combines many of the existing high-precision SCL identifiers with our own newly developed identifiers for specific SCLs. The LocateP pipeline was designed such that it mimics protein targeting and secretion processes. It distinguishes 7 different SCLs within Gram-positive bacteria: intracellular, multi-transmembrane, N-terminally membrane anchored, C-terminally membrane anchored, lipid-anchored, LPxTG-type cell-wall anchored, and secreted/released proteins. Moreover, it distinguishes pathways for Sec- or Tat-dependent secretion and alternative secretion of bacteriocin-like proteins. The pipeline was tested on data sets extracted from literature, including experimental proteomics studies. The tests showed that LocateP performs as well as, or even slightly better than other SCL predictors for some locations and outperforms

  16. Improving prediction of heterodimeric protein complexes using combination with pairwise kernel.

    Science.gov (United States)

    Ruan, Peiying; Hayashida, Morihiro; Akutsu, Tatsuya; Vert, Jean-Philippe

    2018-02-19

    Since many proteins become functional only after they interact with their partner proteins and form protein complexes, it is essential to identify the sets of proteins that form complexes. Therefore, several computational methods have been proposed to predict complexes from the topology and structure of experimental protein-protein interaction (PPI) network. These methods work well to predict complexes involving at least three proteins, but generally fail at identifying complexes involving only two different proteins, called heterodimeric complexes or heterodimers. There is however an urgent need for efficient methods to predict heterodimers, since the majority of known protein complexes are precisely heterodimers. In this paper, we use three promising kernel functions, Min kernel and two pairwise kernels, which are Metric Learning Pairwise Kernel (MLPK) and Tensor Product Pairwise Kernel (TPPK). We also consider the normalization forms of Min kernel. Then, we combine Min kernel or its normalization form and one of the pairwise kernels by plugging. We applied kernels based on PPI, domain, phylogenetic profile, and subcellular localization properties to predicting heterodimers. Then, we evaluate our method by employing C-Support Vector Classification (C-SVC), carrying out 10-fold cross-validation, and calculating the average F-measures. The results suggest that the combination of normalized-Min-kernel and MLPK leads to the best F-measure and improved the performance of our previous work, which had been the best existing method so far. We propose new methods to predict heterodimers, using a machine learning-based approach. We train a support vector machine (SVM) to discriminate interacting vs non-interacting protein pairs, based on informations extracted from PPI, domain, phylogenetic profiles and subcellular localization. We evaluate in detail new kernel functions to encode these data, and report prediction performance that outperforms the state-of-the-art.

  17. LocTree3 prediction of localization

    DEFF Research Database (Denmark)

    Goldberg, T.; Hecht, M.; Hamp, T.

    2014-01-01

    The prediction of protein sub-cellular localization is an important step toward elucidating protein function. For each query protein sequence, LocTree2 applies machine learning (profile kernel SVM) to predict the native sub-cellular localization in 18 classes for eukaryotes, in six for bacteria a...

  18. Rechecking the Centrality-Lethality Rule in the Scope of Protein Subcellular Localization Interaction Networks.

    Directory of Open Access Journals (Sweden)

    Xiaoqing Peng

    Full Text Available Essential proteins are indispensable for living organisms to maintain life activities and play important roles in the studies of pathology, synthetic biology, and drug design. Therefore, besides experiment methods, many computational methods are proposed to identify essential proteins. Based on the centrality-lethality rule, various centrality methods are employed to predict essential proteins in a Protein-protein Interaction Network (PIN. However, neglecting the temporal and spatial features of protein-protein interactions, the centrality scores calculated by centrality methods are not effective enough for measuring the essentiality of proteins in a PIN. Moreover, many methods, which overfit with the features of essential proteins for one species, may perform poor for other species. In this paper, we demonstrate that the centrality-lethality rule also exists in Protein Subcellular Localization Interaction Networks (PSLINs. To do this, a method based on Localization Specificity for Essential protein Detection (LSED, was proposed, which can be combined with any centrality method for calculating the improved centrality scores by taking into consideration PSLINs in which proteins play their roles. In this study, LSED was combined with eight centrality methods separately to calculate Localization-specific Centrality Scores (LCSs for proteins based on the PSLINs of four species (Saccharomyces cerevisiae, Homo sapiens, Mus musculus and Drosophila melanogaster. Compared to the proteins with high centrality scores measured from the global PINs, more proteins with high LCSs measured from PSLINs are essential. It indicates that proteins with high LCSs measured from PSLINs are more likely to be essential and the performance of centrality methods can be improved by LSED. Furthermore, LSED provides a wide applicable prediction model to identify essential proteins for different species.

  19. Subcellular distribution of calcium-binding proteins and a calcium-ATPase in canine pancreas

    International Nuclear Information System (INIS)

    Nigam, S.K.; Towers, T.

    1990-01-01

    Using a 45Ca blot-overlay assay, we monitored the subcellular fractionation pattern of several Ca binding proteins of apparent molecular masses 94, 61, and 59 kD. These proteins also appeared to stain blue with Stains-All. Additionally, using a monoclonal antiserum raised against canine cardiac sarcoplasmic reticulum Ca-ATPase, we examined the subcellular distribution of a canine pancreatic 110-kD protein recognized by this antiserum. This protein had the same electrophoretic mobility as the cardiac protein against which the antiserum was raised. The three Ca binding proteins and the Ca-ATPase cofractionated into the rough microsomal fraction (RM), previously shown to consist of highly purified RER, in a pattern highly similar to that of the RER marker, ribophorin I. To provide further evidence for an RER localization, native RM were subjected to isopycnic flotation in sucrose gradients. The Ca binding proteins and the Ca-ATPase were found in dense fractions, along with ribophorin I. When RM were stripped of ribosomes with puromycin/high salt, the Ca binding proteins and the Ca-ATPase exhibited a shift to less dense fractions, as did ribophorin I. We conclude that, in pancreas, the Ca binding proteins and Ca-ATPase we detect are localized to the RER (conceivably a subcompartment of the RER) or, possibly, a structure intimately associated with the RER

  20. A Comprehensive Subcellular Proteomic Survey of Salmonella Grown under Phagosome-Mimicking versus Standard Laboratory Conditions

    Energy Technology Data Exchange (ETDEWEB)

    Brown, Roslyn N.; Sanford, James A.; Park, Jea H.; Deatherage, Brooke L.; Champion, Boyd L.; Smith, Richard D.; Heffron, Fred; Adkins, Joshua N.

    2012-06-01

    Towards developing a systems-level pathobiological understanding of Salmonella enterica, we performed a subcellular proteomic analysis of this pathogen grown under standard laboratory and infection-mimicking conditions in vitro. Analysis of proteins from cytoplasmic, inner membrane, periplasmic, and outer membrane fractions yielded coverage of over 30% of the theoretical proteome. Confident subcellular location could be assigned to over 1000 proteins, with good agreement between experimentally observed location and predicted/known protein properties. Comparison of protein location under the different environmental conditions provided insight into dynamic protein localization and possible moonlighting (multiple function) activities. Notable examples of dynamic localization were the response regulators of two-component regulatory systems (e.g., ArcB, PhoQ). The DNA-binding protein Dps that is generally regarded as cytoplasmic was significantly enriched in the outer membrane for all growth conditions examined, suggestive of moonlighting activities. These observations imply the existence of unknown transport mechanisms and novel functions for a subset of Salmonella proteins. Overall, this work provides a catalog of experimentally verified subcellular protein location for Salmonella and a framework for further investigations using computational modeling.

  1. Plant subcellular proteomics: Application for exploring optimal cell function in soybean.

    Science.gov (United States)

    Wang, Xin; Komatsu, Setsuko

    2016-06-30

    Plants have evolved complicated responses to developmental changes and stressful environmental conditions. Subcellular proteomics has the potential to elucidate localized cellular responses and investigate communications among subcellular compartments during plant development and in response to biotic and abiotic stresses. Soybean, which is a valuable legume crop rich in protein and vegetable oil, can grow in several climatic zones; however, the growth and yield of soybean are markedly decreased under stresses. To date, numerous proteomic studies have been performed in soybean to examine the specific protein profiles of cell wall, plasma membrane, nucleus, mitochondrion, chloroplast, and endoplasmic reticulum. In this review, methods for the purification and purity assessment of subcellular organelles from soybean are summarized. In addition, the findings from subcellular proteomic analyses of soybean during development and under stresses, particularly flooding stress, are presented and the proteins regulated among subcellular compartments are discussed. Continued advances in subcellular proteomics are expected to greatly contribute to the understanding of the responses and interactions that occur within and among subcellular compartments during development and under stressful environmental conditions. Subcellular proteomics has the potential to investigate the cellular events and interactions among subcellular compartments in response to development and stresses in plants. Soybean could grow in several climatic zones; however, the growth and yield of soybean are markedly decreased under stresses. Numerous proteomics of cell wall, plasma membrane, nucleus, mitochondrion, chloroplast, and endoplasmic reticulum was carried out to investigate the respecting proteins and their functions in soybean during development or under stresses. In this review, methods of subcellular-organelle enrichment and purity assessment are summarized. In addition, previous findings of

  2. Distinct cellular and subcellular distributions of G protein-coupled receptor kinase and arrestin isoforms in the striatum.

    Directory of Open Access Journals (Sweden)

    Evgeny Bychkov

    Full Text Available G protein-coupled receptor kinases (GRKs and arrestins mediate desensitization of G protein-coupled receptors (GPCR. Arrestins also mediate G protein-independent signaling via GPCRs. Since GRK and arrestins demonstrate no strict receptor specificity, their functions in the brain may depend on their cellular complement, expression level, and subcellular targeting. However, cellular expression and subcellular distribution of GRKs and arrestins in the brain is largely unknown. We show that GRK isoforms GRK2 and GRK5 are similarly expressed in direct and indirect pathway neurons in the rat striatum. Arrestin-2 and arrestin-3 are also expressed in neurons of both pathways. Cholinergic interneurons are enriched in GRK2, arrestin-3, and GRK5. Parvalbumin-positive interneurons express more of GRK2 and less of arrestin-2 than medium spiny neurons. The GRK5 subcellular distribution in the human striatal neurons is altered by its phosphorylation: unphosphorylated enzyme preferentially localizes to synaptic membranes, whereas phosphorylated GRK5 is found in plasma membrane and cytosolic fractions. Both GRK isoforms are abundant in the nucleus of human striatal neurons, whereas the proportion of both arrestins in the nucleus was equally low. However, overall higher expression of arrestin-2 yields high enough concentration in the nucleus to mediate nuclear functions. These data suggest cell type- and subcellular compartment-dependent differences in GRK/arrestin-mediated desensitization and signaling.

  3. Signal peptides and protein localization prediction

    DEFF Research Database (Denmark)

    Nielsen, Henrik

    2005-01-01

    In 1999, the Nobel prize in Physiology or Medicine was awarded to Gunther Blobel “for the discovery that proteins have intrinsic signals that govern their transport and localization in the cell”. Since the subcellular localization of a protein is an important clue to its function, the characteriz...

  4. Protein (multi-)location prediction: utilizing interdependencies via a generative model.

    Science.gov (United States)

    Simha, Ramanuja; Briesemeister, Sebastian; Kohlbacher, Oliver; Shatkay, Hagit

    2015-06-15

    Proteins are responsible for a multitude of vital tasks in all living organisms. Given that a protein's function and role are strongly related to its subcellular location, protein location prediction is an important research area. While proteins move from one location to another and can localize to multiple locations, most existing location prediction systems assign only a single location per protein. A few recent systems attempt to predict multiple locations for proteins, however, their performance leaves much room for improvement. Moreover, such systems do not capture dependencies among locations and usually consider locations as independent. We hypothesize that a multi-location predictor that captures location inter-dependencies can improve location predictions for proteins. We introduce a probabilistic generative model for protein localization, and develop a system based on it-which we call MDLoc-that utilizes inter-dependencies among locations to predict multiple locations for proteins. The model captures location inter-dependencies using Bayesian networks and represents dependency between features and locations using a mixture model. We use iterative processes for learning model parameters and for estimating protein locations. We evaluate our classifier MDLoc, on a dataset of single- and multi-localized proteins derived from the DBMLoc dataset, which is the most comprehensive protein multi-localization dataset currently available. Our results, obtained by using MDLoc, significantly improve upon results obtained by an initial simpler classifier, as well as on results reported by other top systems. MDLoc is available at: http://www.eecis.udel.edu/∼compbio/mdloc. © The Author 2015. Published by Oxford University Press.

  5. Protein (multi-)location prediction: utilizing interdependencies via a generative model

    Science.gov (United States)

    Shatkay, Hagit

    2015-01-01

    Motivation: Proteins are responsible for a multitude of vital tasks in all living organisms. Given that a protein’s function and role are strongly related to its subcellular location, protein location prediction is an important research area. While proteins move from one location to another and can localize to multiple locations, most existing location prediction systems assign only a single location per protein. A few recent systems attempt to predict multiple locations for proteins, however, their performance leaves much room for improvement. Moreover, such systems do not capture dependencies among locations and usually consider locations as independent. We hypothesize that a multi-location predictor that captures location inter-dependencies can improve location predictions for proteins. Results: We introduce a probabilistic generative model for protein localization, and develop a system based on it—which we call MDLoc—that utilizes inter-dependencies among locations to predict multiple locations for proteins. The model captures location inter-dependencies using Bayesian networks and represents dependency between features and locations using a mixture model. We use iterative processes for learning model parameters and for estimating protein locations. We evaluate our classifier MDLoc, on a dataset of single- and multi-localized proteins derived from the DBMLoc dataset, which is the most comprehensive protein multi-localization dataset currently available. Our results, obtained by using MDLoc, significantly improve upon results obtained by an initial simpler classifier, as well as on results reported by other top systems. Availability and implementation: MDLoc is available at: http://www.eecis.udel.edu/∼compbio/mdloc. Contact: shatkay@udel.edu. PMID:26072505

  6. Current Gaps in the Understanding of the Subcellular Distribution of Exogenous and Endogenous Protein TorsinA.

    Science.gov (United States)

    Harata, N Charles

    2014-01-01

    An in-frame deletion leading to the loss of a single glutamic acid residue in the protein torsinA (ΔE-torsinA) results in an inherited movement disorder, DYT1 dystonia. This autosomal dominant disease affects the function of the brain without causing neurodegeneration, by a mechanism that remains unknown. We evaluated the literature regarding the subcellular localization of torsinA. Efforts to elucidate the pathophysiological basis of DYT1 dystonia have relied partly on examining the subcellular distribution of the wild-type and mutated proteins. A typical approach is to introduce the human torsinA gene (TOR1A) into host cells and overexpress the protein therein. In both neurons and non-neuronal cells, exogenous wild-type torsinA introduced in this manner has been found to localize mainly to the endoplasmic reticulum, whereas exogenous ΔE-torsinA is predominantly in the nuclear envelope or cytoplasmic inclusions. Although these outcomes are relatively consistent, findings for the localization of endogenous torsinA have been variable, leaving its physiological distribution a matter of debate. As patients' cells do not overexpress torsinA proteins, it is important to understand why the reported distributions of the endogenous proteins are inconsistent. We propose that careful optimization of experimental methods will be critical in addressing the causes of the differences among the distributions of endogenous (non-overexpressed) vs. exogenously introduced (overexpressed) proteins.

  7. [L-arginine metabolism enzyme activities in rat liver subcellular fractions under condition of protein deprivation].

    Science.gov (United States)

    Kopyl'chuk, G P; Buchkovskaia, I M

    2014-01-01

    The features of arginase and NO-synthase pathways of arginine's metabolism have been studied in rat liver subcellular fractions under condition of protein deprivation. During the experimental period (28 days) albino male rats were kept on semi synthetic casein diet AIN-93. The protein deprivation conditions were designed as total absence of protein in the diet and consumption of the diet partially deprived with 1/2 of the casein amount compared to in the regular diet. Daily diet consumption was regulated according to the pair feeding approach. It has been shown that the changes of enzyme activities, involved in L-arginine metabolism, were characterized by 1.4-1.7 fold decrease in arginase activity, accompanied with unchanged NO-synthase activity in cytosol. In mitochondrial fraction the unchanged arginase activity was accompanied by 3-5 fold increase of NO-synthase activity. At the terminal stages of the experiment the monodirectional dynamics in the studied activities have been observed in the mitochondrial and cytosolfractions in both experimental groups. In the studied subcellular fractions arginase activity decreased (2.4-2.7 fold with no protein in the diet and 1.5 fold with partly supplied protein) and was accompanied by NO-synthase activity increase by 3.8 fold in cytosole fraction, by 7.2 fold in mitochondrial fraction in the group with no protein in the diet and by 2.2 and 3.5 fold in the group partialy supplied with protein respectively. The observed tendency is presumably caused by the switch of L-arginine metabolism from arginase into oxidizing NO-synthase parthway.

  8. Mutations in the C-terminal region affect subcellular localization of crucian carp herpesvirus (CaHV) GPCR.

    Science.gov (United States)

    Wang, Jun; Gui, Lang; Chen, Zong-Yan; Zhang, Qi-Ya

    2016-08-01

    G protein-coupled receptors (GPCRs) are known as seven transmembrane domain receptors and consequently can mediate diverse biological functions via regulation of their subcellular localization. Crucian carp herpesvirus (CaHV) was recently isolated from infected fish with acute gill hemorrhage. CaHV GPCR of 349 amino acids (aa) was identified based on amino acid identity. A series of variants with truncation/deletion/substitution mutation in the C-terminal (aa 315-349) were constructed and expressed in fathead minnow (FHM) cells. The roles of three key C-terminal regions in subcellular localization of CaHV GPCR were determined. Lysine-315 (K-315) directed the aggregation of the protein preferentially at the nuclear side. Predicted N-myristoylation site (GGGWTR, aa 335-340) was responsible for punctate distribution in periplasm or throughout the cytoplasm. Predicted phosphorylation site (SSR, aa 327-329) and GGGWTR together determined the punctate distribution in cytoplasm. Detection of organelles localization by specific markers showed that the protein retaining K-315 colocalized with the Golgi apparatus. These experiments provided first evidence that different mutations of CaHV GPCR C-terminals have different affects on the subcellular localization of fish herpesvirus-encoded GPCRs. The study provided valuable information and new insights into the precise interactions between herpesvirus and fish cells, and could also provide useful targets for antiviral agents in aquaculture.

  9. Protein docking prediction using predicted protein-protein interface

    Directory of Open Access Journals (Sweden)

    Li Bin

    2012-01-01

    Full Text Available Abstract Background Many important cellular processes are carried out by protein complexes. To provide physical pictures of interacting proteins, many computational protein-protein prediction methods have been developed in the past. However, it is still difficult to identify the correct docking complex structure within top ranks among alternative conformations. Results We present a novel protein docking algorithm that utilizes imperfect protein-protein binding interface prediction for guiding protein docking. Since the accuracy of protein binding site prediction varies depending on cases, the challenge is to develop a method which does not deteriorate but improves docking results by using a binding site prediction which may not be 100% accurate. The algorithm, named PI-LZerD (using Predicted Interface with Local 3D Zernike descriptor-based Docking algorithm, is based on a pair wise protein docking prediction algorithm, LZerD, which we have developed earlier. PI-LZerD starts from performing docking prediction using the provided protein-protein binding interface prediction as constraints, which is followed by the second round of docking with updated docking interface information to further improve docking conformation. Benchmark results on bound and unbound cases show that PI-LZerD consistently improves the docking prediction accuracy as compared with docking without using binding site prediction or using the binding site prediction as post-filtering. Conclusion We have developed PI-LZerD, a pairwise docking algorithm, which uses imperfect protein-protein binding interface prediction to improve docking accuracy. PI-LZerD consistently showed better prediction accuracy over alternative methods in the series of benchmark experiments including docking using actual docking interface site predictions as well as unbound docking cases.

  10. Protein docking prediction using predicted protein-protein interface.

    Science.gov (United States)

    Li, Bin; Kihara, Daisuke

    2012-01-10

    Many important cellular processes are carried out by protein complexes. To provide physical pictures of interacting proteins, many computational protein-protein prediction methods have been developed in the past. However, it is still difficult to identify the correct docking complex structure within top ranks among alternative conformations. We present a novel protein docking algorithm that utilizes imperfect protein-protein binding interface prediction for guiding protein docking. Since the accuracy of protein binding site prediction varies depending on cases, the challenge is to develop a method which does not deteriorate but improves docking results by using a binding site prediction which may not be 100% accurate. The algorithm, named PI-LZerD (using Predicted Interface with Local 3D Zernike descriptor-based Docking algorithm), is based on a pair wise protein docking prediction algorithm, LZerD, which we have developed earlier. PI-LZerD starts from performing docking prediction using the provided protein-protein binding interface prediction as constraints, which is followed by the second round of docking with updated docking interface information to further improve docking conformation. Benchmark results on bound and unbound cases show that PI-LZerD consistently improves the docking prediction accuracy as compared with docking without using binding site prediction or using the binding site prediction as post-filtering. We have developed PI-LZerD, a pairwise docking algorithm, which uses imperfect protein-protein binding interface prediction to improve docking accuracy. PI-LZerD consistently showed better prediction accuracy over alternative methods in the series of benchmark experiments including docking using actual docking interface site predictions as well as unbound docking cases.

  11. Targeted nanodiamonds for identification of subcellular protein assemblies in mammalian cells

    Science.gov (United States)

    Lake, Michael P.; Bouchard, Louis-S.

    2017-01-01

    Transmission electron microscopy (TEM) can be used to successfully determine the structures of proteins. However, such studies are typically done ex situ after extraction of the protein from the cellular environment. Here we describe an application for nanodiamonds as targeted intensity contrast labels in biological TEM, using the nuclear pore complex (NPC) as a model macroassembly. We demonstrate that delivery of antibody-conjugated nanodiamonds to live mammalian cells using maltotriose-conjugated polypropylenimine dendrimers results in efficient localization of nanodiamonds to the intended cellular target. We further identify signatures of nanodiamonds under TEM that allow for unambiguous identification of individual nanodiamonds from a resin-embedded, OsO4-stained environment. This is the first demonstration of nanodiamonds as labels for nanoscale TEM-based identification of subcellular protein assemblies. These results, combined with the unique fluorescence properties and biocompatibility of nanodiamonds, represent an important step toward the use of nanodiamonds as markers for correlated optical/electron bioimaging. PMID:28636640

  12. Targeted nanodiamonds for identification of subcellular protein assemblies in mammalian cells.

    Directory of Open Access Journals (Sweden)

    Michael P Lake

    Full Text Available Transmission electron microscopy (TEM can be used to successfully determine the structures of proteins. However, such studies are typically done ex situ after extraction of the protein from the cellular environment. Here we describe an application for nanodiamonds as targeted intensity contrast labels in biological TEM, using the nuclear pore complex (NPC as a model macroassembly. We demonstrate that delivery of antibody-conjugated nanodiamonds to live mammalian cells using maltotriose-conjugated polypropylenimine dendrimers results in efficient localization of nanodiamonds to the intended cellular target. We further identify signatures of nanodiamonds under TEM that allow for unambiguous identification of individual nanodiamonds from a resin-embedded, OsO4-stained environment. This is the first demonstration of nanodiamonds as labels for nanoscale TEM-based identification of subcellular protein assemblies. These results, combined with the unique fluorescence properties and biocompatibility of nanodiamonds, represent an important step toward the use of nanodiamonds as markers for correlated optical/electron bioimaging.

  13. Targeted nanodiamonds for identification of subcellular protein assemblies in mammalian cells.

    Science.gov (United States)

    Lake, Michael P; Bouchard, Louis-S

    2017-01-01

    Transmission electron microscopy (TEM) can be used to successfully determine the structures of proteins. However, such studies are typically done ex situ after extraction of the protein from the cellular environment. Here we describe an application for nanodiamonds as targeted intensity contrast labels in biological TEM, using the nuclear pore complex (NPC) as a model macroassembly. We demonstrate that delivery of antibody-conjugated nanodiamonds to live mammalian cells using maltotriose-conjugated polypropylenimine dendrimers results in efficient localization of nanodiamonds to the intended cellular target. We further identify signatures of nanodiamonds under TEM that allow for unambiguous identification of individual nanodiamonds from a resin-embedded, OsO4-stained environment. This is the first demonstration of nanodiamonds as labels for nanoscale TEM-based identification of subcellular protein assemblies. These results, combined with the unique fluorescence properties and biocompatibility of nanodiamonds, represent an important step toward the use of nanodiamonds as markers for correlated optical/electron bioimaging.

  14. Determination of ABA-binding proteins contents in subcellular fractions isolated from cotton seedlings using radioimmunoanalysis

    International Nuclear Information System (INIS)

    Tursunkhodjayeva, F.M.

    2004-01-01

    Full text: Knowledge of plants' hormone receptor sites is essential to understanding of the principles of phytohormone action in cells and tissues. The hormone abscisic acid (ABA) takes part in many important physiological processes of plants, including water balance and resistance to salt stress. The detection of salt tolerance in the early stages of ontogenesis is desirable for effective cultivation of cotton. Usually such characteristics are determined visually after genetic analysis of hybrids over several generations. This classic method of genetics requires a long time to grow several generations of cotton plants. In this connection we study ABA-binding protein contents in subcellular fractions isolated from seedlings of several kinds of cotton with different tolerance to salt stress. The contents of ABA-binding protein in nuclei and chloroplasts fractions isolated from cotton seedlings were determined using radioimmunoanalysis. The subcellular fractions were prepared by ultracentrifugation in 0,25 - 2,2 M sucrose gradient. ABA-binding protein was isolated from cotton seedlings by affinity chromatography. The antibodies against ABA-binding protein of cotton were developed in rabbits according standard protocols. Than the antibodies were labelled by radioisotope J 125 according Greenwood et al. It was shown, that the nuclei and chloroplasts fractions isolated from cotton with high tolerance to salt stress contain ABA-binding protein up to 1,5-1,8 times more, than the same fractions from cotton with low tolerance to salt stress. So, the ABA-binding protein contents in cotton seedlings may be considered as a marker for screening of cotton kinds, which may potentially have high tolerance to salt stress

  15. Current Gaps in the Understanding of the Subcellular Distribution of Exogenous and Endogenous Protein TorsinA

    Directory of Open Access Journals (Sweden)

    N. Charles Harata

    2014-09-01

    Full Text Available Background: An in‐frame deletion leading to the loss of a single glutamic acid residue in the protein torsinA (ΔE‐torsinA results in an inherited movement disorder, DYT1 dystonia. This autosomal dominant disease affects the function of the brain without causing neurodegeneration, by a mechanism that remains unknown.Methods: We evaluated the literature regarding the subcellular localization of torsinA.Results: Efforts to elucidate the pathophysiological basis of DYT1 dystonia have relied partly on examining the subcellular distribution of the wild‐type and mutated proteins. A typical approach is to introduce the human torsinA gene (TOR1A into host cells and overexpress the protein therein. In both neurons and non‐neuronal cells, exogenous wild‐type torsinA introduced in this manner has been found to localize mainly to the endoplasmic reticulum, whereas exogenous ΔE‐torsinA is predominantly in the nuclear envelope or cytoplasmic inclusions. Although these outcomes are relatively consistent, findings for the localization of endogenous torsinA have been variable, leaving its physiological distribution a matter of debate.Discussion: As patients’ cells do not overexpress torsinA proteins, it is important to understand why the reported distributions of the endogenous proteins are inconsistent. We propose that careful optimization of experimental methods will be critical in addressing the causes of the differences among the distributions of endogenous (non‐overexpressed vs. exogenously introduced (overexpressed proteins.

  16. AAV exploits subcellular stress associated with inflammation, endoplasmic reticulum expansion, and misfolded proteins in models of cystic fibrosis.

    Directory of Open Access Journals (Sweden)

    Jarrod S Johnson

    2011-05-01

    Full Text Available Barriers to infection act at multiple levels to prevent viruses, bacteria, and parasites from commandeering host cells for their own purposes. An intriguing hypothesis is that if a cell experiences stress, such as that elicited by inflammation, endoplasmic reticulum (ER expansion, or misfolded proteins, then subcellular barriers will be less effective at preventing viral infection. Here we have used models of cystic fibrosis (CF to test whether subcellular stress increases susceptibility to adeno-associated virus (AAV infection. In human airway epithelium cultured at an air/liquid interface, physiological conditions of subcellular stress and ER expansion were mimicked using supernatant from mucopurulent material derived from CF lungs. Using this inflammatory stimulus to recapitulate stress found in diseased airways, we demonstrated that AAV infection was significantly enhanced. Since over 90% of CF cases are associated with a misfolded variant of Cystic Fibrosis Transmembrane Conductance Regulator (ΔF508-CFTR, we then explored whether the presence of misfolded proteins could independently increase susceptibility to AAV infection. In these models, AAV was an order of magnitude more efficient at transducing cells expressing ΔF508-CFTR than in cells expressing wild-type CFTR. Rescue of misfolded ΔF508-CFTR under low temperature conditions restored viral transduction efficiency to that demonstrated in controls, suggesting effects related to protein misfolding were responsible for increasing susceptibility to infection. By testing other CFTR mutants, G551D, D572N, and 1410X, we have shown this phenomenon is common to other misfolded proteins and not related to loss of CFTR activity. The presence of misfolded proteins did not affect cell surface attachment of virus or influence expression levels from promoter transgene cassettes in plasmid transfection studies, indicating exploitation occurs at the level of virion trafficking or processing. Thus

  17. Subcellular Iron Localization Mechanisms in Plants

    Directory of Open Access Journals (Sweden)

    Emre Aksoy

    2017-12-01

    Full Text Available The basic micro-nutrient element iron (Fe is present as a cofactor in the active sites of many metalloproteins with important roles in the plant. On the other hand, since it is excessively reactive, excess accumulation in the cell triggers the production of reactive oxygen species, leading to cell death. Therefore, iron homeostasis in the cell is very important for plant growth. Once uptake into the roots, iron is distributed to the subcellular compartments. Subcellular iron transport and hence cellular iron homeostasis is carried out through synchronous control of different membrane protein families. It has been discovered that expression levels of these membrane proteins increase under iron deficiency. Examination of the tasks and regulations of these carriers is very important in terms of understanding the iron intake and distribution mechanisms in plants. Therefore, in this review, the transporters responsible for the uptake of iron into the cell and its subcellular distribution between organelles will be discussed with an emphasis on the current developments about these transporters.

  18. Quantitative Analysis of Subcellular Distribution of the SUMO Conjugation System by Confocal Microscopy Imaging.

    Science.gov (United States)

    Mas, Abraham; Amenós, Montse; Lois, L Maria

    2016-01-01

    Different studies point to an enrichment in SUMO conjugation in the cell nucleus, although non-nuclear SUMO targets also exist. In general, the study of subcellular localization of proteins is essential for understanding their function within a cell. Fluorescence microscopy is a powerful tool for studying subcellular protein partitioning in living cells, since fluorescent proteins can be fused to proteins of interest to determine their localization. Subcellular distribution of proteins can be influenced by binding to other biomolecules and by posttranslational modifications. Sometimes these changes affect only a portion of the protein pool or have a partial effect, and a quantitative evaluation of fluorescence images is required to identify protein redistribution among subcellular compartments. In order to obtain accurate data about the relative subcellular distribution of SUMO conjugation machinery members, and to identify the molecular determinants involved in their localization, we have applied quantitative confocal microscopy imaging. In this chapter, we will describe the fluorescent protein fusions used in these experiments, and how to measure, evaluate, and compare average fluorescence intensities in cellular compartments by image-based analysis. We show the distribution of some components of the Arabidopsis SUMOylation machinery in epidermal onion cells and how they change their distribution in the presence of interacting partners or even when its activity is affected.

  19. Analysis of the influence of subcellular localization of the HIV Rev protein on Rev-dependent gene expression by multi-fluorescence live-cell imaging

    International Nuclear Information System (INIS)

    Wolff, Horst; Hadian, Kamyar; Ziegler, Manja; Weierich, Claudia; Kramer-Hammerle, Susanne; Kleinschmidt, Andrea; Erfle, Volker; Brack-Werner, Ruth

    2006-01-01

    The human immunodeficiency virus Rev protein is a post-transcriptional activator of HIV gene expression. Rev is a nucleocytoplasmic shuttle protein that displays characteristic nuclear/nucleolar subcellular localization in various cell lines. Cytoplasmic localization of Rev occurs under various conditions disrupting Rev function. The goal of this study was to investigate the relationship between localization of Rev and its functional activity in living cells. A triple-fluorescent imaging assay, called AQ-FIND, was established for automatic quantitative evaluation of nucleocytoplasmic distribution of fluorescently tagged proteins. This assay was used to screen 500 rev genes generated by error-prone PCR for Rev mutants with different localization phenotypes. Activities of the Rev mutants were determined with a second quantitative, dual-fluorescent reporter assay. In HeLa cells, the majority of nuclear Rev mutants had activities similar to wild-type Rev. The activities of Rev mutants with abnormal cytoplasmic localization ranged from moderately impaired to nonfunctional. There was no linear correlation between subcellular distribution and levels of Rev activity. In astrocytes, nuclear Rev mutants showed similar impaired activities as the cytoplasmic wild-type Rev. Our data suggest that steady-state subcellular localization is not a primary regulator of Rev activity but may change as a secondary consequence of altered Rev function. The methodologies described here have potential for studying the significance of subcellular localization for functions of other regulatory factors

  20. Interferon-inducible p200-family protein IFI16, an innate immune sensor for cytosolic and nuclear double-stranded DNA: regulation of subcellular localization.

    Science.gov (United States)

    Veeranki, Sudhakar; Choubey, Divaker

    2012-01-01

    The interferon (IFN)-inducible p200-protein family includes structurally related murine (for example, p202a, p202b, p204, and Aim2) and human (for example, AIM2 and IFI16) proteins. All proteins in the family share a partially conserved repeat of 200-amino acid residues (also called HIN-200 domain) in the C-terminus. Additionally, most proteins (except the p202a and p202b proteins) also share a protein-protein interaction pyrin domain (PYD) in the N-terminus. The HIN-200 domain contains two consecutive oligosaccharide/oligonucleotide binding folds (OB-folds) to bind double stranded DNA (dsDNA). The PYD domain in proteins allows interactions with the family members and an adaptor protein ASC. Upon sensing cytosolic dsDNA, Aim2, p204, and AIM2 proteins recruit ASC protein to form an inflammasome, resulting in increased production of proinflammatory cytokines. However, IFI16 protein can sense cytosolic as well as nuclear dsDNA. Interestingly, the IFI16 protein contains a nuclear localization signal (NLS). Accordingly, the initial studies had indicated that the endogenous IFI16 protein is detected in the nucleus and within the nucleus in the nucleolus. However, several recent reports suggest that subcellular localization of IFI16 protein in nuclear versus cytoplasmic (or both) compartment depends on cell type. Given that the IFI16 protein can sense cytosolic as well as nuclear dsDNA and can initiate different innate immune responses (production of IFN-β versus proinflammatory cytokines), here we evaluate the experimental evidence for the regulation of subcellular localization of IFI16 protein in various cell types. We conclude that further studies are needed to understand the molecular mechanisms that regulate the subcellular localization of IFI16 protein. Published by Elsevier Ltd.

  1. Subcellular localization of hepatitis E virus (HEV) replicase

    International Nuclear Information System (INIS)

    Rehman, Shagufta; Kapur, Neeraj; Durgapal, Hemlata; Panda, Subrat Kumar

    2008-01-01

    Hepatitis E virus (HEV) is a hepatotropic virus with a single sense-strand RNA genome of ∼ 7.2 kb in length. Details of the intracellular site of HEV replication can pave further understanding of HEV biology. In-frame fusion construct of functionally active replicase-enhanced green fluorescent protein (EGFP) gene was made in eukaryotic expression vector. The functionality of replicase-EGFP fusion protein was established by its ability to synthesize negative-strand viral RNA in vivo, by strand-specific anchored RT-PCR and molecular beacon binding. Subcellular co-localization was carried out using organelle specific fluorophores and by immuno-electron microscopy. Fluorescence Resonance Energy Transfer (FRET) demonstrated the interaction of this protein with the 3' end of HEV genome. The results show localization of replicase on the endoplasmic reticulum membranes. The protein regions responsible for membrane localization was predicted and identified by use of deletion mutants. Endoplasmic reticulum was identified as the site of replicase localization and possible site of replication

  2. Expression and subcellular localization of antiporter regulating ...

    African Journals Online (AJOL)

    We examined the expression and subcellular localization of antiporter regulating protein OsARP in a submergence tolerant rice (Oryza sativa L.) cultivar FR13A. In the public databases, this protein was designated as putative Os02g0465900 protein. The cDNA containing the full-length sequence of OsARP gene was ...

  3. In silico sequence analysis and homology modeling of predicted beta-amylase 7-like protein in Brachypodium distachyon L.

    Directory of Open Access Journals (Sweden)

    ERTUĞRUL FILIZ

    2014-04-01

    Full Text Available Beta-amylase (β-amylase, EC 3.2.1.2 is an enzyme that catalyses hydrolysis of glucosidic bonds in polysaccharides. In this study, we analyzed protein sequence of predicted beta-amylase 7-like protein in Brachypodium distachyon. pI (isoelectric point value was found as 5.23 in acidic character, while the instability index (II was found as 50.28 with accepted unstable protein. The prediction of subcellular localization was revealed that the protein may reside in chloroplast by using CELLO v.2.5. The 3D structure of protein was performed using comparative homology modeling with SWISS-MODEL. The accuracy of the predicted 3D structure was checked using Ramachandran plot analysis showed that 95.4% in favored region. The results of our study contribute to understanding of β-amylase protein structure in grass species and will be scientific base for 3D modeling of beta-amylase proteins in further studies.

  4. Locating proteins in the cell using TargetP, SignalP and related tools

    DEFF Research Database (Denmark)

    Emanuelsson, O.; Brunak, Søren; von Heijne, G.

    2007-01-01

    of methods to predict subcellular localization based on these sorting signals and other sequence properties. We then outline how to use a number of internet-accessible tools to arrive at a reliable subcellular localization prediction for eukaryotic and prokaryotic proteins. In particular, we provide detailed...

  5. Accurate Classification of Protein Subcellular Localization from High-Throughput Microscopy Images Using Deep Learning

    Directory of Open Access Journals (Sweden)

    Tanel Pärnamaa

    2017-05-01

    Full Text Available High-throughput microscopy of many single cells generates high-dimensional data that are far from straightforward to analyze. One important problem is automatically detecting the cellular compartment where a fluorescently-tagged protein resides, a task relatively simple for an experienced human, but difficult to automate on a computer. Here, we train an 11-layer neural network on data from mapping thousands of yeast proteins, achieving per cell localization classification accuracy of 91%, and per protein accuracy of 99% on held-out images. We confirm that low-level network features correspond to basic image characteristics, while deeper layers separate localization classes. Using this network as a feature calculator, we train standard classifiers that assign proteins to previously unseen compartments after observing only a small number of training examples. Our results are the most accurate subcellular localization classifications to date, and demonstrate the usefulness of deep learning for high-throughput microscopy.

  6. Accurate Classification of Protein Subcellular Localization from High-Throughput Microscopy Images Using Deep Learning.

    Science.gov (United States)

    Pärnamaa, Tanel; Parts, Leopold

    2017-05-05

    High-throughput microscopy of many single cells generates high-dimensional data that are far from straightforward to analyze. One important problem is automatically detecting the cellular compartment where a fluorescently-tagged protein resides, a task relatively simple for an experienced human, but difficult to automate on a computer. Here, we train an 11-layer neural network on data from mapping thousands of yeast proteins, achieving per cell localization classification accuracy of 91%, and per protein accuracy of 99% on held-out images. We confirm that low-level network features correspond to basic image characteristics, while deeper layers separate localization classes. Using this network as a feature calculator, we train standard classifiers that assign proteins to previously unseen compartments after observing only a small number of training examples. Our results are the most accurate subcellular localization classifications to date, and demonstrate the usefulness of deep learning for high-throughput microscopy. Copyright © 2017 Parnamaa and Parts.

  7. Correlation of N-myc downstream-regulated gene 1 subcellular localization and lymph node metastases of colorectal neoplasms

    Energy Technology Data Exchange (ETDEWEB)

    Song, Yan [Medical Research Center, Shandong Provincial Qianfoshan Hospital, Shandong University, Jinan 250014 (China); Lv, Liyang [Department of Health, Jinan Military Area Command, Jinan 250022 (China); Du, Juan; Yue, Longtao [Medical Research Center, Shandong Provincial Qianfoshan Hospital, Shandong University, Jinan 250014 (China); Cao, Lili, E-mail: cllly22@163.com [Medical Research Center, Shandong Provincial Qianfoshan Hospital, Shandong University, Jinan 250014 (China)

    2013-09-20

    Highlights: •We clarified NDRG1 subcellular location in colorectal cancer. •We found the changes of NDRG1 distribution during colorectal cancer progression. •We clarified the correlation between NDRG1 distribution and lymph node metastasis. •It is possible that NDRG1 subcellular localization may determine its function. •Maybe NDRG1 is valuable early diagnostic markers for metastasis. -- Abstract: In colorectal neoplasms, N-myc downstream-regulated gene 1 (NDRG1) is a primarily cytoplasmic protein, but it is also expressed on the cell membrane and in the nucleus. NDRG1 is involved in various stages of tumor development in colorectal cancer, and it is possible that the different subcellular localizations may determine the function of NDRG1 protein. Here, we attempt to clarify the characteristics of NDRG1 protein subcellular localization during the progression of colorectal cancer. We examined NDRG1 expression in 49 colorectal cancer patients in cancerous, non-cancerous, and corresponding lymph node tissues. Cytoplasmic and membrane NDRG1 expression was higher in the lymph nodes with metastases than in those without metastases (P < 0.01). Nuclear NDRG1 expression in colorectal neoplasms was significantly higher than in the normal colorectal mucosa, and yet the normal colorectal mucosa showed no nuclear expression. Furthermore, our results showed higher cytoplasmic NDRG1 expression was better for differentiation, and higher membrane NDRG1 expression resulted in a greater possibility of lymph node metastasis. These data indicate that a certain relationship between the cytoplasmic and membrane expression of NDRG1 in lymph nodes exists with lymph node metastasis. NDRG1 expression may translocate from the membrane of the colorectal cancer cells to the nucleus, where it is involved in lymph node metastasis. Combination analysis of NDRG1 subcellular expression and clinical variables will help predict the incidence of lymph node metastasis.

  8. Correlation of N-myc downstream-regulated gene 1 subcellular localization and lymph node metastases of colorectal neoplasms

    International Nuclear Information System (INIS)

    Song, Yan; Lv, Liyang; Du, Juan; Yue, Longtao; Cao, Lili

    2013-01-01

    Highlights: •We clarified NDRG1 subcellular location in colorectal cancer. •We found the changes of NDRG1 distribution during colorectal cancer progression. •We clarified the correlation between NDRG1 distribution and lymph node metastasis. •It is possible that NDRG1 subcellular localization may determine its function. •Maybe NDRG1 is valuable early diagnostic markers for metastasis. -- Abstract: In colorectal neoplasms, N-myc downstream-regulated gene 1 (NDRG1) is a primarily cytoplasmic protein, but it is also expressed on the cell membrane and in the nucleus. NDRG1 is involved in various stages of tumor development in colorectal cancer, and it is possible that the different subcellular localizations may determine the function of NDRG1 protein. Here, we attempt to clarify the characteristics of NDRG1 protein subcellular localization during the progression of colorectal cancer. We examined NDRG1 expression in 49 colorectal cancer patients in cancerous, non-cancerous, and corresponding lymph node tissues. Cytoplasmic and membrane NDRG1 expression was higher in the lymph nodes with metastases than in those without metastases (P < 0.01). Nuclear NDRG1 expression in colorectal neoplasms was significantly higher than in the normal colorectal mucosa, and yet the normal colorectal mucosa showed no nuclear expression. Furthermore, our results showed higher cytoplasmic NDRG1 expression was better for differentiation, and higher membrane NDRG1 expression resulted in a greater possibility of lymph node metastasis. These data indicate that a certain relationship between the cytoplasmic and membrane expression of NDRG1 in lymph nodes exists with lymph node metastasis. NDRG1 expression may translocate from the membrane of the colorectal cancer cells to the nucleus, where it is involved in lymph node metastasis. Combination analysis of NDRG1 subcellular expression and clinical variables will help predict the incidence of lymph node metastasis

  9. An ensemble method for predicting subnuclear localizations from primary protein structures.

    Directory of Open Access Journals (Sweden)

    Guo Sheng Han

    Full Text Available BACKGROUND: Predicting protein subnuclear localization is a challenging problem. Some previous works based on non-sequence information including Gene Ontology annotations and kernel fusion have respective limitations. The aim of this work is twofold: one is to propose a novel individual feature extraction method; another is to develop an ensemble method to improve prediction performance using comprehensive information represented in the form of high dimensional feature vector obtained by 11 feature extraction methods. METHODOLOGY/PRINCIPAL FINDINGS: A novel two-stage multiclass support vector machine is proposed to predict protein subnuclear localizations. It only considers those feature extraction methods based on amino acid classifications and physicochemical properties. In order to speed up our system, an automatic search method for the kernel parameter is used. The prediction performance of our method is evaluated on four datasets: Lei dataset, multi-localization dataset, SNL9 dataset and a new independent dataset. The overall accuracy of prediction for 6 localizations on Lei dataset is 75.2% and that for 9 localizations on SNL9 dataset is 72.1% in the leave-one-out cross validation, 71.7% for the multi-localization dataset and 69.8% for the new independent dataset, respectively. Comparisons with those existing methods show that our method performs better for both single-localization and multi-localization proteins and achieves more balanced sensitivities and specificities on large-size and small-size subcellular localizations. The overall accuracy improvements are 4.0% and 4.7% for single-localization proteins and 6.5% for multi-localization proteins. The reliability and stability of our classification model are further confirmed by permutation analysis. CONCLUSIONS: It can be concluded that our method is effective and valuable for predicting protein subnuclear localizations. A web server has been designed to implement the proposed method

  10. Subcellular redistribution of trimeric G-proteins – potential mechanism of desensitization of hormone response: internalisation, solubilization, down-regulation

    Czech Academy of Sciences Publication Activity Database

    Drastichová, Zdeňka; Bouřová, Lenka; Lisý, Václav; Hejnová, L.; Rudajev, Vladimír; Stöhr, Jiří; Durchánková, Dana; Ostašov, Pavel; Teisinger, Jan; Soukup, Tomáš; Novotný, Jiří; Svoboda, Petr

    2008-01-01

    Roč. 57, Suppl.3 (2008), S1-S10 ISSN 0862-8408 R&D Projects: GA MŠk(CZ) LC554; GA ČR(CZ) GA309/06/0121 Institutional research plan: CEZ:AV0Z50110509 Keywords : brain * subcellular fractionation * trimeric G-proteins Subject RIV: CE - Biochemistry Impact factor: 1.653, year: 2008

  11. The PDZ and band 4.1 containing protein Frmpd1 regulates the subcellular location of activator of G-protein signaling 3 and its interaction with G-proteins.

    Science.gov (United States)

    An, Ningfei; Blumer, Joe B; Bernard, Michael L; Lanier, Stephen M

    2008-09-05

    Activator of G-protein signaling 3 (AGS3) is one of nine mammalian proteins containing one or more G-protein regulatory (GPR) motifs that stabilize the GDP-bound conformation of Galphai. Such proteins have revealed unexpected functional diversity for the "G-switch" in the control of events within the cell independent of the role of heterotrimeric G-proteins as transducers for G-protein-coupled receptors at the cell surface. A key question regarding this class of proteins is what controls their subcellular positioning and interaction with G-proteins. We conducted a series of yeast two-hybrid screens to identify proteins interacting with the tetratricopeptide repeat (TPR) of AGS3, which plays an important role in subcellular positioning of the protein. We report the identification of Frmpd1 (FERM and PDZ domain containing 1) as a regulatory binding partner of AGS3. Frmpd1 binds to the TPR domain of AGS3 and coimmunoprecipitates with AGS3 from cell lysates. Cell fractionation indicated that Frmpd1 stabilizes AGS3 in a membrane fraction. Upon cotransfection of COS7 cells with Frmpd1-GFP and AGS3-mRFP, AGS3-mRFP is observed in regions of the cell cortex and also in membrane extensions or processes where it appears to be colocalized with Frmpd1-GFP based upon the merged fluorescent signals. Frmpd1 knockdown (siRNA) in Cath.a-differentiated neuronal cells decreased the level of endogenous AGS3 in membrane fractions by approximately 50% and enhanced the alpha2-adrenergic receptor-mediated inhibition of forskolin-induced increases in cAMP. The coimmunoprecipitation of Frmpd1 with AGS3 is lost as the amount of Galphai3 in the cell is increased and AGS3 apparently switches its binding partner from Frmpd1 to Galphai3 indicating that the interaction of AGS3 with Frmpd1 and Galphai3 is mutually exclusive. Mechanistically, Frmpd1 may position AGS3 in a membrane environment where it then interacts with Galphai in a regulated manner.

  12. Subcellular fractionation and localization studies reveal a direct interaction of the Fragile X Mental Retardation Protein (FMRP) with nucleolin

    NARCIS (Netherlands)

    Taha, M.S.; Nouri, K.; Milroy, L.G.; Moll, J.M.; Herrmann, C.; Brunsveld, L.; Piekorz, R.P.; Ahmadian, M.R.

    2014-01-01

    Fragile X mental Retardation Protein (FMRP) is a well-known regulator of local translation of its mRNA targets in neurons. However, despite its ubiquitous expression, the role of FMRP remains ill-defined in other cell types. In this study we investigated the subcellular distribution of FMRP and its

  13. Subcellular Trafficking of the Papillomavirus Genome during Initial Infection: The Remarkable Abilities of Minor Capsid Protein L2

    Directory of Open Access Journals (Sweden)

    Samuel K. Campos

    2017-12-01

    Full Text Available Since 2012, our understanding of human papillomavirus (HPV subcellular trafficking has undergone a drastic paradigm shift. Work from multiple laboratories has revealed that HPV has evolved a unique means to deliver its viral genome (vDNA to the cell nucleus, relying on myriad host cell proteins and processes. The major breakthrough finding from these recent endeavors has been the realization of L2-dependent utilization of cellular sorting factors for the retrograde transport of vDNA away from degradative endo/lysosomal compartments to the Golgi, prior to mitosis-dependent nuclear accumulation of L2/vDNA. An overview of current models of HPV entry, subcellular trafficking, and the role of L2 during initial infection is provided below, highlighting unresolved questions and gaps in knowledge.

  14. Trehalose Alters Subcellular Trafficking and the Metabolism of the Alzheimer-associated Amyloid Precursor Protein.

    Science.gov (United States)

    Tien, Nguyen T; Karaca, Ilker; Tamboli, Irfan Y; Walter, Jochen

    2016-05-13

    The disaccharide trehalose is commonly considered to stimulate autophagy. Cell treatment with trehalose could decrease cytosolic aggregates of potentially pathogenic proteins, including mutant huntingtin, α-synuclein, and phosphorylated tau that are associated with neurodegenerative diseases. Here, we demonstrate that trehalose also alters the metabolism of the Alzheimer disease-related amyloid precursor protein (APP). Cell treatment with trehalose decreased the degradation of full-length APP and its C-terminal fragments. Trehalose also reduced the secretion of the amyloid-β peptide. Biochemical and cell biological experiments revealed that trehalose alters the subcellular distribution and decreases the degradation of APP C-terminal fragments in endolysosomal compartments. Trehalose also led to strong accumulation of the autophagic marker proteins LC3-II and p62, and decreased the proteolytic activation of the lysosomal hydrolase cathepsin D. The combined data indicate that trehalose decreases the lysosomal metabolism of APP by altering its endocytic vesicular transport. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.

  15. A new method of high-speed cellular protein separation and insight into subcellular compartmentalization of proteins.

    Science.gov (United States)

    Png, Evelyn; Lan, WanWen; Lazaroo, Melisa; Chen, Silin; Zhou, Lei; Tong, Louis

    2011-05-01

    Transglutaminase (TGM)-2 is a ubiquitous protein with important cellular functions such as regulation of cytoskeleton, cell adhesion, apoptosis, energy metabolism, and stress signaling. We identified several proteins that may interact with TGM-2 through a discovery-based proteomics method via pull down of flag-tagged TGM-2 peptide fragments. The distribution of these potential binding partners of TGM-2 was studied in subcellular fractions separated by density using novel high-speed centricollation technology. Centricollation is a compressed air-driven, low-temperature stepwise ultracentrifugation procedure where low extraction volumes can be processed in a relatively short time in non-denaturing separation conditions with high recovery yield. The fractions were characterized by immunoblots against known organelle markers. The changes in the concentrations of the binding partners were studied in cells expressing short hairpin RNA against TGM-2 (shTG). Desmin, mitochondrial intramembrane cleaving protease (PARL), protein tyrosine kinase (NTRK3), and serine protease (PRSS3) were found to be less concentrated in the 8.5%, 10%, 15%, and 20% sucrose fractions (SFs) from the lysate of shTG cells. The Golgi-associated protein (GOLGA2) was predominantly localized in 15% SF fraction, and in shTG, this shifted to predominantly in the 8.5% SF and showed larger aggregations in the cytosol of cells on immunofluorescent staining compared to control. Based on the relative concentrations of these proteins, we propose how trafficking of such proteins between cellular compartments can occur to regulate cell function. Centricollation is useful for elucidating biological function at the molecular level, especially when combined with traditional cell biology techniques.

  16. Effect of pH 5 enzyme from liver on the protein synthesis by mammary gland subcellular fractions in vitro

    International Nuclear Information System (INIS)

    Singh, Jaspal; Singh, Ajit; Ganguli, N.C.

    1976-01-01

    The effect of pH 5 enzyme fraction of liver on the protein synthesizing activity of the subcellular fractions of the mammary gland has been investigated. Results indicate that (1) lactating liver pH 5 enzyme stimulates protein synthesis which is enhanced by the addition of ATP-generating system and (2) the enzyme fractions from the non-lactating liver inhibits the protein synthesis by mammary fractions, but in some cases like mitochondrial and supernatant fractions of mammary it elevates the synthesis when supplemented with ATP-generating system. Chlorella protein hydrolysate- 14 C was used as a tracer and rabits were used as experimental animals. (M.G.B.)

  17. Top Down Proteomics Reveals Mature Proteoforms Expressed in Subcellular Fractions of the Echinococcus granulosus Preadult Stage.

    Science.gov (United States)

    Lorenzatto, Karina R; Kim, Kyunggon; Ntai, Ioanna; Paludo, Gabriela P; Camargo de Lima, Jeferson; Thomas, Paul M; Kelleher, Neil L; Ferreira, Henrique B

    2015-11-06

    Echinococcus granulosus is the causative agent of cystic hydatid disease, a neglected zoonosis responsible for high morbidity and mortality. Several molecular mechanisms underlying parasite biology remain poorly understood. Here, E. granulosus subcellular fractions were analyzed by top down and bottom up proteomics for protein identification and characterization of co-translational and post-translational modifications (CTMs and PTMs, respectively). Nuclear and cytosolic extracts of E. granulosus protoscoleces were fractionated by 10% GELFrEE and proteins under 30 kDa were analyzed by LC-MS/MS. By top down analysis, 186 proteins and 207 proteoforms were identified, of which 122 and 52 proteoforms were exclusively detected in nuclear and cytosolic fractions, respectively. CTMs were evident as 71% of the proteoforms had methionine excised and 47% were N-terminal acetylated. In addition, in silico internal acetylation prediction coupled with top down MS allowed the characterization of 9 proteins differentially acetylated, including histones. Bottom up analysis increased the overall number of identified proteins in nuclear and cytosolic fractions to 154 and 112, respectively. Overall, our results provided the first description of the low mass proteome of E. granulosus subcellular fractions and highlighted proteoforms with CTMs and PTMS whose characterization may lead to another level of understanding about molecular mechanisms controlling parasitic flatworm biology.

  18. The incorporation of labelled amino acids into the subcellular fractions of the rabbit brain

    International Nuclear Information System (INIS)

    Ogrodnik, W.

    1980-01-01

    Radioactive amino acids were injected into the fourth ventriculum of adult rabbits. After 3, 6 and 13 hours the animals were killed and tissue subcellular fractions were prepared from their brains. Nucleic acids were extracted and quantitatively determined from nucleic, myelin, mitochondrial, microsomal and cytoplasmic fractions. The radioactivity was determined in the protein and nucleic acid fractions. It was found out that the incorporation of radioactive amino acids increased in relation to time. In the analyzed subcellular fractions a very rapid incorporation of glutamic acid and leucine into cytoplasmic proteins was observed. The chromatographic analysis of the nucleic acids showed that radioactivity in the nucleic acid fractions depended on a radioactive protein contamination. Radioactive aminoacyl-tRNA was not found in the nucleic acid fractions, extracted from different subcellular fractions. (author)

  19. Osmotic stress changes the expression and subcellular localization of the Batten disease protein CLN3.

    Directory of Open Access Journals (Sweden)

    Amanda Getty

    Full Text Available Juvenile CLN3 disease (formerly known as juvenile neuronal ceroid lipofuscinosis is a fatal childhood neurodegenerative disorder caused by mutations in the CLN3 gene. CLN3 encodes a putative lysosomal transmembrane protein with unknown function. Previous cell culture studies using CLN3-overexpressing vectors and/or anti-CLN3 antibodies with questionable specificity have also localized CLN3 in cellular structures other than lysosomes. Osmoregulation of the mouse Cln3 mRNA level in kidney cells was recently reported. To clarify the subcellular localization of the CLN3 protein and to investigate if human CLN3 expression and localization is affected by osmotic changes we generated a stably transfected BHK (baby hamster kidney cell line that expresses a moderate level of myc-tagged human CLN3 under the control of the human ubiquitin C promoter. Hyperosmolarity (800 mOsm, achieved by either NaCl/urea or sucrose, dramatically increased the mRNA and protein levels of CLN3 as determined by quantitative real-time PCR and Western blotting. Under isotonic conditions (300 mOsm, human CLN3 was found in a punctate vesicular pattern surrounding the nucleus with prominent Golgi and lysosomal localizations. CLN3-positive early endosomes, late endosomes and cholesterol/sphingolipid-enriched plasma membrane microdomain caveolae were also observed. Increasing the osmolarity of the culture medium to 800 mOsm extended CLN3 distribution away from the perinuclear region and enhanced the lysosomal localization of CLN3. Our results reveal that CLN3 has multiple subcellular localizations within the cell, which, together with its expression, prominently change following osmotic stress. These data suggest that CLN3 is involved in the response and adaptation to cellular stress.

  20. Rice DB: an Oryza Information Portal linking annotation, subcellular location, function, expression, regulation, and evolutionary information for rice and Arabidopsis.

    Science.gov (United States)

    Narsai, Reena; Devenish, James; Castleden, Ian; Narsai, Kabir; Xu, Lin; Shou, Huixia; Whelan, James

    2013-12-01

    Omics research in Oryza sativa (rice) relies on the use of multiple databases to obtain different types of information to define gene function. We present Rice DB, an Oryza information portal that is a functional genomics database, linking gene loci to comprehensive annotations, expression data and the subcellular location of encoded proteins. Rice DB has been designed to integrate the direct comparison of rice with Arabidopsis (Arabidopsis thaliana), based on orthology or 'expressology', thus using and combining available information from two pre-eminent plant models. To establish Rice DB, gene identifiers (more than 40 types) and annotations from a variety of sources were compiled, functional information based on large-scale and individual studies was manually collated, hundreds of microarrays were analysed to generate expression annotations, and the occurrences of potential functional regulatory motifs in promoter regions were calculated. A range of computational subcellular localization predictions were also run for all putative proteins encoded in the rice genome, and experimentally confirmed protein localizations have been collated, curated and linked to functional studies in rice. A single search box allows anything from gene identifiers (for rice and/or Arabidopsis), motif sequences, subcellular location, to keyword searches to be entered, with the capability of Boolean searches (such as AND/OR). To demonstrate the utility of Rice DB, several examples are presented including a rice mitochondrial proteome, which draws on a variety of sources for subcellular location data within Rice DB. Comparisons of subcellular location, functional annotations, as well as transcript expression in parallel with Arabidopsis reveals examples of conservation between rice and Arabidopsis, using Rice DB (http://ricedb.plantenergy.uwa.edu.au). © 2013 The Authors The Plant Journal © 2013 John Wiley & Sons Ltd.

  1. The Induction of Recombinant Protein Bodies in Different Subcellular Compartments Reveals a Cryptic Plastid-Targeting Signal in the 27-kDa γ-Zein Sequence

    Energy Technology Data Exchange (ETDEWEB)

    Hofbauer, Anna; Peters, Jenny; Arcalis, Elsa [Department of Applied Genetics and Cell Biology, University of Natural Resources and Life Sciences, Vienna (Austria); Rademacher, Thomas [Institute of Molecular Biotechnology, RWTH Aachen University, Aachen (Germany); Lampel, Johannes [Department of Applied Genetics and Cell Biology, University of Natural Resources and Life Sciences, Vienna (Austria); Eudes, François [Agriculture and Agri-Food Canada, Lethbridge, AB (Canada); Vitale, Alessandro [Institute of Agricultural Biology and Biotechnology, National Research Council (CNR), Milan (Italy); Stoger, Eva, E-mail: eva.stoger@boku.ac.at [Department of Applied Genetics and Cell Biology, University of Natural Resources and Life Sciences, Vienna (Austria)

    2014-12-11

    Naturally occurring storage proteins such as zeins are used as fusion partners for recombinant proteins because they induce the formation of ectopic storage organelles known as protein bodies (PBs) where the proteins are stabilized by intermolecular interactions and the formation of disulfide bonds. Endogenous PBs are derived from the endoplasmic reticulum (ER). Here, we have used different targeting sequences to determine whether ectopic PBs composed of the N-terminal portion of mature 27 kDa γ-zein added to a fluorescent protein could be induced to form elsewhere in the cell. The addition of a transit peptide for targeting to plastids causes PB formation in the stroma, whereas in the absence of any added targeting sequence PBs were typically associated with the plastid envelope, revealing the presence of a cryptic plastid-targeting signal within the γ-zein cysteine-rich domain. The subcellular localization of the PBs influences their morphology and the solubility of the stored recombinant fusion protein. Our results indicate that the biogenesis and budding of PBs does not require ER-specific factors and therefore, confirm that γ-zein is a versatile fusion partner for recombinant proteins offering unique opportunities for the accumulation and bioencapsulation of recombinant proteins in different subcellular compartments.

  2. The Induction of Recombinant Protein Bodies in Different Subcellular Compartments Reveals a Cryptic Plastid-Targeting Signal in the 27-kDa γ-Zein Sequence

    International Nuclear Information System (INIS)

    Hofbauer, Anna; Peters, Jenny; Arcalis, Elsa; Rademacher, Thomas; Lampel, Johannes; Eudes, François; Vitale, Alessandro; Stoger, Eva

    2014-01-01

    Naturally occurring storage proteins such as zeins are used as fusion partners for recombinant proteins because they induce the formation of ectopic storage organelles known as protein bodies (PBs) where the proteins are stabilized by intermolecular interactions and the formation of disulfide bonds. Endogenous PBs are derived from the endoplasmic reticulum (ER). Here, we have used different targeting sequences to determine whether ectopic PBs composed of the N-terminal portion of mature 27 kDa γ-zein added to a fluorescent protein could be induced to form elsewhere in the cell. The addition of a transit peptide for targeting to plastids causes PB formation in the stroma, whereas in the absence of any added targeting sequence PBs were typically associated with the plastid envelope, revealing the presence of a cryptic plastid-targeting signal within the γ-zein cysteine-rich domain. The subcellular localization of the PBs influences their morphology and the solubility of the stored recombinant fusion protein. Our results indicate that the biogenesis and budding of PBs does not require ER-specific factors and therefore, confirm that γ-zein is a versatile fusion partner for recombinant proteins offering unique opportunities for the accumulation and bioencapsulation of recombinant proteins in different subcellular compartments.

  3. Expression, purification, characterization and subcellular localization of the goose parvovirus rep1 protein.

    Science.gov (United States)

    Chen, Zongyan; Li, Chuanfeng; Peng, Gaojing; Liu, Guangqing

    2013-07-01

    The goose parvovirus (GPV) Rep1 protein is both essential for viral replication and a potential target for GPV diagnosis, but its protein characterization and intracellular localization is not clear. We constructed a recombinant plasmid, pET28a/GPV-Rep1, and expressed the Rep1 gene in BL21 (DE3) Escherichia coli. A protein approximately 75 kDa in size was obtained from lysates of E. coli cells expressing the recombinant plasmid. SDS-PAGE analysis showed that after induction with 0.6 mM isopropyl β-D-thiogalactosidase (IPTG) at 30°C for 5 h, the Rep1 protein was highly overexpressed. Two methods used to purify proteins, a salinity-gradient elution and Ni-NTA affinity chromatography, were performed. The amount of Rep1 protein obtained by Ni-NTA affinity chromatography was 41.23 mg, while 119.9 mg of Rep1 protein was obtained by a salinity-gradient elution from a 1 L E. coli BL21 (DE3) culture. An immunogenicity analysis showed that the protein could significantly elicit a specific antibody response in immunized goslings compared to control groups. Antibody titers peaked to 1:5120 (optical density (OD) 450 = 3.9) on day 28 after immunization but had mean titers of 1:10,240 (OD450 = 4.2) in gosling groups immunized with a commercially available GPV-attenuated vaccine strain. Experiments examining subcellular localization showed that the Rep1 protein appeared to associate predominantly with the nuclear membrane, especially during later times of infection. This work provides a basis for biochemical and structural studies on the GPV Rep1 protein.

  4. Distinct domains within the NITROGEN LIMITATION ADAPTATION protein mediate its subcellular localization and function in the nitrate-dependent phosphate homeostasis pathway

    Science.gov (United States)

    The NITROGEN LIMITATION ADAPTATION (NLA) protein is a RING-type E3 ubiquitin ligase that plays an essential role in the regulation of nitrogen and phosphate homeostasis. NLA is localized to two distinct subcellular sites, the plasma membrane and nucleus, and contains four distinct domains: i) a RING...

  5. Subcellular controls of mercury trophic transfer to a marine fish

    Energy Technology Data Exchange (ETDEWEB)

    Dang Fei [Department of Biology, Hong Kong University of Science and Technology (HKUST), Clear Water Bay, Kowloon (Hong Kong); Wang Wenxiong, E-mail: wwang@ust.hk [Department of Biology, Hong Kong University of Science and Technology (HKUST), Clear Water Bay, Kowloon (Hong Kong)

    2010-09-15

    Different behaviors of inorganic mercury [Hg(II)] and methylmercury (MeHg) during trophic transfer along the marine food chain have been widely reported, but the mechanisms are not fully understood. The bioavailability of ingested mercury, quantified by assimilation efficiency (AE), was investigated in a marine fish, the grunt Terapon jarbua, based on mercury subcellular partitioning in prey and purified subcellular fractions of prey tissues. The subcellular distribution of Hg(II) differed substantially among prey types, with cellular debris being a major (49-57% in bivalves) or secondary (14-19% in other prey) binding pool. However, MeHg distribution varied little among prey types, with most MeHg (43-79%) in heat-stable protein (HSP) fraction. The greater AEs measured for MeHg (90-94%) than for Hg(II) (23-43%) confirmed the findings of previous studies. Bioavailability of each purified subcellular fraction rather than the proposed trophically available metal (TAM) fraction could better elucidate mercury assimilation difference. Hg(II) associated with insoluble fraction (e.g. cellular debris) was less bioavailable than that in soluble fraction (e.g. HSP). However, subcellular distribution was shown to be less important for MeHg, with each fraction having comparable MeHg bioavailability. Subcellular distribution in prey should be an important consideration in mercury trophic transfer studies.

  6. Subcellular controls of mercury trophic transfer to a marine fish

    International Nuclear Information System (INIS)

    Dang Fei; Wang Wenxiong

    2010-01-01

    Different behaviors of inorganic mercury [Hg(II)] and methylmercury (MeHg) during trophic transfer along the marine food chain have been widely reported, but the mechanisms are not fully understood. The bioavailability of ingested mercury, quantified by assimilation efficiency (AE), was investigated in a marine fish, the grunt Terapon jarbua, based on mercury subcellular partitioning in prey and purified subcellular fractions of prey tissues. The subcellular distribution of Hg(II) differed substantially among prey types, with cellular debris being a major (49-57% in bivalves) or secondary (14-19% in other prey) binding pool. However, MeHg distribution varied little among prey types, with most MeHg (43-79%) in heat-stable protein (HSP) fraction. The greater AEs measured for MeHg (90-94%) than for Hg(II) (23-43%) confirmed the findings of previous studies. Bioavailability of each purified subcellular fraction rather than the proposed trophically available metal (TAM) fraction could better elucidate mercury assimilation difference. Hg(II) associated with insoluble fraction (e.g. cellular debris) was less bioavailable than that in soluble fraction (e.g. HSP). However, subcellular distribution was shown to be less important for MeHg, with each fraction having comparable MeHg bioavailability. Subcellular distribution in prey should be an important consideration in mercury trophic transfer studies.

  7. Using distant supervised learning to identify protein subcellular localizations from full-text scientific articles.

    Science.gov (United States)

    Zheng, Wu; Blake, Catherine

    2015-10-01

    Databases of curated biomedical knowledge, such as the protein-locations reflected in the UniProtKB database, provide an accurate and useful resource to researchers and decision makers. Our goal is to augment the manual efforts currently used to curate knowledge bases with automated approaches that leverage the increased availability of full-text scientific articles. This paper describes experiments that use distant supervised learning to identify protein subcellular localizations, which are important to understand protein function and to identify candidate drug targets. Experiments consider Swiss-Prot, the manually annotated subset of the UniProtKB protein knowledge base, and 43,000 full-text articles from the Journal of Biological Chemistry that contain just under 11.5 million sentences. The system achieves 0.81 precision and 0.49 recall at sentence level and an accuracy of 57% on held-out instances in a test set. Moreover, the approach identifies 8210 instances that are not in the UniProtKB knowledge base. Manual inspection of the 50 most likely relations showed that 41 (82%) were valid. These results have immediate benefit to researchers interested in protein function, and suggest that distant supervision should be explored to complement other manual data curation efforts. Copyright © 2015 Elsevier Inc. All rights reserved.

  8. Plasma effects on subcellular structures

    International Nuclear Information System (INIS)

    Gweon, Bomi; Kim, Dan Bee; Jung, Heesoo; Choe, Wonho; Kim, Daeyeon; Shin, Jennifer H.

    2010-01-01

    Atmospheric pressure helium plasma treated human hepatocytes exhibit distinctive zones of necrotic and live cells separated by a void. We propose that plasma induced necrosis is attributed to plasma species such as oxygen radicals, charged particles, metastables and/or severe disruption of charged cytoskeletal proteins. Interestingly, uncharged cytoskeletal intermediate filaments are only minimally disturbed by plasma, elucidating the possibility of plasma induced electrostatic effects selectively destroying charged proteins. These bona fide plasma effects, which inflict alterations in specific subcellular structures leading to necrosis and cellular detachment, were not observed by application of helium flow or electric field alone.

  9. Subcellular distribution of folate and folate binding protein in renal proximal tubules

    International Nuclear Information System (INIS)

    Sharkey, C.; Hjelle, J.T.; Selhub, J.

    1986-01-01

    High affinity folate binding protein (FBP) found in brush border membranes derived from renal cortices is thought to be involved in the renal conservation of folate. To examine the mechanisms of folate recovery, the subcellular distribution of FBP and 3 H-folate in rabbit renal proximal tubules (PT) was examined using analytical cell fractionation techniques. Tubules contain 3.41 +/- 0.32 picomoles FBP/mg protein (X +/- S.D.; n = 5). Postnuclear supernates (PNS) of PT were layered atop Percoll-sucrose gradients, centrifuged, fractions collected and assayed for various marker enzymes and FBP. Pooled fractions from such gradients were subsequently treated with digitonin and centrifuged in a stoichiometric manner with the activity of the microvillar enzyme, alanylaminopeptidase (AAP); excess FBP distributed with more buoyant particles. Infusion of 3 H-folate into rabbit kidneys followed by tubule isolation and fractionation revealed a time dependent shift in distribution of radiolabel from the AAP-rich gradient fractions to a region containing more buoyant particles; radiolevel was not associated with lysosomal markers. EM-radioautography revealed grains over intracellular vesicles. These results are consistent with the hypothesis that folate is recovered by a process involving receptor-mediated endocytosis or transcytosis

  10. Role of NH2-terminal hydrophobic motif in the subcellular localization of ATP-binding cassette protein subfamily D: Common features in eukaryotic organisms

    International Nuclear Information System (INIS)

    Lee, Asaka; Asahina, Kota; Okamoto, Takumi; Kawaguchi, Kosuke; Kostsin, Dzmitry G.; Kashiwayama, Yoshinori; Takanashi, Kojiro; Yazaki, Kazufumi; Imanaka, Tsuneo; Morita, Masashi

    2014-01-01

    Highlights: • ABCD proteins classifies based on with or without NH 2 -terminal hydrophobic segment. • The ABCD proteins with the segment are targeted peroxisomes. • The ABCD proteins without the segment are targeted to the endoplasmic reticulum. • The role of the segment in organelle targeting is conserved in eukaryotic organisms. - Abstract: In mammals, four ATP-binding cassette (ABC) proteins belonging to subfamily D have been identified. ABCD1–3 possesses the NH 2 -terminal hydrophobic region and are targeted to peroxisomes, while ABCD4 lacking the region is targeted to the endoplasmic reticulum (ER). Based on hydropathy plot analysis, we found that several eukaryotes have ABCD protein homologs lacking the NH 2 -terminal hydrophobic segment (H0 motif). To investigate whether the role of the NH 2 -terminal H0 motif in subcellular localization is conserved across species, we expressed ABCD proteins from several species (metazoan, plant and fungi) in fusion with GFP in CHO cells and examined their subcellular localization. ABCD proteins possessing the NH 2 -terminal H0 motif were localized to peroxisomes, while ABCD proteins lacking this region lost this capacity. In addition, the deletion of the NH 2 -terminal H0 motif of ABCD protein resulted in their localization to the ER. These results suggest that the role of the NH 2 -terminal H0 motif in organelle targeting is widely conserved in living organisms

  11. Subcellular distribution of styrene oxide in rat liver

    International Nuclear Information System (INIS)

    Pacifici, G.M.; Cuoci, L.; Rane, A.

    1984-01-01

    The subcellular distribution of ( 3 H)-styrene-7,8-oxide was studied in the rat liver. The compound was added to liver homogenate to give a final concentration of 2 X 10(-5); 2 X 10(-4) and 2 X 10(-3) M. Subcellular fractions were obtained by differential centrifugation. Most of styrene oxide (59-88%) was associated with the cytosolic fraction. Less than 15 percent of the compound was retrieved in each of the nuclear, mitochondrial and microsomal fractions. A considerable percentage of radioactivity was found unextractable with the organic solvents, suggesting that styrene oxide reacted with the endogenous compounds. The intracellular distribution of this epoxide was also studied in the perfused rat liver. Comparable results with those previously described were obtained. The binding of styrene oxide to the cytosolic protein was investigated by equilibrium dialysis and ultrafiltration. Only a small percentage of the compound was bound to protein

  12. Imaging Subcellular Structures in the Living Zebrafish Embryo.

    Science.gov (United States)

    Engerer, Peter; Plucinska, Gabriela; Thong, Rachel; Trovò, Laura; Paquet, Dominik; Godinho, Leanne

    2016-04-02

    In vivo imaging provides unprecedented access to the dynamic behavior of cellular and subcellular structures in their natural context. Performing such imaging experiments in higher vertebrates such as mammals generally requires surgical access to the system under study. The optical accessibility of embryonic and larval zebrafish allows such invasive procedures to be circumvented and permits imaging in the intact organism. Indeed the zebrafish is now a well-established model to visualize dynamic cellular behaviors using in vivo microscopy in a wide range of developmental contexts from proliferation to migration and differentiation. A more recent development is the increasing use of zebrafish to study subcellular events including mitochondrial trafficking and centrosome dynamics. The relative ease with which these subcellular structures can be genetically labeled by fluorescent proteins and the use of light microscopy techniques to image them is transforming the zebrafish into an in vivo model of cell biology. Here we describe methods to generate genetic constructs that fluorescently label organelles, highlighting mitochondria and centrosomes as specific examples. We use the bipartite Gal4-UAS system in multiple configurations to restrict expression to specific cell-types and provide protocols to generate transiently expressing and stable transgenic fish. Finally, we provide guidelines for choosing light microscopy methods that are most suitable for imaging subcellular dynamics.

  13. Probability weighted ensemble transfer learning for predicting interactions between HIV-1 and human proteins.

    Directory of Open Access Journals (Sweden)

    Suyu Mei

    Full Text Available Reconstruction of host-pathogen protein interaction networks is of great significance to reveal the underlying microbic pathogenesis. However, the current experimentally-derived networks are generally small and should be augmented by computational methods for less-biased biological inference. From the point of view of computational modelling, data scarcity, data unavailability and negative data sampling are the three major problems for host-pathogen protein interaction networks reconstruction. In this work, we are motivated to address the three concerns and propose a probability weighted ensemble transfer learning model for HIV-human protein interaction prediction (PWEN-TLM, where support vector machine (SVM is adopted as the individual classifier of the ensemble model. In the model, data scarcity and data unavailability are tackled by homolog knowledge transfer. The importance of homolog knowledge is measured by the ROC-AUC metric of the individual classifiers, whose outputs are probability weighted to yield the final decision. In addition, we further validate the assumption that only the homolog knowledge is sufficient to train a satisfactory model for host-pathogen protein interaction prediction. Thus the model is more robust against data unavailability with less demanding data constraint. As regards with negative data construction, experiments show that exclusiveness of subcellular co-localized proteins is unbiased and more reliable than random sampling. Last, we conduct analysis of overlapped predictions between our model and the existing models, and apply the model to novel host-pathogen PPIs recognition for further biological research.

  14. ARAMEMNON, a novel database for Arabidopsis integral membrane proteins

    DEFF Research Database (Denmark)

    Schwacke, Rainer; Schneider, Anja; van der Graaff, Eric

    2003-01-01

    spans and are possibly linked to transport functions. The ARAMEMNON DB enables direct comparison of the predictions of seven different TM span computation programs and the predictions of subcellular localization by eight signal peptide recognition programs. A special function displays the proteins...

  15. Computational prediction of protein-protein interactions in Leishmania predicted proteomes.

    Directory of Open Access Journals (Sweden)

    Antonio M Rezende

    Full Text Available The Trypanosomatids parasites Leishmania braziliensis, Leishmania major and Leishmania infantum are important human pathogens. Despite of years of study and genome availability, effective vaccine has not been developed yet, and the chemotherapy is highly toxic. Therefore, it is clear just interdisciplinary integrated studies will have success in trying to search new targets for developing of vaccines and drugs. An essential part of this rationale is related to protein-protein interaction network (PPI study which can provide a better understanding of complex protein interactions in biological system. Thus, we modeled PPIs for Trypanosomatids through computational methods using sequence comparison against public database of protein or domain interaction for interaction prediction (Interolog Mapping and developed a dedicated combined system score to address the predictions robustness. The confidence evaluation of network prediction approach was addressed using gold standard positive and negative datasets and the AUC value obtained was 0.94. As result, 39,420, 43,531 and 45,235 interactions were predicted for L. braziliensis, L. major and L. infantum respectively. For each predicted network the top 20 proteins were ranked by MCC topological index. In addition, information related with immunological potential, degree of protein sequence conservation among orthologs and degree of identity compared to proteins of potential parasite hosts was integrated. This information integration provides a better understanding and usefulness of the predicted networks that can be valuable to select new potential biological targets for drug and vaccine development. Network modularity which is a key when one is interested in destabilizing the PPIs for drug or vaccine purposes along with multiple alignments of the predicted PPIs were performed revealing patterns associated with protein turnover. In addition, around 50% of hypothetical protein present in the networks

  16. Pharmacologic modulation of protein kinase C isozymes: the role of RACKs and subcellular localisation.

    Science.gov (United States)

    Csukai, M; Mochly-Rosen, D

    1999-04-01

    Protein kinase C (PKC) isozymes are highly homologous kinases and several different isozymes can be present in a cell. Each isozyme is likely to mediate unique functions, but pharmacological tools to explore their isozyme-specific roles have not been available until recently. In this review, we describe the development and application of isozyme-selective inhibitors of PKC. The identification of these inhibitors stems from the observation that PKC isozymes are each localised to unique subcellular locations following activation. Inhibitors of this isozyme-unique localisation have been shown to act as selective inhibitors of the functions of individual isozymes. The identification of isozyme-specific inhibitors should allow the exploration of individual PKC isozyme function in a wide range of cell systems. Copyright 1999 The Italian Pharmacological Society.

  17. Subcellular localization and logistics of integral membrane protein biogenesis in Escherichia coli.

    Science.gov (United States)

    Bogdanov, Mikhail; Aboulwafa, Mohammad; Saier, Milton H

    2013-01-01

    Transporters catalyze entry and exit of molecules into and out of cells and organelles, and protein-lipid interactions influence their activities. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) catalyzes transport-coupled sugar phosphorylation as well as nonvectorial sugar phosphorylation in the cytoplasm. The vectorial process is much more sensitive to the lipid environment than the nonvectorial process. Moreover, cytoplasmic micellar forms of these enzyme-porters have been identified, and non-PTS permeases have similarly been shown to exist in 'soluble' forms. The latter porters exhibit lipid-dependent activities and can adopt altered topologies by simply changing the lipid composition. Finally, intracellular membranes and vesicles exist in Escherichia coli leading to the following unanswered questions: (1) what determines whether a PTS permease catalyzes vectorial or nonvectorial sugar phosphorylation? (2) How do phospholipids influence relative amounts of the plasma membrane, intracellular membrane, inner membrane-derived vesicles and cytoplasmic micelles? (3) What regulates the route(s) of permease insertion and transfer into and between the different subcellular sites? (4) Do these various membranous forms have distinct physiological functions? (5) What methods should be utilized to study the biogenesis and interconversion of these membranous structures? While research concerning these questions is still in its infancy, answers will greatly enhance our understanding of protein-lipid interactions and how they control the activities, conformations, cellular locations and biogenesis of integral membrane proteins. Copyright © 2013 S. Karger AG, Basel.

  18. Functionality of system components: Conservation of protein function in protein feature space

    DEFF Research Database (Denmark)

    Jensen, Lars Juhl; Ussery, David; Brunak, Søren

    2003-01-01

    well on organisms other than the one on which it was trained. We evaluate the performance of such a method, ProtFun, which relies on protein features as its sole input, and show that the method gives similar performance for most eukaryotes and performs much better than anticipated on archaea......Many protein features useful for prediction of protein function can be predicted from sequence, including posttranslational modifications, subcellular localization, and physical/chemical properties. We show here that such protein features are more conserved among orthologs than paralogs, indicating...... they are crucial for protein function and thus subject to selective pressure. This means that a function prediction method based on sequence-derived features may be able to discriminate between proteins with different function even when they have highly similar structure. Also, such a method is likely to perform...

  19. Detection and subcellular localization of dehydrin-like proteins in quinoa (Chenopodium quinoa Willd.) embryos.

    Science.gov (United States)

    Carjuzaa, P; Castellión, M; Distéfano, A J; del Vas, M; Maldonado, S

    2008-01-01

    The aim of this study was to characterize the dehydrin content in mature embryos of two quinoa cultivars, Sajama and Baer La Unión. Cultivar Sajama grows at 3600-4000 m altitude and is adapted to the very arid conditions characteristic of the salty soils of the Bolivian Altiplano, with less than 250 mm of annual rain and a minimum temperature of -1 degrees C. Cultivar Baer La Unión grows at sea-level regions of central Chile and is adapted to more humid conditions (800 to 1500 mm of annual rain), fertile soils, and temperatures above 5 degrees C. Western blot analysis of embryo tissues from plants growing under controlled greenhouse conditions clearly revealed the presence of several dehydrin bands (at molecular masses of approximately 30, 32, 50, and 55 kDa), which were common to both cultivars, although the amount of the 30 and 32 kDa bands differed. Nevertheless, when grains originated from their respective natural environments, three extra bands (at molecular masses of approximately 34, 38, and 40 kDa), which were hardly visible in Sajama, and another weak band (at a molecular mass of approximately 28 kDa) were evident in Baer La Unión. In situ immunolocalization microscopy detected dehydrin-like proteins in all axis and cotyledon tissues. At the subcellular level, dehydrins were detected in the plasma membrane, cytoplasm and nucleus. In the cytoplasm, dehydrins were found associated with mitochondria, rough endoplasmic reticulum cisternae, and proplastid membranes. The presence of dehydrins was also recognized in the matrix of protein bodies. In the nucleus, dehydrins were associated with the euchromatin. Upon examining dehydrin composition and subcellular localization in two quinoa cultivars belonging to highly contrasting environments, we conclude that most dehydrins detected here were constitutive components of the quinoa seed developmental program, but some of them (specially the 34, 38, and 40 kDa bands) may reflect quantitative molecular differences

  20. Analysis of potato virus X replicase and TGBp3 subcellular locations

    International Nuclear Information System (INIS)

    Bamunusinghe, Devinka; Hemenway, Cynthia L.; Nelson, Richard S.; Sanderfoot, Anton A.; Ye, Chang M.; Silva, Muniwarage A.T.; Payton, M.; Verchot-Lubicz, Jeanmarie

    2009-01-01

    Potato virus X (PVX) infection leads to certain cytopathological modifications of the host endomembrane system. The subcellular location of the PVX replicase was previously unknown while the PVX TGBp3 protein was previously reported to reside in the ER. Using PVX infectious clones expressing the green fluorescent protein reporter, and antisera detecting the PVX replicase and host membrane markers, we examined the subcellular distribution of the PVX replicase in relation to the TGBp3. Confocal and electron microscopic observations revealed that the replicase localizes in membrane bound structures that derive from the ER. A subset of TGBp3 resides in the ER at the same location as the replicase. Sucrose gradient fractionation showed that the PVX replicase and TGBp3 proteins co-fractionate with ER marker proteins. This localization represents a region where both proteins may be synthesized and/or function. There is no evidence to indicate that either PVX protein moves into the Golgi apparatus. Cerulenin, a drug that inhibits de novo membrane synthesis, also inhibited PVX replication. These combined data indicate that PVX replication relies on ER-derived membrane recruitment and membrane proliferation.

  1. Analysis of the subcellular localization of the human histone methyltransferase SETDB1

    Energy Technology Data Exchange (ETDEWEB)

    Tachibana, Keisuke, E-mail: nya@phs.osaka-u.ac.jp [Graduate School of Pharmaceutical Sciences, Osaka University, 1-6 Yamadaoka, Suita, Osaka 565-0871 (Japan); Gotoh, Eiko; Kawamata, Natsuko [Graduate School of Pharmaceutical Sciences, Osaka University, 1-6 Yamadaoka, Suita, Osaka 565-0871 (Japan); Ishimoto, Kenji [Graduate School of Pharmaceutical Sciences, Osaka University, 1-6 Yamadaoka, Suita, Osaka 565-0871 (Japan); Laboratory for System Biology and Medicine, Research Center for Advanced Science and Technology, The University of Tokyo, 4-6-1 Komaba, Meguro, Tokyo 153-8904 (Japan); Uchihara, Yoshie [Graduate School of Pharmaceutical Sciences, Osaka University, 1-6 Yamadaoka, Suita, Osaka 565-0871 (Japan); Iwanari, Hiroko [Department of Quantitative Biology and Medicine, Research Center for Advanced Science and Technology, The University of Tokyo, 4-6-1 Komaba, Meguro, Tokyo 153-8904 (Japan); Sugiyama, Akira; Kawamura, Takeshi [Radioisotope Center, The University of Tokyo, 2-11-16 Yayoi, Bunkyo, Tokyo 113-0032 (Japan); Mochizuki, Yasuhiro [Department of Quantitative Biology and Medicine, Research Center for Advanced Science and Technology, The University of Tokyo, 4-6-1 Komaba, Meguro, Tokyo 153-8904 (Japan); Tanaka, Toshiya [Laboratory for System Biology and Medicine, Research Center for Advanced Science and Technology, The University of Tokyo, 4-6-1 Komaba, Meguro, Tokyo 153-8904 (Japan); Sakai, Juro [Division of Metabolic Medicine, Research Center for Advanced Science and Technology, The University of Tokyo, 4-6-1 Komaba, Meguro, Tokyo 153-8904 (Japan); Hamakubo, Takao [Department of Quantitative Biology and Medicine, Research Center for Advanced Science and Technology, The University of Tokyo, 4-6-1 Komaba, Meguro, Tokyo 153-8904 (Japan); Kodama, Tatsuhiko [Laboratory for System Biology and Medicine, Research Center for Advanced Science and Technology, The University of Tokyo, 4-6-1 Komaba, Meguro, Tokyo 153-8904 (Japan); and others

    2015-10-02

    SET domain, bifurcated 1 (SETDB1) is a histone methyltransferase that methylates lysine 9 on histone H3. Although it is important to know the localization of proteins to elucidate their physiological function, little is known of the subcellular localization of human SETDB1. In the present study, to investigate the subcellular localization of hSETDB1, we established a human cell line constitutively expressing enhanced green fluorescent protein fused to hSETDB1. We then generated a monoclonal antibody against the hSETDB1 protein. Expression of both exogenous and endogenous hSETDB1 was observed mainly in the cytoplasm of various human cell lines. Combined treatment with the nuclear export inhibitor leptomycin B and the proteasome inhibitor MG132 led to the accumulation of hSETDB1 in the nucleus. These findings suggest that hSETDB1, localized in the nucleus, might undergo degradation by the proteasome and be exported to the cytosol, resulting in its detection mainly in the cytosol. - Highlights: • Endogenous human SETDB1 was localized mainly in the cytoplasm. • Combined treatment with LMB and MG132 led to accumulation of human SETDB1 in the nucleus. • HeLa cells expressing EFGP-hSETDB1 are useful for subcellular localization analyses.

  2. Analysis of the subcellular localization of the human histone methyltransferase SETDB1

    International Nuclear Information System (INIS)

    Tachibana, Keisuke; Gotoh, Eiko; Kawamata, Natsuko; Ishimoto, Kenji; Uchihara, Yoshie; Iwanari, Hiroko; Sugiyama, Akira; Kawamura, Takeshi; Mochizuki, Yasuhiro; Tanaka, Toshiya; Sakai, Juro; Hamakubo, Takao; Kodama, Tatsuhiko

    2015-01-01

    SET domain, bifurcated 1 (SETDB1) is a histone methyltransferase that methylates lysine 9 on histone H3. Although it is important to know the localization of proteins to elucidate their physiological function, little is known of the subcellular localization of human SETDB1. In the present study, to investigate the subcellular localization of hSETDB1, we established a human cell line constitutively expressing enhanced green fluorescent protein fused to hSETDB1. We then generated a monoclonal antibody against the hSETDB1 protein. Expression of both exogenous and endogenous hSETDB1 was observed mainly in the cytoplasm of various human cell lines. Combined treatment with the nuclear export inhibitor leptomycin B and the proteasome inhibitor MG132 led to the accumulation of hSETDB1 in the nucleus. These findings suggest that hSETDB1, localized in the nucleus, might undergo degradation by the proteasome and be exported to the cytosol, resulting in its detection mainly in the cytosol. - Highlights: • Endogenous human SETDB1 was localized mainly in the cytoplasm. • Combined treatment with LMB and MG132 led to accumulation of human SETDB1 in the nucleus. • HeLa cells expressing EFGP-hSETDB1 are useful for subcellular localization analyses.

  3. Subcellular localization of ammonium transporters in Dictyostelium discoideum

    Directory of Open Access Journals (Sweden)

    Davis Carter T

    2008-12-01

    Full Text Available Abstract Background With the exception of vertebrates, most organisms have plasma membrane associated ammonium transporters which primarily serve to import a source of nitrogen for nutritional purposes. Dictyostelium discoideum has three ammonium transporters, Amts A, B and C. Our present work used fluorescent fusion proteins to determine the cellular localization of the Amts and tested the hypothesis that the transporters mediate removal of ammonia generated endogenously from the elevated protein catabolism common to many protists. Results Using RFP and YFP fusion constructs driven by the actin 15 promoter, we found that the three ammonium transporters were localized on the plasma membrane and on the membranes of subcellular organelles. AmtA and AmtB were localized on the membranes of endolysosomes and phagosomes, with AmtB further localized on the membranes of contractile vacuoles. AmtC also was localized on subcellular organelles when it was stabilized by coexpression with either the AmtA or AmtB fusion transporter. The three ammonium transporters exported ammonia linearly with regard to time during the first 18 hours of the developmental program as revealed by reduced export in the null strains. The fluorescently tagged transporters rescued export when expressed in the null strains, and thus they were functional transporters. Conclusion Unlike ammonium transporters in most organisms, which import NH3/NH4+ as a nitrogen source, those of Dictyostelium export ammonia/ammonium as a waste product from extensive catabolism of exogenously derived and endogenous proteins. Localization on proteolytic organelles and on the neutral contractile vacuole suggests that Dictyostelium ammonium transporters may have unique subcellular functions and play a role in the maintenance of intracellular ammonium distribution. A lack of correlation between the null strain phenotypes and ammonia excretion properties of the ammonium transporters suggests that it is not

  4. Subcellular localization analysis of the closely related Fps/Fes and Fer protein-tyrosine kinases suggests a distinct role for Fps/Fes in vesicular trafficking.

    Science.gov (United States)

    Zirngibl, R; Schulze, D; Mirski, S E; Cole, S P; Greer, P A

    2001-05-15

    The subcellular localizations of the Fps/Fes and closely related Fer cytoplasmic tyrosine kinases were studied using green fluorescent protein (GFP) fusions and confocal fluorescence microscopy. In contrast to previous reports, neither kinase localized to the nucleus. Fer was diffusely cytoplasmic throughout the cell cycle. Fps/Fes also displayed a diffuse cytoplasmic localization, but in addition it showed distinct accumulations in cytoplasmic vesicles as well as in a perinuclear region consistent with the Golgi. This localization was very similar to that of TGN38, a known marker of the trans Golgi. The localization of Fps/Fes and TGN38 were both perturbed by brefeldin A, a fungal metabolite that disrupts the Golgi apparatus. Fps/Fes was also found to colocalize to various extents with several Rab proteins, which are members of the monomeric G-protein superfamily involved in vesicular transport between specific subcellular compartments. Using Rabs that are involved in endocytosis (Rab5B and Rab7) or exocytosis (Rab1A and Rab3A), we showed that Fps/Fes is localized in both pathways. These results suggest that Fps/Fes may play a general role in the regulation of vesicular trafficking. Copyright 2001 Academic Press.

  5. Cloud prediction of protein structure and function with PredictProtein for Debian.

    Science.gov (United States)

    Kaján, László; Yachdav, Guy; Vicedo, Esmeralda; Steinegger, Martin; Mirdita, Milot; Angermüller, Christof; Böhm, Ariane; Domke, Simon; Ertl, Julia; Mertes, Christian; Reisinger, Eva; Staniewski, Cedric; Rost, Burkhard

    2013-01-01

    We report the release of PredictProtein for the Debian operating system and derivatives, such as Ubuntu, Bio-Linux, and Cloud BioLinux. The PredictProtein suite is available as a standard set of open source Debian packages. The release covers the most popular prediction methods from the Rost Lab, including methods for the prediction of secondary structure and solvent accessibility (profphd), nuclear localization signals (predictnls), and intrinsically disordered regions (norsnet). We also present two case studies that successfully utilize PredictProtein packages for high performance computing in the cloud: the first analyzes protein disorder for whole organisms, and the second analyzes the effect of all possible single sequence variants in protein coding regions of the human genome.

  6. Autophagosome Proteins LC3A, LC3B and LC3C Have Distinct Subcellular Distribution Kinetics and Expression in Cancer Cell Lines.

    Directory of Open Access Journals (Sweden)

    Michael I Koukourakis

    Full Text Available LC3s (MAP1-LC3A, B and C are structural proteins of autophagosomal membranes, widely used as biomarkers of autophagy. Whether these three LC3 proteins have a similar biological role in autophagy remains obscure. We examine in parallel the subcellular expression patterns of the three LC3 proteins in a panel of human cancer cell lines, as well as in normal MRC5 fibroblasts and HUVEC, using confocal microscopy and western blot analysis of cell fractions. In the cytoplasm, there was a minimal co-localization between LC3A, B and C staining, suggesting that the relevant autophagosomes are formed by only one out of the three LC3 proteins. LC3A showed a perinuclear and nuclear localization, while LC3B was equally distributed throughout the cytoplasm and localized in the nucleolar regions. LC3C was located in the cytoplasm and strongly in the nuclei (excluding nucleoli, where it extensively co-localized with the LC3A and the Beclin-1 autophagy initiating protein. Beclin 1 is known to contain a nuclear trafficking signal. Blocking nuclear export function by Leptomycin B resulted in nuclear accumulation of all LC3 and Beclin-1 proteins, while Ivermectin that blocks nuclear import showed reduction of accumulation, but not in all cell lines. Since endogenous LC3 proteins are used as major markers of autophagy in clinical studies and cell lines, it is essential to check the specificity of the antibodies used, as the kinetics of these molecules are not identical and may have distinct biological roles. The distinct subcellular expression patterns of LC3s provide a basis for further studies.

  7. Primary structure and subcellular localization of two fimbrial subunit-like proteins involved in the biosynthesis of K99 fibrillae.

    Science.gov (United States)

    Roosendaal, E; Jacobs, A A; Rathman, P; Sondermeyer, C; Stegehuis, F; Oudega, B; de Graaf, F K

    1987-09-01

    Analysis of the nucleotide sequence of the distal part of the fan gene cluster encoding the proteins involved in the biosynthesis of the fibrillar adhesin, K99, revealed the presence of two structural genes, fanG and fanH. The amino acid sequence of the gene products (FanG and FanH) showed significant homology to the amino acid sequence of the fibrillar subunit protein (FanC). Introduction of a site-specific frameshift mutation in fanG or fanH resulted in a simultaneous decrease in fibrillae production and adhesive capacity. Analysis of subcellular fractions showed that, in contrast to the K99 fibrillar subunit (FanC), both the FanH and the FanG protein were loosely associated with the outer membrane, possibly on the periplasmic side, but were not components of the fimbriae themselves.

  8. Subcellular Nanoparticle Distribution from Light Transmission Spectroscopy

    Science.gov (United States)

    Deatsch, Alison; Sun, Nan; Johnson, Jeffrey; Stack, Sharon; Tanner, Carol; Ruggiero, Steven

    We have measured the particle-size distribution (PSD) of subcellular structures in plant and animal cells. We have employed a new technique developed by our group, Light Transmission Spectroscopy-combined with cell fractionation-to accurately measure PSDs over a wide size range: from 10 nm to 3000nm, which includes objects from the size of individual proteins to organelles. To date our experiments have included cultured human oral cells and spinach cells. These results show a power-law dependence of particle density with particle diameter, implying a universality of the packing distribution. We discuss modeling the cell as a self-similar (fractal) body comprised of spheres on all size scales. This goal of this work is to obtain a better understanding of the fundamental nature of particle packing within cells in order to enrich our knowledge of the structure, function, and interactions of sub-cellular nanostructures across cell types.

  9. CoBaltDB: Complete bacterial and archaeal orfeomes subcellular localization database and associated resources

    Directory of Open Access Journals (Sweden)

    Lucchetti-Miganeh Céline

    2010-03-01

    Full Text Available Abstract Background The functions of proteins are strongly related to their localization in cell compartments (for example the cytoplasm or membranes but the experimental determination of the sub-cellular localization of proteomes is laborious and expensive. A fast and low-cost alternative approach is in silico prediction, based on features of the protein primary sequences. However, biologists are confronted with a very large number of computational tools that use different methods that address various localization features with diverse specificities and sensitivities. As a result, exploiting these computer resources to predict protein localization accurately involves querying all tools and comparing every prediction output; this is a painstaking task. Therefore, we developed a comprehensive database, called CoBaltDB, that gathers all prediction outputs concerning complete prokaryotic proteomes. Description The current version of CoBaltDB integrates the results of 43 localization predictors for 784 complete bacterial and archaeal proteomes (2.548.292 proteins in total. CoBaltDB supplies a simple user-friendly interface for retrieving and exploring relevant information about predicted features (such as signal peptide cleavage sites and transmembrane segments. Data are organized into three work-sets ("specialized tools", "meta-tools" and "additional tools". The database can be queried using the organism name, a locus tag or a list of locus tags and may be browsed using numerous graphical and text displays. Conclusions With its new functionalities, CoBaltDB is a novel powerful platform that provides easy access to the results of multiple localization tools and support for predicting prokaryotic protein localizations with higher confidence than previously possible. CoBaltDB is available at http://www.umr6026.univ-rennes1.fr/english/home/research/basic/software/cobalten.

  10. Subcellular Redox Targeting: Bridging in Vitro and in Vivo Chemical Biology.

    Science.gov (United States)

    Long, Marcus J C; Poganik, Jesse R; Ghosh, Souradyuti; Aye, Yimon

    2017-03-17

    Networks of redox sensor proteins within discrete microdomains regulate the flow of redox signaling. Yet, the inherent reactivity of redox signals complicates the study of specific redox events and pathways by traditional methods. Herein, we review designer chemistries capable of measuring flux and/or mimicking subcellular redox signaling at the cellular and organismal level. Such efforts have begun to decipher the logic underlying organelle-, site-, and target-specific redox signaling in vitro and in vivo. These data highlight chemical biology as a perfect gateway to interrogate how nature choreographs subcellular redox chemistry to drive precision redox biology.

  11. Automated Learning of Subcellular Variation among Punctate Protein Patterns and a Generative Model of Their Relation to Microtubules.

    Directory of Open Access Journals (Sweden)

    Gregory R Johnson

    2015-12-01

    Full Text Available Characterizing the spatial distribution of proteins directly from microscopy images is a difficult problem with numerous applications in cell biology (e.g. identifying motor-related proteins and clinical research (e.g. identification of cancer biomarkers. Here we describe the design of a system that provides automated analysis of punctate protein patterns in microscope images, including quantification of their relationships to microtubules. We constructed the system using confocal immunofluorescence microscopy images from the Human Protein Atlas project for 11 punctate proteins in three cultured cell lines. These proteins have previously been characterized as being primarily located in punctate structures, but their images had all been annotated by visual examination as being simply "vesicular". We were able to show that these patterns could be distinguished from each other with high accuracy, and we were able to assign to one of these subclasses hundreds of proteins whose subcellular localization had not previously been well defined. In addition to providing these novel annotations, we built a generative approach to modeling of punctate distributions that captures the essential characteristics of the distinct patterns. Such models are expected to be valuable for representing and summarizing each pattern and for constructing systems biology simulations of cell behaviors.

  12. Prediction of Protein-Protein Interactions Related to Protein Complexes Based on Protein Interaction Networks

    Directory of Open Access Journals (Sweden)

    Peng Liu

    2015-01-01

    Full Text Available A method for predicting protein-protein interactions based on detected protein complexes is proposed to repair deficient interactions derived from high-throughput biological experiments. Protein complexes are pruned and decomposed into small parts based on the adaptive k-cores method to predict protein-protein interactions associated with the complexes. The proposed method is adaptive to protein complexes with different structure, number, and size of nodes in a protein-protein interaction network. Based on different complex sets detected by various algorithms, we can obtain different prediction sets of protein-protein interactions. The reliability of the predicted interaction sets is proved by using estimations with statistical tests and direct confirmation of the biological data. In comparison with the approaches which predict the interactions based on the cliques, the overlap of the predictions is small. Similarly, the overlaps among the predicted sets of interactions derived from various complex sets are also small. Thus, every predicted set of interactions may complement and improve the quality of the original network data. Meanwhile, the predictions from the proposed method replenish protein-protein interactions associated with protein complexes using only the network topology.

  13. Information assessment on predicting protein-protein interactions

    Directory of Open Access Journals (Sweden)

    Gerstein Mark

    2004-10-01

    Full Text Available Abstract Background Identifying protein-protein interactions is fundamental for understanding the molecular machinery of the cell. Proteome-wide studies of protein-protein interactions are of significant value, but the high-throughput experimental technologies suffer from high rates of both false positive and false negative predictions. In addition to high-throughput experimental data, many diverse types of genomic data can help predict protein-protein interactions, such as mRNA expression, localization, essentiality, and functional annotation. Evaluations of the information contributions from different evidences help to establish more parsimonious models with comparable or better prediction accuracy, and to obtain biological insights of the relationships between protein-protein interactions and other genomic information. Results Our assessment is based on the genomic features used in a Bayesian network approach to predict protein-protein interactions genome-wide in yeast. In the special case, when one does not have any missing information about any of the features, our analysis shows that there is a larger information contribution from the functional-classification than from expression correlations or essentiality. We also show that in this case alternative models, such as logistic regression and random forest, may be more effective than Bayesian networks for predicting interactions. Conclusions In the restricted problem posed by the complete-information subset, we identified that the MIPS and Gene Ontology (GO functional similarity datasets as the dominating information contributors for predicting the protein-protein interactions under the framework proposed by Jansen et al. Random forests based on the MIPS and GO information alone can give highly accurate classifications. In this particular subset of complete information, adding other genomic data does little for improving predictions. We also found that the data discretizations used in the

  14. Sub-cellular localisation studies may spuriously detect the Yes-associated protein, YAP, in nucleoli leading to potentially invalid conclusions of its function.

    Science.gov (United States)

    Finch, Megan L; Passman, Adam M; Strauss, Robyn P; Yeoh, George C; Callus, Bernard A

    2015-01-01

    The Yes-associated protein (YAP) is a potent transcriptional co-activator that functions as a nuclear effector of the Hippo signaling pathway. YAP is oncogenic and its activity is linked to its cellular abundance and nuclear localisation. Activation of the Hippo pathway restricts YAP nuclear entry via its phosphorylation by Lats kinases and consequent cytoplasmic retention bound to 14-3-3 proteins. We examined YAP expression in liver progenitor cells (LPCs) and surprisingly found that transformed LPCs did not show an increase in YAP abundance compared to the non-transformed LPCs from which they were derived. We then sought to ascertain whether nuclear YAP was more abundant in transformed LPCs. We used an antibody that we confirmed was specific for YAP by immunoblotting to determine YAP's sub-cellular localisation by immunofluorescence. This antibody showed diffuse staining for YAP within the cytosol and nuclei, but, noticeably, it showed intense staining of the nucleoli of LPCs. This staining was non-specific, as shRNA treatment of cells abolished YAP expression to undetectable levels by Western blot yet the nucleolar staining remained. Similar spurious YAP nucleolar staining was also seen in mouse embryonic fibroblasts and mouse liver tissue, indicating that this antibody is unsuitable for immunological applications to determine YAP sub-cellular localisation in mouse cells or tissues. Interestingly nucleolar staining was not evident in D645 cells suggesting the antibody may be suitable for use in human cells. Given the large body of published work on YAP in recent years, many of which utilise this antibody, this study raises concerns regarding its use for determining sub-cellular localisation. From a broader perspective, it serves as a timely reminder of the need to perform appropriate controls to ensure the validity of published data.

  15. Protein Structure Prediction by Protein Threading

    Science.gov (United States)

    Xu, Ying; Liu, Zhijie; Cai, Liming; Xu, Dong

    The seminal work of Bowie, Lüthy, and Eisenberg (Bowie et al., 1991) on "the inverse protein folding problem" laid the foundation of protein structure prediction by protein threading. By using simple measures for fitness of different amino acid types to local structural environments defined in terms of solvent accessibility and protein secondary structure, the authors derived a simple and yet profoundly novel approach to assessing if a protein sequence fits well with a given protein structural fold. Their follow-up work (Elofsson et al., 1996; Fischer and Eisenberg, 1996; Fischer et al., 1996a,b) and the work by Jones, Taylor, and Thornton (Jones et al., 1992) on protein fold recognition led to the development of a new brand of powerful tools for protein structure prediction, which we now term "protein threading." These computational tools have played a key role in extending the utility of all the experimentally solved structures by X-ray crystallography and nuclear magnetic resonance (NMR), providing structural models and functional predictions for many of the proteins encoded in the hundreds of genomes that have been sequenced up to now.

  16. Subcellular proteomic characterization of the high-temperature stress response of the cyanobacterium Spirulina platensis

    Directory of Open Access Journals (Sweden)

    Cheevadhanarak Supapon

    2009-09-01

    Full Text Available Abstract The present study examined the changes in protein expression in Spirulina platensis upon exposure to high temperature, with the changes in expression analyzed at the subcellular level. In addition, the transcriptional expression level of some differentially expressed proteins, the expression pattern clustering, and the protein-protein interaction network were analyzed. The results obtained from differential expression analysis revealed up-regulation of proteins involved in two-component response systems, DNA damage and repair systems, molecular chaperones, known stress-related proteins, and proteins involved in other biological processes, such as capsule formation and unsaturated fatty acid biosynthesis. The clustering of all differentially expressed proteins in the three cellular compartments showed: (i the majority of the proteins in all fractions were sustained tolerance proteins, suggesting the roles of these proteins in the tolerance to high temperature stress, (ii the level of resistance proteins in the photosynthetic membrane was 2-fold higher than the level in two other fractions, correlating with the rapid inactivation of the photosynthetic system in response to high temperature. Subcellular communication among the three cellular compartments via protein-protein interactions was clearly shown by the PPI network analysis. Furthermore, this analysis also showed a connection between temperature stress and nitrogen and ammonia assimilation.

  17. Dynamic changes to survivin subcellular localization are initiated by DNA damage

    Directory of Open Access Journals (Sweden)

    Maritess Gay Asumen

    2010-07-01

    Full Text Available Maritess Gay Asumen1, Tochukwu V Ifeacho2, Luke Cockerham3, Christina Pfandl4, Nathan R Wall31Touro University’s College of Osteopathic Medicine, Vallejo, CA, USA; 2University of Southern California, Los Angeles, CA, USA; 3Center for Health Disparities Research and Molecular Medicine, Loma Linda University, CA, USA; 4Green Mountain Antibodies, Burlington, VT, USAAbstract: Subcellular distribution of the apoptosis inhibitor survivin and its ability to relocalize as a result of cell cycle phase or therapeutic insult has led to the hypothesis that these subcellular pools may coincide with different survivin functions. The PIK kinases (ATM, ATR and DNA-PK phosphorylate a variety of effector substrates that propagate DNA damage signals, resulting in various biological outputs. Here we demonstrate that subcellular repartitioning of survivin in MCF-7 cells as a result of UV light-mediated DNA damage is dependent upon DNA damage-sensing proteins as treatment with the pan PIK kinase inhibitor wortmannin repartitioned survivin in the mitochondria and diminished it from the cytosol and nucleus. Mitochondrial redistribution of survivin, such as was recorded after wortmannin treatment, occurred in cells lacking any one of the three DNA damage sensing protein kinases: DNA-PK, ATM or ATR. However, failed survivin redistribution from the mitochondria in response to low-dose UV occurred only in the cells lacking ATM, implying that ATM may be the primary kinase involved in this process. Taken together, this data implicates survivian’s subcellular distribution is a dynamic physiological process that appears responsive to UV light- initiated DNA damage and that its distribution may be responsible for its multifunctionality.Keywords: survivin, PIK kinases, ATM, ATR, DNA-PK

  18. Subcellular partitioning of metals in Aporrectodea caliginosa along a gradient of metal exposure in 31 field-contaminated soils

    Energy Technology Data Exchange (ETDEWEB)

    Beaumelle, Léa [INRA, UR 251 PESSAC, 78026 Versailles Cedex (France); Gimbert, Frédéric [Laboratoire Chrono-Environnement, UMR 6249 University of Franche-Comté/CNRS Usc INRA, 16 route de Gray, 25030 Besançon Cedex (France); Hedde, Mickaël [INRA, UR 251 PESSAC, 78026 Versailles Cedex (France); Guérin, Annie [INRA, US 0010 LAS Laboratoire d' analyses des sols, 273 rue de Cambrai, 62000 Arras (France); Lamy, Isabelle, E-mail: lamy@versailles.inra.fr [INRA, UR 251 PESSAC, 78026 Versailles Cedex (France)

    2015-07-01

    Subcellular fractionation of metals in organisms was proposed as a better way to characterize metal bioaccumulation. Here we report the impact of a laboratory exposure to a wide range of field-metal contaminated soils on the subcellular partitioning of metals in the earthworm Aporrectodea caliginosa. Soils moderately contaminated were chosen to create a gradient of soil metal availability; covering ranges of both soil metal contents and of several soil parameters. Following exposure, Cd, Pb and Zn concentrations were determined both in total earthworm body and in three subcellular compartments: cytosolic, granular and debris fractions. Three distinct proxies of soil metal availability were investigated: CaCl{sub 2}-extractable content dissolved content predicted by a semi-mechanistic model and free ion concentration predicted by a geochemical speciation model. Subcellular partitionings of Cd and Pb were modified along the gradient of metal exposure, while stable Zn partitioning reflected regulation processes. Cd subcellular distribution responded more strongly to increasing soil Cd concentration than the total internal content, when Pb subcellular distribution and total internal content were similarly affected. Free ion concentrations were better descriptors of Cd and Pb subcellular distribution than CaCl{sub 2} extractable and dissolved metal concentrations. However, free ion concentrations and soil total metal contents were equivalent descriptors of the subcellular partitioning of Cd and Pb because they were highly correlated. Considering lowly contaminated soils, our results raise the question of the added value of three proxies of metal availability compared to soil total metal content in the assessment of metal bioavailability to earthworm. - Highlights: • Earthworms were exposed to a wide panel of historically contaminated soils • Subcellular partitioning of Cd, Pb and Zn was investigated in earthworms • Three proxies of soil metal availability were

  19. Accumulation, subcellular distribution and toxicity of inorganic mercury and methylmercury in marine phytoplankton

    Energy Technology Data Exchange (ETDEWEB)

    Wu Yun [Division of Life Science, Hong Kong University of Science and Technology (HKUST), Clear Water Bay, Kowloon (Hong Kong); Wang Wenxiong, E-mail: wwang@ust.hk [Division of Life Science, Hong Kong University of Science and Technology (HKUST), Clear Water Bay, Kowloon (Hong Kong)

    2011-10-15

    We examined the accumulation, subcellular distribution, and toxicity of Hg(II) and MeHg in three marine phytoplankton (the diatom Thalassiosira pseudonana, the green alga Chlorella autotrophica, and the flagellate Isochrysis galbana). For MeHg, the inter-species toxic difference could be best interpreted by the total cellular or intracellular accumulation. For Hg(II), both I. galbana and T. pseudonana exhibited similar sensitivity, but they each accumulated a different level of Hg(II). A higher percentage of Hg(II) was bound to the cellular debris fraction in T. pseudonana than in I. galbana, implying that the cellular debris may play an important role in Hg(II) detoxification. Furthermore, heat-stable proteins were a major binding pool for MeHg, while the cellular debris was an important binding pool for Hg(II). Elucidating the different subcellular fates of Hg(II) and MeHg may help us understand their toxicity in marine phytoplankton at the bottom of aquatic food chains. - Highlights: > The inter-species toxic difference of methylmercury in marine phytoplankton can be explained by its total cellular or intracellular accumulation. > The inter-species toxic difference of inorganic mercury in marine phytoplankton can be explained by its subcellular distribution. > Heat-stable protein was a major binding pool for MeHg, while the cellular debris was an important binding pool for Hg(II). - The inter-species difference in methylmercury and inorganic mercury toxicity in phytoplankton can be explained by cellular accumulation and subcellular distribution.

  20. Accumulation, subcellular distribution and toxicity of inorganic mercury and methylmercury in marine phytoplankton

    International Nuclear Information System (INIS)

    Wu Yun; Wang Wenxiong

    2011-01-01

    We examined the accumulation, subcellular distribution, and toxicity of Hg(II) and MeHg in three marine phytoplankton (the diatom Thalassiosira pseudonana, the green alga Chlorella autotrophica, and the flagellate Isochrysis galbana). For MeHg, the inter-species toxic difference could be best interpreted by the total cellular or intracellular accumulation. For Hg(II), both I. galbana and T. pseudonana exhibited similar sensitivity, but they each accumulated a different level of Hg(II). A higher percentage of Hg(II) was bound to the cellular debris fraction in T. pseudonana than in I. galbana, implying that the cellular debris may play an important role in Hg(II) detoxification. Furthermore, heat-stable proteins were a major binding pool for MeHg, while the cellular debris was an important binding pool for Hg(II). Elucidating the different subcellular fates of Hg(II) and MeHg may help us understand their toxicity in marine phytoplankton at the bottom of aquatic food chains. - Highlights: → The inter-species toxic difference of methylmercury in marine phytoplankton can be explained by its total cellular or intracellular accumulation. → The inter-species toxic difference of inorganic mercury in marine phytoplankton can be explained by its subcellular distribution. → Heat-stable protein was a major binding pool for MeHg, while the cellular debris was an important binding pool for Hg(II). - The inter-species difference in methylmercury and inorganic mercury toxicity in phytoplankton can be explained by cellular accumulation and subcellular distribution.

  1. Subcellular localization of acyl-CoA binding protein in Aspergillus oryzae is regulated by autophagy machinery.

    Science.gov (United States)

    Kawaguchi, Kouhei; Kikuma, Takashi; Higuchi, Yujiro; Takegawa, Kaoru; Kitamoto, Katsuhiko

    2016-11-04

    In eukaryotic cells, acyl-CoA binding protein (ACBP) is important for cellular activities, such as in lipid metabolism. In the industrially important fungus Aspergillus oryzae, the ACBP, known as AoACBP, has been biochemically characterized, but its physiological function is not known. In the present study, although we could not find any phenotype of AoACBP disruptants in the normal growth conditions, we examined the subcellular localization of AoACBP to understand its physiological function. Using an enhanced green fluorescent protein (EGFP)-tagged AoACBP construct we showed that AoACBP localized to punctate structures in the cytoplasm, some of which moved inside the cells in a microtubule-dependent manner. Further microscopic analyses showed that AoACBP-EGFP co-localized with the autophagy marker protein AoAtg8 tagged with red fluorescent protein (mDsRed). Expression of AoACBP-EGFP in disruptants of autophagy-related genes revealed aggregation of AoACBP-EGFP fluorescence in the cytoplasm of Aoatg1, Aoatg4 and Aoatg8 disruptant cells. However, in cells harboring disruption of Aoatg15, which encodes a lipase for autophagic body, puncta of AoACBP-EGFP fluorescence accumulated in vacuoles, indicating that AoACBP is transported to vacuoles via the autophagy machinery. Collectively, these results suggest the existence of a regulatory mechanism between AoACBP localization and autophagy. Copyright © 2016 Elsevier Inc. All rights reserved.

  2. Selenium assimilation and loss by an insect predator and its relationship to Se subcellular partitioning in two prey types

    Energy Technology Data Exchange (ETDEWEB)

    Dubois, Maitee [Institut national de la recherche scientifique - Eau, Terre et Environnement, Universite du Quebec, Quebec City, Quebec, G1K 9A9 (Canada); Hare, Landis [Institut national de la recherche scientifique - Eau, Terre et Environnement, Universite du Quebec, Quebec City, Quebec, G1K 9A9 (Canada)], E-mail: landis@ete.inrs.ca

    2009-03-15

    Subcellular selenium (Se) distributions in the oligochaete Tubifex tubifex and in the insect Chironomus riparius did not vary with Se exposure duration, which was consistent with the observations that the duration of prey Se exposure had little influence on either Se assimilation or loss by a predatory insect (the alderfly Sialis velata). However, these two prey types differed in how Se was distributed in their cells. Overall, the predator assimilated a mean of 66% of the Se present in its prey, which was similar to the mean percentage of Se in prey cells (62%) that was theoretically available for uptake (that is, Se in the protein and organelle fractions). Likewise, data for cadmium, nickel and thallium suggest that predictions of trace element transfer between prey and predator are facilitated by considering the subcellular partitioning of these contaminants in prey cells. - Selenium assimilation by a predatory aquatic insect depends on Se availability in the cells of its prey.

  3. Selenium assimilation and loss by an insect predator and its relationship to Se subcellular partitioning in two prey types

    International Nuclear Information System (INIS)

    Dubois, Maitee; Hare, Landis

    2009-01-01

    Subcellular selenium (Se) distributions in the oligochaete Tubifex tubifex and in the insect Chironomus riparius did not vary with Se exposure duration, which was consistent with the observations that the duration of prey Se exposure had little influence on either Se assimilation or loss by a predatory insect (the alderfly Sialis velata). However, these two prey types differed in how Se was distributed in their cells. Overall, the predator assimilated a mean of 66% of the Se present in its prey, which was similar to the mean percentage of Se in prey cells (62%) that was theoretically available for uptake (that is, Se in the protein and organelle fractions). Likewise, data for cadmium, nickel and thallium suggest that predictions of trace element transfer between prey and predator are facilitated by considering the subcellular partitioning of these contaminants in prey cells. - Selenium assimilation by a predatory aquatic insect depends on Se availability in the cells of its prey

  4. Subcellular distribution of curium in beagle liver

    International Nuclear Information System (INIS)

    Bruenger, F.W.; Grube, B.J.; Atherton, D.R.; Taylor, G.N.; Stevens, W.

    1976-01-01

    The subcellular distribution of curium ( 243 244 Cm) was studied in canine liver from 2 hr to 47 days after injection of 3 μCi 243 244 Cm/kg of body weight. The pattern of distribution for Cm was similar to other trivalent actinide elements studied previously (Am, Cf). Initially (2 hr), most of the nuclide was found in the cytosol and at least 90 percent was protein bound. About 70 percent of the Cm was bound to ferritin, approximately 5 percent was associated with a protein of MW approximately 200,000, and approximately 25 percent was found in the low-molecular-weight region (approximately 5000). The decrease in the Cm content of cytosol, nuclei, and microsomes coincided with an increase in the amount associated with mitochondria and lysosomes. The concentration of the Cm in the mitochondrial fraction was higher than it was in the lysosomal fraction at each time studied. In the mitochondrial fraction approximately 30 percent of the Cm was bound to membranous or granular material, and 70 percent was found in the soluble fraction. The Cm concentration initially associated with cell nuclei was high but had diminished to 20 percent of the 2 hr concentration by 20 days post injection (PI). The subcellular distribution of Cm in the liver of a dog which had received the same dose and was terminated because of severe liver damage was studied at 384 days PI. The liver weighed 130 g and contained approximately 30 percent of the injected Cm. In contrast, a normal liver weighs 280 g and at 2 hr PI contains approximately 40 percent of the injected dose. The subcellular distribution of Cm in this severely damaged liver differed from the pattern observed at earlier times after injection. The relative concentration of Cm in the cytosol was doubled; it was higher in the nuclei-debris fraction; and it was lower in the mitochondrial and lysosomal fractions when compared to earlier times

  5. Subcellular interactions of dietary cadmium, copper and zinc in rainbow trout (Oncorhynchus mykiss)

    International Nuclear Information System (INIS)

    Kamunde, Collins; MacPhail, Ruth

    2011-01-01

    Highlights: Interactions of Cu, Cd and Zn were studied at the subcellular level in rainbow trout. Metals accumulated in the liver were predominantly metabolically active. Cd, Cu and Zn exhibited both competitive and cooperative interactions. The metal–metal interactions altered subcellular metals partitioning. - Abstract: Interactions of Cu, Cd and Zn were studied at the subcellular level in juvenile rainbow trout (Oncorhynchus mykiss) fed diets containing (μg/g) 500 Cu, 1000 Zn and 500 Cd singly and as a ternary mixture for 28 days. Livers were harvested and submitted to differential centrifugation to isolate components of metabolically active metal pool (MAP: heat-denaturable proteins (HDP), organelles, nuclei) and metabolically detoxified metal pool (MDP: heat stable proteins (HSP), NaOH-resistant granules). Results indicated that Cd accumulation was enhanced in all the subcellular compartments, albeit at different time points, in fish exposed to the metals mixture relative to those exposed to Cd alone, whereas Cu alone exposure increased Cd partitioning. Exposure to the metals mixture reduced (HDP) and enhanced (HSP, nuclei and granules) Cu accumulation while exposure to Zn alone enhanced Cu concentration in all the fractions analyzed without altering proportional distribution in MAP and MDP. Although subcellular Zn accumulation was less pronounced than that of either Cu or Cd, concentrations of Zn were enhanced in HDP, nuclei and granules from fish exposed to the metals mixture relative to those exposed to Zn alone. Cadmium alone exposure mobilized Zn and Cu from the nuclei and increased Zn accumulation in organelles and Cu in granules, while Cu alone exposure stimulated Zn accumulation in HSP, HDP and organelles. Interestingly, Cd alone exposure increased the partitioning of the three metals in MDP indicative of enhanced detoxification. Generally the accumulated metals were predominantly metabolically active: Cd, 67–83%; Cu, 68–79% and Zn, 60–76

  6. Subcellular interactions of dietary cadmium, copper and zinc in rainbow trout (Oncorhynchus mykiss)

    Energy Technology Data Exchange (ETDEWEB)

    Kamunde, Collins, E-mail: ckamunde@upei.ca [Department of Biomedical Sciences, Atlantic Veterinary College, University of Prince Edward Island, 550 University Avenue, Charlottetown, PE, C1A 4P3 (Canada); MacPhail, Ruth [Department of Biomedical Sciences, Atlantic Veterinary College, University of Prince Edward Island, 550 University Avenue, Charlottetown, PE, C1A 4P3 (Canada)

    2011-10-15

    Highlights: Interactions of Cu, Cd and Zn were studied at the subcellular level in rainbow trout. Metals accumulated in the liver were predominantly metabolically active. Cd, Cu and Zn exhibited both competitive and cooperative interactions. The metal-metal interactions altered subcellular metals partitioning. - Abstract: Interactions of Cu, Cd and Zn were studied at the subcellular level in juvenile rainbow trout (Oncorhynchus mykiss) fed diets containing ({mu}g/g) 500 Cu, 1000 Zn and 500 Cd singly and as a ternary mixture for 28 days. Livers were harvested and submitted to differential centrifugation to isolate components of metabolically active metal pool (MAP: heat-denaturable proteins (HDP), organelles, nuclei) and metabolically detoxified metal pool (MDP: heat stable proteins (HSP), NaOH-resistant granules). Results indicated that Cd accumulation was enhanced in all the subcellular compartments, albeit at different time points, in fish exposed to the metals mixture relative to those exposed to Cd alone, whereas Cu alone exposure increased Cd partitioning. Exposure to the metals mixture reduced (HDP) and enhanced (HSP, nuclei and granules) Cu accumulation while exposure to Zn alone enhanced Cu concentration in all the fractions analyzed without altering proportional distribution in MAP and MDP. Although subcellular Zn accumulation was less pronounced than that of either Cu or Cd, concentrations of Zn were enhanced in HDP, nuclei and granules from fish exposed to the metals mixture relative to those exposed to Zn alone. Cadmium alone exposure mobilized Zn and Cu from the nuclei and increased Zn accumulation in organelles and Cu in granules, while Cu alone exposure stimulated Zn accumulation in HSP, HDP and organelles. Interestingly, Cd alone exposure increased the partitioning of the three metals in MDP indicative of enhanced detoxification. Generally the accumulated metals were predominantly metabolically active: Cd, 67-83%; Cu, 68-79% and Zn, 60-76%. Taken

  7. Cell segmentation in time-lapse fluorescence microscopy with temporally varying sub-cellular fusion protein patterns.

    Science.gov (United States)

    Bunyak, Filiz; Palaniappan, Kannappan; Chagin, Vadim; Cardoso, M

    2009-01-01

    Fluorescently tagged proteins such as GFP-PCNA produce rich dynamically varying textural patterns of foci distributed in the nucleus. This enables the behavioral study of sub-cellular structures during different phases of the cell cycle. The varying punctuate patterns of fluorescence, drastic changes in SNR, shape and position during mitosis and abundance of touching cells, however, require more sophisticated algorithms for reliable automatic cell segmentation and lineage analysis. Since the cell nuclei are non-uniform in appearance, a distribution-based modeling of foreground classes is essential. The recently proposed graph partitioning active contours (GPAC) algorithm supports region descriptors and flexible distance metrics. We extend GPAC for fluorescence-based cell segmentation using regional density functions and dramatically improve its efficiency for segmentation from O(N(4)) to O(N(2)), for an image with N(2) pixels, making it practical and scalable for high throughput microscopy imaging studies.

  8. Studies on proinsulin and proglucagon biosynthesis and conversion at the subcellular level: I. Fractionation procedure and characterization of the subcellular fractions

    Science.gov (United States)

    Noe, BD; Baste, CA; Bauer, GE

    1977-01-01

    Anglerfish islets were homogenized in 0.25 M sucrose and separated into seven separate subcellular fractions by differential and discontinuous density gradient centrifugation. The objective was to isolate microsomes and secretory granules in a highly purified state. The fractions were characterized by electron microscopy and chemical analyses. Each fraction was assayed for its content of protein, RNA, DNA, immunoreactive insulin (IRI), and immunoreactive glucagon (IRG). Ultrastructural examination showed that two of the seven subcellular fractions contain primarily mitochondria, and that two others consist almost exclusively of secretory granules. A fifth fraction contains rough and smooth microsomal vesicles. The remaining two fractions are the cell supernate and the nuclei and cell debris. The content of DNA and RNA in all fractions is consistent with the observed ultrastructure. More than 82 percent of the total cellular IRI and 89(percent) of the total cellular IRG are found in the fractions of secretory granules. The combined fractions of secretory granules and microsomes consistently yield >93 percent of the total IRG. These results indicate that the fractionation procedure employed yields fractions of microsomes and secretory granules that contain nearly all the immunoassayable insulin and glucagons found in whole islet tissue. These fractions are thus considered suitable for study of proinsulin and proglucagon biosynthesis and their metabolic conversion at the subcellular level. PMID:328517

  9. An assessment on epitope prediction methods for protozoa genomes

    Directory of Open Access Journals (Sweden)

    Resende Daniela M

    2012-11-01

    of 0.77. For T CD8+ epitope predictors, the combined prediction of NetCTL and NetMHC reached an AUC value of 0.64. Finally, regarding the subcellular localization prediction, the best performance is achieved when the combined prediction of Sigcleave, TargetP and WoLF PSORT is used. Conclusions Our study indicates that the combination of B cells epitope predictors is the best tool for predicting epitopes on protozoan parasites proteins. Regarding subcellular localization, the best result was obtained when the three algorithms predictions were combined. The developed pipeline is available upon request to authors.

  10. Quantitative protein localization signatures reveal an association between spatial and functional divergences of proteins.

    Science.gov (United States)

    Loo, Lit-Hsin; Laksameethanasan, Danai; Tung, Yi-Ling

    2014-03-01

    Protein subcellular localization is a major determinant of protein function. However, this important protein feature is often described in terms of discrete and qualitative categories of subcellular compartments, and therefore it has limited applications in quantitative protein function analyses. Here, we present Protein Localization Analysis and Search Tools (PLAST), an automated analysis framework for constructing and comparing quantitative signatures of protein subcellular localization patterns based on microscopy images. PLAST produces human-interpretable protein localization maps that quantitatively describe the similarities in the localization patterns of proteins and major subcellular compartments, without requiring manual assignment or supervised learning of these compartments. Using the budding yeast Saccharomyces cerevisiae as a model system, we show that PLAST is more accurate than existing, qualitative protein localization annotations in identifying known co-localized proteins. Furthermore, we demonstrate that PLAST can reveal protein localization-function relationships that are not obvious from these annotations. First, we identified proteins that have similar localization patterns and participate in closely-related biological processes, but do not necessarily form stable complexes with each other or localize at the same organelles. Second, we found an association between spatial and functional divergences of proteins during evolution. Surprisingly, as proteins with common ancestors evolve, they tend to develop more diverged subcellular localization patterns, but still occupy similar numbers of compartments. This suggests that divergence of protein localization might be more frequently due to the development of more specific localization patterns over ancestral compartments than the occupation of new compartments. PLAST enables systematic and quantitative analyses of protein localization-function relationships, and will be useful to elucidate protein

  11. Abnormal subcellular distribution of GLUT4 protein in obese and insulin-treated diabetic female dogs

    Directory of Open Access Journals (Sweden)

    A.M. Vargas

    2004-07-01

    Full Text Available The GLUT4 transporter plays a key role in insulin-induced glucose uptake, which is impaired in insulin resistance. The objective of the present study was to investigate the tissue content and the subcellular distribution of GLUT4 protein in 4- to 12-year-old control, obese and insulin-treated diabetic mongrel female dogs (4 animals per group. The parametrial white adipose tissue was sampled and processed to obtain both plasma membrane and microsome subcellular fractions for GLUT4 analysis by Western blotting. There was no significant difference in glycemia and insulinemia between control and obese animals. Diabetic dogs showed hyperglycemia (369.9 ± 89.9 mg/dl. Compared to control, the plasma membrane GLUT4, reported per g tissue, was reduced by 55% (P < 0.01 in obese dogs, and increased by 30% (P < 0.05 in diabetic dogs, and the microsomal GLUT4 was increased by ~45% (P < 0.001 in both obese and diabetic animals. Considering the sum of GLUT4 measured in plasma membrane and microsome as total cellular GLUT4, percent GLUT4 present in plasma membrane was reduced by ~65% (P < 0.001 in obese compared to control and diabetic animals. Since insulin stimulates GLUT4 translocation to the plasma membrane, percent GLUT4 in plasma membrane was divided by the insulinemia at the time of tissue removal and was found to be reduced by 75% (P < 0.01 in obese compared to control dogs. We conclude that the insulin-stimulated translocation of GLUT4 to the cell surface is reduced in obese female dogs. This probably contributes to insulin resistance, which plays an important role in glucose homeostasis in dogs.

  12. SPiCE : A web-based tool for sequence-based protein classification and exploration

    NARCIS (Netherlands)

    Van den Berg, B.A.; Reinders, M.J.; Roubos, J.A.; De Ridder, D.

    2014-01-01

    Background Amino acid sequences and features extracted from such sequences have been used to predict many protein properties, such as subcellular localization or solubility, using classifier algorithms. Although software tools are available for both feature extraction and classifier construction,

  13. Protein complex prediction in large ontology attributed protein-protein interaction networks.

    Science.gov (United States)

    Zhang, Yijia; Lin, Hongfei; Yang, Zhihao; Wang, Jian; Li, Yanpeng; Xu, Bo

    2013-01-01

    Protein complexes are important for unraveling the secrets of cellular organization and function. Many computational approaches have been developed to predict protein complexes in protein-protein interaction (PPI) networks. However, most existing approaches focus mainly on the topological structure of PPI networks, and largely ignore the gene ontology (GO) annotation information. In this paper, we constructed ontology attributed PPI networks with PPI data and GO resource. After constructing ontology attributed networks, we proposed a novel approach called CSO (clustering based on network structure and ontology attribute similarity). Structural information and GO attribute information are complementary in ontology attributed networks. CSO can effectively take advantage of the correlation between frequent GO annotation sets and the dense subgraph for protein complex prediction. Our proposed CSO approach was applied to four different yeast PPI data sets and predicted many well-known protein complexes. The experimental results showed that CSO was valuable in predicting protein complexes and achieved state-of-the-art performance.

  14. Prion subcellular fractionation reveals infectivity spectrum, with a high titre-low PrPres level disparity

    Directory of Open Access Journals (Sweden)

    Lewis Victoria

    2012-04-01

    Full Text Available Abstract Background Prion disease transmission and pathogenesis are linked to misfolded, typically protease resistant (PrPres conformers of the normal cellular prion protein (PrPC, with the former posited to be the principal constituent of the infectious 'prion'. Unexplained discrepancies observed between detectable PrPres and infectivity levels exemplify the complexity in deciphering the exact biophysical nature of prions and those host cell factors, if any, which contribute to transmission efficiency. In order to improve our understanding of these important issues, this study utilized a bioassay validated cell culture model of prion infection to investigate discordance between PrPres levels and infectivity titres at a subcellular resolution. Findings Subcellular fractions enriched in lipid rafts or endoplasmic reticulum/mitochondrial marker proteins were equally highly efficient at prion transmission, despite lipid raft fractions containing up to eight times the levels of detectable PrPres. Brain homogenate infectivity was not differentially enhanced by subcellular fraction-specific co-factors, and proteinase K pre-treatment of selected fractions modestly, but equally reduced infectivity. Only lipid raft associated infectivity was enhanced by sonication. Conclusions This study authenticates a subcellular disparity in PrPres and infectivity levels, and eliminates simultaneous divergence of prion strains as the explanation for this phenomenon. On balance, the results align best with the concept that transmission efficiency is influenced more by intrinsic characteristics of the infectious prion, rather than cellular microenvironment conditions or absolute PrPres levels.

  15. Different protein-protein interface patterns predicted by different machine learning methods.

    Science.gov (United States)

    Wang, Wei; Yang, Yongxiao; Yin, Jianxin; Gong, Xinqi

    2017-11-22

    Different types of protein-protein interactions make different protein-protein interface patterns. Different machine learning methods are suitable to deal with different types of data. Then, is it the same situation that different interface patterns are preferred for prediction by different machine learning methods? Here, four different machine learning methods were employed to predict protein-protein interface residue pairs on different interface patterns. The performances of the methods for different types of proteins are different, which suggest that different machine learning methods tend to predict different protein-protein interface patterns. We made use of ANOVA and variable selection to prove our result. Our proposed methods taking advantages of different single methods also got a good prediction result compared to single methods. In addition to the prediction of protein-protein interactions, this idea can be extended to other research areas such as protein structure prediction and design.

  16. Targeting a heterologous protein to multiple plant organelles via rationally designed 5? mRNA tags

    NARCIS (Netherlands)

    Voges, M.J.; Silver, P.A.; Way, J.C.; Mattozzi, M.D.

    2013-01-01

    Background Plant bioengineers require simple genetic devices for predictable localization of heterologous proteins to multiple subcellular compartments. Results We designed novel hybrid signal sequences for multiple-compartment localization and characterize their function when fused to GFP in

  17. Capillary electrophoretic analysis reveals subcellular binding between individual mitochondria and cytoskeleton

    Science.gov (United States)

    Kostal, Vratislav; Arriaga, Edgar A.

    2011-01-01

    Interactions between the cytoskeleton and mitochondria are essential for normal cellular function. An assessment of such interactions is commonly based on bulk analysis of mitochondrial and cytoskeletal markers present in a given sample, which assumes complete binding between these two organelle types. Such measurements are biased because they rarely account for non-bound ‘free’ subcellular species. Here we report on the use of capillary electrophoresis with dual laser induced fluorescence detection (CE-LIF) to identify, classify, count and quantify properties of individual binding events of mitochondria and cytoskeleton. Mitochondria were fluorescently labeled with DsRed2 while F-actin, a major cytoskeletal component, was fluorescently labeled with Alexa488-phalloidin. In a typical subcellular fraction of L6 myoblasts, 79% of mitochondrial events did not have detectable levels of F-actin, while the rest had on average ~2 zeptomole F-actin, which theoretically represents a ~ 2.5-μm long network of actin filaments per event. Trypsin treatment of L6 subcellular fractions prior to analysis decreased the fraction of mitochondrial events with detectable levels of F-actin, which is expected from digestion of cytoskeletal proteins on the surface of mitochondria. The electrophoretic mobility distributions of the individual events were also used to further distinguish between cytoskeleton-bound from cytoskeleton-free mitochondrial events. The CE-LIF approach described here could be further developed to explore cytoskeleton interactions with other subcellular structures, the effects of cytoskeleton destabilizing drugs, and the progression of viral infections. PMID:21309532

  18. Parasites modify sub-cellular partitioning of metals in the gut of fish

    Energy Technology Data Exchange (ETDEWEB)

    Oyoo-Okoth, Elijah, E-mail: elijaoyoo2009@gmail.com [Division of Environmental Health, School of Environmental Studies, Moi University, P.O. Box 3900, Eldoret (Kenya); Department of Aquatic Ecology and Ecotoxicology, Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, P.O. Box 9424/1090 GE (Netherlands); Admiraal, Wim [Department of Aquatic Ecology and Ecotoxicology, Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, P.O. Box 9424/1090 GE (Netherlands); Osano, Odipo [Division of Environmental Health, School of Environmental Studies, Moi University, P.O. Box 3900, Eldoret (Kenya); Kraak, Michiel H.S. [Department of Aquatic Ecology and Ecotoxicology, Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, P.O. Box 9424/1090 GE (Netherlands); Gichuki, John; Ogwai, Caleb [Kenya Marine and Fisheries Research Institute, P.O. Box 1881, Kisumu (Kenya)

    2012-01-15

    Infestation of fish by parasites may influence metal accumulation patterns in the host. However, the subcellular mechanisms of these processes have rarely been studied. Therefore, this study determined how a cyprinid fish (Rastrineobola argentea) partitioned four metals (Cd, Cr, Zn and Cu) in the subcellular fractions of the gut in presence of an endoparasite (Ligula intestinalis). The fish were sampled along four sites in Lake Victoria, Kenya differing in metal contamination. Accumulation of Cd, Cr and Zn was higher in the whole body and in the gut of parasitized fish compared to non-parasitized fish, while Cu was depleted in parasitized fish. Generally, for both non-parasitized and parasitized fish, Cd, Cr and Zn partitioned in the cytosolic fractions and Cu in the particulate fraction. Metal concentrations in organelles within the particulate fractions of the non-parasitized fish were statistically similar except for Cd in the lysosome, while in the parasitized fish, Cd, Cr and Zn were accumulated more by the lysosome and microsomes. In the cytosolic fractions, the non-parasitized fish accumulated Cd, Cr and Zn in the heat stable proteins (HSP), while in the parasitized fish the metals were accumulated in the heat denatured proteins (HDP). On the contrary, Cu accumulated in the HSP in parasitized fish. The present study revealed specific binding of metals to potentially sensitive sub-cellular fractions in fish in the presence of parasites, suggesting interference with metal detoxification, and potentially affecting the health status of fish hosts in Lake Victoria.

  19. Prediction of Protein Configurational Entropy (Popcoen).

    Science.gov (United States)

    Goethe, Martin; Gleixner, Jan; Fita, Ignacio; Rubi, J Miguel

    2018-03-13

    A knowledge-based method for configurational entropy prediction of proteins is presented; this methodology is extremely fast, compared to previous approaches, because it does not involve any type of configurational sampling. Instead, the configurational entropy of a query fold is estimated by evaluating an artificial neural network, which was trained on molecular-dynamics simulations of ∼1000 proteins. The predicted entropy can be incorporated into a large class of protein software based on cost-function minimization/evaluation, in which configurational entropy is currently neglected for performance reasons. Software of this type is used for all major protein tasks such as structure predictions, proteins design, NMR and X-ray refinement, docking, and mutation effect predictions. Integrating the predicted entropy can yield a significant accuracy increase as we show exemplarily for native-state identification with the prominent protein software FoldX. The method has been termed Popcoen for Prediction of Protein Configurational Entropy. An implementation is freely available at http://fmc.ub.edu/popcoen/ .

  20. Bioaccumulation and subcellular partitioning of zinc in rainbow trout (Oncorhynchus mykiss): Cross-talk between waterborne and dietary uptake

    International Nuclear Information System (INIS)

    Sappal, Ravinder; Burka, John; Dawson, Susan; Kamunde, Collins

    2009-01-01

    Zinc homeostasis was studied at the tissue and gill subcellular levels in rainbow trout (Oncorhynchus mykiss) following waterborne and dietary exposures, singly and in combination. Juvenile rainbow trout were exposed to 150 or 600 μg l -1 waterborne Zn, 1500 or 4500 μg g -1 dietary Zn, and a combination of 150 μg l -1 waterborne and 1500 μg g -1 dietary Zn for 40 days. Accumulation of Zn in tissues and gill subcellular fractions was measured. At the tissue level, the carcass acted as the main Zn depot containing 84-90% of whole body Zn burden whereas the gill held 4-6%. At the subcellular level, the majority of gill Zn was bioavailable with the estimated metabolically active pool being 81-90%. Interestingly, the nuclei-cellular debris fraction bound the highest amount (40%) of the gill Zn burden. There was low partitioning of Zn into the detoxified pool (10-19%) suggesting that sequestration and chelation are not major mechanisms of cellular Zn homeostasis in rainbow trout. Further, the subcellular partitioning of Zn did not conform to the spill-over model of metal toxicity because Zn binding was indiscriminate irrespective of exposure concentration and duration. The contribution of the branchial and gastrointestinal uptake pathways to Zn accumulation depended on the tissue. Specifically, in plasma, blood cells, and gill, uptake from water was dominant whereas both pathways appeared to contribute equally to Zn accumulation in the carcass. Subcellularly, additive uptake from the two pathways was observed in the heat-stable proteins (HSP) fraction. Toxicologically, Zn exposure caused minimal adverse effects manifested by a transitory inhibition of protein synthesis in gills in the waterborne exposure. Overall, subcellular fractionation appears to have value in the quest for a better understanding of Zn homeostasis and interactions between branchial and gastrointestinal uptake pathways

  1. Bioaccumulation and subcellular partitioning of zinc in rainbow trout (Oncorhynchus mykiss): Cross-talk between waterborne and dietary uptake

    Energy Technology Data Exchange (ETDEWEB)

    Sappal, Ravinder; Burka, John; Dawson, Susan [Department of Biomedical Sciences, Atlantic Veterinary College, University of Prince Edward Island, Charlottetown, PE C1A 4P3 (Canada); Kamunde, Collins [Department of Biomedical Sciences, Atlantic Veterinary College, University of Prince Edward Island, Charlottetown, PE C1A 4P3 (Canada)], E-mail: ckamunde@upei.ca

    2009-03-09

    Zinc homeostasis was studied at the tissue and gill subcellular levels in rainbow trout (Oncorhynchus mykiss) following waterborne and dietary exposures, singly and in combination. Juvenile rainbow trout were exposed to 150 or 600 {mu}g l{sup -1} waterborne Zn, 1500 or 4500 {mu}g g{sup -1} dietary Zn, and a combination of 150 {mu}g l{sup -1} waterborne and 1500 {mu}g g{sup -1} dietary Zn for 40 days. Accumulation of Zn in tissues and gill subcellular fractions was measured. At the tissue level, the carcass acted as the main Zn depot containing 84-90% of whole body Zn burden whereas the gill held 4-6%. At the subcellular level, the majority of gill Zn was bioavailable with the estimated metabolically active pool being 81-90%. Interestingly, the nuclei-cellular debris fraction bound the highest amount (40%) of the gill Zn burden. There was low partitioning of Zn into the detoxified pool (10-19%) suggesting that sequestration and chelation are not major mechanisms of cellular Zn homeostasis in rainbow trout. Further, the subcellular partitioning of Zn did not conform to the spill-over model of metal toxicity because Zn binding was indiscriminate irrespective of exposure concentration and duration. The contribution of the branchial and gastrointestinal uptake pathways to Zn accumulation depended on the tissue. Specifically, in plasma, blood cells, and gill, uptake from water was dominant whereas both pathways appeared to contribute equally to Zn accumulation in the carcass. Subcellularly, additive uptake from the two pathways was observed in the heat-stable proteins (HSP) fraction. Toxicologically, Zn exposure caused minimal adverse effects manifested by a transitory inhibition of protein synthesis in gills in the waterborne exposure. Overall, subcellular fractionation appears to have value in the quest for a better understanding of Zn homeostasis and interactions between branchial and gastrointestinal uptake pathways.

  2. Prediction of protein structural classes by Chou's pseudo amino acid composition: approached using continuous wavelet transform and principal component analysis.

    Science.gov (United States)

    Li, Zhan-Chao; Zhou, Xi-Bin; Dai, Zong; Zou, Xiao-Yong

    2009-07-01

    A prior knowledge of protein structural classes can provide useful information about its overall structure, so it is very important for quick and accurate determination of protein structural class with computation method in protein science. One of the key for computation method is accurate protein sample representation. Here, based on the concept of Chou's pseudo-amino acid composition (AAC, Chou, Proteins: structure, function, and genetics, 43:246-255, 2001), a novel method of feature extraction that combined continuous wavelet transform (CWT) with principal component analysis (PCA) was introduced for the prediction of protein structural classes. Firstly, the digital signal was obtained by mapping each amino acid according to various physicochemical properties. Secondly, CWT was utilized to extract new feature vector based on wavelet power spectrum (WPS), which contains more abundant information of sequence order in frequency domain and time domain, and PCA was then used to reorganize the feature vector to decrease information redundancy and computational complexity. Finally, a pseudo-amino acid composition feature vector was further formed to represent primary sequence by coupling AAC vector with a set of new feature vector of WPS in an orthogonal space by PCA. As a showcase, the rigorous jackknife cross-validation test was performed on the working datasets. The results indicated that prediction quality has been improved, and the current approach of protein representation may serve as a useful complementary vehicle in classifying other attributes of proteins, such as enzyme family class, subcellular localization, membrane protein types and protein secondary structure, etc.

  3. Neural Networks for protein Structure Prediction

    DEFF Research Database (Denmark)

    Bohr, Henrik

    1998-01-01

    This is a review about neural network applications in bioinformatics. Especially the applications to protein structure prediction, e.g. prediction of secondary structures, prediction of surface structure, fold class recognition and prediction of the 3-dimensional structure of protein backbones...

  4. Protein function prediction using neighbor relativity in protein-protein interaction network.

    Science.gov (United States)

    Moosavi, Sobhan; Rahgozar, Masoud; Rahimi, Amir

    2013-04-01

    There is a large gap between the number of discovered proteins and the number of functionally annotated ones. Due to the high cost of determining protein function by wet-lab research, function prediction has become a major task for computational biology and bioinformatics. Some researches utilize the proteins interaction information to predict function for un-annotated proteins. In this paper, we propose a novel approach called "Neighbor Relativity Coefficient" (NRC) based on interaction network topology which estimates the functional similarity between two proteins. NRC is calculated for each pair of proteins based on their graph-based features including distance, common neighbors and the number of paths between them. In order to ascribe function to an un-annotated protein, NRC estimates a weight for each neighbor to transfer its annotation to the unknown protein. Finally, the unknown protein will be annotated by the top score transferred functions. We also investigate the effect of using different coefficients for various types of functions. The proposed method has been evaluated on Saccharomyces cerevisiae and Homo sapiens interaction networks. The performance analysis demonstrates that NRC yields better results in comparison with previous protein function prediction approaches that utilize interaction network. Copyright © 2012 Elsevier Ltd. All rights reserved.

  5. Mutational analyses of the signals involved in the subcellular location of DSCR1

    Directory of Open Access Journals (Sweden)

    Henrique-Silva Flávio

    2002-09-01

    Full Text Available Abstract Background Down syndrome is the most frequent genetic disorder in humans. Rare cases involving partial trisomy of chromosome 21 allowed a small chromosomal region common to all carriers, called Down Syndrome Critical Region (DSCR, to be determined. The DSCR1 gene was identified in this region and is expressed preferentially in the brain, heart and skeletal muscle. Recent studies have shown that DSCR1 belongs to a family of proteins that binds and inhibits calcineurin, a serine-threonine phosphatase. The work reported on herein consisted of a study of the subcellular location of DSCR1 and DSCR1-mutated forms by fusion with a green fluorescent protein, using various cell lines, including human. Results The protein's location was preferentially nuclear, independently of the isoform, cell line and insertion in the GFP's N- or C-terminal. A segment in the C-terminal, which is important in the location of the protein, was identified by deletion. On the other hand, site-directed mutational analyses have indicated the involvement of some serine and threonine residues in this event. Conclusion In this paper, we discuss the identification of amino acids which can be important for subcellular location of DSCR1. The involvement of residues that are prone to phosphorylation suggests that the location and function of DSCR1 may be regulated by kinases and/or phosphatases.

  6. Protein complex prediction based on k-connected subgraphs in protein interaction network

    Directory of Open Access Journals (Sweden)

    Habibi Mahnaz

    2010-09-01

    Full Text Available Abstract Background Protein complexes play an important role in cellular mechanisms. Recently, several methods have been presented to predict protein complexes in a protein interaction network. In these methods, a protein complex is predicted as a dense subgraph of protein interactions. However, interactions data are incomplete and a protein complex does not have to be a complete or dense subgraph. Results We propose a more appropriate protein complex prediction method, CFA, that is based on connectivity number on subgraphs. We evaluate CFA using several protein interaction networks on reference protein complexes in two benchmark data sets (MIPS and Aloy, containing 1142 and 61 known complexes respectively. We compare CFA to some existing protein complex prediction methods (CMC, MCL, PCP and RNSC in terms of recall and precision. We show that CFA predicts more complexes correctly at a competitive level of precision. Conclusions Many real complexes with different connectivity level in protein interaction network can be predicted based on connectivity number. Our CFA program and results are freely available from http://www.bioinf.cs.ipm.ir/softwares/cfa/CFA.rar.

  7. Tau regulates the subcellular localization of calmodulin

    Energy Technology Data Exchange (ETDEWEB)

    Barreda, Elena Gomez de [Centro de Biologia Molecular ' Severo Ochoa' , CSIC/UAM, Universidad Autonoma de Madrid, Cantoblanco, 28049 Madrid (Spain); Avila, Jesus, E-mail: javila@cbm.uam.es [Centro de Biologia Molecular ' Severo Ochoa' , CSIC/UAM, Universidad Autonoma de Madrid, Cantoblanco, 28049 Madrid (Spain); CIBER de Enfermedades Neurodegenerativas, 28031 Madrid (Spain)

    2011-05-13

    Highlights: {yields} In this work we have tried to explain how a cytoplasmic protein could regulate a cell nuclear function. We have tested the role of a cytoplasmic protein (tau) in regulating the expression of calbindin gene. We found that calmodulin, a tau-binding protein with nuclear and cytoplasmic localization, increases its nuclear localization in the absence of tau. Since nuclear calmodulin regulates calbindin expression, a decrease in nuclear calmodulin, due to the presence of tau that retains it at the cytoplasm, results in a change in calbindin expression. -- Abstract: Lack of tau expression in neuronal cells results in a change in the expression of few genes. However, little is known about how tau regulates gene expression. Here we show that the presence of tau could alter the subcellular localization of calmodulin, a protein that could be located at the cytoplasm or in the nucleus. Nuclear calmodulin binds to co-transcription factors, regulating the expression of genes like calbindin. In this work, we have found that in neurons containing tau, a higher proportion of calmodulin is present in the cytoplasm compared with neurons lacking tau and that an increase in cytoplasmic calmodulin correlates with a higher expression of calbindin.

  8. Tau regulates the subcellular localization of calmodulin

    International Nuclear Information System (INIS)

    Barreda, Elena Gomez de; Avila, Jesus

    2011-01-01

    Highlights: → In this work we have tried to explain how a cytoplasmic protein could regulate a cell nuclear function. We have tested the role of a cytoplasmic protein (tau) in regulating the expression of calbindin gene. We found that calmodulin, a tau-binding protein with nuclear and cytoplasmic localization, increases its nuclear localization in the absence of tau. Since nuclear calmodulin regulates calbindin expression, a decrease in nuclear calmodulin, due to the presence of tau that retains it at the cytoplasm, results in a change in calbindin expression. -- Abstract: Lack of tau expression in neuronal cells results in a change in the expression of few genes. However, little is known about how tau regulates gene expression. Here we show that the presence of tau could alter the subcellular localization of calmodulin, a protein that could be located at the cytoplasm or in the nucleus. Nuclear calmodulin binds to co-transcription factors, regulating the expression of genes like calbindin. In this work, we have found that in neurons containing tau, a higher proportion of calmodulin is present in the cytoplasm compared with neurons lacking tau and that an increase in cytoplasmic calmodulin correlates with a higher expression of calbindin.

  9. Nucleolin modulates the subcellular localization of GDNF-inducible zinc finger protein 1 and its roles in transcription and cell proliferation

    International Nuclear Information System (INIS)

    Dambara, Atsushi; Morinaga, Takatoshi; Fukuda, Naoyuki; Yamakawa, Yoshinori; Kato, Takuya; Enomoto, Atsushi; Asai, Naoya; Murakumo, Yoshiki; Matsuo, Seiichi; Takahashi, Masahide

    2007-01-01

    GZF1 is a zinc finger protein induced by glial cell-line-derived neurotrophic factor (GDNF). It is a sequence-specific transcriptional repressor with a BTB/POZ (Broad complex, Tramtrack, Bric a brac/Poxvirus and zinc finger) domain and ten zinc finger motifs. In the present study, we used immunoprecipitation and mass spectrometry to identify nucleolin as a GZF1-binding protein. Deletion analysis revealed that zinc finger motifs 1-4 of GZF1 mediate its association with nucleolin. When zinc fingers 1-4 were deleted from GZF1 or nucleolin expression was knocked down by short interference RNA (siRNA), nuclear localization of GZF1 was impaired. These results suggest that nucleolin is involved in the proper subcellular distribution of GZF1. In addition, overexpression of nucleolin moderately inhibited the transcriptional repressive activity of GZF1 whereas knockdown of nucleolin expression by siRNA enhanced its activity. Thus, the repressive activity of GZF1 is modulated by the level at which nucleolin is expressed. Finally, we found that knockdown of GZF1 and nucleolin expression markedly impaired cell proliferation. These findings suggest that the physiological functions of GZF1 may be regulated by the protein's association with nucleolin

  10. Optogenetic Tools for Subcellular Applications in Neuroscience.

    Science.gov (United States)

    Rost, Benjamin R; Schneider-Warme, Franziska; Schmitz, Dietmar; Hegemann, Peter

    2017-11-01

    The ability to study cellular physiology using photosensitive, genetically encoded molecules has profoundly transformed neuroscience. The modern optogenetic toolbox includes fluorescent sensors to visualize signaling events in living cells and optogenetic actuators enabling manipulation of numerous cellular activities. Most optogenetic tools are not targeted to specific subcellular compartments but are localized with limited discrimination throughout the cell. Therefore, optogenetic activation often does not reflect context-dependent effects of highly localized intracellular signaling events. Subcellular targeting is required to achieve more specific optogenetic readouts and photomanipulation. Here we first provide a detailed overview of the available optogenetic tools with a focus on optogenetic actuators. Second, we review established strategies for targeting these tools to specific subcellular compartments. Finally, we discuss useful tools and targeting strategies that are currently missing from the optogenetics repertoire and provide suggestions for novel subcellular optogenetic applications. Copyright © 2017 Elsevier Inc. All rights reserved.

  11. Identification and subcellular localization of porcine deltacoronavirus accessory protein NS6

    International Nuclear Information System (INIS)

    Fang, Puxian; Fang, Liurong; Liu, Xiaorong; Hong, Yingying; Wang, Yongle; Dong, Nan; Ma, Panpan; Bi, Jing; Wang, Dang; Xiao, Shaobo

    2016-01-01

    Porcine deltacoronavirus (PDCoV) is an emerging swine enteric coronavirus. Accessory proteins are genus-specific for coronavirus, and two putative accessory proteins, NS6 and NS7, are predicted to be encoded by PDCoV; however, this remains to be confirmed experimentally. Here, we identified the leader-body junction sites of NS6 subgenomic RNA (sgRNA) and found that the actual transcription regulatory sequence (TRS) utilized by NS6 is non-canonical and is located upstream of the predicted TRS. Using the purified NS6 from an Escherichia coli expression system, we obtained two anti-NS6 monoclonal antibodies that could detect the predicted NS6 in cells infected with PDCoV or transfected with NS6-expressing plasmids. Further studies revealed that NS6 is always localized in the cytoplasm of PDCoV-infected cells, mainly co-localizing with the endoplasmic reticulum (ER) and ER-Golgi intermediate compartments, as well as partially with the Golgi apparatus. Together, our results identify the NS6 sgRNA and demonstrate its expression in PDCoV-infected cells. -- Highlights: •The leader-body fusion site of NS6 sgRNA is identified. •NS6 sgRNA uses a non-canonical transcription regulatory sequence (TRS). •NS6 can be expressed in PDCoV-infected cell. •NS6 predominantly localize to the ER complex and ER-Golgi intermediate compartment.

  12. Identification and subcellular localization of porcine deltacoronavirus accessory protein NS6

    Energy Technology Data Exchange (ETDEWEB)

    Fang, Puxian; Fang, Liurong; Liu, Xiaorong; Hong, Yingying; Wang, Yongle; Dong, Nan; Ma, Panpan [State Key Laboratory of Agricultural Microbiology, College of Veterinary Medicine, Huazhong Agricultural University, Wuhan 430070 (China); The Cooperative Innovation Center for Sustainable Pig Production, Wuhan 430070 (China); Bi, Jing [State Key Laboratory of Agricultural Microbiology, College of Veterinary Medicine, Huazhong Agricultural University, Wuhan 430070 (China); Department of Immunology and Aetology, College of Basic Medicine, Hubei University of Chinese Medicine, Wuhan 430065 (China); Wang, Dang [State Key Laboratory of Agricultural Microbiology, College of Veterinary Medicine, Huazhong Agricultural University, Wuhan 430070 (China); The Cooperative Innovation Center for Sustainable Pig Production, Wuhan 430070 (China); Xiao, Shaobo, E-mail: vet@mail.hzau.edu.cn [State Key Laboratory of Agricultural Microbiology, College of Veterinary Medicine, Huazhong Agricultural University, Wuhan 430070 (China); The Cooperative Innovation Center for Sustainable Pig Production, Wuhan 430070 (China)

    2016-12-15

    Porcine deltacoronavirus (PDCoV) is an emerging swine enteric coronavirus. Accessory proteins are genus-specific for coronavirus, and two putative accessory proteins, NS6 and NS7, are predicted to be encoded by PDCoV; however, this remains to be confirmed experimentally. Here, we identified the leader-body junction sites of NS6 subgenomic RNA (sgRNA) and found that the actual transcription regulatory sequence (TRS) utilized by NS6 is non-canonical and is located upstream of the predicted TRS. Using the purified NS6 from an Escherichia coli expression system, we obtained two anti-NS6 monoclonal antibodies that could detect the predicted NS6 in cells infected with PDCoV or transfected with NS6-expressing plasmids. Further studies revealed that NS6 is always localized in the cytoplasm of PDCoV-infected cells, mainly co-localizing with the endoplasmic reticulum (ER) and ER-Golgi intermediate compartments, as well as partially with the Golgi apparatus. Together, our results identify the NS6 sgRNA and demonstrate its expression in PDCoV-infected cells. -- Highlights: •The leader-body fusion site of NS6 sgRNA is identified. •NS6 sgRNA uses a non-canonical transcription regulatory sequence (TRS). •NS6 can be expressed in PDCoV-infected cell. •NS6 predominantly localize to the ER complex and ER-Golgi intermediate compartment.

  13. Protein complex prediction based on k-connected subgraphs in protein interaction network

    OpenAIRE

    Habibi, Mahnaz; Eslahchi, Changiz; Wong, Limsoon

    2010-01-01

    Abstract Background Protein complexes play an important role in cellular mechanisms. Recently, several methods have been presented to predict protein complexes in a protein interaction network. In these methods, a protein complex is predicted as a dense subgraph of protein interactions. However, interactions data are incomplete and a protein complex does not have to be a complete or dense subgraph. Results We propose a more appropriate protein complex prediction method, CFA, that is based on ...

  14. Loss of Subcellular Lipid Transport Due to ARV1 Deficiency Disrupts Organelle Homeostasis and Activates the Unfolded Protein Response*

    Science.gov (United States)

    Shechtman, Caryn F.; Henneberry, Annette L.; Seimon, Tracie A.; Tinkelenberg, Arthur H.; Wilcox, Lisa J.; Lee, Eunjee; Fazlollahi, Mina; Munkacsi, Andrew B.; Bussemaker, Harmen J.; Tabas, Ira; Sturley, Stephen L.

    2011-01-01

    The ARV1-encoded protein mediates sterol transport from the endoplasmic reticulum (ER) to the plasma membrane. Yeast ARV1 mutants accumulate multiple lipids in the ER and are sensitive to pharmacological modulators of both sterol and sphingolipid metabolism. Using fluorescent and electron microscopy, we demonstrate sterol accumulation, subcellular membrane expansion, elevated lipid droplet formation, and vacuolar fragmentation in ARV1 mutants. Motif-based regression analysis of ARV1 deletion transcription profiles indicates activation of Hac1p, an integral component of the unfolded protein response (UPR). Accordingly, we show constitutive splicing of HAC1 transcripts, induction of a UPR reporter, and elevated expression of UPR targets in ARV1 mutants. IRE1, encoding the unfolded protein sensor in the ER lumen, exhibits a lethal genetic interaction with ARV1, indicating a viability requirement for the UPR in cells lacking ARV1. Surprisingly, ARV1 mutants expressing a variant of Ire1p defective in sensing unfolded proteins are viable. Moreover, these strains also exhibit constitutive HAC1 splicing that interacts with DTT-mediated perturbation of protein folding. These data suggest that a component of UPR induction in arv1Δ strains is distinct from protein misfolding. Decreased ARV1 expression in murine macrophages also results in UPR induction, particularly up-regulation of activating transcription factor-4, CHOP (C/EBP homologous protein), and apoptosis. Cholesterol loading or inhibition of cholesterol esterification further elevated CHOP expression in ARV1 knockdown cells. Thus, loss or down-regulation of ARV1 disturbs membrane and lipid homeostasis, resulting in a disruption of ER integrity, one consequence of which is induction of the UPR. PMID:21266578

  15. False positive reduction in protein-protein interaction predictions using gene ontology annotations

    Directory of Open Access Journals (Sweden)

    Lin Yen-Han

    2007-07-01

    Full Text Available Abstract Background Many crucial cellular operations such as metabolism, signalling, and regulations are based on protein-protein interactions. However, the lack of robust protein-protein interaction information is a challenge. One reason for the lack of solid protein-protein interaction information is poor agreement between experimental findings and computational sets that, in turn, comes from huge false positive predictions in computational approaches. Reduction of false positive predictions and enhancing true positive fraction of computationally predicted protein-protein interaction datasets based on highly confident experimental results has not been adequately investigated. Results Gene Ontology (GO annotations were used to reduce false positive protein-protein interactions (PPI pairs resulting from computational predictions. Using experimentally obtained PPI pairs as a training dataset, eight top-ranking keywords were extracted from GO molecular function annotations. The sensitivity of these keywords is 64.21% in the yeast experimental dataset and 80.83% in the worm experimental dataset. The specificities, a measure of recovery power, of these keywords applied to four predicted PPI datasets for each studied organisms, are 48.32% and 46.49% (by average of four datasets in yeast and worm, respectively. Based on eight top-ranking keywords and co-localization of interacting proteins a set of two knowledge rules were deduced and applied to remove false positive protein pairs. The 'strength', a measure of improvement provided by the rules was defined based on the signal-to-noise ratio and implemented to measure the applicability of knowledge rules applying to the predicted PPI datasets. Depending on the employed PPI-predicting methods, the strength varies between two and ten-fold of randomly removing protein pairs from the datasets. Conclusion Gene Ontology annotations along with the deduced knowledge rules could be implemented to partially

  16. Cloning, characterization and sub-cellular localization of gamma subunit of T-complex protein-1 (chaperonin) from Leishmania donovani

    Energy Technology Data Exchange (ETDEWEB)

    Bhaskar,; Kumari, Neeti [Division of Biochemistry, CSIR-Central Drug Research Institute, Chattar Manzil Palace, PO Box 173, Lucknow (India); Goyal, Neena, E-mail: neenacdri@yahoo.com [Division of Biochemistry, CSIR-Central Drug Research Institute, Chattar Manzil Palace, PO Box 173, Lucknow (India)

    2012-12-07

    Highlights: Black-Right-Pointing-Pointer The study presents cloning and characterization of TCP1{gamma} gene from L. donovani. Black-Right-Pointing-Pointer TCP1{gamma} is a subunit of T-complex protein-1 (TCP1), a chaperonin class of protein. Black-Right-Pointing-Pointer LdTCP{gamma} exhibited differential expression in different stages of promastigotes. Black-Right-Pointing-Pointer LdTCP{gamma} co-localized with actin, a cytoskeleton protein. Black-Right-Pointing-Pointer The data suggests that this gene may have a role in differentiation/biogenesis. Black-Right-Pointing-Pointer First report on this chapronin in Leishmania. -- Abstract: T-complex protein-1 (TCP1) complex, a chaperonin class of protein, ubiquitous in all genera of life, is involved in intracellular assembly and folding of various proteins. The gamma subunit of TCP1 complex (TCP1{gamma}), plays a pivotal role in the folding and assembly of cytoskeleton protein(s) as an individual or complexed with other subunits. Here, we report for the first time cloning, characterization and expression of the TCP1{gamma} of Leishmania donovani (LdTCP1{gamma}), the causative agent of Indian Kala-azar. Primary sequence analysis of LdTCP1{gamma} revealed the presence of all the characteristic features of TCP1{gamma}. However, leishmanial TCP1{gamma} represents a distinct kinetoplastid group, clustered in a separate branch of the phylogenic tree. LdTCP1{gamma} exhibited differential expression in different stages of promastigotes. The non-dividing stationary phase promastigotes exhibited 2.5-fold less expression of LdTCP1{gamma} as compared to rapidly dividing log phase parasites. The sub-cellular distribution of LdTCP1{gamma} was studied in log phase promastigotes by employing indirect immunofluorescence microscopy. The protein was present not only in cytoplasm but it was also localized in nucleus, peri-nuclear region, flagella, flagellar pocket and apical region. Co-localization of LdTCP1{gamma} with actin suggests

  17. Cloning, characterization and sub-cellular localization of gamma subunit of T-complex protein-1 (chaperonin) from Leishmania donovani

    International Nuclear Information System (INIS)

    Bhaskar,; Kumari, Neeti; Goyal, Neena

    2012-01-01

    Highlights: ► The study presents cloning and characterization of TCP1γ gene from L. donovani. ► TCP1γ is a subunit of T-complex protein-1 (TCP1), a chaperonin class of protein. ► LdTCPγ exhibited differential expression in different stages of promastigotes. ► LdTCPγ co-localized with actin, a cytoskeleton protein. ► The data suggests that this gene may have a role in differentiation/biogenesis. ► First report on this chapronin in Leishmania. -- Abstract: T-complex protein-1 (TCP1) complex, a chaperonin class of protein, ubiquitous in all genera of life, is involved in intracellular assembly and folding of various proteins. The gamma subunit of TCP1 complex (TCP1γ), plays a pivotal role in the folding and assembly of cytoskeleton protein(s) as an individual or complexed with other subunits. Here, we report for the first time cloning, characterization and expression of the TCP1γ of Leishmania donovani (LdTCP1γ), the causative agent of Indian Kala-azar. Primary sequence analysis of LdTCP1γ revealed the presence of all the characteristic features of TCP1γ. However, leishmanial TCP1γ represents a distinct kinetoplastid group, clustered in a separate branch of the phylogenic tree. LdTCP1γ exhibited differential expression in different stages of promastigotes. The non-dividing stationary phase promastigotes exhibited 2.5-fold less expression of LdTCP1γ as compared to rapidly dividing log phase parasites. The sub-cellular distribution of LdTCP1γ was studied in log phase promastigotes by employing indirect immunofluorescence microscopy. The protein was present not only in cytoplasm but it was also localized in nucleus, peri-nuclear region, flagella, flagellar pocket and apical region. Co-localization of LdTCP1γ with actin suggests that, this gene may have a role in maintaining the structural dynamics of cytoskeleton of parasite.

  18. Protein-Protein Interactions Prediction Based on Iterative Clique Extension with Gene Ontology Filtering

    Directory of Open Access Journals (Sweden)

    Lei Yang

    2014-01-01

    Full Text Available Cliques (maximal complete subnets in protein-protein interaction (PPI network are an important resource used to analyze protein complexes and functional modules. Clique-based methods of predicting PPI complement the data defection from biological experiments. However, clique-based predicting methods only depend on the topology of network. The false-positive and false-negative interactions in a network usually interfere with prediction. Therefore, we propose a method combining clique-based method of prediction and gene ontology (GO annotations to overcome the shortcoming and improve the accuracy of predictions. According to different GO correcting rules, we generate two predicted interaction sets which guarantee the quality and quantity of predicted protein interactions. The proposed method is applied to the PPI network from the Database of Interacting Proteins (DIP and most of the predicted interactions are verified by another biological database, BioGRID. The predicted protein interactions are appended to the original protein network, which leads to clique extension and shows the significance of biological meaning.

  19. Role of the EHD2 unstructured loop in dimerization, protein binding and subcellular localization.

    Directory of Open Access Journals (Sweden)

    Kriti Bahl

    Full Text Available The C-terminal Eps 15 Homology Domain proteins (EHD1-4 play important roles in regulating endocytic trafficking. EHD2 is the only family member whose crystal structure has been solved, and it contains an unstructured loop consisting of two proline-phenylalanine (PF motifs: KPFRKLNPF. In contrast, despite EHD2 having nearly 70% amino acid identity with its paralogs, EHD1, EHD3 and EHD4, the latter proteins contain a single KPF or RPF motif, but no NPF motif. In this study, we sought to define the precise role of each PF motif in EHD2's homo-dimerization, binding with the protein partners, and subcellular localization. To test the role of the NPF motif, we generated an EHD2 NPF-to-NAF mutant to mimic the homologous sequences of EHD1 and EHD3. We demonstrated that this mutant lost both its ability to dimerize and bind to Syndapin2. However, it continued to localize primarily to the cytosolic face of the plasma membrane. On the other hand, EHD2 NPF-to-APA mutants displayed normal dimerization and Syndapin2 binding, but exhibited markedly increased nuclear localization and reduced association with the plasma membrane. We then hypothesized that the single PF motif of EHD1 (that aligns with the KPF of EHD2 might be responsible for both binding and localization functions of EHD1. Indeed, the EHD1 RPF motif was required for dimerization, interaction with MICAL-L1 and Syndapin2, as well as localization to tubular recycling endosomes. Moreover, recycling assays demonstrated that EHD1 RPF-to-APA was incapable of supporting normal receptor recycling. Overall, our data suggest that the EHD2 NPF phenylalanine residue is crucial for EHD2 localization to the plasma membrane, whereas the proline residue is essential for EHD2 dimerization and binding. These studies support the recently proposed model in which the EHD2 N-terminal region may regulate the availability of the unstructured loop for interactions with neighboring EHD2 dimers, thus promoting

  20. Specific primary sequence requirements for Aurora B kinase-mediated phosphorylation and subcellular localization of TMAP during mitosis.

    Science.gov (United States)

    Kim, Hyun-Jun; Kwon, Hye-Rim; Bae, Chang-Dae; Park, Joobae; Hong, Kyung U

    2010-05-15

    During mitosis, regulation of protein structures and functions by phosphorylation plays critical roles in orchestrating a series of complex events essential for the cell division process. Tumor-associated microtubule-associated protein (TMAP), also known as cytoskeleton-associated protein 2 (CKAP2), is a novel player in spindle assembly and chromosome segregation. We have previously reported that TMAP is phosphorylated at multiple residues specifically during mitosis. However, the mechanisms and functional importance of phosphorylation at most of the sites identified are currently unknown. Here, we report that TMAP is a novel substrate of the Aurora B kinase. Ser627 of TMAP was specifically phosphorylated by Aurora B both in vitro and in vivo. Ser627 and neighboring conserved residues were strictly required for efficient phosphorylation of TMAP by Aurora B, as even minor amino acid substitutions of the phosphorylation motif significantly diminished the efficiency of the substrate phosphorylation. Nearly all mutations at the phosphorylation motif had dramatic effects on the subcellular localization of TMAP. Instead of being localized to the chromosome region during late mitosis, the mutants remained associated with microtubules and centrosomes throughout mitosis. However, the changes in the subcellular localization of these mutants could not be completely explained by the phosphorylation status on Ser627. Our findings suggest that the motif surrounding Ser627 ((625) RRSRRL (630)) is a critical part of a functionally important sequence motif which not only governs the kinase-substrate recognition, but also regulates the subcellular localization of TMAP during mitosis.

  1. The cellular and subcellular localization of zinc transporter 7 in the mouse spinal cord

    Science.gov (United States)

    The present work addresses the cellular and subcellular localization of the zinc transporter 7 (ZNT7, SLC30a7) protein and the distribution of zinc ions (Zn2+) in the mouse spinal cord. Our results indicated that the ZNT7 immunoreactive neurons were widely distributed in the Rexed’s laminae of the g...

  2. Application of Machine Learning Approaches for Protein-protein Interactions Prediction.

    Science.gov (United States)

    Zhang, Mengying; Su, Qiang; Lu, Yi; Zhao, Manman; Niu, Bing

    2017-01-01

    Proteomics endeavors to study the structures, functions and interactions of proteins. Information of the protein-protein interactions (PPIs) helps to improve our knowledge of the functions and the 3D structures of proteins. Thus determining the PPIs is essential for the study of the proteomics. In this review, in order to study the application of machine learning in predicting PPI, some machine learning approaches such as support vector machine (SVM), artificial neural networks (ANNs) and random forest (RF) were selected, and the examples of its applications in PPIs were listed. SVM and RF are two commonly used methods. Nowadays, more researchers predict PPIs by combining more than two methods. This review presents the application of machine learning approaches in predicting PPI. Many examples of success in identification and prediction in the area of PPI prediction have been discussed, and the PPIs research is still in progress. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  3. The subcellular localization of IGFBP5 affects its cell growth and migration functions in breast cancer

    International Nuclear Information System (INIS)

    Akkiprik, Mustafa; Hu, Limei; Sahin, Aysegul; Hao, Xishan; Zhang, Wei

    2009-01-01

    Insulin-like growth factor binding protein 5 (IGFBP5) has been shown to be associated with breast cancer metastasis in clinical marker studies. However, a major difficulty in understanding how IGFBP5 functions in this capacity is the paradoxical observation that ectopic overexpression of IGFBP5 in breast cancer cell lines results in suppressed cellular proliferation. In cancer tissues, IGFBP5 resides mainly in the cytoplasm; however, in transfected cells, IGFBP5 is mainly located in the nucleus. We hypothesized that subcellular localization of IGFBP5 affects its functions in host cells. To test this hypothesis, we generated wild-type and mutant IGFBP5 expression constructs. The mutation occurs within the nuclear localization sequence (NLS) of the protein and is generated by site-directed mutagenesis using the wild-type IGFBP5 expression construct as a template. Next, we transfected each expression construct into MDA-MB-435 breast cancer cells to establish stable clones overexpressing either wild-type or mutant IGFBP5. Functional analysis revealed that cells overexpressing wild-type IGFBP5 had significantly lower cell growth rate and motility than the vector-transfected cells, whereas cells overexpressing mutant IGFBP5 demonstrated a significantly higher ability to proliferate and migrate. To illustrate the subcellular localization of the proteins, we generated wild-type and mutant IGFBP5-pDsRed fluorescence fusion constructs. Fluorescence microscopy imaging revealed that mutation of the NLS in IGFBP5 switched the accumulation of IGFBP5 from the nucleus to the cytoplasm of the protein. Together, these findings imply that the mutant form of IGFBP5 increases proliferation and motility of breast cancer cells and that mutation of the NLS in IGFBP5 results in localization of IGFBP5 in the cytoplasm, suggesting that subcellular localization of IGFBP5 affects its cell growth and migration functions in the breast cancer cells

  4. HomPPI: a class of sequence homology based protein-protein interface prediction methods

    Directory of Open Access Journals (Sweden)

    Dobbs Drena

    2011-06-01

    Full Text Available Abstract Background Although homology-based methods are among the most widely used methods for predicting the structure and function of proteins, the question as to whether interface sequence conservation can be effectively exploited in predicting protein-protein interfaces has been a subject of debate. Results We studied more than 300,000 pair-wise alignments of protein sequences from structurally characterized protein complexes, including both obligate and transient complexes. We identified sequence similarity criteria required for accurate homology-based inference of interface residues in a query protein sequence. Based on these analyses, we developed HomPPI, a class of sequence homology-based methods for predicting protein-protein interface residues. We present two variants of HomPPI: (i NPS-HomPPI (Non partner-specific HomPPI, which can be used to predict interface residues of a query protein in the absence of knowledge of the interaction partner; and (ii PS-HomPPI (Partner-specific HomPPI, which can be used to predict the interface residues of a query protein with a specific target protein. Our experiments on a benchmark dataset of obligate homodimeric complexes show that NPS-HomPPI can reliably predict protein-protein interface residues in a given protein, with an average correlation coefficient (CC of 0.76, sensitivity of 0.83, and specificity of 0.78, when sequence homologs of the query protein can be reliably identified. NPS-HomPPI also reliably predicts the interface residues of intrinsically disordered proteins. Our experiments suggest that NPS-HomPPI is competitive with several state-of-the-art interface prediction servers including those that exploit the structure of the query proteins. The partner-specific classifier, PS-HomPPI can, on a large dataset of transient complexes, predict the interface residues of a query protein with a specific target, with a CC of 0.65, sensitivity of 0.69, and specificity of 0.70, when homologs of

  5. GOASVM: a subcellular location predictor by incorporating term-frequency gene ontology into the general form of Chou's pseudo-amino acid composition.

    Science.gov (United States)

    Wan, Shibiao; Mak, Man-Wai; Kung, Sun-Yuan

    2013-04-21

    Prediction of protein subcellular localization is an important yet challenging problem. Recently, several computational methods based on Gene Ontology (GO) have been proposed to tackle this problem and have demonstrated superiority over methods based on other features. Existing GO-based methods, however, do not fully use the GO information. This paper proposes an efficient GO method called GOASVM that exploits the information from the GO term frequencies and distant homologs to represent a protein in the general form of Chou's pseudo-amino acid composition. The method first selects a subset of relevant GO terms to form a GO vector space. Then for each protein, the method uses the accession number (AC) of the protein or the ACs of its homologs to find the number of occurrences of the selected GO terms in the Gene Ontology annotation (GOA) database as a means to construct GO vectors for support vector machines (SVMs) classification. With the advantages of GO term frequencies and a new strategy to incorporate useful homologous information, GOASVM can achieve a prediction accuracy of 72.2% on a new independent test set comprising novel proteins that were added to Swiss-Prot six years later than the creation date of the training set. GOASVM and Supplementary materials are available online at http://bioinfo.eie.polyu.edu.hk/mGoaSvmServer/GOASVM.html. Copyright © 2013 Elsevier Ltd. All rights reserved.

  6. Subcellular localization of skeletal muscle lipid droplets and PLIN family proteins OXPAT and ADRP at rest and following contraction in rat soleus muscle.

    Science.gov (United States)

    MacPherson, Rebecca E K; Herbst, Eric A F; Reynolds, Erica J; Vandenboom, Rene; Roy, Brian D; Peters, Sandra J

    2012-01-01

    Skeletal muscle lipid droplet-associated proteins (PLINs) are thought to regulate lipolysis through protein-protein interactions on the lipid droplet surface. In adipocytes, PLIN2 [adipocyte differentiation-related protein (ADRP)] is found only on lipid droplets, while PLIN5 (OXPAT, expressed only in oxidative tissues) is found both on and off the lipid droplet and may be recruited to lipid droplet membranes when needed. Our purpose was to determine whether PLIN5 is recruited to lipid droplets with contraction and to investigate the myocellular location and colocalization of lipid droplets, PLIN2, and PLIN5. Rat solei were isolated, and following a 30-min equilibration period, they were assigned to one of two groups: 1) 30 min of resting incubation and 2) 30 min of stimulation (n = 10 each). Immunofluorescence microscopy was used to determine subcellular content, distribution, and colocalization of lipid droplets, PLIN2, and PLIN5. There was a main effect for lower lipid and PLIN2 content in stimulated compared with rested muscles (P muscles (P = 0.001, r(2) = 0.99) and linearly in stimulated muscles (slope = -0.0023 ± 0.0006, P muscles (P contraction in isolated skeletal muscle.

  7. Development of a high-throughput method for the systematic identification of human proteins nuclear translocation potential

    Directory of Open Access Journals (Sweden)

    Kawai Jun

    2009-09-01

    Full Text Available Abstract Background Important clues to the function of novel and uncharacterized proteins can be obtained by identifying their ability to translocate in the nucleus. In addition, a comprehensive definition of the nuclear proteome undoubtedly represents a key step toward a better understanding of the biology of this organelle. Although several high-throughput experimental methods have been developed to explore the sub-cellular localization of proteins, these methods tend to focus on the predominant localizations of gene products and may fail to provide a complete catalog of proteins that are able to transiently locate into the nucleus. Results We have developed a method for examining the nuclear localization potential of human gene products at the proteome scale by adapting a mammalian two-hybrid system we have previously developed. Our system is composed of three constructs co-transfected into a mammalian cell line. First, it contains a PCR construct encoding a fusion protein composed of a tested protein, the PDZ-protein TIP-1, and the transactivation domain of TNNC2 (referred to as ACT construct. Second, our system contains a PCR construct encoding a fusion protein composed of the DNA binding domain of GAL4 and the PDZ binding domain of rhotekin (referred to as the BIND construct. Third, a GAL4-responsive luciferase reporter is used to detect the reconstitution of a transcriptionally active BIND-ACT complex through the interaction of TIP-1 and rhotekin, which indicates the ability of the tested protein to translocate into the nucleus. We validated our method in a small-scale feasibility study by comparing it to green fluorescent protein (GFP fusion-based sub-cellular localization assays, sequence-based computational prediction of protein sub-cellular localization, and current sub-cellular localization data available from the literature for 22 gene products. Conclusion Our reporter-based system can rapidly screen gene products for their ability

  8. Toxicological relationships between proteins obtained from protein target predictions of large toxicity databases

    International Nuclear Information System (INIS)

    Nigsch, Florian; Mitchell, John B.O.

    2008-01-01

    The combination of models for protein target prediction with large databases containing toxicological information for individual molecules allows the derivation of 'toxiclogical' profiles, i.e., to what extent are molecules of known toxicity predicted to interact with a set of protein targets. To predict protein targets of drug-like and toxic molecules, we built a computational multiclass model using the Winnow algorithm based on a dataset of protein targets derived from the MDL Drug Data Report. A 15-fold Monte Carlo cross-validation using 50% of each class for training, and the remaining 50% for testing, provided an assessment of the accuracy of that model. We retained the 3 top-ranking predictions and found that in 82% of all cases the correct target was predicted within these three predictions. The first prediction was the correct one in almost 70% of cases. A model built on the whole protein target dataset was then used to predict the protein targets for 150 000 molecules from the MDL Toxicity Database. We analysed the frequency of the predictions across the panel of protein targets for experimentally determined toxicity classes of all molecules. This allowed us to identify clusters of proteins related by their toxicological profiles, as well as toxicities that are related. Literature-based evidence is provided for some specific clusters to show the relevance of the relationships identified

  9. HKC: An Algorithm to Predict Protein Complexes in Protein-Protein Interaction Networks

    Directory of Open Access Journals (Sweden)

    Xiaomin Wang

    2011-01-01

    Full Text Available With the availability of more and more genome-scale protein-protein interaction (PPI networks, research interests gradually shift to Systematic Analysis on these large data sets. A key topic is to predict protein complexes in PPI networks by identifying clusters that are densely connected within themselves but sparsely connected with the rest of the network. In this paper, we present a new topology-based algorithm, HKC, to detect protein complexes in genome-scale PPI networks. HKC mainly uses the concepts of highest k-core and cohesion to predict protein complexes by identifying overlapping clusters. The experiments on two data sets and two benchmarks show that our algorithm has relatively high F-measure and exhibits better performance compared with some other methods.

  10. MEGADOCK-Web: an integrated database of high-throughput structure-based protein-protein interaction predictions.

    Science.gov (United States)

    Hayashi, Takanori; Matsuzaki, Yuri; Yanagisawa, Keisuke; Ohue, Masahito; Akiyama, Yutaka

    2018-05-08

    Protein-protein interactions (PPIs) play several roles in living cells, and computational PPI prediction is a major focus of many researchers. The three-dimensional (3D) structure and binding surface are important for the design of PPI inhibitors. Therefore, rigid body protein-protein docking calculations for two protein structures are expected to allow elucidation of PPIs different from known complexes in terms of 3D structures because known PPI information is not explicitly required. We have developed rapid PPI prediction software based on protein-protein docking, called MEGADOCK. In order to fully utilize the benefits of computational PPI predictions, it is necessary to construct a comprehensive database to gather prediction results and their predicted 3D complex structures and to make them easily accessible. Although several databases exist that provide predicted PPIs, the previous databases do not contain a sufficient number of entries for the purpose of discovering novel PPIs. In this study, we constructed an integrated database of MEGADOCK PPI predictions, named MEGADOCK-Web. MEGADOCK-Web provides more than 10 times the number of PPI predictions than previous databases and enables users to conduct PPI predictions that cannot be found in conventional PPI prediction databases. In MEGADOCK-Web, there are 7528 protein chains and 28,331,628 predicted PPIs from all possible combinations of those proteins. Each protein structure is annotated with PDB ID, chain ID, UniProt AC, related KEGG pathway IDs, and known PPI pairs. Additionally, MEGADOCK-Web provides four powerful functions: 1) searching precalculated PPI predictions, 2) providing annotations for each predicted protein pair with an experimentally known PPI, 3) visualizing candidates that may interact with the query protein on biochemical pathways, and 4) visualizing predicted complex structures through a 3D molecular viewer. MEGADOCK-Web provides a huge amount of comprehensive PPI predictions based on

  11. In vivo subcellular localization of Mal de Rio Cuarto virus (MRCV) non-structural proteins in insect cells reveals their putative functions

    Energy Technology Data Exchange (ETDEWEB)

    Maroniche, Guillermo A.; Mongelli, Vanesa C.; Llauger, Gabriela; Alfonso, Victoria; Taboga, Oscar [Instituto de Biotecnologia, CICVyA, Instituto Nacional de Tecnologia Agropecuaria (IB-INTA), Las cabanas y Los Reseros s/n. Hurlingham Cp 1686, Buenos Aires (Argentina); Vas, Mariana del, E-mail: mdelvas@cnia.inta.gov.ar [Instituto de Biotecnologia, CICVyA, Instituto Nacional de Tecnologia Agropecuaria (IB-INTA), Las cabanas y Los Reseros s/n. Hurlingham Cp 1686, Buenos Aires (Argentina)

    2012-09-01

    The in vivo subcellular localization of Mal de Rio Cuarto virus (MRCV, Fijivirus, Reoviridae) non-structural proteins fused to GFP was analyzed by confocal microscopy. P5-1 showed a cytoplasmic vesicular-like distribution that was lost upon deleting its PDZ binding TKF motif, suggesting that P5-1 interacts with cellular PDZ proteins. P5-2 located at the nucleus and its nuclear import was affected by the deletion of its basic C-termini. P7-1 and P7-2 also entered the nucleus and therefore, along with P5-2, could function as regulators of host gene expression. P6 located in the cytoplasm and in perinuclear cloud-like inclusions, was driven to P9-1 viroplasm-like structures and co-localized with P7-2, P10 and {alpha}-tubulin, suggesting its involvement in viroplasm formation and viral intracellular movement. Finally, P9-2 was N-glycosylated and located at the plasma membrane in association with filopodia-like protrusions containing actin, suggesting a possible role in virus cell-to-cell movement and spread.

  12. Computational prediction of protein hot spot residues.

    Science.gov (United States)

    Morrow, John Kenneth; Zhang, Shuxing

    2012-01-01

    Most biological processes involve multiple proteins interacting with each other. It has been recently discovered that certain residues in these protein-protein interactions, which are called hot spots, contribute more significantly to binding affinity than others. Hot spot residues have unique and diverse energetic properties that make them challenging yet important targets in the modulation of protein-protein complexes. Design of therapeutic agents that interact with hot spot residues has proven to be a valid methodology in disrupting unwanted protein-protein interactions. Using biological methods to determine which residues are hot spots can be costly and time consuming. Recent advances in computational approaches to predict hot spots have incorporated a myriad of features, and have shown increasing predictive successes. Here we review the state of knowledge around protein-protein interactions, hot spots, and give an overview of multiple in silico prediction techniques of hot spot residues.

  13. Prediction of protein-protein interactions between viruses and human by an SVM model

    Directory of Open Access Journals (Sweden)

    Cui Guangyu

    2012-05-01

    Full Text Available Abstract Background Several computational methods have been developed to predict protein-protein interactions from amino acid sequences, but most of those methods are intended for the interactions within a species rather than for interactions across different species. Methods for predicting interactions between homogeneous proteins are not appropriate for finding those between heterogeneous proteins since they do not distinguish the interactions between proteins of the same species from those of different species. Results We developed a new method for representing a protein sequence of variable length in a frequency vector of fixed length, which encodes the relative frequency of three consecutive amino acids of a sequence. We built a support vector machine (SVM model to predict human proteins that interact with virus proteins. In two types of viruses, human papillomaviruses (HPV and hepatitis C virus (HCV, our SVM model achieved an average accuracy above 80%, which is higher than that of another SVM model with a different representation scheme. Using the SVM model and Gene Ontology (GO annotations of proteins, we predicted new interactions between virus proteins and human proteins. Conclusions Encoding the relative frequency of amino acid triplets of a protein sequence is a simple yet powerful representation method for predicting protein-protein interactions across different species. The representation method has several advantages: (1 it enables a prediction model to achieve a better performance than other representations, (2 it generates feature vectors of fixed length regardless of the sequence length, and (3 the same representation is applicable to different types of proteins.

  14. Prediction and characterization of protein-protein interaction networks in swine

    Directory of Open Access Journals (Sweden)

    Wang Fen

    2012-01-01

    Full Text Available Abstract Background Studying the large-scale protein-protein interaction (PPI network is important in understanding biological processes. The current research presents the first PPI map of swine, which aims to give new insights into understanding their biological processes. Results We used three methods, Interolog-based prediction of porcine PPI network, domain-motif interactions from structural topology-based prediction of porcine PPI network and motif-motif interactions from structural topology-based prediction of porcine PPI network, to predict porcine protein interactions among 25,767 porcine proteins. We predicted 20,213, 331,484, and 218,705 porcine PPIs respectively, merged the three results into 567,441 PPIs, constructed four PPI networks, and analyzed the topological properties of the porcine PPI networks. Our predictions were validated with Pfam domain annotations and GO annotations. Averages of 70, 10,495, and 863 interactions were related to the Pfam domain-interacting pairs in iPfam database. For comparison, randomized networks were generated, and averages of only 4.24, 66.79, and 44.26 interactions were associated with Pfam domain-interacting pairs in iPfam database. In GO annotations, we found 52.68%, 75.54%, 27.20% of the predicted PPIs sharing GO terms respectively. However, the number of PPI pairs sharing GO terms in the 10,000 randomized networks reached 52.68%, 75.54%, 27.20% is 0. Finally, we determined the accuracy and precision of the methods. The methods yielded accuracies of 0.92, 0.53, and 0.50 at precisions of about 0.93, 0.74, and 0.75, respectively. Conclusion The results reveal that the predicted PPI networks are considerably reliable. The present research is an important pioneering work on protein function research. The porcine PPI data set, the confidence score of each interaction and a list of related data are available at (http://pppid.biositemap.com/.

  15. Subcellular distribution and chemical forms of cadmium in Phytolacca americana L

    Energy Technology Data Exchange (ETDEWEB)

    Fu Xiaoping; Dou Changming [Ministry of Agriculture Key Laboratory of Non-point Source Pollution Control, Institute of Environmental Science and Technology, Zhejiang University, Hangzhou 310029 (China); Chen Yingxu, E-mail: yingxu_chen@hotmail.com [Ministry of Agriculture Key Laboratory of Non-point Source Pollution Control, Institute of Environmental Science and Technology, Zhejiang University, Hangzhou 310029 (China); Chen Xincai; Shi Jiyan; Yu Mingge; Xu Jie [Ministry of Agriculture Key Laboratory of Non-point Source Pollution Control, Institute of Environmental Science and Technology, Zhejiang University, Hangzhou 310029 (China)

    2011-02-15

    Phytolacca americana L. (pokeweed) is a promising species for Cd phytoextraction with large biomass and fast growth rate. To further understand the mechanisms involved in Cd tolerance and detoxification, the present study investigated subcellular distribution and chemical forms of Cd in pokeweed. Subcellular fractionation of Cd-containing tissues indicated that both in root and leaves, the majority of the element was located in soluble fraction and cell walls. Meanwhile, Cd taken up by pokeweed existed in different chemical forms. Results showed that the greatest amount of Cd was found in the extraction of 80% ethanol in roots, followed by 1 M NaCl, d-H{sub 2}O and 2% HAc, while in leaves and stems, most of the Cd was extracted by 1 M NaCl, and the subdominant amount of Cd was extracted by 80% ethanol. It could be suggested that Cd compartmentation with organo-ligands in vacuole or integrated with pectates and proteins in cell wall might be responsible for the adaptation of pokeweed to Cd stress.

  16. Evaluation on subcellular partitioning and biodynamics of pulse copper toxicity in tilapia reveals impacts of a major environmental disturbance.

    Science.gov (United States)

    Ju, Yun-Ru; Yang, Ying-Fei; Tsai, Jeng-Wei; Cheng, Yi-Hsien; Chen, Wei-Yu; Liao, Chung-Min

    2017-07-01

    Fluctuation exposure of trace metal copper (Cu) is ubiquitous in aquatic environments. The purpose of this study was to investigate the impacts of chronically pulsed exposure on biodynamics and subcellular partitioning of Cu in freshwater tilapia (Oreochromis mossambicus). Long-term 28-day pulsed Cu exposure experiments were performed to explore subcellular partitioning and toxicokinetics/toxicodynamics of Cu in tilapia. Subcellular partitioning linking with a metal influx scheme was used to estimate detoxification and elimination rates. A biotic ligand model-based damage assessment model was used to take into account environmental effects and biological mechanisms of Cu toxicity. We demonstrated that the probability causing 50% of susceptibility risk in response to pulse Cu exposure in generic Taiwan aquaculture ponds was ~33% of Cu in adverse physiologically associated, metabolically active pool, implicating no significant susceptibility risk for tilapia. We suggest that our integrated ecotoxicological models linking chronic exposure measurements with subcellular partitioning can facilitate a risk assessment framework that provides a predictive tool for preventive susceptibility reduction strategies for freshwater fish exposed to pulse metal stressors.

  17. Cellular and Subcellular Immunohistochemical Localization and Quantification of Cadmium Ions in Wheat (Triticum aestivum.

    Directory of Open Access Journals (Sweden)

    Wei Gao

    Full Text Available The distribution of metallic ions in plant tissues is associated with their toxicity and is important for understanding mechanisms of toxicity tolerance. A quantitative histochemical method can help advance knowledge of cellular and subcellular localization and distribution of heavy metals in plant tissues. An immunohistochemical (IHC imaging method for cadmium ions (Cd2+ was developed for the first time for the wheat Triticum aestivum grown in Cd2+-fortified soils. Also, 1-(4-Isothiocyanobenzyl-ethylenediamine-N,N,N,N-tetraacetic acid (ITCB-EDTA was used to chelate the mobile Cd2+. The ITCB-EDTA/Cd2+ complex was fixed with proteins in situ via the isothiocyano group. A new Cd2+-EDTA specific monoclonal antibody, 4F3B6D9A1, was used to locate the Cd2+-EDTA protein complex. After staining, the fluorescence intensities of sections of Cd2+-positive roots were compared with those of Cd2+-negative roots under a laser confocal scanning microscope, and the location of colloidal gold particles was determined with a transmission electron microscope. The results enable quantification of the Cd2+ content in plant tissues and illustrate Cd2+ translocation and cellular and subcellular responses of T. aestivum to Cd2+ stress. Compared to the conventional metal-S coprecipitation histochemical method, this new IHC method is quantitative, more specific and has less background interference. The subcellular location of Cd2+ was also confirmed with energy-dispersive X-ray microanalysis. The IHC method is suitable for locating and quantifying Cd2+ in plant tissues and can be extended to other heavy metallic ions.

  18. Cellular and Subcellular Immunohistochemical Localization and Quantification of Cadmium Ions in Wheat (Triticum aestivum).

    Science.gov (United States)

    Gao, Wei; Nan, Tiegui; Tan, Guiyu; Zhao, Hongwei; Tan, Weiming; Meng, Fanyun; Li, Zhaohu; Li, Qing X; Wang, Baomin

    2015-01-01

    The distribution of metallic ions in plant tissues is associated with their toxicity and is important for understanding mechanisms of toxicity tolerance. A quantitative histochemical method can help advance knowledge of cellular and subcellular localization and distribution of heavy metals in plant tissues. An immunohistochemical (IHC) imaging method for cadmium ions (Cd2+) was developed for the first time for the wheat Triticum aestivum grown in Cd2+-fortified soils. Also, 1-(4-Isothiocyanobenzyl)-ethylenediamine-N,N,N,N-tetraacetic acid (ITCB-EDTA) was used to chelate the mobile Cd2+. The ITCB-EDTA/Cd2+ complex was fixed with proteins in situ via the isothiocyano group. A new Cd2+-EDTA specific monoclonal antibody, 4F3B6D9A1, was used to locate the Cd2+-EDTA protein complex. After staining, the fluorescence intensities of sections of Cd2+-positive roots were compared with those of Cd2+-negative roots under a laser confocal scanning microscope, and the location of colloidal gold particles was determined with a transmission electron microscope. The results enable quantification of the Cd2+ content in plant tissues and illustrate Cd2+ translocation and cellular and subcellular responses of T. aestivum to Cd2+ stress. Compared to the conventional metal-S coprecipitation histochemical method, this new IHC method is quantitative, more specific and has less background interference. The subcellular location of Cd2+ was also confirmed with energy-dispersive X-ray microanalysis. The IHC method is suitable for locating and quantifying Cd2+ in plant tissues and can be extended to other heavy metallic ions.

  19. Subcellular Location of PKA Controls Striatal Plasticity: Stochastic Simulations in Spiny Dendrites

    Science.gov (United States)

    Oliveira, Rodrigo F.; Kim, MyungSook; Blackwell, Kim T.

    2012-01-01

    Dopamine release in the striatum has been implicated in various forms of reward dependent learning. Dopamine leads to production of cAMP and activation of protein kinase A (PKA), which are involved in striatal synaptic plasticity and learning. PKA and its protein targets are not diffusely located throughout the neuron, but are confined to various subcellular compartments by anchoring molecules such as A-Kinase Anchoring Proteins (AKAPs). Experiments have shown that blocking the interaction of PKA with AKAPs disrupts its subcellular location and prevents LTP in the hippocampus and striatum; however, these experiments have not revealed whether the critical function of anchoring is to locate PKA near the cAMP that activates it or near its targets, such as AMPA receptors located in the post-synaptic density. We have developed a large scale stochastic reaction-diffusion model of signaling pathways in a medium spiny projection neuron dendrite with spines, based on published biochemical measurements, to investigate this question and to evaluate whether dopamine signaling exhibits spatial specificity post-synaptically. The model was stimulated with dopamine pulses mimicking those recorded in response to reward. Simulations show that PKA colocalization with adenylate cyclase, either in the spine head or in the dendrite, leads to greater phosphorylation of DARPP-32 Thr34 and AMPA receptor GluA1 Ser845 than when PKA is anchored away from adenylate cyclase. Simulations further demonstrate that though cAMP exhibits a strong spatial gradient, diffusible DARPP-32 facilitates the spread of PKA activity, suggesting that additional inactivation mechanisms are required to produce spatial specificity of PKA activity. PMID:22346744

  20. Refining intra-protein contact prediction by graph analysis

    Directory of Open Access Journals (Sweden)

    Eyal Eran

    2007-05-01

    Full Text Available Abstract Background Accurate prediction of intra-protein residue contacts from sequence information will allow the prediction of protein structures. Basic predictions of such specific contacts can be further refined by jointly analyzing predicted contacts, and by adding information on the relative positions of contacts in the protein primary sequence. Results We introduce a method for graph analysis refinement of intra-protein contacts, termed GARP. Our previously presented intra-contact prediction method by means of pair-to-pair substitution matrix (P2PConPred was used to test the GARP method. In our approach, the top contact predictions obtained by a basic prediction method were used as edges to create a weighted graph. The edges were scored by a mutual clustering coefficient that identifies highly connected graph regions, and by the density of edges between the sequence regions of the edge nodes. A test set of 57 proteins with known structures was used to determine contacts. GARP improves the accuracy of the P2PConPred basic prediction method in whole proteins from 12% to 18%. Conclusion Using a simple approach we increased the contact prediction accuracy of a basic method by 1.5 times. Our graph approach is simple to implement, can be used with various basic prediction methods, and can provide input for further downstream analyses.

  1. A domain-based approach to predict protein-protein interactions

    Directory of Open Access Journals (Sweden)

    Resat Haluk

    2007-06-01

    Full Text Available Abstract Background Knowing which proteins exist in a certain organism or cell type and how these proteins interact with each other are necessary for the understanding of biological processes at the whole cell level. The determination of the protein-protein interaction (PPI networks has been the subject of extensive research. Despite the development of reasonably successful methods, serious technical difficulties still exist. In this paper we present DomainGA, a quantitative computational approach that uses the information about the domain-domain interactions to predict the interactions between proteins. Results DomainGA is a multi-parameter optimization method in which the available PPI information is used to derive a quantitative scoring scheme for the domain-domain pairs. Obtained domain interaction scores are then used to predict whether a pair of proteins interacts. Using the yeast PPI data and a series of tests, we show the robustness and insensitivity of the DomainGA method to the selection of the parameter sets, score ranges, and detection rules. Our DomainGA method achieves very high explanation ratios for the positive and negative PPIs in yeast. Based on our cross-verification tests on human PPIs, comparison of the optimized scores with the structurally observed domain interactions obtained from the iPFAM database, and sensitivity and specificity analysis; we conclude that our DomainGA method shows great promise to be applicable across multiple organisms. Conclusion We envision the DomainGA as a first step of a multiple tier approach to constructing organism specific PPIs. As it is based on fundamental structural information, the DomainGA approach can be used to create potential PPIs and the accuracy of the constructed interaction template can be further improved using complementary methods. Explanation ratios obtained in the reported test case studies clearly show that the false prediction rates of the template networks constructed

  2. Protein-protein interaction site predictions with minimum covariance determinant and Mahalanobis distance.

    Science.gov (United States)

    Qiu, Zhijun; Zhou, Bo; Yuan, Jiangfeng

    2017-11-21

    Protein-protein interaction site (PPIS) prediction must deal with the diversity of interaction sites that limits their prediction accuracy. Use of proteins with unknown or unidentified interactions can also lead to missing interfaces. Such data errors are often brought into the training dataset. In response to these two problems, we used the minimum covariance determinant (MCD) method to refine the training data to build a predictor with better performance, utilizing its ability of removing outliers. In order to predict test data in practice, a method based on Mahalanobis distance was devised to select proper test data as input for the predictor. With leave-one-validation and independent test, after the Mahalanobis distance screening, our method achieved higher performance according to Matthews correlation coefficient (MCC), although only a part of test data could be predicted. These results indicate that data refinement is an efficient approach to improve protein-protein interaction site prediction. By further optimizing our method, it is hopeful to develop predictors of better performance and wide range of application. Copyright © 2017 Elsevier Ltd. All rights reserved.

  3. Unique Protein Signature of Circulating Microparticles in Systemic Lupus Erythematosus

    DEFF Research Database (Denmark)

    Østergaard, Ole; Nielsen, Christoffer; Iversen, Line V

    2013-01-01

    To characterize the unique qualities of proteins associated with circulating subcellular material in systemic lupus erythematosus (SLE) patients compared with healthy controls and patients with other chronic autoimmune diseases.......To characterize the unique qualities of proteins associated with circulating subcellular material in systemic lupus erythematosus (SLE) patients compared with healthy controls and patients with other chronic autoimmune diseases....

  4. Deep learning methods for protein torsion angle prediction.

    Science.gov (United States)

    Li, Haiou; Hou, Jie; Adhikari, Badri; Lyu, Qiang; Cheng, Jianlin

    2017-09-18

    Deep learning is one of the most powerful machine learning methods that has achieved the state-of-the-art performance in many domains. Since deep learning was introduced to the field of bioinformatics in 2012, it has achieved success in a number of areas such as protein residue-residue contact prediction, secondary structure prediction, and fold recognition. In this work, we developed deep learning methods to improve the prediction of torsion (dihedral) angles of proteins. We design four different deep learning architectures to predict protein torsion angles. The architectures including deep neural network (DNN) and deep restricted Boltzmann machine (DRBN), deep recurrent neural network (DRNN) and deep recurrent restricted Boltzmann machine (DReRBM) since the protein torsion angle prediction is a sequence related problem. In addition to existing protein features, two new features (predicted residue contact number and the error distribution of torsion angles extracted from sequence fragments) are used as input to each of the four deep learning architectures to predict phi and psi angles of protein backbone. The mean absolute error (MAE) of phi and psi angles predicted by DRNN, DReRBM, DRBM and DNN is about 20-21° and 29-30° on an independent dataset. The MAE of phi angle is comparable to the existing methods, but the MAE of psi angle is 29°, 2° lower than the existing methods. On the latest CASP12 targets, our methods also achieved the performance better than or comparable to a state-of-the art method. Our experiment demonstrates that deep learning is a valuable method for predicting protein torsion angles. The deep recurrent network architecture performs slightly better than deep feed-forward architecture, and the predicted residue contact number and the error distribution of torsion angles extracted from sequence fragments are useful features for improving prediction accuracy.

  5. Subcellular site and nature of intracellular cadmium in plants

    International Nuclear Information System (INIS)

    Wagner, G.J.

    1979-01-01

    The mechanisms underlying heavy metal accumulation, toxicity, and tolerance in higher plants are poorly understood. Since subcellular processes are undoubtedly involved in all these phenomena, it is of interest to study the extent, subcellular site and nature of intracellularly accumulated cadmium in higher plants. Whole plants supplied 109 CdCl 2 or 112 CdSO 4 accumulated Cd into roots and aerial tissues. Preparation of protoplasts from aerial tissues followed by subcellular fractionation of the protoplasts to obtain intact vacuoles, chloroplasts and cytosol revealed the presence of Cd in the cytosol but not in vacuoles or chloroplasts. No evidence was obtained for the production of volatile Cd complexes in tobacco

  6. An Overview of Practical Applications of Protein Disorder Prediction and Drive for Faster, More Accurate Predictions.

    Science.gov (United States)

    Deng, Xin; Gumm, Jordan; Karki, Suman; Eickholt, Jesse; Cheng, Jianlin

    2015-07-07

    Protein disordered regions are segments of a protein chain that do not adopt a stable structure. Thus far, a variety of protein disorder prediction methods have been developed and have been widely used, not only in traditional bioinformatics domains, including protein structure prediction, protein structure determination and function annotation, but also in many other biomedical fields. The relationship between intrinsically-disordered proteins and some human diseases has played a significant role in disorder prediction in disease identification and epidemiological investigations. Disordered proteins can also serve as potential targets for drug discovery with an emphasis on the disordered-to-ordered transition in the disordered binding regions, and this has led to substantial research in drug discovery or design based on protein disordered region prediction. Furthermore, protein disorder prediction has also been applied to healthcare by predicting the disease risk of mutations in patients and studying the mechanistic basis of diseases. As the applications of disorder prediction increase, so too does the need to make quick and accurate predictions. To fill this need, we also present a new approach to predict protein residue disorder using wide sequence windows that is applicable on the genomic scale.

  7. An Overview of Practical Applications of Protein Disorder Prediction and Drive for Faster, More Accurate Predictions

    Directory of Open Access Journals (Sweden)

    Xin Deng

    2015-07-01

    Full Text Available Protein disordered regions are segments of a protein chain that do not adopt a stable structure. Thus far, a variety of protein disorder prediction methods have been developed and have been widely used, not only in traditional bioinformatics domains, including protein structure prediction, protein structure determination and function annotation, but also in many other biomedical fields. The relationship between intrinsically-disordered proteins and some human diseases has played a significant role in disorder prediction in disease identification and epidemiological investigations. Disordered proteins can also serve as potential targets for drug discovery with an emphasis on the disordered-to-ordered transition in the disordered binding regions, and this has led to substantial research in drug discovery or design based on protein disordered region prediction. Furthermore, protein disorder prediction has also been applied to healthcare by predicting the disease risk of mutations in patients and studying the mechanistic basis of diseases. As the applications of disorder prediction increase, so too does the need to make quick and accurate predictions. To fill this need, we also present a new approach to predict protein residue disorder using wide sequence windows that is applicable on the genomic scale.

  8. Early subcellular partitioning of cadmium in gill and liver of rainbow trout (Oncorhynchus mykiss) following low-to-near-lethal waterborne cadmium exposure

    Energy Technology Data Exchange (ETDEWEB)

    Kamunde, Collins [Department of Biomedical Sciences, Atlantic Veterinary College, University of Prince Edward Island, 550 University Avenue, Charlottetown, PE, C1A 4P3 (Canada)], E-mail: ckamunde@upei.ca

    2009-03-09

    Non-essential metals such as cadmium (Cd) accumulated in animal cells are envisaged to partition into potentially metal-sensitive compartments when detoxification capacity is exceeded. An understanding of intracellular metal partitioning is therefore important in delineation of the toxicologically relevant metal fraction for accurate tissue residue-based assessment of toxicity. In the present study, the early intracellular Cd accumulation was studied to test the prediction that it conforms to the spillover model of metal toxicity. Juvenile rainbow trout (10-15 g) were exposed for 96 h to three doses of cadmium (5, 25 and 50 {mu}g/l) and a control (nominal 0 {mu}g/l Cd) in hard water followed by measurement of the changes in intracellular Cd concentrations in the gill and liver, and carcass calcium (Ca) levels. There were dose-dependent increases in Cd concentration in both organs but the accumulation pattern over time was linear in the liver and biphasic in the gill. The Cd accumulation was associated with carcass Ca loss after 48 h. Comparatively, the gill accumulated 2-4x more Cd than the liver and generally the subcellular compartments reflected the organ-level patterns of accumulation. For the gill the rank of Cd accumulation in subcellular fractions was: heat-stable proteins (HSP) > heat-labile proteins (HLP) > nuclei > microsomes-lysosomes (ML) {>=} mitochondria > resistant fraction while for the liver it was HSP > HLP > ML > mitochondria > nuclei > resistant fraction. Contrary to the spillover hypothesis there was no exposure concentration or internal accumulation at which Cd was not found in potentially metal-sensitive compartments. The proportion of Cd bound to the metabolically active pool (MAP) increased while that bound to the metabolically detoxified pool (MDP) decreased in gills of Cd-exposed fish but remained unchanged in the liver. Because the Cd concentration increased in all subcellular compartments while their contribution to the total increased

  9. Early subcellular partitioning of cadmium in gill and liver of rainbow trout (Oncorhynchus mykiss) following low-to-near-lethal waterborne cadmium exposure

    International Nuclear Information System (INIS)

    Kamunde, Collins

    2009-01-01

    Non-essential metals such as cadmium (Cd) accumulated in animal cells are envisaged to partition into potentially metal-sensitive compartments when detoxification capacity is exceeded. An understanding of intracellular metal partitioning is therefore important in delineation of the toxicologically relevant metal fraction for accurate tissue residue-based assessment of toxicity. In the present study, the early intracellular Cd accumulation was studied to test the prediction that it conforms to the spillover model of metal toxicity. Juvenile rainbow trout (10-15 g) were exposed for 96 h to three doses of cadmium (5, 25 and 50 μg/l) and a control (nominal 0 μg/l Cd) in hard water followed by measurement of the changes in intracellular Cd concentrations in the gill and liver, and carcass calcium (Ca) levels. There were dose-dependent increases in Cd concentration in both organs but the accumulation pattern over time was linear in the liver and biphasic in the gill. The Cd accumulation was associated with carcass Ca loss after 48 h. Comparatively, the gill accumulated 2-4x more Cd than the liver and generally the subcellular compartments reflected the organ-level patterns of accumulation. For the gill the rank of Cd accumulation in subcellular fractions was: heat-stable proteins (HSP) > heat-labile proteins (HLP) > nuclei > microsomes-lysosomes (ML) ≥ mitochondria > resistant fraction while for the liver it was HSP > HLP > ML > mitochondria > nuclei > resistant fraction. Contrary to the spillover hypothesis there was no exposure concentration or internal accumulation at which Cd was not found in potentially metal-sensitive compartments. The proportion of Cd bound to the metabolically active pool (MAP) increased while that bound to the metabolically detoxified pool (MDP) decreased in gills of Cd-exposed fish but remained unchanged in the liver. Because the Cd concentration increased in all subcellular compartments while their contribution to the total increased

  10. Sub-cellular force microscopy in single normal and cancer cells.

    Science.gov (United States)

    Babahosseini, H; Carmichael, B; Strobl, J S; Mahmoodi, S N; Agah, M

    2015-08-07

    This work investigates the biomechanical properties of sub-cellular structures of breast cells using atomic force microscopy (AFM). The cells are modeled as a triple-layered structure where the Generalized Maxwell model is applied to experimental data from AFM stress-relaxation tests to extract the elastic modulus, the apparent viscosity, and the relaxation time of sub-cellular structures. The triple-layered modeling results allow for determination and comparison of the biomechanical properties of the three major sub-cellular structures between normal and cancerous cells: the up plasma membrane/actin cortex, the mid cytoplasm/nucleus, and the low nuclear/integrin sub-domains. The results reveal that the sub-domains become stiffer and significantly more viscous with depth, regardless of cell type. In addition, there is a decreasing trend in the average elastic modulus and apparent viscosity of the all corresponding sub-cellular structures from normal to cancerous cells, which becomes most remarkable in the deeper sub-domain. The presented modeling in this work constitutes a unique AFM-based experimental framework to study the biomechanics of sub-cellular structures. Copyright © 2015 Elsevier Inc. All rights reserved.

  11. Working with Proteins in silico: A Review of Online Available Tools for Basic Identification of Proteins

    Directory of Open Access Journals (Sweden)

    Caner Yavuz

    2017-01-01

    Full Text Available Increase in online available bioinformatics tools for protein research creates an important opportunity for scientists to reveal characteristics of the protein of interest by only starting from the predicted or known amino acid sequence without fully depending on experimental approaches. There are many sophisticated tools used for diverse purposes; however, there are not enough reviews covering the tips and tricks in selecting and using the correct tools as the literature mainly state the promotion of the new ones. In this review, with the aim of providing young scientists with no specific experience on protein work a reliable starting point for in silico analysis of the protein of interest, we summarized tools for annotation, identification of motifs and domains, determination isoelectric point, molecular weight, subcellular localization, and post-translational modifications by focusing on the important points to be considered while selecting from online available tools.

  12. Topology of membrane proteins-predictions, limitations and variations.

    Science.gov (United States)

    Tsirigos, Konstantinos D; Govindarajan, Sudha; Bassot, Claudio; Västermark, Åke; Lamb, John; Shu, Nanjiang; Elofsson, Arne

    2017-10-26

    Transmembrane proteins perform a variety of important biological functions necessary for the survival and growth of the cells. Membrane proteins are built up by transmembrane segments that span the lipid bilayer. The segments can either be in the form of hydrophobic alpha-helices or beta-sheets which create a barrel. A fundamental aspect of the structure of transmembrane proteins is the membrane topology, that is, the number of transmembrane segments, their position in the protein sequence and their orientation in the membrane. Along these lines, many predictive algorithms for the prediction of the topology of alpha-helical and beta-barrel transmembrane proteins exist. The newest algorithms obtain an accuracy close to 80% both for alpha-helical and beta-barrel transmembrane proteins. However, lately it has been shown that the simplified picture presented when describing a protein family by its topology is limited. To demonstrate this, we highlight examples where the topology is either not conserved in a protein superfamily or where the structure cannot be described solely by the topology of a protein. The prediction of these non-standard features from sequence alone was not successful until the recent revolutionary progress in 3D-structure prediction of proteins. Copyright © 2017 Elsevier Ltd. All rights reserved.

  13. Text mining improves prediction of protein functional sites.

    Directory of Open Access Journals (Sweden)

    Karin M Verspoor

    Full Text Available We present an approach that integrates protein structure analysis and text mining for protein functional site prediction, called LEAP-FS (Literature Enhanced Automated Prediction of Functional Sites. The structure analysis was carried out using Dynamics Perturbation Analysis (DPA, which predicts functional sites at control points where interactions greatly perturb protein vibrations. The text mining extracts mentions of residues in the literature, and predicts that residues mentioned are functionally important. We assessed the significance of each of these methods by analyzing their performance in finding known functional sites (specifically, small-molecule binding sites and catalytic sites in about 100,000 publicly available protein structures. The DPA predictions recapitulated many of the functional site annotations and preferentially recovered binding sites annotated as biologically relevant vs. those annotated as potentially spurious. The text-based predictions were also substantially supported by the functional site annotations: compared to other residues, residues mentioned in text were roughly six times more likely to be found in a functional site. The overlap of predictions with annotations improved when the text-based and structure-based methods agreed. Our analysis also yielded new high-quality predictions of many functional site residues that were not catalogued in the curated data sources we inspected. We conclude that both DPA and text mining independently provide valuable high-throughput protein functional site predictions, and that integrating the two methods using LEAP-FS further improves the quality of these predictions.

  14. Text Mining Improves Prediction of Protein Functional Sites

    Science.gov (United States)

    Cohn, Judith D.; Ravikumar, Komandur E.

    2012-01-01

    We present an approach that integrates protein structure analysis and text mining for protein functional site prediction, called LEAP-FS (Literature Enhanced Automated Prediction of Functional Sites). The structure analysis was carried out using Dynamics Perturbation Analysis (DPA), which predicts functional sites at control points where interactions greatly perturb protein vibrations. The text mining extracts mentions of residues in the literature, and predicts that residues mentioned are functionally important. We assessed the significance of each of these methods by analyzing their performance in finding known functional sites (specifically, small-molecule binding sites and catalytic sites) in about 100,000 publicly available protein structures. The DPA predictions recapitulated many of the functional site annotations and preferentially recovered binding sites annotated as biologically relevant vs. those annotated as potentially spurious. The text-based predictions were also substantially supported by the functional site annotations: compared to other residues, residues mentioned in text were roughly six times more likely to be found in a functional site. The overlap of predictions with annotations improved when the text-based and structure-based methods agreed. Our analysis also yielded new high-quality predictions of many functional site residues that were not catalogued in the curated data sources we inspected. We conclude that both DPA and text mining independently provide valuable high-throughput protein functional site predictions, and that integrating the two methods using LEAP-FS further improves the quality of these predictions. PMID:22393388

  15. Protein-protein interaction site predictions with three-dimensional probability distributions of interacting atoms on protein surfaces.

    Directory of Open Access Journals (Sweden)

    Ching-Tai Chen

    Full Text Available Protein-protein interactions are key to many biological processes. Computational methodologies devised to predict protein-protein interaction (PPI sites on protein surfaces are important tools in providing insights into the biological functions of proteins and in developing therapeutics targeting the protein-protein interaction sites. One of the general features of PPI sites is that the core regions from the two interacting protein surfaces are complementary to each other, similar to the interior of proteins in packing density and in the physicochemical nature of the amino acid composition. In this work, we simulated the physicochemical complementarities by constructing three-dimensional probability density maps of non-covalent interacting atoms on the protein surfaces. The interacting probabilities were derived from the interior of known structures. Machine learning algorithms were applied to learn the characteristic patterns of the probability density maps specific to the PPI sites. The trained predictors for PPI sites were cross-validated with the training cases (consisting of 432 proteins and were tested on an independent dataset (consisting of 142 proteins. The residue-based Matthews correlation coefficient for the independent test set was 0.423; the accuracy, precision, sensitivity, specificity were 0.753, 0.519, 0.677, and 0.779 respectively. The benchmark results indicate that the optimized machine learning models are among the best predictors in identifying PPI sites on protein surfaces. In particular, the PPI site prediction accuracy increases with increasing size of the PPI site and with increasing hydrophobicity in amino acid composition of the PPI interface; the core interface regions are more likely to be recognized with high prediction confidence. The results indicate that the physicochemical complementarity patterns on protein surfaces are important determinants in PPIs, and a substantial portion of the PPI sites can be predicted

  16. Protein-Protein Interaction Site Predictions with Three-Dimensional Probability Distributions of Interacting Atoms on Protein Surfaces

    Science.gov (United States)

    Chen, Ching-Tai; Peng, Hung-Pin; Jian, Jhih-Wei; Tsai, Keng-Chang; Chang, Jeng-Yih; Yang, Ei-Wen; Chen, Jun-Bo; Ho, Shinn-Ying; Hsu, Wen-Lian; Yang, An-Suei

    2012-01-01

    Protein-protein interactions are key to many biological processes. Computational methodologies devised to predict protein-protein interaction (PPI) sites on protein surfaces are important tools in providing insights into the biological functions of proteins and in developing therapeutics targeting the protein-protein interaction sites. One of the general features of PPI sites is that the core regions from the two interacting protein surfaces are complementary to each other, similar to the interior of proteins in packing density and in the physicochemical nature of the amino acid composition. In this work, we simulated the physicochemical complementarities by constructing three-dimensional probability density maps of non-covalent interacting atoms on the protein surfaces. The interacting probabilities were derived from the interior of known structures. Machine learning algorithms were applied to learn the characteristic patterns of the probability density maps specific to the PPI sites. The trained predictors for PPI sites were cross-validated with the training cases (consisting of 432 proteins) and were tested on an independent dataset (consisting of 142 proteins). The residue-based Matthews correlation coefficient for the independent test set was 0.423; the accuracy, precision, sensitivity, specificity were 0.753, 0.519, 0.677, and 0.779 respectively. The benchmark results indicate that the optimized machine learning models are among the best predictors in identifying PPI sites on protein surfaces. In particular, the PPI site prediction accuracy increases with increasing size of the PPI site and with increasing hydrophobicity in amino acid composition of the PPI interface; the core interface regions are more likely to be recognized with high prediction confidence. The results indicate that the physicochemical complementarity patterns on protein surfaces are important determinants in PPIs, and a substantial portion of the PPI sites can be predicted correctly with

  17. The role of water flow into subcellular organella in cell death

    International Nuclear Information System (INIS)

    Chiba-Kamoshida, Kaori

    2008-01-01

    Mitochondrion is a subcellular organella producing most of the energy necessary for living cells. The structure consisting of double membrane, inner and outer membranes, has a close relationship with activity and diseases. Its accurate regulation of the membrane permeability plays an important role in the homeostatic energy production. Abnormal membrane permeability has a potential to lead to cell depth. Although, even transportation of water molecule is regulated by a specific membrane protein, aquapoline, there has not been reported any method to monitor the water flow through the membrane. Neutron small-angle scattering allows us to perform measurements with biological materials and subcellular organella such as mitochondria in solution under the experimental condition maintaining the activity of the biological samples. Outstanding advantage of neutron spectroscopy is its ability to distinguish hydrogen spread over biomolecules from deuterium. In order to explore a new method to monitor conformational change inside mitochondria, wide-range neutron small angle scattering data introducing two neutron spectrometers in JAEA JRR-3, SANS-J and PNO covering not only the size for the thickness of the double membrane but also that for isolated whole mitochondria particle, ∼1 μm was employed. Utilizing the excess protein content, 70%, in the inner membrane of mitochondria, a new attempt was began to figure out the structure change in inner membrane caused by the change such as in oxygen and in the substrate concentration, and to examine the relationship between the structure change and water flow through the mitochondria membrane. (author)

  18. Predicting and validating protein interactions using network structure.

    Directory of Open Access Journals (Sweden)

    Pao-Yang Chen

    2008-07-01

    Full Text Available Protein interactions play a vital part in the function of a cell. As experimental techniques for detection and validation of protein interactions are time consuming, there is a need for computational methods for this task. Protein interactions appear to form a network with a relatively high degree of local clustering. In this paper we exploit this clustering by suggesting a score based on triplets of observed protein interactions. The score utilises both protein characteristics and network properties. Our score based on triplets is shown to complement existing techniques for predicting protein interactions, outperforming them on data sets which display a high degree of clustering. The predicted interactions score highly against test measures for accuracy. Compared to a similar score derived from pairwise interactions only, the triplet score displays higher sensitivity and specificity. By looking at specific examples, we show how an experimental set of interactions can be enriched and validated. As part of this work we also examine the effect of different prior databases upon the accuracy of prediction and find that the interactions from the same kingdom give better results than from across kingdoms, suggesting that there may be fundamental differences between the networks. These results all emphasize that network structure is important and helps in the accurate prediction of protein interactions. The protein interaction data set and the program used in our analysis, and a list of predictions and validations, are available at http://www.stats.ox.ac.uk/bioinfo/resources/PredictingInteractions.

  19. The UL24 protein of herpes simplex virus 1 affects the sub-cellular distribution of viral glycoproteins involved in fusion

    Energy Technology Data Exchange (ETDEWEB)

    Ben Abdeljelil, Nawel; Rochette, Pierre-Alexandre; Pearson, Angela, E-mail: angela.pearson@iaf.inrs.ca

    2013-09-15

    Mutations in UL24 of herpes simplex virus type 1 can lead to a syncytial phenotype. We hypothesized that UL24 affects the sub-cellular distribution of viral glycoproteins involved in fusion. In non-immortalized human foreskin fibroblasts (HFFs) we detected viral glycoproteins B (gB), gD, gH and gL present in extended blotches throughout the cytoplasm with limited nuclear membrane staining; however, in HFFs infected with a UL24-deficient virus (UL24X), staining for the viral glycoproteins appeared as long, thin streaks running across the cell. Interestingly, there was a decrease in co-localized staining of gB and gD with F-actin at late times in UL24X-infected HFFs. Treatment with chemical agents that perturbed the actin cytoskeleton hindered the formation of UL24X-induced syncytia in these cells. These data support a model whereby the UL24 syncytial phenotype results from a mislocalization of viral glycoproteins late in infection. - Highlights: • UL24 affects the sub-cellular distribution of viral glycoproteins required for fusion. • Sub-cellular distribution of viral glycoproteins varies in cell-type dependent manner. • Drugs targeting actin microfilaments affect formation of UL24-related syncytia in HFFs.

  20. Biogenesis of the rat hepatocyte plasma membrane in vivo: comparison of the pathways taken by apical and basolateral proteins using subcellular fractionation

    International Nuclear Information System (INIS)

    Bartles, J.R.; Feracci, H.M.; Stieger, B.; Hubbard, A.L.

    1987-01-01

    We have used pulse-chase metabolic radiolabeling with L-[ 35 S]methionine in conjunction with subcellular fractionation and specific protein immunoprecipitation techniques to compare the posttranslational transport pathways taken by endogenous domain-specific integral proteins of the rat hepatocyte plasma membrane in vivo. Our results suggest that both apical (HA 4, dipeptidylpeptidase IV, and aminopeptidase N) and basolateral (CE 9 and the asialoglycoprotein receptor [ASGP-R]) proteins reach the hepatocyte plasma membrane with similar kinetics. The mature molecular mass form of each of these proteins reaches its maximum specific radioactivity in a purified hepatocyte plasma membrane fraction after only 45 min of chase. However, at this time, the mature radiolabeled apical proteins are not associated with vesicles derived from the apical domain of the hepatocyte plasma membrane, but instead are associated with vesicles which, by several criteria, appear to be basolateral plasma membrane. These vesicles: (a) fractionate like basolateral plasma membrane in sucrose density gradients and in free-flow electrophoresis; (b) can be separated from the bulk of the likely organellar contaminants, including membranes derived from the late Golgi cisternae, transtubular network, and endosomes; (c) contain the proven basolateral constituents CE 9 and the ASGP-R, as judged by vesicle immunoadsorption using fixed Staphylococcus aureus cells and anti-ASGP-R antibodies; and (d) are oriented with their ectoplasmic surfaces facing outward, based on the results of vesicle immunoadsorption experiments using antibodies specific for the ectoplasmic domain of the ASGP-R. Only at times of chase greater than 45 min do significant amounts of the mature radiolabeled apical proteins arrive at the apical domain, and they do so at different rates

  1. Predicting protein-protein interactions from multimodal biological data sources via nonnegative matrix tri-factorization.

    Science.gov (United States)

    Wang, Hua; Huang, Heng; Ding, Chris; Nie, Feiping

    2013-04-01

    Protein interactions are central to all the biological processes and structural scaffolds in living organisms, because they orchestrate a number of cellular processes such as metabolic pathways and immunological recognition. Several high-throughput methods, for example, yeast two-hybrid system and mass spectrometry method, can help determine protein interactions, which, however, suffer from high false-positive rates. Moreover, many protein interactions predicted by one method are not supported by another. Therefore, computational methods are necessary and crucial to complete the interactome expeditiously. In this work, we formulate the problem of predicting protein interactions from a new mathematical perspective--sparse matrix completion, and propose a novel nonnegative matrix factorization (NMF)-based matrix completion approach to predict new protein interactions from existing protein interaction networks. Through using manifold regularization, we further develop our method to integrate different biological data sources, such as protein sequences, gene expressions, protein structure information, etc. Extensive experimental results on four species, Saccharomyces cerevisiae, Drosophila melanogaster, Homo sapiens, and Caenorhabditis elegans, have shown that our new methods outperform related state-of-the-art protein interaction prediction methods.

  2. Sequence-based prediction of protein protein interaction using a deep-learning algorithm.

    Science.gov (United States)

    Sun, Tanlin; Zhou, Bo; Lai, Luhua; Pei, Jianfeng

    2017-05-25

    Protein-protein interactions (PPIs) are critical for many biological processes. It is therefore important to develop accurate high-throughput methods for identifying PPI to better understand protein function, disease occurrence, and therapy design. Though various computational methods for predicting PPI have been developed, their robustness for prediction with external datasets is unknown. Deep-learning algorithms have achieved successful results in diverse areas, but their effectiveness for PPI prediction has not been tested. We used a stacked autoencoder, a type of deep-learning algorithm, to study the sequence-based PPI prediction. The best model achieved an average accuracy of 97.19% with 10-fold cross-validation. The prediction accuracies for various external datasets ranged from 87.99% to 99.21%, which are superior to those achieved with previous methods. To our knowledge, this research is the first to apply a deep-learning algorithm to sequence-based PPI prediction, and the results demonstrate its potential in this field.

  3. The Subcellular Localization and Functional Analysis of Fibrillarin2, a Nucleolar Protein in Nicotiana benthamiana

    Directory of Open Access Journals (Sweden)

    Luping Zheng

    2016-01-01

    Full Text Available Nucleolar proteins play important roles in plant cytology, growth, and development. Fibrillarin2 is a nucleolar protein of Nicotiana benthamiana (N. benthamiana. Its cDNA was amplified by RT-PCR and inserted into expression vector pEarley101 labeled with yellow fluorescent protein (YFP. The fusion protein was localized in the nucleolus and Cajal body of leaf epidermal cells of N. benthamiana. The N. benthamiana fibrillarin2 (NbFib2 protein has three functional domains (i.e., glycine and arginine rich domain, RNA-binding domain, and α-helical domain and a nuclear localization signal (NLS in C-terminal. The protein 3D structure analysis predicted that NbFib2 is an α/β protein. In addition, the virus induced gene silencing (VIGS approach was used to determine the function of NbFib2. Our results showed that symptoms including growth retardation, organ deformation, chlorosis, and necrosis appeared in NbFib2-silenced N. benthamiana.

  4. Prediction of protein–protein interactions: unifying evolution and structure at protein interfaces

    International Nuclear Information System (INIS)

    Tuncbag, Nurcan; Gursoy, Attila; Keskin, Ozlem

    2011-01-01

    The vast majority of the chores in the living cell involve protein–protein interactions. Providing details of protein interactions at the residue level and incorporating them into protein interaction networks are crucial toward the elucidation of a dynamic picture of cells. Despite the rapid increase in the number of structurally known protein complexes, we are still far away from a complete network. Given experimental limitations, computational modeling of protein interactions is a prerequisite to proceed on the way to complete structural networks. In this work, we focus on the question 'how do proteins interact?' rather than 'which proteins interact?' and we review structure-based protein–protein interaction prediction approaches. As a sample approach for modeling protein interactions, PRISM is detailed which combines structural similarity and evolutionary conservation in protein interfaces to infer structures of complexes in the protein interaction network. This will ultimately help us to understand the role of protein interfaces in predicting bound conformations

  5. Construction of ontology augmented networks for protein complex prediction.

    Science.gov (United States)

    Zhang, Yijia; Lin, Hongfei; Yang, Zhihao; Wang, Jian

    2013-01-01

    Protein complexes are of great importance in understanding the principles of cellular organization and function. The increase in available protein-protein interaction data, gene ontology and other resources make it possible to develop computational methods for protein complex prediction. Most existing methods focus mainly on the topological structure of protein-protein interaction networks, and largely ignore the gene ontology annotation information. In this article, we constructed ontology augmented networks with protein-protein interaction data and gene ontology, which effectively unified the topological structure of protein-protein interaction networks and the similarity of gene ontology annotations into unified distance measures. After constructing ontology augmented networks, a novel method (clustering based on ontology augmented networks) was proposed to predict protein complexes, which was capable of taking into account the topological structure of the protein-protein interaction network, as well as the similarity of gene ontology annotations. Our method was applied to two different yeast protein-protein interaction datasets and predicted many well-known complexes. The experimental results showed that (i) ontology augmented networks and the unified distance measure can effectively combine the structure closeness and gene ontology annotation similarity; (ii) our method is valuable in predicting protein complexes and has higher F1 and accuracy compared to other competing methods.

  6. Roles for text mining in protein function prediction.

    Science.gov (United States)

    Verspoor, Karin M

    2014-01-01

    The Human Genome Project has provided science with a hugely valuable resource: the blueprints for life; the specification of all of the genes that make up a human. While the genes have all been identified and deciphered, it is proteins that are the workhorses of the human body: they are essential to virtually all cell functions and are the primary mechanism through which biological function is carried out. Hence in order to fully understand what happens at a molecular level in biological organisms, and eventually to enable development of treatments for diseases where some aspect of a biological system goes awry, we must understand the functions of proteins. However, experimental characterization of protein function cannot scale to the vast amount of DNA sequence data now available. Computational protein function prediction has therefore emerged as a problem at the forefront of modern biology (Radivojac et al., Nat Methods 10(13):221-227, 2013).Within the varied approaches to computational protein function prediction that have been explored, there are several that make use of biomedical literature mining. These methods take advantage of information in the published literature to associate specific proteins with specific protein functions. In this chapter, we introduce two main strategies for doing this: association of function terms, represented as Gene Ontology terms (Ashburner et al., Nat Genet 25(1):25-29, 2000), to proteins based on information in published articles, and a paradigm called LEAP-FS (Literature-Enhanced Automated Prediction of Functional Sites) in which literature mining is used to validate the predictions of an orthogonal computational protein function prediction method.

  7. The Puf family of RNA-binding proteins in plants: phylogeny, structural modeling, activity and subcellular localization

    Directory of Open Access Journals (Sweden)

    Tam Michael WC

    2010-03-01

    Full Text Available Abstract Background Puf proteins have important roles in controlling gene expression at the post-transcriptional level by promoting RNA decay and repressing translation. The Pumilio homology domain (PUM-HD is a conserved region within Puf proteins that binds to RNA with sequence specificity. Although Puf proteins have been well characterized in animal and fungal systems, little is known about the structural and functional characteristics of Puf-like proteins in plants. Results The Arabidopsis and rice genomes code for 26 and 19 Puf-like proteins, respectively, each possessing eight or fewer Puf repeats in their PUM-HD. Key amino acids in the PUM-HD of several of these proteins are conserved with those of animal and fungal homologs, whereas other plant Puf proteins demonstrate extensive variability in these amino acids. Three-dimensional modeling revealed that the predicted structure of this domain in plant Puf proteins provides a suitable surface for binding RNA. Electrophoretic gel mobility shift experiments showed that the Arabidopsis AtPum2 PUM-HD binds with high affinity to BoxB of the Drosophila Nanos Response Element I (NRE1 RNA, whereas a point mutation in the core of the NRE1 resulted in a significant reduction in binding affinity. Transient expression of several of the Arabidopsis Puf proteins as fluorescent protein fusions revealed a dynamic, punctate cytoplasmic pattern of localization for most of these proteins. The presence of predicted nuclear export signals and accumulation of AtPuf proteins in the nucleus after treatment of cells with leptomycin B demonstrated that shuttling of these proteins between the cytosol and nucleus is common among these proteins. In addition to the cytoplasmically enriched AtPum proteins, two AtPum proteins showed nuclear targeting with enrichment in the nucleolus. Conclusions The Puf family of RNA-binding proteins in plants consists of a greater number of members than any other model species studied to

  8. EVA: continuous automatic evaluation of protein structure prediction servers.

    Science.gov (United States)

    Eyrich, V A; Martí-Renom, M A; Przybylski, D; Madhusudhan, M S; Fiser, A; Pazos, F; Valencia, A; Sali, A; Rost, B

    2001-12-01

    Evaluation of protein structure prediction methods is difficult and time-consuming. Here, we describe EVA, a web server for assessing protein structure prediction methods, in an automated, continuous and large-scale fashion. Currently, EVA evaluates the performance of a variety of prediction methods available through the internet. Every week, the sequences of the latest experimentally determined protein structures are sent to prediction servers, results are collected, performance is evaluated, and a summary is published on the web. EVA has so far collected data for more than 3000 protein chains. These results may provide valuable insight to both developers and users of prediction methods. http://cubic.bioc.columbia.edu/eva. eva@cubic.bioc.columbia.edu

  9. Predicting DNA-binding proteins and binding residues by complex structure prediction and application to human proteome.

    Directory of Open Access Journals (Sweden)

    Huiying Zhao

    Full Text Available As more and more protein sequences are uncovered from increasingly inexpensive sequencing techniques, an urgent task is to find their functions. This work presents a highly reliable computational technique for predicting DNA-binding function at the level of protein-DNA complex structures, rather than low-resolution two-state prediction of DNA-binding as most existing techniques do. The method first predicts protein-DNA complex structure by utilizing the template-based structure prediction technique HHblits, followed by binding affinity prediction based on a knowledge-based energy function (Distance-scaled finite ideal-gas reference state for protein-DNA interactions. A leave-one-out cross validation of the method based on 179 DNA-binding and 3797 non-binding protein domains achieves a Matthews correlation coefficient (MCC of 0.77 with high precision (94% and high sensitivity (65%. We further found 51% sensitivity for 82 newly determined structures of DNA-binding proteins and 56% sensitivity for the human proteome. In addition, the method provides a reasonably accurate prediction of DNA-binding residues in proteins based on predicted DNA-binding complex structures. Its application to human proteome leads to more than 300 novel DNA-binding proteins; some of these predicted structures were validated by known structures of homologous proteins in APO forms. The method [SPOT-Seq (DNA] is available as an on-line server at http://sparks-lab.org.

  10. ProteinSplit: splitting of multi-domain proteins using prediction of ordered and disordered regions in protein sequences for virtual structural genomics

    International Nuclear Information System (INIS)

    Wyrwicz, Lucjan S; Koczyk, Grzegorz; Rychlewski, Leszek; Plewczynski, Dariusz

    2007-01-01

    The annotation of protein folds within newly sequenced genomes is the main target for semi-automated protein structure prediction (virtual structural genomics). A large number of automated methods have been developed recently with very good results in the case of single-domain proteins. Unfortunately, most of these automated methods often fail to properly predict the distant homology between a given multi-domain protein query and structural templates. Therefore a multi-domain protein should be split into domains in order to overcome this limitation. ProteinSplit is designed to identify protein domain boundaries using a novel algorithm that predicts disordered regions in protein sequences. The software utilizes various sequence characteristics to assess the local propensity of a protein to be disordered or ordered in terms of local structure stability. These disordered parts of a protein are likely to create interdomain spacers. Because of its speed and portability, the method was successfully applied to several genome-wide fold annotation experiments. The user can run an automated analysis of sets of proteins or perform semi-automated multiple user projects (saving the results on the server). Additionally the sequences of predicted domains can be sent to the Bioinfo.PL Protein Structure Prediction Meta-Server for further protein three-dimensional structure and function prediction. The program is freely accessible as a web service at http://lucjan.bioinfo.pl/proteinsplit together with detailed benchmark results on the critical assessment of a fully automated structure prediction (CAFASP) set of sequences. The source code of the local version of protein domain boundary prediction is available upon request from the authors

  11. Biodynamics of copper oxide nanoparticles and copper ions in an oligochaete - Part II: Subcellular distribution following sediment exposure

    Energy Technology Data Exchange (ETDEWEB)

    Thit, Amalie, E-mail: athitj@ruc.dk [U.S. Geological Survey, 345 Middlefield Road, Menlo Park, CA 94025 (United States); Department of Science and Environment, Roskilde University, Universitetsvej 1, Roskilde DK-4000 (Denmark); Ramskov, Tina, E-mail: tramskov@hotmail.com [U.S. Geological Survey, 345 Middlefield Road, Menlo Park, CA 94025 (United States); Department of Science and Environment, Roskilde University, Universitetsvej 1, Roskilde DK-4000 (Denmark); Croteau, Marie-Noële, E-mail: mcroteau@usgs.gov [Department of Science and Environment, Roskilde University, Universitetsvej 1, Roskilde DK-4000 (Denmark); Selck, Henriette [U.S. Geological Survey, 345 Middlefield Road, Menlo Park, CA 94025 (United States); Department of Science and Environment, Roskilde University, Universitetsvej 1, Roskilde DK-4000 (Denmark)

    2016-11-15

    Highlights: • L. variegatus was exposed to sediment spiked with either aqueous Cu or nanoparticulate CuO. • Both aqueous and nanoparticulate Cu were marginally accumulated by L. variegatus. • Elimination of Cu accumulated from both forms was limited. • The subcellular distribution of accumulated Cu varied between Cu forms. • The use of a tracer, greater exposure concentration and duration are recommended. - Abstract: The use and likely incidental release of metal nanoparticles (NPs) is steadily increasing. Despite the increasing amount of published literature on metal NP toxicity in the aquatic environment, very little is known about the biological fate of NPs after sediment exposures. Here, we compare the bioavailability and subcellular distribution of copper oxide (CuO) NPs and aqueous Cu (Cu-Aq) in the sediment-dwelling worm Lumbriculus variegatus. Ten days (d) sediment exposure resulted in marginal Cu bioaccumulation in L. variegatus for both forms of Cu. Bioaccumulation was detected because isotopically enriched {sup 65}Cu was used as a tracer. Neither burrowing behavior or survival was affected by the exposure. Once incorporated into tissue, Cu loss was negligible over 10 d of elimination in clean sediment (Cu elimination rate constants were not different from zero). With the exception of day 10, differences in bioaccumulation and subcellular distribution between Cu forms were either not detectable or marginal. After 10 d of exposure to Cu-Aq, the accumulated Cu was primarily partitioned in the subcellular fraction containing metallothionein-like proteins (MTLP, ≈40%) and cellular debris (CD, ≈30%). Cu concentrations in these fractions were significantly higher than in controls. For worms exposed to CuO NPs for 10 d, most of the accumulated Cu was partitioned in the CD fraction (≈40%), which was the only subcellular fraction where the Cu concentration was significantly higher than for the control group. Our results indicate that L. variegatus

  12. Transcriptional Analysis and Subcellular Protein Localization Reveal Specific Features of the Essential WalKR System in Staphylococcus aureus.

    Directory of Open Access Journals (Sweden)

    Olivier Poupel

    Full Text Available The WalKR two-component system, controlling cell wall metabolism, is highly conserved among Bacilli and essential for cell viability. In Staphylococcus aureus, walR and walK are followed by three genes of unknown function: walH, walI and walJ. Sequence analysis and transcript mapping revealed a unique genetic structure for this locus in S. aureus: the last gene of the locus, walJ, is transcribed independently, whereas transcription of the tetra-cistronic walRKHI operon occurred from two independent promoters located upstream from walR. Protein topology analysis and protein-protein interactions in E. coli as well as subcellular localization in S. aureus allowed us to show that WalH and WalI are membrane-bound proteins, which associate with WalK to form a complex at the cell division septum. While these interactions suggest that WalH and WalI play a role in activity of the WalKR regulatory pathway, deletion of walH and/or walI did not have a major effect on genes whose expression is strongly dependent on WalKR or on associated phenotypes. No effect of WalH or WalI was seen on tightly controlled WalKR regulon genes such as sle1 or saouhsc_00773, which encodes a CHAP-domain amidase. Of the genes encoding the two major S. aureus autolysins, AtlA and Sle1, only transcription of atlA was increased in the ΔwalH or ΔwalI mutants. Likewise, bacterial autolysis was not increased in the absence of WalH and/or WalI and biofilm formation was lowered rather than increased. Our results suggest that contrary to their major role as WalK inhibitors in B. subtilis, the WalH and WalI proteins have evolved a different function in S. aureus, where they are more accessory. A phylogenomic analysis shows a striking conservation of the 5 gene wal cluster along the evolutionary history of Bacilli, supporting the key importance of this signal transduction system, and indicating that the walH and walI genes were lost in the ancestor of Streptococcaceae, leading to their

  13. Transcriptional Analysis and Subcellular Protein Localization Reveal Specific Features of the Essential WalKR System in Staphylococcus aureus.

    Science.gov (United States)

    Poupel, Olivier; Moyat, Mati; Groizeleau, Julie; Antunes, Luísa C S; Gribaldo, Simonetta; Msadek, Tarek; Dubrac, Sarah

    2016-01-01

    The WalKR two-component system, controlling cell wall metabolism, is highly conserved among Bacilli and essential for cell viability. In Staphylococcus aureus, walR and walK are followed by three genes of unknown function: walH, walI and walJ. Sequence analysis and transcript mapping revealed a unique genetic structure for this locus in S. aureus: the last gene of the locus, walJ, is transcribed independently, whereas transcription of the tetra-cistronic walRKHI operon occurred from two independent promoters located upstream from walR. Protein topology analysis and protein-protein interactions in E. coli as well as subcellular localization in S. aureus allowed us to show that WalH and WalI are membrane-bound proteins, which associate with WalK to form a complex at the cell division septum. While these interactions suggest that WalH and WalI play a role in activity of the WalKR regulatory pathway, deletion of walH and/or walI did not have a major effect on genes whose expression is strongly dependent on WalKR or on associated phenotypes. No effect of WalH or WalI was seen on tightly controlled WalKR regulon genes such as sle1 or saouhsc_00773, which encodes a CHAP-domain amidase. Of the genes encoding the two major S. aureus autolysins, AtlA and Sle1, only transcription of atlA was increased in the ΔwalH or ΔwalI mutants. Likewise, bacterial autolysis was not increased in the absence of WalH and/or WalI and biofilm formation was lowered rather than increased. Our results suggest that contrary to their major role as WalK inhibitors in B. subtilis, the WalH and WalI proteins have evolved a different function in S. aureus, where they are more accessory. A phylogenomic analysis shows a striking conservation of the 5 gene wal cluster along the evolutionary history of Bacilli, supporting the key importance of this signal transduction system, and indicating that the walH and walI genes were lost in the ancestor of Streptococcaceae, leading to their atypical 3 wal gene

  14. [Cloning, subcellular localization, and heterologous expression of ApNAC1 gene from Andrographis paniculata].

    Science.gov (United States)

    Wang, Jian; Qi, Meng-Die; Guo, Juan; Shen, Ye; Lin, Hui-Xin; Huang, Lu-Qi

    2017-03-01

    Andrographis paniculata is widely used as medicinal herb in China for a long time and andrographolide is its main medicinal constituent. To investigate the underlying andrographolide biosynthesis mechanisms, RNA-seq for A. paniculata leaves with MeJA treatment was performed. In A. paniculata transcriptomic data, the expression pattern of one member of NAC transcription factor family (ApNAC1) matched with andrographolide accumulation. The coding sequence of ApNAC1 was cloned by RT-PCR, and GenBank accession number was KY196416. The analysis of bioinformatics showed that the gene encodes a peptide of 323 amino acids, with a predicted relative molecular weight of 35.9 kDa and isoelectric point of 6.14. To confirm the subcellular localization, ApNAC1-GFP was transiently expressed in A. paniculata protoplast. The results indicated that ApNAC1 is a nucleus-localized protein. The analysis of real-time quantitative PCR revealed that ApNAC1 gene predominantly expresses in leaves. Compared with control sample, its expression abundance sharply increased with methyl jasmonate treatment. Based on its expression pattern, ApNAC1 gene might involve in andrographolide biosynthesis. ApNAC1 was heterologously expressed in Escherichia coli and recombinant protein was purified by Ni-NTA agarose. Further study will help us to understand the function of ApNAC1 in andrographolide biosynthesis. Copyright© by the Chinese Pharmaceutical Association.

  15. Nucleolar localization of cirhin, the protein mutated in North American Indian childhood cirrhosis

    International Nuclear Information System (INIS)

    Yu, Bin; Mitchell, Grant A.; Richter, Andrea

    2005-01-01

    Cirhin (NP 1 16219), the product of the CIRH1A gene is mutated in North American Indian childhood cirrhosis (NAIC/CIRH1A, OMIM 604901), a severe autosomal recessive intrahepatic cholestasis. It is a 686-amino-acid WD40-repeat containing protein of unknown function that is predicted to contain multiple targeting signals, including an N-terminal mitochondrial targeting signal, a C-terminal monopartite nuclear localization signal (NLS) and a bipartite nuclear localization signal (BNLS). We performed the direct determination of subcellular localization of cirhin as a crucial first step in unraveling its biological function. Using EGFP and His-tagged cirhin fusion proteins expressed in HeLa and HepG2, cells we show that cirhin is a nucleolar protein and that the R565W mutation, for which all NAIC patients are homozygous, has no effect on subcellular localization. Cirhin has an active C-terminal monopartite nuclear localization signal (NLS) and a unique nucleolar localization signal (NrLS) between residues 315 and 432. The nucleolus is not known to be important specifically for intrahepatic cholestasis. These observations provide a new dimension in the study of hereditary cholestasis

  16. PSPP: a protein structure prediction pipeline for computing clusters.

    Directory of Open Access Journals (Sweden)

    Michael S Lee

    2009-07-01

    Full Text Available Protein structures are critical for understanding the mechanisms of biological systems and, subsequently, for drug and vaccine design. Unfortunately, protein sequence data exceed structural data by a factor of more than 200 to 1. This gap can be partially filled by using computational protein structure prediction. While structure prediction Web servers are a notable option, they often restrict the number of sequence queries and/or provide a limited set of prediction methodologies. Therefore, we present a standalone protein structure prediction software package suitable for high-throughput structural genomic applications that performs all three classes of prediction methodologies: comparative modeling, fold recognition, and ab initio. This software can be deployed on a user's own high-performance computing cluster.The pipeline consists of a Perl core that integrates more than 20 individual software packages and databases, most of which are freely available from other research laboratories. The query protein sequences are first divided into domains either by domain boundary recognition or Bayesian statistics. The structures of the individual domains are then predicted using template-based modeling or ab initio modeling. The predicted models are scored with a statistical potential and an all-atom force field. The top-scoring ab initio models are annotated by structural comparison against the Structural Classification of Proteins (SCOP fold database. Furthermore, secondary structure, solvent accessibility, transmembrane helices, and structural disorder are predicted. The results are generated in text, tab-delimited, and hypertext markup language (HTML formats. So far, the pipeline has been used to study viral and bacterial proteomes.The standalone pipeline that we introduce here, unlike protein structure prediction Web servers, allows users to devote their own computing assets to process a potentially unlimited number of queries as well as perform

  17. Comparison of expressed human and mouse sodium/iodide sym-porters reveals differences in transport properties and subcellular localization

    Energy Technology Data Exchange (ETDEWEB)

    Dayem, M.; Basquin, C.; Navarro, V.; Carrier, P.; Marsault, R.; Lindenthal, S.; Pourcher, T. [Univ Nice Sophia Antipolis, Sch Med, CEA, DSV, iBEB, SBTN, TIRO, F-06107 Nice (France); Chang, P. [CNRS, UPMC Biol Dev, UMR 7009, F-06230 Villefranche Sur Mer (France); Huc, S.; Darrouzet, E. [CEA Valrho, DSV, iBEB, SBTN, F-30207 Bagnols Sur Ceze (France)

    2008-07-01

    The active transport of iodide from the blood stream into thyroid follicular cells is mediated by the Na{sup +}/I{sup -} sym-porter (NIS). We studied mouse NIS (mNIS) and found that it catalyzes iodide transport into transfected cells more efficiently than human NIS (hNIS). To further characterize this difference,we compared {sup 125}I, uptake in the transiently transfected human embryonic kidney (HEK) 293 cells. We found that the Vmax for mNIS was four times higher than that for hNIS, and that the iodide transport constant (Km) was 2-5-fold lower for hNIS than mNIS. We also performed immuno-cyto-localization studies and observed that the subcellular distribution of the two ortho-logs differed. While the mouse protein was predominantly found at the plasma membrane, its human ortho-log was intracellular in {approx} 40% of the expressing cells. Using cell surface protein-labeling assays, we found that the plasma membrane localization frequency of the mouse protein was only 2-5-fold higher than that of the human protein, and therefore cannot alone account for,x values. We reasoned that the difference in the obtained Vmax the observed difference could also be caused by a higher turnover number for iodide transport in the mouse protein. We then expressed and analyzed chimeric proteins. The data obtained with these constructs suggest that the iodide recognition site could be located in the region extending from the N-terminus to transmembrane domain 8, and that the region between transmembrane domain 5 and the C-terminus could play a role in the subcellular localization of the protein. (authors)

  18. Comparison of expressed human and mouse sodium/iodide sym-porters reveals differences in transport properties and subcellular localization

    International Nuclear Information System (INIS)

    Dayem, M.; Basquin, C.; Navarro, V.; Carrier, P.; Marsault, R.; Lindenthal, S.; Pourcher, T.; Chang, P.; Huc, S.; Darrouzet, E.

    2008-01-01

    The active transport of iodide from the blood stream into thyroid follicular cells is mediated by the Na + /I - sym-porter (NIS). We studied mouse NIS (mNIS) and found that it catalyzes iodide transport into transfected cells more efficiently than human NIS (hNIS). To further characterize this difference,we compared 125 I, uptake in the transiently transfected human embryonic kidney (HEK) 293 cells. We found that the Vmax for mNIS was four times higher than that for hNIS, and that the iodide transport constant (Km) was 2-5-fold lower for hNIS than mNIS. We also performed immuno-cyto-localization studies and observed that the subcellular distribution of the two ortho-logs differed. While the mouse protein was predominantly found at the plasma membrane, its human ortho-log was intracellular in ∼ 40% of the expressing cells. Using cell surface protein-labeling assays, we found that the plasma membrane localization frequency of the mouse protein was only 2-5-fold higher than that of the human protein, and therefore cannot alone account for,x values. We reasoned that the difference in the obtained Vmax the observed difference could also be caused by a higher turnover number for iodide transport in the mouse protein. We then expressed and analyzed chimeric proteins. The data obtained with these constructs suggest that the iodide recognition site could be located in the region extending from the N-terminus to transmembrane domain 8, and that the region between transmembrane domain 5 and the C-terminus could play a role in the subcellular localization of the protein. (authors)

  19. Blind Test of Physics-Based Prediction of Protein Structures

    Science.gov (United States)

    Shell, M. Scott; Ozkan, S. Banu; Voelz, Vincent; Wu, Guohong Albert; Dill, Ken A.

    2009-01-01

    We report here a multiprotein blind test of a computer method to predict native protein structures based solely on an all-atom physics-based force field. We use the AMBER 96 potential function with an implicit (GB/SA) model of solvation, combined with replica-exchange molecular-dynamics simulations. Coarse conformational sampling is performed using the zipping and assembly method (ZAM), an approach that is designed to mimic the putative physical routes of protein folding. ZAM was applied to the folding of six proteins, from 76 to 112 monomers in length, in CASP7, a community-wide blind test of protein structure prediction. Because these predictions have about the same level of accuracy as typical bioinformatics methods, and do not utilize information from databases of known native structures, this work opens up the possibility of predicting the structures of membrane proteins, synthetic peptides, or other foldable polymers, for which there is little prior knowledge of native structures. This approach may also be useful for predicting physical protein folding routes, non-native conformations, and other physical properties from amino acid sequences. PMID:19186130

  20. CNNcon: improved protein contact maps prediction using cascaded neural networks.

    Directory of Open Access Journals (Sweden)

    Wang Ding

    Full Text Available BACKGROUNDS: Despite continuing progress in X-ray crystallography and high-field NMR spectroscopy for determination of three-dimensional protein structures, the number of unsolved and newly discovered sequences grows much faster than that of determined structures. Protein modeling methods can possibly bridge this huge sequence-structure gap with the development of computational science. A grand challenging problem is to predict three-dimensional protein structure from its primary structure (residues sequence alone. However, predicting residue contact maps is a crucial and promising intermediate step towards final three-dimensional structure prediction. Better predictions of local and non-local contacts between residues can transform protein sequence alignment to structure alignment, which can finally improve template based three-dimensional protein structure predictors greatly. METHODS: CNNcon, an improved multiple neural networks based contact map predictor using six sub-networks and one final cascade-network, was developed in this paper. Both the sub-networks and the final cascade-network were trained and tested with their corresponding data sets. While for testing, the target protein was first coded and then input to its corresponding sub-networks for prediction. After that, the intermediate results were input to the cascade-network to finish the final prediction. RESULTS: The CNNcon can accurately predict 58.86% in average of contacts at a distance cutoff of 8 Å for proteins with lengths ranging from 51 to 450. The comparison results show that the present method performs better than the compared state-of-the-art predictors. Particularly, the prediction accuracy keeps steady with the increase of protein sequence length. It indicates that the CNNcon overcomes the thin density problem, with which other current predictors have trouble. This advantage makes the method valuable to the prediction of long length proteins. As a result, the effective

  1. On the analysis of protein-protein interactions via knowledge-based potentials for the prediction of protein-protein docking

    DEFF Research Database (Denmark)

    Feliu, Elisenda; Aloy, Patrick; Oliva, Baldo

    2011-01-01

    Development of effective methods to screen binary interactions obtained by rigid-body protein-protein docking is key for structure prediction of complexes and for elucidating physicochemical principles of protein-protein binding. We have derived empirical knowledge-based potential functions for s...... and with independence of the partner. This information is encoded at the residue level and could be easily incorporated in the initial grid scoring for Fast Fourier Transform rigid-body docking methods.......Development of effective methods to screen binary interactions obtained by rigid-body protein-protein docking is key for structure prediction of complexes and for elucidating physicochemical principles of protein-protein binding. We have derived empirical knowledge-based potential functions...... for selecting rigid-body docking poses. These potentials include the energetic component that provides the residues with a particular secondary structure and surface accessibility. These scoring functions have been tested on a state-of-art benchmark dataset and on a decoy dataset of permanent interactions. Our...

  2. Nuclear functions and subcellular trafficking mechanisms of the epidermal growth factor receptor family

    Science.gov (United States)

    2012-01-01

    Accumulating evidence suggests that various diseases, including many types of cancer, result from alteration of subcellular protein localization and compartmentalization. Therefore, it is worthwhile to expand our knowledge in subcellular trafficking of proteins, such as epidermal growth factor receptor (EGFR) and ErbB-2 of the receptor tyrosine kinases, which are highly expressed and activated in human malignancies and frequently correlated with poor prognosis. The well-characterized trafficking of cell surface EGFR is routed, via endocytosis and endosomal sorting, to either the lysosomes for degradation or back to the plasma membrane for recycling. A novel nuclear mode of EGFR signaling pathway has been gradually deciphered in which EGFR is shuttled from the cell surface to the nucleus after endocytosis, and there, it acts as a transcriptional regulator, transmits signals, and is involved in multiple biological functions, including cell proliferation, tumor progression, DNA repair and replication, and chemo- and radio-resistance. Internalized EGFR can also be transported from the cell surface to several intracellular compartments, such as the Golgi apparatus, the endoplasmic reticulum, and the mitochondria, in addition to the nucleus. In this review, we will summarize the functions of nuclear EGFR family and the potential pathways by which EGFR is trafficked from the cell surface to a variety of cellular organelles. A better understanding of the molecular mechanism of EGFR trafficking will shed light on both the receptor biology and potential therapeutic targets of anti-EGFR therapies for clinical application. PMID:22520625

  3. Protein function prediction involved on radio-resistant bacteria

    International Nuclear Information System (INIS)

    Mezhoud, Karim; Mankai, Houda; Sghaier, Haitham; Barkallah, Insaf

    2009-01-01

    Previously, we identified 58 proteins under positive selection in ionizing-radiation-resistant bacteria (IRRB) but absent in all ionizing-radiation-sensitive bacteria (IRSB). These are good reasons to believe these 58 proteins with their interactions with other proteins (interactomes) are a part of the answer to the question as to how IRRB resist to radiation, because our knowledge of interactomes of positively selected orphan proteins in IRRB might allow us to define cellular pathways important to ionizing-radiation resistance. Using the Database of Interacting Proteins and the PSIbase, we have predicted interactions of orthologs of the 58 proteins under positive selection in IRRB but absent in all IRSB. We used integrate experimental data sets with molecular interaction networks and protein structure prediction from databases. Among these, 18 proteins with their interactomes were identified in Deinococcus radiodurans R1. DNA checkpoint and repair, kinases pathways, energetic and nucleotide metabolisms were the important biological process that found. We predicted the interactomes of 58 proteins under positive selection in IRRB. It is hoped our data will provide new clues as to the cellular pathways that are important for ionizing-radiation resistance. We have identified news proteins involved on DNA management which were not previously mentioned. It is an important input in addition to protein that studied. It does still work to deepen our study on these new proteins

  4. Sub-cellular force microscopy in single normal and cancer cells

    Energy Technology Data Exchange (ETDEWEB)

    Babahosseini, H. [VT MEMS Laboratory, The Bradley Department of Electrical and Computer Engineering, Blacksburg, VA 24061 (United States); Carmichael, B. [Nonlinear Intelligent Structures Laboratory, Department of Mechanical Engineering, University of Alabama, Tuscaloosa, AL 35487-0276 (United States); Strobl, J.S. [VT MEMS Laboratory, The Bradley Department of Electrical and Computer Engineering, Blacksburg, VA 24061 (United States); Mahmoodi, S.N., E-mail: nmahmoodi@eng.ua.edu [Nonlinear Intelligent Structures Laboratory, Department of Mechanical Engineering, University of Alabama, Tuscaloosa, AL 35487-0276 (United States); Agah, M., E-mail: agah@vt.edu [VT MEMS Laboratory, The Bradley Department of Electrical and Computer Engineering, Blacksburg, VA 24061 (United States)

    2015-08-07

    This work investigates the biomechanical properties of sub-cellular structures of breast cells using atomic force microscopy (AFM). The cells are modeled as a triple-layered structure where the Generalized Maxwell model is applied to experimental data from AFM stress-relaxation tests to extract the elastic modulus, the apparent viscosity, and the relaxation time of sub-cellular structures. The triple-layered modeling results allow for determination and comparison of the biomechanical properties of the three major sub-cellular structures between normal and cancerous cells: the up plasma membrane/actin cortex, the mid cytoplasm/nucleus, and the low nuclear/integrin sub-domains. The results reveal that the sub-domains become stiffer and significantly more viscous with depth, regardless of cell type. In addition, there is a decreasing trend in the average elastic modulus and apparent viscosity of the all corresponding sub-cellular structures from normal to cancerous cells, which becomes most remarkable in the deeper sub-domain. The presented modeling in this work constitutes a unique AFM-based experimental framework to study the biomechanics of sub-cellular structures. - Highlights: • The cells are modeled as a triple-layered structure using Generalized Maxwell model. • The sub-domains include membrane/cortex, cytoplasm/nucleus, and nuclear/integrin. • Biomechanics of corresponding sub-domains are compared among normal and cancer cells. • Viscoelasticity of sub-domains show a decreasing trend from normal to cancer cells. • The decreasing trend becomes most significant in the deeper sub-domain.

  5. Sub-cellular force microscopy in single normal and cancer cells

    International Nuclear Information System (INIS)

    Babahosseini, H.; Carmichael, B.; Strobl, J.S.; Mahmoodi, S.N.; Agah, M.

    2015-01-01

    This work investigates the biomechanical properties of sub-cellular structures of breast cells using atomic force microscopy (AFM). The cells are modeled as a triple-layered structure where the Generalized Maxwell model is applied to experimental data from AFM stress-relaxation tests to extract the elastic modulus, the apparent viscosity, and the relaxation time of sub-cellular structures. The triple-layered modeling results allow for determination and comparison of the biomechanical properties of the three major sub-cellular structures between normal and cancerous cells: the up plasma membrane/actin cortex, the mid cytoplasm/nucleus, and the low nuclear/integrin sub-domains. The results reveal that the sub-domains become stiffer and significantly more viscous with depth, regardless of cell type. In addition, there is a decreasing trend in the average elastic modulus and apparent viscosity of the all corresponding sub-cellular structures from normal to cancerous cells, which becomes most remarkable in the deeper sub-domain. The presented modeling in this work constitutes a unique AFM-based experimental framework to study the biomechanics of sub-cellular structures. - Highlights: • The cells are modeled as a triple-layered structure using Generalized Maxwell model. • The sub-domains include membrane/cortex, cytoplasm/nucleus, and nuclear/integrin. • Biomechanics of corresponding sub-domains are compared among normal and cancer cells. • Viscoelasticity of sub-domains show a decreasing trend from normal to cancer cells. • The decreasing trend becomes most significant in the deeper sub-domain

  6. Endoplasmic Reticulum Export, Subcellular Distribution, and Fibril Formation by Pmel17 Require an Intact N-terminal Domain Junction*

    Science.gov (United States)

    Leonhardt, Ralf M.; Vigneron, Nathalie; Rahner, Christoph; Van den Eynde, Benoît J.; Cresswell, Peter

    2010-01-01

    Pmel17 is a melanocyte/melanoma-specific protein that subcellularly localizes to melanosomes, where it forms a fibrillar matrix that serves for the sequestration of potentially toxic reaction intermediates of melanin synthesis and deposition of the pigment. As a key factor in melanosomal biogenesis, understanding intracellular trafficking and processing of Pmel17 is of central importance to comprehend how these organelles are formed, how they mature, and how they function in the cell. Using a series of deletion and missense mutants of Pmel17, we are able to show that the integrity of the junction between the N-terminal region and the polycystic kidney disease-like domain is highly crucial for endoplasmic reticulum export, subcellular targeting, and fibril formation by Pmel17 and thus for establishing functional melanosomes. PMID:20231267

  7. Comprehensive predictions of target proteins based on protein-chemical interaction using virtual screening and experimental verifications.

    Science.gov (United States)

    Kobayashi, Hiroki; Harada, Hiroko; Nakamura, Masaomi; Futamura, Yushi; Ito, Akihiro; Yoshida, Minoru; Iemura, Shun-Ichiro; Shin-Ya, Kazuo; Doi, Takayuki; Takahashi, Takashi; Natsume, Tohru; Imoto, Masaya; Sakakibara, Yasubumi

    2012-04-05

    Identification of the target proteins of bioactive compounds is critical for elucidating the mode of action; however, target identification has been difficult in general, mostly due to the low sensitivity of detection using affinity chromatography followed by CBB staining and MS/MS analysis. We applied our protocol of predicting target proteins combining in silico screening and experimental verification for incednine, which inhibits the anti-apoptotic function of Bcl-xL by an unknown mechanism. One hundred eighty-two target protein candidates were computationally predicted to bind to incednine by the statistical prediction method, and the predictions were verified by in vitro binding of incednine to seven proteins, whose expression can be confirmed in our cell system.As a result, 40% accuracy of the computational predictions was achieved successfully, and we newly found 3 incednine-binding proteins. This study revealed that our proposed protocol of predicting target protein combining in silico screening and experimental verification is useful, and provides new insight into a strategy for identifying target proteins of small molecules.

  8. Comprehensive predictions of target proteins based on protein-chemical interaction using virtual screening and experimental verifications

    Directory of Open Access Journals (Sweden)

    Kobayashi Hiroki

    2012-04-01

    Full Text Available Abstract Background Identification of the target proteins of bioactive compounds is critical for elucidating the mode of action; however, target identification has been difficult in general, mostly due to the low sensitivity of detection using affinity chromatography followed by CBB staining and MS/MS analysis. Results We applied our protocol of predicting target proteins combining in silico screening and experimental verification for incednine, which inhibits the anti-apoptotic function of Bcl-xL by an unknown mechanism. One hundred eighty-two target protein candidates were computationally predicted to bind to incednine by the statistical prediction method, and the predictions were verified by in vitro binding of incednine to seven proteins, whose expression can be confirmed in our cell system. As a result, 40% accuracy of the computational predictions was achieved successfully, and we newly found 3 incednine-binding proteins. Conclusions This study revealed that our proposed protocol of predicting target protein combining in silico screening and experimental verification is useful, and provides new insight into a strategy for identifying target proteins of small molecules.

  9. Efficient prediction of human protein-protein interactions at a global scale.

    Science.gov (United States)

    Schoenrock, Andrew; Samanfar, Bahram; Pitre, Sylvain; Hooshyar, Mohsen; Jin, Ke; Phillips, Charles A; Wang, Hui; Phanse, Sadhna; Omidi, Katayoun; Gui, Yuan; Alamgir, Md; Wong, Alex; Barrenäs, Fredrik; Babu, Mohan; Benson, Mikael; Langston, Michael A; Green, James R; Dehne, Frank; Golshani, Ashkan

    2014-12-10

    Our knowledge of global protein-protein interaction (PPI) networks in complex organisms such as humans is hindered by technical limitations of current methods. On the basis of short co-occurring polypeptide regions, we developed a tool called MP-PIPE capable of predicting a global human PPI network within 3 months. With a recall of 23% at a precision of 82.1%, we predicted 172,132 putative PPIs. We demonstrate the usefulness of these predictions through a range of experiments. The speed and accuracy associated with MP-PIPE can make this a potential tool to study individual human PPI networks (from genomic sequences alone) for personalized medicine.

  10. Interaction of HSP20 with a viral RdRp changes its sub-cellular localization and distribution pattern in plants.

    Science.gov (United States)

    Li, Jing; Xiang, Cong-Ying; Yang, Jian; Chen, Jian-Ping; Zhang, Heng-Mu

    2015-09-11

    Small heat shock proteins (sHSPs) perform a fundamental role in protecting cells against a wide array of stresses but their biological function during viral infection remains unknown. Rice stripe virus (RSV) causes a severe disease of rice in Eastern Asia. OsHSP20 and its homologue (NbHSP20) were used as baits in yeast two-hybrid (YTH) assays to screen an RSV cDNA library and were found to interact with the viral RNA-dependent RNA polymerase (RdRp) of RSV. Interactions were confirmed by pull-down and BiFC assays. Further analysis showed that the N-terminus (residues 1-296) of the RdRp was crucial for the interaction between the HSP20s and viral RdRp and responsible for the alteration of the sub-cellular localization and distribution pattern of HSP20s in protoplasts of rice and epidermal cells of Nicotiana benthamiana. This is the first report that a plant virus or a viral protein alters the expression pattern or sub-cellular distribution of sHSPs.

  11. Predicting protein-binding RNA nucleotides with consideration of binding partners.

    Science.gov (United States)

    Tuvshinjargal, Narankhuu; Lee, Wook; Park, Byungkyu; Han, Kyungsook

    2015-06-01

    In recent years several computational methods have been developed to predict RNA-binding sites in protein. Most of these methods do not consider interacting partners of a protein, so they predict the same RNA-binding sites for a given protein sequence even if the protein binds to different RNAs. Unlike the problem of predicting RNA-binding sites in protein, the problem of predicting protein-binding sites in RNA has received little attention mainly because it is much more difficult and shows a lower accuracy on average. In our previous study, we developed a method that predicts protein-binding nucleotides from an RNA sequence. In an effort to improve the prediction accuracy and usefulness of the previous method, we developed a new method that uses both RNA and protein sequence data. In this study, we identified effective features of RNA and protein molecules and developed a new support vector machine (SVM) model to predict protein-binding nucleotides from RNA and protein sequence data. The new model that used both protein and RNA sequence data achieved a sensitivity of 86.5%, a specificity of 86.2%, a positive predictive value (PPV) of 72.6%, a negative predictive value (NPV) of 93.8% and Matthews correlation coefficient (MCC) of 0.69 in a 10-fold cross validation; it achieved a sensitivity of 58.8%, a specificity of 87.4%, a PPV of 65.1%, a NPV of 84.2% and MCC of 0.48 in independent testing. For comparative purpose, we built another prediction model that used RNA sequence data alone and ran it on the same dataset. In a 10 fold-cross validation it achieved a sensitivity of 85.7%, a specificity of 80.5%, a PPV of 67.7%, a NPV of 92.2% and MCC of 0.63; in independent testing it achieved a sensitivity of 67.7%, a specificity of 78.8%, a PPV of 57.6%, a NPV of 85.2% and MCC of 0.45. In both cross-validations and independent testing, the new model that used both RNA and protein sequences showed a better performance than the model that used RNA sequence data alone in

  12. Subcellular analysis by laser ablation electrospray ionization mass spectrometry

    Science.gov (United States)

    Vertes, Akos; Stolee, Jessica A; Shrestha, Bindesh

    2014-12-02

    In various embodiments, a method of laser ablation electrospray ionization mass spectrometry (LAESI-MS) may generally comprise micro-dissecting a cell comprising at least one of a cell wall and a cell membrane to expose at least one subcellular component therein, ablating the at least one subcellular component by an infrared laser pulse to form an ablation plume, intercepting the ablation plume by an electrospray plume to form ions, and detecting the ions by mass spectrometry.

  13. Single-cell analysis of pyroptosis dynamics reveals conserved GSDMD-mediated subcellular events that precede plasma membrane rupture.

    Science.gov (United States)

    de Vasconcelos, Nathalia M; Van Opdenbosch, Nina; Van Gorp, Hanne; Parthoens, Eef; Lamkanfi, Mohamed

    2018-04-17

    Pyroptosis is rapidly emerging as a mechanism of anti-microbial host defense, and of extracellular release of the inflammasome-dependent cytokines interleukin (IL)-1β and IL-18, which contributes to autoinflammatory pathology. Caspases 1, 4, 5 and 11 trigger this regulated form of necrosis by cleaving the pyroptosis effector gasdermin D (GSDMD), causing its pore-forming amino-terminal domain to oligomerize and perforate the plasma membrane. However, the subcellular events that precede pyroptotic cell lysis are ill defined. In this study, we triggered primary macrophages to undergo pyroptosis from three inflammasome types and recorded their dynamics and morphology using high-resolution live-cell spinning disk confocal laser microscopy. Based on quantitative analysis of single-cell subcellular events, we propose a model of pyroptotic cell disintegration that is initiated by opening of GSDMD-dependent ion channels or pores that are more restrictive than recently proposed GSDMD pores, followed by osmotic cell swelling, commitment of mitochondria and other membrane-bound organelles prior to sudden rupture of the plasma membrane and full permeability to intracellular proteins. This study provides a dynamic framework for understanding cellular changes that occur during pyroptosis, and charts a chronological sequence of GSDMD-mediated subcellular events that define pyroptotic cell death at the single-cell level.

  14. Subcellular partitioning of cadmium in the freshwater bivalve, Pyganodon grandis, after separate short-term exposures to waterborne or diet-borne metal

    Energy Technology Data Exchange (ETDEWEB)

    Cooper, Sophie; Hare, Landis [INRS-Eau, Terre et Environnement, Universite du Quebec, 490 rue de la Couronne, Quebec, QC, G1K 9A9 (Canada); Campbell, Peter G.C., E-mail: peter.campbell@ete.inrs.ca [INRS-Eau, Terre et Environnement, Universite du Quebec, 490 rue de la Couronne, Quebec, QC, G1K 9A9 (Canada)

    2010-11-15

    The dynamics of cadmium uptake and subcellular partitioning were studied in laboratory experiments conducted on Pyganodon grandis, a freshwater unionid bivalve that shows promise as a biomonitor for metal pollution. Bivalves were collected from an uncontaminated lake, allowed to acclimate to laboratory conditions ({>=}25 days), and then either exposed to a low, environmentally relevant, concentration of dissolved Cd (5 nM; 6, 12 and 24 h), or fed Cd-contaminated algae ({approx}70 nmol Cd g{sup -1} dry weight; 4 x 4 h). In this latter case, the bivalves were allowed to depurate for up to 8 days after the end of the feeding phase. As anticipated, the gills were the main target organ during the aqueous Cd exposure whereas the intestine was the initial site of Cd accumulation during the dietary exposure; during the subsequent depuration period, the dietary Cd accumulated in both the digestive gland and in the gills. For the gills, the distribution of Cd among the subcellular fractions (i.e., granules > heat-denatured proteins (HDP) {approx} heat-stable proteins (HSP) > mitochondria {approx} lysosomes + microsomes) was insensitive to the exposure route; both waterborne and diet-borne Cd ended up largely bound to the granule fraction. The subcellular distribution of Cd in the digestive gland differed markedly from that in the gills (HDP > HSP {approx} granules {approx} mitochondria > lysosomes + microsomes), but as in the case of the gills, this distribution was relatively insensitive to the exposure route. For both the gills and the digestive gland, the subcellular distributions of Cd differed from those observed in native bivalves that are chronically exposed to Cd in the field - in the short-term experimental exposures of P. grandis, metal detoxification was less effective than in chronically exposed native bivalves.

  15. Improving protein function prediction methods with integrated literature data

    Directory of Open Access Journals (Sweden)

    Gabow Aaron P

    2008-04-01

    Full Text Available Abstract Background Determining the function of uncharacterized proteins is a major challenge in the post-genomic era due to the problem's complexity and scale. Identifying a protein's function contributes to an understanding of its role in the involved pathways, its suitability as a drug target, and its potential for protein modifications. Several graph-theoretic approaches predict unidentified functions of proteins by using the functional annotations of better-characterized proteins in protein-protein interaction networks. We systematically consider the use of literature co-occurrence data, introduce a new method for quantifying the reliability of co-occurrence and test how performance differs across species. We also quantify changes in performance as the prediction algorithms annotate with increased specificity. Results We find that including information on the co-occurrence of proteins within an abstract greatly boosts performance in the Functional Flow graph-theoretic function prediction algorithm in yeast, fly and worm. This increase in performance is not simply due to the presence of additional edges since supplementing protein-protein interactions with co-occurrence data outperforms supplementing with a comparably-sized genetic interaction dataset. Through the combination of protein-protein interactions and co-occurrence data, the neighborhood around unknown proteins is quickly connected to well-characterized nodes which global prediction algorithms can exploit. Our method for quantifying co-occurrence reliability shows superior performance to the other methods, particularly at threshold values around 10% which yield the best trade off between coverage and accuracy. In contrast, the traditional way of asserting co-occurrence when at least one abstract mentions both proteins proves to be the worst method for generating co-occurrence data, introducing too many false positives. Annotating the functions with greater specificity is harder

  16. Prediction of protein-protein interaction sites in sequences and 3D structures by random forests.

    Directory of Open Access Journals (Sweden)

    Mile Sikić

    2009-01-01

    Full Text Available Identifying interaction sites in proteins provides important clues to the function of a protein and is becoming increasingly relevant in topics such as systems biology and drug discovery. Although there are numerous papers on the prediction of interaction sites using information derived from structure, there are only a few case reports on the prediction of interaction residues based solely on protein sequence. Here, a sliding window approach is combined with the Random Forests method to predict protein interaction sites using (i a combination of sequence- and structure-derived parameters and (ii sequence information alone. For sequence-based prediction we achieved a precision of 84% with a 26% recall and an F-measure of 40%. When combined with structural information, the prediction performance increases to a precision of 76% and a recall of 38% with an F-measure of 51%. We also present an attempt to rationalize the sliding window size and demonstrate that a nine-residue window is the most suitable for predictor construction. Finally, we demonstrate the applicability of our prediction methods by modeling the Ras-Raf complex using predicted interaction sites as target binding interfaces. Our results suggest that it is possible to predict protein interaction sites with quite a high accuracy using only sequence information.

  17. Kandelia obovata (S., L.) Yong tolerance mechanisms to Cadmium: Subcellular distribution, chemical forms and thiol pools

    International Nuclear Information System (INIS)

    Weng Bosen; Xie Xiangyu; Weiss, Dominik J.; Liu Jingchun; Lu Haoliang; Yan Chongling

    2012-01-01

    Highlights: ► Cadmium tolerance mechanisms of Kandelia obovata was investigated systematacially. ► Thiol pool can play roles in cadmium detoxification mechanisms. ► Increasing cadmium treatment strength caused proportional increase of cadmium uptake. ► More than half of cadmium was localized in cell walls, and lowest in membranes. ► Sodium chloride and acetic acid extractable fractions were dominant. - Abstract: In order to explore the detoxification mechanisms adopted by mangrove under cadmium (Cd) stress, we investigated the subcellular distribution and chemical forms of Cd, in addition to the change of the thiol pools in Kandelia obovata (S., L.) Yong, which were cultivated in sandy culture medium treated with sequential Cd solution. We found that Cd addition caused a proportional increase of Cd in the organs of K. obovata. The investigation of subcellular distribution verified that most of the Cd was localized in the cell wall, and the lowest was in the membrane. Results showed sodium chloride and acetic acid extractable Cd fractions were dominant. The contents of non-protein thiol compounds, Glutathione and phytochelatins in K. obovata were enhanced by the increasing strength of Cd treatment. Therefore, K. obovata can be defined as Cd tolerant plant, which base on cell wall compartmentalization, as well as protein and organic acids combination.

  18. Prediction of protein loop geometries in solution

    NARCIS (Netherlands)

    Rapp, Chaya S.; Strauss, Temima; Nederveen, Aart; Fuentes, Gloria

    2007-01-01

    The ability to determine the structure of a protein in solution is a critical tool for structural biology, as proteins in their native state are found in aqueous environments. Using a physical chemistry based prediction protocol, we demonstrate the ability to reproduce protein loop geometries in

  19. A workflow for mathematical modeling of subcellular metabolic pathways in leaf metabolism of Arabidopsis thaliana

    Directory of Open Access Journals (Sweden)

    Thomas eNägele

    2013-12-01

    Full Text Available During the last decade genome sequencing has experienced a rapid technological development resulting in numerous sequencing projects and applications in life science. In plant molecular biology, the availability of sequence data on whole genomes has enabled the reconstruction of metabolic networks. Enzymatic reactions are predicted by the sequence information. Pathways arise due to the participation of chemical compounds as substrates and products in these reactions. Although several of these comprehensive networks have been reconstructed for the genetic model plant Arabidopsis thaliana, the integration of experimental data is still challenging. Particularly the analysis of subcellular organization of plant cells limits the understanding of regulatory instances in these metabolic networks in vivo. In this study, we develop an approach for the functional integration of experimental high-throughput data into such large-scale networks. We present a subcellular metabolic network model comprising 524 metabolic intermediates and 548 metabolic interactions derived from a total of 2769 reactions. We demonstrate how to link the metabolite covariance matrix of different Arabidopsis thaliana accessions with the subcellular metabolic network model for the inverse calculation of the biochemical Jacobian, finally resulting in the calculation of a matrix which satisfies a Lyaponov equation involving a covariance matrix. In this way, differential strategies of metabolite compartmentation and involved reactions were identified in the accessions when exposed to low temperature.

  20. An overview of the prediction of protein DNA-binding sites.

    Science.gov (United States)

    Si, Jingna; Zhao, Rui; Wu, Rongling

    2015-03-06

    Interactions between proteins and DNA play an important role in many essential biological processes such as DNA replication, transcription, splicing, and repair. The identification of amino acid residues involved in DNA-binding sites is critical for understanding the mechanism of these biological activities. In the last decade, numerous computational approaches have been developed to predict protein DNA-binding sites based on protein sequence and/or structural information, which play an important role in complementing experimental strategies. At this time, approaches can be divided into three categories: sequence-based DNA-binding site prediction, structure-based DNA-binding site prediction, and homology modeling and threading. In this article, we review existing research on computational methods to predict protein DNA-binding sites, which includes data sets, various residue sequence/structural features, machine learning methods for comparison and selection, evaluation methods, performance comparison of different tools, and future directions in protein DNA-binding site prediction. In particular, we detail the meta-analysis of protein DNA-binding sites. We also propose specific implications that are likely to result in novel prediction methods, increased performance, or practical applications.

  1. Mapping monomeric threading to protein-protein structure prediction.

    Science.gov (United States)

    Guerler, Aysam; Govindarajoo, Brandon; Zhang, Yang

    2013-03-25

    The key step of template-based protein-protein structure prediction is the recognition of complexes from experimental structure libraries that have similar quaternary fold. Maintaining two monomer and dimer structure libraries is however laborious, and inappropriate library construction can degrade template recognition coverage. We propose a novel strategy SPRING to identify complexes by mapping monomeric threading alignments to protein-protein interactions based on the original oligomer entries in the PDB, which does not rely on library construction and increases the efficiency and quality of complex template recognitions. SPRING is tested on 1838 nonhomologous protein complexes which can recognize correct quaternary template structures with a TM score >0.5 in 1115 cases after excluding homologous proteins. The average TM score of the first model is 60% and 17% higher than that by HHsearch and COTH, respectively, while the number of targets with an interface RMSD benchmark proteins. Although the relative performance of SPRING and ZDOCK depends on the level of homology filters, a combination of the two methods can result in a significantly higher model quality than ZDOCK at all homology thresholds. These data demonstrate a new efficient approach to quaternary structure recognition that is ready to use for genome-scale modeling of protein-protein interactions due to the high speed and accuracy.

  2. Subcellular partitioning profiles and metallothionein levels in indigenous clams Moerella iridescens from a metal-impacted coastal bay

    Energy Technology Data Exchange (ETDEWEB)

    Wang, Zaosheng, E-mail: zswang@iue.ac.cn [Key Laboratory of Urban Environment and Health, Institute of Urban Environment, Chinese Academy of Sciences, 1799 Jimei Boulevard, Xiamen 361021 (China); State Key Laboratory of Environmental Criteria and Risk Assessment, Chinese Research Academy of Environmental Sciences, Beijing 100012 (China); Feng, Chenglian; Ye, Chun [State Key Laboratory of Environmental Criteria and Risk Assessment, Chinese Research Academy of Environmental Sciences, Beijing 100012 (China); Wang, Youshao [State Key Laboratory of Tropical Oceanography, South China Sea Institute of Oceanology, Chinese Academy of Sciences, Guangzhou 510301 (China); Yan, Changzhou, E-mail: czyan@iue.ac.cn [Key Laboratory of Urban Environment and Health, Institute of Urban Environment, Chinese Academy of Sciences, 1799 Jimei Boulevard, Xiamen 361021 (China); Li, Rui; Yan, Yijun; Chi, Qiaoqiao [Key Laboratory of Urban Environment and Health, Institute of Urban Environment, Chinese Academy of Sciences, 1799 Jimei Boulevard, Xiamen 361021 (China)

    2016-07-15

    Highlights: • Subcellular partitioning profile of metals were investigated in biomonitor organism. • Cu, Zn and Cd levels in main fraction of HSP increase along accumulation gradients. • Despite MTs as the major binding pool, detoxification of Cd and Pb was incomplete. • Induced MTs were sequentially correlated with Cu, Zn and Cd levels in HSP fraction. • Intracellular metal fates highlighted the metabolic availability within organism. - Abstract: In this study, the effect of environmental metal exposure on the accumulation and subcellular distribution of metals in the digestive gland of clams with special emphasis on metallothioneins (MTs) was investigated. Specimens of indigenous Moerella iridescens were collected from different natural habitats in Maluan Bay (China), characterized by varying levels of metal contamination. The digestive glands were excised, homogenized and six subcellular fractions were separated by differential centrifugation procedures and analyzed for their Cu, Zn, Cd and Pb contents. MTs were quantified independently by spectrophotometric measurements of thiols. Site-specific differences were observed in total metal concentrations in the tissues, correlating well with variable environmental metal concentrations and reflecting the gradient trends in metal contamination. Concentrations of the non-essential Cd and Pb were more responsive to environmental exposure gradients than were tissue concentrations of the essential metals, Cu and Zn. Subcellular partitioning profiles for Cu, Zn and Cd were relatively similar, with the heat-stable protein (HSP) fraction as the dominant metal-binding compartment, whereas for Pb this fraction was much less important. The variations in proportions and concentrations of metals in this fraction along with the metal bioaccumulation gradients suggested that the induced MTs play an important role in metal homeostasis and detoxification for M. iridescens in the metal-contaminated bay. Nevertheless

  3. C-reactive protein, fibrinogen, and cardiovascular disease prediction

    DEFF Research Database (Denmark)

    Kaptoge, Stephen; Di Angelantonio, Emanuele; Pennells, Lisa

    2012-01-01

    There is debate about the value of assessing levels of C-reactive protein (CRP) and other biomarkers of inflammation for the prediction of first cardiovascular events.......There is debate about the value of assessing levels of C-reactive protein (CRP) and other biomarkers of inflammation for the prediction of first cardiovascular events....

  4. From nonspecific DNA-protein encounter complexes to the prediction of DNA-protein interactions.

    Directory of Open Access Journals (Sweden)

    Mu Gao

    2009-03-01

    Full Text Available DNA-protein interactions are involved in many essential biological activities. Because there is no simple mapping code between DNA base pairs and protein amino acids, the prediction of DNA-protein interactions is a challenging problem. Here, we present a novel computational approach for predicting DNA-binding protein residues and DNA-protein interaction modes without knowing its specific DNA target sequence. Given the structure of a DNA-binding protein, the method first generates an ensemble of complex structures obtained by rigid-body docking with a nonspecific canonical B-DNA. Representative models are subsequently selected through clustering and ranking by their DNA-protein interfacial energy. Analysis of these encounter complex models suggests that the recognition sites for specific DNA binding are usually favorable interaction sites for the nonspecific DNA probe and that nonspecific DNA-protein interaction modes exhibit some similarity to specific DNA-protein binding modes. Although the method requires as input the knowledge that the protein binds DNA, in benchmark tests, it achieves better performance in identifying DNA-binding sites than three previously established methods, which are based on sophisticated machine-learning techniques. We further apply our method to protein structures predicted through modeling and demonstrate that our method performs satisfactorily on protein models whose root-mean-square Calpha deviation from native is up to 5 A from their native structures. This study provides valuable structural insights into how a specific DNA-binding protein interacts with a nonspecific DNA sequence. The similarity between the specific DNA-protein interaction mode and nonspecific interaction modes may reflect an important sampling step in search of its specific DNA targets by a DNA-binding protein.

  5. Stringent homology-based prediction of H. sapiens-M. tuberculosis H37Rv protein-protein interactions.

    Science.gov (United States)

    Zhou, Hufeng; Gao, Shangzhi; Nguyen, Nam Ninh; Fan, Mengyuan; Jin, Jingjing; Liu, Bing; Zhao, Liang; Xiong, Geng; Tan, Min; Li, Shijun; Wong, Limsoon

    2014-04-08

    H. sapiens-M. tuberculosis H37Rv protein-protein interaction (PPI) data are essential for understanding the infection mechanism of the formidable pathogen M. tuberculosis H37Rv. Computational prediction is an important strategy to fill the gap in experimental H. sapiens-M. tuberculosis H37Rv PPI data. Homology-based prediction is frequently used in predicting both intra-species and inter-species PPIs. However, some limitations are not properly resolved in several published works that predict eukaryote-prokaryote inter-species PPIs using intra-species template PPIs. We develop a stringent homology-based prediction approach by taking into account (i) differences between eukaryotic and prokaryotic proteins and (ii) differences between inter-species and intra-species PPI interfaces. We compare our stringent homology-based approach to a conventional homology-based approach for predicting host-pathogen PPIs, based on cellular compartment distribution analysis, disease gene list enrichment analysis, pathway enrichment analysis and functional category enrichment analysis. These analyses support the validity of our prediction result, and clearly show that our approach has better performance in predicting H. sapiens-M. tuberculosis H37Rv PPIs. Using our stringent homology-based approach, we have predicted a set of highly plausible H. sapiens-M. tuberculosis H37Rv PPIs which might be useful for many of related studies. Based on our analysis of the H. sapiens-M. tuberculosis H37Rv PPI network predicted by our stringent homology-based approach, we have discovered several interesting properties which are reported here for the first time. We find that both host proteins and pathogen proteins involved in the host-pathogen PPIs tend to be hubs in their own intra-species PPI network. Also, both host and pathogen proteins involved in host-pathogen PPIs tend to have longer primary sequence, tend to have more domains, tend to be more hydrophilic, etc. And the protein domains from both

  6. PIPE: a protein-protein interaction prediction engine based on the re-occurring short polypeptide sequences between known interacting protein pairs

    Directory of Open Access Journals (Sweden)

    Greenblatt Jack

    2006-07-01

    Full Text Available Abstract Background Identification of protein interaction networks has received considerable attention in the post-genomic era. The currently available biochemical approaches used to detect protein-protein interactions are all time and labour intensive. Consequently there is a growing need for the development of computational tools that are capable of effectively identifying such interactions. Results Here we explain the development and implementation of a novel Protein-Protein Interaction Prediction Engine termed PIPE. This tool is capable of predicting protein-protein interactions for any target pair of the yeast Saccharomyces cerevisiae proteins from their primary structure and without the need for any additional information or predictions about the proteins. PIPE showed a sensitivity of 61% for detecting any yeast protein interaction with 89% specificity and an overall accuracy of 75%. This rate of success is comparable to those associated with the most commonly used biochemical techniques. Using PIPE, we identified a novel interaction between YGL227W (vid30 and YMR135C (gid8 yeast proteins. This lead us to the identification of a novel yeast complex that here we term vid30 complex (vid30c. The observed interaction was confirmed by tandem affinity purification (TAP tag, verifying the ability of PIPE to predict novel protein-protein interactions. We then used PIPE analysis to investigate the internal architecture of vid30c. It appeared from PIPE analysis that vid30c may consist of a core and a secondary component. Generation of yeast gene deletion strains combined with TAP tagging analysis indicated that the deletion of a member of the core component interfered with the formation of vid30c, however, deletion of a member of the secondary component had little effect (if any on the formation of vid30c. Also, PIPE can be used to analyse yeast proteins for which TAP tagging fails, thereby allowing us to predict protein interactions that are not

  7. Quantifying the Sub-Cellular Distributions of Gold Nanospheres Uptaken by Cells through Stepwise, Site-Selective Etching.

    Science.gov (United States)

    Xia, Younan; Huo, Da

    2018-04-10

    A quantitative understanding of the sub-cellular distributions of nanoparticles uptaken by cells is important to the development of nanomedicine. With Au nanospheres as a model system, here we demonstrate, for the first time, how to quantify the numbers of nanoparticles bound to plasma membrane, accumulated in cytosol, and entrapped in lysosomes, respectively, through stepwise, site-selective etching. Our results indicate that the chance for nanoparticles to escape from lysosomes is insensitive to the presence of targeting ligand although ligand-receptor binding has been documented as a critical factor in triggering internalization. Furthermore, the presence of serum proteins is shown to facilitate the binding of nanoparticles to plasma membrane lacking the specific receptor. Collectively, these findings confirm the potential of stepwise etching in quantitatively analyzing the sub-cellular distributions of nanoparticles uptaken by cells in an effort to optimize the therapeutic effect. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  8. Subcellular site and nature of intracellular cadmium in plants

    International Nuclear Information System (INIS)

    Wagner, G.J.

    1979-01-01

    The mechanisms underlying heavy metal accumulation, toxicity and tolerance in higher plants are poorly understood. Since subcellular processes are undoubtedly involved in all these phenomena, it is of interest to study the extent of, subcellular site of and nature of intracellularly accumulated cadmium in higher plants. Whole plants supplied 109 CdCl 2 or 112 CdSO 4 accumulated Cd into roots and aerial tissues. Preparation of protoplasts from aerial tissue followed by subcellular fractionation of the protoplasts to obtain intact vacuoles, chloroplasts and cytosol revealed the presence of Cd in the cytosol but not in vacuoles or chloroplasts. Particulate materials containing other cell components were also labeled. Of the 109 Cd supplied to plants, 2 to 10% was recovered in both cytosol preparations and in particulate materials. Cytosol contained proteinaceous--Cd complexes, free metal and low molecular weight Cd complexes. Labeling of protoplasts gave similar results. No evidence was obtained for the production of volatile Cd complexes in tobacco

  9. Bioinformatic Prediction of WSSV-Host Protein-Protein Interaction

    Directory of Open Access Journals (Sweden)

    Zheng Sun

    2014-01-01

    Full Text Available WSSV is one of the most dangerous pathogens in shrimp aquaculture. However, the molecular mechanism of how WSSV interacts with shrimp is still not very clear. In the present study, bioinformatic approaches were used to predict interactions between proteins from WSSV and shrimp. The genome data of WSSV (NC_003225.1 and the constructed transcriptome data of F. chinensis were used to screen potentially interacting proteins by searching in protein interaction databases, including STRING, Reactome, and DIP. Forty-four pairs of proteins were suggested to have interactions between WSSV and the shrimp. Gene ontology analysis revealed that 6 pairs of these interacting proteins were classified into “extracellular region” or “receptor complex” GO-terms. KEGG pathway analysis showed that they were involved in the “ECM-receptor interaction pathway.” In the 6 pairs of interacting proteins, an envelope protein called “collagen-like protein” (WSSV-CLP encoded by an early virus gene “wsv001” in WSSV interacted with 6 deduced proteins from the shrimp, including three integrin alpha (ITGA, two integrin beta (ITGB, and one syndecan (SDC. Sequence analysis on WSSV-CLP, ITGA, ITGB, and SDC revealed that they possessed the sequence features for protein-protein interactions. This study might provide new insights into the interaction mechanisms between WSSV and shrimp.

  10. Predicting co-complexed protein pairs using genomic and proteomic data integration

    Directory of Open Access Journals (Sweden)

    King Oliver D

    2004-04-01

    Full Text Available Abstract Background Identifying all protein-protein interactions in an organism is a major objective of proteomics. A related goal is to know which protein pairs are present in the same protein complex. High-throughput methods such as yeast two-hybrid (Y2H and affinity purification coupled with mass spectrometry (APMS have been used to detect interacting proteins on a genomic scale. However, both Y2H and APMS methods have substantial false-positive rates. Aside from high-throughput interaction screens, other gene- or protein-pair characteristics may also be informative of physical interaction. Therefore it is desirable to integrate multiple datasets and utilize their different predictive value for more accurate prediction of co-complexed relationship. Results Using a supervised machine learning approach – probabilistic decision tree, we integrated high-throughput protein interaction datasets and other gene- and protein-pair characteristics to predict co-complexed pairs (CCP of proteins. Our predictions proved more sensitive and specific than predictions based on Y2H or APMS methods alone or in combination. Among the top predictions not annotated as CCPs in our reference set (obtained from the MIPS complex catalogue, a significant fraction was found to physically interact according to a separate database (YPD, Yeast Proteome Database, and the remaining predictions may potentially represent unknown CCPs. Conclusions We demonstrated that the probabilistic decision tree approach can be successfully used to predict co-complexed protein (CCP pairs from other characteristics. Our top-scoring CCP predictions provide testable hypotheses for experimental validation.

  11. Echinococcus granulosus fatty acid binding proteins subcellular localization.

    Science.gov (United States)

    Alvite, Gabriela; Esteves, Adriana

    2016-05-01

    Two fatty acid binding proteins, EgFABP1 and EgFABP2, were isolated from the parasitic platyhelminth Echinococcus granulosus. These proteins bind fatty acids and have particular relevance in flatworms since de novo fatty acids synthesis is absent. Therefore platyhelminthes depend on the capture and intracellular distribution of host's lipids and fatty acid binding proteins could participate in lipid distribution. To elucidate EgFABP's roles, we investigated their intracellular distribution in the larval stage by a proteomic approach. Our results demonstrated the presence of EgFABP1 isoforms in cytosolic, nuclear, mitochondrial and microsomal fractions, suggesting that these molecules could be involved in several cellular processes. Copyright © 2016 Elsevier Inc. All rights reserved.

  12. Automatic selection of reference taxa for protein-protein interaction prediction with phylogenetic profiling

    DEFF Research Database (Denmark)

    Simonsen, Martin; Maetschke, S.R.; Ragan, M.A.

    2012-01-01

    Motivation: Phylogenetic profiling methods can achieve good accuracy in predicting protein–protein interactions, especially in prokaryotes. Recent studies have shown that the choice of reference taxa (RT) is critical for accurate prediction, but with more than 2500 fully sequenced taxa publicly......: We present three novel methods for automating the selection of RT, using machine learning based on known protein–protein interaction networks. One of these methods in particular, Tree-Based Search, yields greatly improved prediction accuracies. We further show that different methods for constituting...... phylogenetic profiles often require very different RT sets to support high prediction accuracy....

  13. Neptunium 237 behaviour in subcellular fractions of rat kidneys

    International Nuclear Information System (INIS)

    Kreslov, V.V.; Maksutova, A.Ya.; Mushkacheva, G.S.

    1978-01-01

    Subcellular distribution of intravenously injected (1 and 0.5 μCi/rat) neptunium nitrate (5- and 6-valent) in kidneys of rat males and females has been investigated. It has been shown that the radionuclide was unevenly distributed within the cell. As early as 24 hours after administration, about 50 per cent of neptunium were concentrated in the mitochondrial fraction. The data are presented on variations in neptunium behaviour within subcellular fractions of rat kidneys depending on the sex of animals, valency and dose of the isotope

  14. A computational tool to predict the evolutionarily conserved protein-protein interaction hot-spot residues from the structure of the unbound protein.

    Science.gov (United States)

    Agrawal, Neeraj J; Helk, Bernhard; Trout, Bernhardt L

    2014-01-21

    Identifying hot-spot residues - residues that are critical to protein-protein binding - can help to elucidate a protein's function and assist in designing therapeutic molecules to target those residues. We present a novel computational tool, termed spatial-interaction-map (SIM), to predict the hot-spot residues of an evolutionarily conserved protein-protein interaction from the structure of an unbound protein alone. SIM can predict the protein hot-spot residues with an accuracy of 36-57%. Thus, the SIM tool can be used to predict the yet unknown hot-spot residues for many proteins for which the structure of the protein-protein complexes are not available, thereby providing a clue to their functions and an opportunity to design therapeutic molecules to target these proteins. Copyright © 2013 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.

  15. An Overview of the Prediction of Protein DNA-Binding Sites

    Directory of Open Access Journals (Sweden)

    Jingna Si

    2015-03-01

    Full Text Available Interactions between proteins and DNA play an important role in many essential biological processes such as DNA replication, transcription, splicing, and repair. The identification of amino acid residues involved in DNA-binding sites is critical for understanding the mechanism of these biological activities. In the last decade, numerous computational approaches have been developed to predict protein DNA-binding sites based on protein sequence and/or structural information, which play an important role in complementing experimental strategies. At this time, approaches can be divided into three categories: sequence-based DNA-binding site prediction, structure-based DNA-binding site prediction, and homology modeling and threading. In this article, we review existing research on computational methods to predict protein DNA-binding sites, which includes data sets, various residue sequence/structural features, machine learning methods for comparison and selection, evaluation methods, performance comparison of different tools, and future directions in protein DNA-binding site prediction. In particular, we detail the meta-analysis of protein DNA-binding sites. We also propose specific implications that are likely to result in novel prediction methods, increased performance, or practical applications.

  16. Visualizing Escherichia coli sub-cellular structure using sparse deconvolution Spatial Light Interference Tomography.

    Directory of Open Access Journals (Sweden)

    Mustafa Mir

    Full Text Available Studying the 3D sub-cellular structure of living cells is essential to our understanding of biological function. However, tomographic imaging of live cells is challenging mainly because they are transparent, i.e., weakly scattering structures. Therefore, this type of imaging has been implemented largely using fluorescence techniques. While confocal fluorescence imaging is a common approach to achieve sectioning, it requires fluorescence probes that are often harmful to the living specimen. On the other hand, by using the intrinsic contrast of the structures it is possible to study living cells in a non-invasive manner. One method that provides high-resolution quantitative information about nanoscale structures is a broadband interferometric technique known as Spatial Light Interference Microscopy (SLIM. In addition to rendering quantitative phase information, when combined with a high numerical aperture objective, SLIM also provides excellent depth sectioning capabilities. However, like in all linear optical systems, SLIM's resolution is limited by diffraction. Here we present a novel 3D field deconvolution algorithm that exploits the sparsity of phase images and renders images with resolution beyond the diffraction limit. We employ this label-free method, called deconvolution Spatial Light Interference Tomography (dSLIT, to visualize coiled sub-cellular structures in E. coli cells which are most likely the cytoskeletal MreB protein and the division site regulating MinCDE proteins. Previously these structures have only been observed using specialized strains and plasmids and fluorescence techniques. Our results indicate that dSLIT can be employed to study such structures in a practical and non-invasive manner.

  17. Predicting protein-protein interactions in Arabidopsis thaliana through integration of orthology, gene ontology and co-expression

    Directory of Open Access Journals (Sweden)

    Vandepoele Klaas

    2009-06-01

    Full Text Available Abstract Background Large-scale identification of the interrelationships between different components of the cell, such as the interactions between proteins, has recently gained great interest. However, unraveling large-scale protein-protein interaction maps is laborious and expensive. Moreover, assessing the reliability of the interactions can be cumbersome. Results In this study, we have developed a computational method that exploits the existing knowledge on protein-protein interactions in diverse species through orthologous relations on the one hand, and functional association data on the other hand to predict and filter protein-protein interactions in Arabidopsis thaliana. A highly reliable set of protein-protein interactions is predicted through this integrative approach making use of existing protein-protein interaction data from yeast, human, C. elegans and D. melanogaster. Localization, biological process, and co-expression data are used as powerful indicators for protein-protein interactions. The functional repertoire of the identified interactome reveals interactions between proteins functioning in well-conserved as well as plant-specific biological processes. We observe that although common mechanisms (e.g. actin polymerization and components (e.g. ARPs, actin-related proteins exist between different lineages, they are active in specific processes such as growth, cancer metastasis and trichome development in yeast, human and Arabidopsis, respectively. Conclusion We conclude that the integration of orthology with functional association data is adequate to predict protein-protein interactions. Through this approach, a high number of novel protein-protein interactions with diverse biological roles is discovered. Overall, we have predicted a reliable set of protein-protein interactions suitable for further computational as well as experimental analyses.

  18. Subcellular localization of YKL-40 in normal and malignant epithelial cells of the breast

    DEFF Research Database (Denmark)

    Roslind, A.; Balslev, E.; Kruse, H.

    2008-01-01

    . YKL-40 protein expression was redistributed in carcinoma versus normal glandular tissue of the breast. A reduced expression of YKL-40 in relation to intermediate filaments and desmosomes was found in tumor cells. Changes in YKL-40 expression suggest that the function of YKL-40 in cells of epithelial......YKL-40 is a new prognostic biomarker in cancer. The biological function is only poorly understood. This study aimed at determining the subcellular localization of YKL-40, using immunogold labeling, in normal epithelial cells and in malignant tumor cells of the breast by immunoelectron microscopy...

  19. Boosting compound-protein interaction prediction by deep learning.

    Science.gov (United States)

    Tian, Kai; Shao, Mingyu; Wang, Yang; Guan, Jihong; Zhou, Shuigeng

    2016-11-01

    The identification of interactions between compounds and proteins plays an important role in network pharmacology and drug discovery. However, experimentally identifying compound-protein interactions (CPIs) is generally expensive and time-consuming, computational approaches are thus introduced. Among these, machine-learning based methods have achieved a considerable success. However, due to the nonlinear and imbalanced nature of biological data, many machine learning approaches have their own limitations. Recently, deep learning techniques show advantages over many state-of-the-art machine learning methods in some applications. In this study, we aim at improving the performance of CPI prediction based on deep learning, and propose a method called DL-CPI (the abbreviation of Deep Learning for Compound-Protein Interactions prediction), which employs deep neural network (DNN) to effectively learn the representations of compound-protein pairs. Extensive experiments show that DL-CPI can learn useful features of compound-protein pairs by a layerwise abstraction, and thus achieves better prediction performance than existing methods on both balanced and imbalanced datasets. Copyright © 2016 Elsevier Inc. All rights reserved.

  20. Comparing human-Salmonella with plant-Salmonella protein-protein interaction predictions

    Directory of Open Access Journals (Sweden)

    Sylvia eSchleker

    2015-01-01

    Full Text Available Salmonellosis is the most frequent food-borne disease world-wide and can be transmitted to humans by a variety of routes, especially via animal and plant products. Salmonella bacteria are believed to use not only animal and human but also plant hosts despite their evolutionary distance. This raises the question if Salmonella employs similar mechanisms in infection of these diverse hosts. Given that most of our understanding comes from its interaction with human hosts, we investigate here to what degree knowledge of Salmonella-human interactions can be transferred to the Salmonella-plant system. Reviewed are recent publications on analysis and prediction of Salmonella-host interactomes. Putative protein-protein interactions (PPIs between Salmonella and its human and Arabidopsis hosts were retrieved utilizing purely interolog-based approaches in which predictions were inferred based on available sequence and domain information of known PPIs, and machine learning approaches that integrate a larger set of useful information from different sources. Transfer learning is an especially suitable machine learning technique to predict plant host targets from the knowledge of human host targets. A comparison of the prediction results with transcriptomic data shows a clear overlap between the host proteins predicted to be targeted by PPIs and their gene ontology enrichment in both host species and regulation of gene expression. In particular, the cellular processes Salmonella interferes with in plants and humans are catabolic processes. The details of how these processes are targeted, however, are quite different between the two organisms, as expected based on their evolutionary and habitat differences. Possible implications of this observation on evolution of host-pathogen communication are discussed.

  1. Prediction of protein hydration sites from sequence by modular neural networks

    DEFF Research Database (Denmark)

    Ehrlich, L.; Reczko, M.; Bohr, Henrik

    1998-01-01

    The hydration properties of a protein are important determinants of its structure and function. Here, modular neural networks are employed to predict ordered hydration sites using protein sequence information. First, secondary structure and solvent accessibility are predicted from sequence with two...... separate neural networks. These predictions are used as input together with protein sequences for networks predicting hydration of residues, backbone atoms and sidechains. These networks are teined with protein crystal structures. The prediction of hydration is improved by adding information on secondary...... structure and solvent accessibility and, using actual values of these properties, redidue hydration can be predicted to 77% accuracy with a Metthews coefficient of 0.43. However, predicted property data with an accuracy of 60-70% result in less than half the improvement in predictive performance observed...

  2. Prediction of heterodimeric protein complexes from weighted protein-protein interaction networks using novel features and kernel functions.

    Directory of Open Access Journals (Sweden)

    Peiying Ruan

    Full Text Available Since many proteins express their functional activity by interacting with other proteins and forming protein complexes, it is very useful to identify sets of proteins that form complexes. For that purpose, many prediction methods for protein complexes from protein-protein interactions have been developed such as MCL, MCODE, RNSC, PCP, RRW, and NWE. These methods have dealt with only complexes with size of more than three because the methods often are based on some density of subgraphs. However, heterodimeric protein complexes that consist of two distinct proteins occupy a large part according to several comprehensive databases of known complexes. In this paper, we propose several feature space mappings from protein-protein interaction data, in which each interaction is weighted based on reliability. Furthermore, we make use of prior knowledge on protein domains to develop feature space mappings, domain composition kernel and its combination kernel with our proposed features. We perform ten-fold cross-validation computational experiments. These results suggest that our proposed kernel considerably outperforms the naive Bayes-based method, which is the best existing method for predicting heterodimeric protein complexes.

  3. Exploration of the dynamic properties of protein complexes predicted from spatially constrained protein-protein interaction networks.

    Directory of Open Access Journals (Sweden)

    Eric A Yen

    2014-05-01

    Full Text Available Protein complexes are not static, but rather highly dynamic with subunits that undergo 1-dimensional diffusion with respect to each other. Interactions within protein complexes are modulated through regulatory inputs that alter interactions and introduce new components and deplete existing components through exchange. While it is clear that the structure and function of any given protein complex is coupled to its dynamical properties, it remains a challenge to predict the possible conformations that complexes can adopt. Protein-fragment Complementation Assays detect physical interactions between protein pairs constrained to ≤8 nm from each other in living cells. This method has been used to build networks composed of 1000s of pair-wise interactions. Significantly, these networks contain a wealth of dynamic information, as the assay is fully reversible and the proteins are expressed in their natural context. In this study, we describe a method that extracts this valuable information in the form of predicted conformations, allowing the user to explore the conformational landscape, to search for structures that correlate with an activity state, and estimate the abundance of conformations in the living cell. The generator is based on a Markov Chain Monte Carlo simulation that uses the interaction dataset as input and is constrained by the physical resolution of the assay. We applied this method to an 18-member protein complex composed of the seven core proteins of the budding yeast Arp2/3 complex and 11 associated regulators and effector proteins. We generated 20,480 output structures and identified conformational states using principle component analysis. We interrogated the conformation landscape and found evidence of symmetry breaking, a mixture of likely active and inactive conformational states and dynamic exchange of the core protein Arc15 between core and regulatory components. Our method provides a novel tool for prediction and

  4. Zn subcellular distribution in liver of goldfish (carassius auratus with exposure to zinc oxide nanoparticles and mechanism of hepatic detoxification.

    Directory of Open Access Journals (Sweden)

    Wenhong Fan

    Full Text Available Zinc Oxide Nanoparticles (ZnO NPs have attracted increasing concerns because of their widespread use and toxic potential. In this study, Zn accumulations in different tissues (gills, liver, muscle, and gut of goldfish (Carassius auratus after exposure to ZnO NPs were studied in comparison with bulk ZnO and Zn(2+. And the technique of subcellular partitioning was firstly used on the liver of goldfish to study the hepatic accumulation of ZnO NPs. The results showed that at sublethal Zn concentration (2 mg/L, bioaccumulation in goldfish was tissue-specific and dependent on the exposure materials. Compared with Zn(2+, the particles of bulk ZnO and the ZnO NPs appeared to aggregate in the environmentally contacted tissues (gills and gut, rather than transport to the internal tissues (liver and muscle. The subcellular distributions of liver differed for the three exposure treatments. After ZnO NPs exposure, Zn percentage in metal-rich granule (MRG increased significantly, and after Zn(2+ exposure, it increased significantly in the organelles. Metallothionein-like proteins (MTLP were the main target for Zn(2+, while MRG played dominant role for ZnO NPs. The different results of subcellular distributions revealed that metal detoxification mechanisms of liver for ZnO NPs, bulk ZnO, and Zn(2+ were different. Overall, subcellular partitioning provided an interesting start to better understanding of the toxicity of nano- and conventional materials.

  5. Improving N-terminal protein annotation of Plasmodium species based on signal peptide prediction of orthologous proteins

    Directory of Open Access Journals (Sweden)

    Neto Armando

    2012-11-01

    Full Text Available Abstract Background Signal peptide is one of the most important motifs involved in protein trafficking and it ultimately influences protein function. Considering the expected functional conservation among orthologs it was hypothesized that divergence in signal peptides within orthologous groups is mainly due to N-terminal protein sequence misannotation. Thus, discrepancies in signal peptide prediction of orthologous proteins were used to identify misannotated proteins in five Plasmodium species. Methods Signal peptide (SignalP and orthology (OrthoMCL were combined in an innovative strategy to identify orthologous groups showing discrepancies in signal peptide prediction among their protein members (Mixed groups. In a comparative analysis, multiple alignments for each of these groups and gene models were visually inspected in search of misannotated proteins and, whenever possible, alternative gene models were proposed. Thresholds for signal peptide prediction parameters were also modified to reduce their impact as a possible source of discrepancy among orthologs. Validation of new gene models was based on RT-PCR (few examples or on experimental evidence already published (ApiLoc. Results The rate of misannotated proteins was significantly higher in Mixed groups than in Positive or Negative groups, corroborating the proposed hypothesis. A total of 478 proteins were reannotated and change of signal peptide prediction from negative to positive was the most common. Reannotations triggered the conversion of almost 50% of all Mixed groups, which were further reduced by optimization of signal peptide prediction parameters. Conclusions The methodological novelty proposed here combining orthology and signal peptide prediction proved to be an effective strategy for the identification of proteins showing wrongly N-terminal annotated sequences, and it might have an important impact in the available data for genome-wide searching of potential vaccine and drug

  6. Improving protein-protein interaction prediction using evolutionary information from low-quality MSAs.

    Science.gov (United States)

    Várnai, Csilla; Burkoff, Nikolas S; Wild, David L

    2017-01-01

    Evolutionary information stored in multiple sequence alignments (MSAs) has been used to identify the interaction interface of protein complexes, by measuring either co-conservation or co-mutation of amino acid residues across the interface. Recently, maximum entropy related correlated mutation measures (CMMs) such as direct information, decoupling direct from indirect interactions, have been developed to identify residue pairs interacting across the protein complex interface. These studies have focussed on carefully selected protein complexes with large, good-quality MSAs. In this work, we study protein complexes with a more typical MSA consisting of fewer than 400 sequences, using a set of 79 intramolecular protein complexes. Using a maximum entropy based CMM at the residue level, we develop an interface level CMM score to be used in re-ranking docking decoys. We demonstrate that our interface level CMM score compares favourably to the complementarity trace score, an evolutionary information-based score measuring co-conservation, when combined with the number of interface residues, a knowledge-based potential and the variability score of individual amino acid sites. We also demonstrate, that, since co-mutation and co-complementarity in the MSA contain orthogonal information, the best prediction performance using evolutionary information can be achieved by combining the co-mutation information of the CMM with co-conservation information of a complementarity trace score, predicting a near-native structure as the top prediction for 41% of the dataset. The method presented is not restricted to small MSAs, and will likely improve interface prediction also for complexes with large and good-quality MSAs.

  7. SitesIdentify: a protein functional site prediction tool

    Directory of Open Access Journals (Sweden)

    Doig Andrew J

    2009-11-01

    Full Text Available Abstract Background The rate of protein structures being deposited in the Protein Data Bank surpasses the capacity to experimentally characterise them and therefore computational methods to analyse these structures have become increasingly important. Identifying the region of the protein most likely to be involved in function is useful in order to gain information about its potential role. There are many available approaches to predict functional site, but many are not made available via a publicly-accessible application. Results Here we present a functional site prediction tool (SitesIdentify, based on combining sequence conservation information with geometry-based cleft identification, that is freely available via a web-server. We have shown that SitesIdentify compares favourably to other functional site prediction tools in a comparison of seven methods on a non-redundant set of 237 enzymes with annotated active sites. Conclusion SitesIdentify is able to produce comparable accuracy in predicting functional sites to its closest available counterpart, but in addition achieves improved accuracy for proteins with few characterised homologues. SitesIdentify is available via a webserver at http://www.manchester.ac.uk/bioinformatics/sitesidentify/

  8. Prediction of methyl-side Chain Dynamics in Proteins

    International Nuclear Information System (INIS)

    Ming Dengming; Brueschweiler, Rafael

    2004-01-01

    A simple analytical model is presented for the prediction of methyl-side chain dynamics in comparison with S 2 order parameters obtained by NMR relaxation spectroscopy. The model, which is an extension of the local contact model for backbone order parameter prediction, uses a static 3D protein structure as input. It expresses the methyl-group S 2 order parameters as a function of local contacts of the methyl carbon with respect to the neighboring atoms in combination with the number of consecutive mobile dihedral angles between the methyl group and the protein backbone. For six out of seven proteins the prediction results are good when compared with experimentally determined methyl-group S 2 values with an average correlation coefficient r-bar=0.65±0.14. For the unusually rigid cytochrome c 2 no significant correlation between prediction and experiment is found. The presented model provides independent support for the reliability of current side-chain relaxation methods along with their interpretation by the model-free formalism

  9. Utilizing knowledge base of amino acids structural neighborhoods to predict protein-protein interaction sites.

    Science.gov (United States)

    Jelínek, Jan; Škoda, Petr; Hoksza, David

    2017-12-06

    Protein-protein interactions (PPI) play a key role in an investigation of various biochemical processes, and their identification is thus of great importance. Although computational prediction of which amino acids take part in a PPI has been an active field of research for some time, the quality of in-silico methods is still far from perfect. We have developed a novel prediction method called INSPiRE which benefits from a knowledge base built from data available in Protein Data Bank. All proteins involved in PPIs were converted into labeled graphs with nodes corresponding to amino acids and edges to pairs of neighboring amino acids. A structural neighborhood of each node was then encoded into a bit string and stored in the knowledge base. When predicting PPIs, INSPiRE labels amino acids of unknown proteins as interface or non-interface based on how often their structural neighborhood appears as interface or non-interface in the knowledge base. We evaluated INSPiRE's behavior with respect to different types and sizes of the structural neighborhood. Furthermore, we examined the suitability of several different features for labeling the nodes. Our evaluations showed that INSPiRE clearly outperforms existing methods with respect to Matthews correlation coefficient. In this paper we introduce a new knowledge-based method for identification of protein-protein interaction sites called INSPiRE. Its knowledge base utilizes structural patterns of known interaction sites in the Protein Data Bank which are then used for PPI prediction. Extensive experiments on several well-established datasets show that INSPiRE significantly surpasses existing PPI approaches.

  10. Subcellular localization of an intracellular serine protease of 68 kDa in Leishmania (Leishmania amazonensis promastigotes

    Directory of Open Access Journals (Sweden)

    José Andrés Morgado-Díaz

    2005-07-01

    Full Text Available Here we report the subcellular localization of an intracellular serine protease of 68 kDa in axenic promastigotes of Leishmania (Leishmania amazonensis, using subcellular fractionation, enzymatic assays, immunoblotting, and immunocytochemistry. All fractions were evaluated by transmission electron microscopy and the serine protease activity was measured during the cell fractionation procedure using a-N-r-tosyl-L-arginine methyl ester (L-TAME as substrate, phenylmethylsulphone fluoride (PMSF and L-1-tosylamino-2-phenylethylchloromethylketone (TPCK as specific inhibitors. The enzymatic activity was detected mainly in a membranous vesicular fraction (6.5-fold enrichment relative to the whole homogenate, but also in a crude plasma membrane fraction (2.0-fold. Analysis by SDS-PAGE gelatin under reducing conditions demonstrated that the major proteolytic activity was found in a 68 kDa protein in all fractions studied. A protein with identical molecular weight was also recognized in immunoblots by a polyclonal antibody against serine protease (anti-SP, with higher immunoreactivity in the vesicular fraction. Electron microscopic immunolocalization using the same polyclonal antibody showed the enzyme present at the cell surface, as well as in cytoplasmic membranous compartments of the parasite. Our findings indicate that the internal location of this serine protease in L. amazonensis is mainly restricted to the membranes of intracellular compartments resembling endocytic/exocytic elements.

  11. Improving accuracy of protein-protein interaction prediction by considering the converse problem for sequence representation

    Directory of Open Access Journals (Sweden)

    Wang Yong

    2011-10-01

    Full Text Available Abstract Background With the development of genome-sequencing technologies, protein sequences are readily obtained by translating the measured mRNAs. Therefore predicting protein-protein interactions from the sequences is of great demand. The reason lies in the fact that identifying protein-protein interactions is becoming a bottleneck for eventually understanding the functions of proteins, especially for those organisms barely characterized. Although a few methods have been proposed, the converse problem, if the features used extract sufficient and unbiased information from protein sequences, is almost untouched. Results In this study, we interrogate this problem theoretically by an optimization scheme. Motivated by the theoretical investigation, we find novel encoding methods for both protein sequences and protein pairs. Our new methods exploit sufficiently the information of protein sequences and reduce artificial bias and computational cost. Thus, it significantly outperforms the available methods regarding sensitivity, specificity, precision, and recall with cross-validation evaluation and reaches ~80% and ~90% accuracy in Escherichia coli and Saccharomyces cerevisiae respectively. Our findings here hold important implication for other sequence-based prediction tasks because representation of biological sequence is always the first step in computational biology. Conclusions By considering the converse problem, we propose new representation methods for both protein sequences and protein pairs. The results show that our method significantly improves the accuracy of protein-protein interaction predictions.

  12. Protein structure prediction using bee colony optimization metaheuristic

    DEFF Research Database (Denmark)

    Fonseca, Rasmus; Paluszewski, Martin; Winter, Pawel

    2010-01-01

    of the proteins structure, an energy potential and some optimization algorithm that ¿nds the structure with minimal energy. Bee Colony Optimization (BCO) is a relatively new approach to solving opti- mization problems based on the foraging behaviour of bees. Several variants of BCO have been suggested......Predicting the native structure of proteins is one of the most challenging problems in molecular biology. The goal is to determine the three-dimensional struc- ture from the one-dimensional amino acid sequence. De novo prediction algorithms seek to do this by developing a representation...... our BCO method to generate good solutions to the protein structure prediction problem. The results show that BCO generally ¿nds better solutions than simulated annealing which so far has been the metaheuristic of choice for this problem....

  13. DomPep--a general method for predicting modular domain-mediated protein-protein interactions.

    Directory of Open Access Journals (Sweden)

    Lei Li

    Full Text Available Protein-protein interactions (PPIs are frequently mediated by the binding of a modular domain in one protein to a short, linear peptide motif in its partner. The advent of proteomic methods such as peptide and protein arrays has led to the accumulation of a wealth of interaction data for modular interaction domains. Although several computational programs have been developed to predict modular domain-mediated PPI events, they are often restricted to a given domain type. We describe DomPep, a method that can potentially be used to predict PPIs mediated by any modular domains. DomPep combines proteomic data with sequence information to achieve high accuracy and high coverage in PPI prediction. Proteomic binding data were employed to determine a simple yet novel parameter Ligand-Binding Similarity which, in turn, is used to calibrate Domain Sequence Identity and Position-Weighted-Matrix distance, two parameters that are used in constructing prediction models. Moreover, DomPep can be used to predict PPIs for both domains with experimental binding data and those without. Using the PDZ and SH2 domain families as test cases, we show that DomPep can predict PPIs with accuracies superior to existing methods. To evaluate DomPep as a discovery tool, we deployed DomPep to identify interactions mediated by three human PDZ domains. Subsequent in-solution binding assays validated the high accuracy of DomPep in predicting authentic PPIs at the proteome scale. Because DomPep makes use of only interaction data and the primary sequence of a domain, it can be readily expanded to include other types of modular domains.

  14. Preclinical models used for immunogenicity prediction of therapeutic proteins.

    Science.gov (United States)

    Brinks, Vera; Weinbuch, Daniel; Baker, Matthew; Dean, Yann; Stas, Philippe; Kostense, Stefan; Rup, Bonita; Jiskoot, Wim

    2013-07-01

    All therapeutic proteins are potentially immunogenic. Antibodies formed against these drugs can decrease efficacy, leading to drastically increased therapeutic costs and in rare cases to serious and sometimes life threatening side-effects. Many efforts are therefore undertaken to develop therapeutic proteins with minimal immunogenicity. For this, immunogenicity prediction of candidate drugs during early drug development is essential. Several in silico, in vitro and in vivo models are used to predict immunogenicity of drug leads, to modify potentially immunogenic properties and to continue development of drug candidates with expected low immunogenicity. Despite the extensive use of these predictive models, their actual predictive value varies. Important reasons for this uncertainty are the limited/insufficient knowledge on the immune mechanisms underlying immunogenicity of therapeutic proteins, the fact that different predictive models explore different components of the immune system and the lack of an integrated clinical validation. In this review, we discuss the predictive models in use, summarize aspects of immunogenicity that these models predict and explore the merits and the limitations of each of the models.

  15. Predicting turns in proteins with a unified model.

    Directory of Open Access Journals (Sweden)

    Qi Song

    Full Text Available MOTIVATION: Turns are a critical element of the structure of a protein; turns play a crucial role in loops, folds, and interactions. Current prediction methods are well developed for the prediction of individual turn types, including α-turn, β-turn, and γ-turn, etc. However, for further protein structure and function prediction it is necessary to develop a uniform model that can accurately predict all types of turns simultaneously. RESULTS: In this study, we present a novel approach, TurnP, which offers the ability to investigate all the turns in a protein based on a unified model. The main characteristics of TurnP are: (i using newly exploited features of structural evolution information (secondary structure and shape string of protein based on structure homologies, (ii considering all types of turns in a unified model, and (iii practical capability of accurate prediction of all turns simultaneously for a query. TurnP utilizes predicted secondary structures and predicted shape strings, both of which have greater accuracy, based on innovative technologies which were both developed by our group. Then, sequence and structural evolution features, which are profile of sequence, profile of secondary structures and profile of shape strings are generated by sequence and structure alignment. When TurnP was validated on a non-redundant dataset (4,107 entries by five-fold cross-validation, we achieved an accuracy of 88.8% and a sensitivity of 71.8%, which exceeded the most state-of-the-art predictors of certain type of turn. Newly determined sequences, the EVA and CASP9 datasets were used as independent tests and the results we achieved were outstanding for turn predictions and confirmed the good performance of TurnP for practical applications.

  16. Monoterpene biosynthesis potential of plant subcellular compartments

    NARCIS (Netherlands)

    Dong, L.; Jongedijk, E.J.; Bouwmeester, H.J.; Krol, van der A.R.

    2016-01-01

    Subcellular monoterpene biosynthesis capacity based on local geranyl diphosphate (GDP) availability or locally boosted GDP production was determined for plastids, cytosol and mitochondria. A geraniol synthase (GES) was targeted to plastids, cytosol, or mitochondria. Transient expression in Nicotiana

  17. Prediction and Dissection of Protein-RNA Interactions by Molecular Descriptors.

    Science.gov (United States)

    Liu, Zhi-Ping; Chen, Luonan

    2016-01-01

    Protein-RNA interactions play crucial roles in numerous biological processes. However, detecting the interactions and binding sites between protein and RNA by traditional experiments is still time consuming and labor costing. Thus, it is of importance to develop bioinformatics methods for predicting protein-RNA interactions and binding sites. Accurate prediction of protein-RNA interactions and recognitions will highly benefit to decipher the interaction mechanisms between protein and RNA, as well as to improve the RNA-related protein engineering and drug design. In this work, we summarize the current bioinformatics strategies of predicting protein-RNA interactions and dissecting protein-RNA interaction mechanisms from local structure binding motifs. In particular, we focus on the feature-based machine learning methods, in which the molecular descriptors of protein and RNA are extracted and integrated as feature vectors of representing the interaction events and recognition residues. In addition, the available methods are classified and compared comprehensively. The molecular descriptors are expected to elucidate the binding mechanisms of protein-RNA interaction and reveal the functional implications from structural complementary perspective.

  18. Protein tyrosine phosphatases: regulatory mechanisms.

    NARCIS (Netherlands)

    den Hertog, J.; Ostman, A.; Bohmer, F.D.

    2008-01-01

    Protein-tyrosine phosphatases are tightly controlled by various mechanisms, ranging from differential expression in specific cell types to restricted subcellular localization, limited proteolysis, post-translational modifications affecting intrinsic catalytic activity, ligand binding and

  19. Biomechanics of subcellular structures by non-invasive Brillouin microscopy

    Science.gov (United States)

    Antonacci, Giuseppe; Braakman, Sietse

    2016-11-01

    Cellular biomechanics play a pivotal role in the pathophysiology of several diseases. Unfortunately, current methods to measure biomechanical properties are invasive and mostly limited to the surface of a cell. As a result, the mechanical behaviour of subcellular structures and organelles remains poorly characterised. Here, we show three-dimensional biomechanical images of single cells obtained with non-invasive, non-destructive Brillouin microscopy with an unprecedented spatial resolution. Our results quantify the longitudinal elastic modulus of subcellular structures. In particular, we found the nucleoli to be stiffer than both the nuclear envelope (p biomechanics and its role in pathophysiology.

  20. Scoring protein relationships in functional interaction networks predicted from sequence data.

    Directory of Open Access Journals (Sweden)

    Gaston K Mazandu

    Full Text Available UNLABELLED: The abundance of diverse biological data from various sources constitutes a rich source of knowledge, which has the power to advance our understanding of organisms. This requires computational methods in order to integrate and exploit these data effectively and elucidate local and genome wide functional connections between protein pairs, thus enabling functional inferences for uncharacterized proteins. These biological data are primarily in the form of sequences, which determine functions, although functional properties of a protein can often be predicted from just the domains it contains. Thus, protein sequences and domains can be used to predict protein pair-wise functional relationships, and thus contribute to the function prediction process of uncharacterized proteins in order to ensure that knowledge is gained from sequencing efforts. In this work, we introduce information-theoretic based approaches to score protein-protein functional interaction pairs predicted from protein sequence similarity and conserved protein signature matches. The proposed schemes are effective for data-driven scoring of connections between protein pairs. We applied these schemes to the Mycobacterium tuberculosis proteome to produce a homology-based functional network of the organism with a high confidence and coverage. We use the network for predicting functions of uncharacterised proteins. AVAILABILITY: Protein pair-wise functional relationship scores for Mycobacterium tuberculosis strain CDC1551 sequence data and python scripts to compute these scores are available at http://web.cbio.uct.ac.za/~gmazandu/scoringschemes.

  1. Localization of aggregating proteins in bacteria depends on the rate of addition

    Directory of Open Access Journals (Sweden)

    Karl eScheu

    2014-08-01

    Full Text Available Many proteins are observed to localize to specific subcellular regions within bacteria. Recent experiments have shown that proteins that have self-interactions that lead them to aggregate tend to localize to the poles. Theoretical modeling of the localization of aggregating protein within bacterial cell geometries shows that aggregates can spontaneously localize to the pole due to nucleoid occlusion. The resulting polar localization, whether it be to a single pole or to both was shown to depend on the rate of protein addition. Motivated by these predictions we selected a set of genes from E. coli, whose protein products have been reported to localize when tagged with GFP, and explored the dynamics of their localization. We induced protein expression from each gene at different rates and found that in all cases unipolar patterning is favored at low rates of expression whereas bipolar is favored at higher rates of expression. Our findings are consistent with the predictions of the model, suggesting that localization may be due to aggregation plus nucleoid occlusion. When we expressed GFP by itself under the same conditions, no localization was observed. These experiments highlight the potential importance of protein aggregation, nucleoid occlusion and rate of protein expression in driving polar localization of functional proteins in bacteria.

  2. MetaGO: Predicting Gene Ontology of Non-homologous Proteins Through Low-Resolution Protein Structure Prediction and Protein-Protein Network Mapping.

    Science.gov (United States)

    Zhang, Chengxin; Zheng, Wei; Freddolino, Peter L; Zhang, Yang

    2018-03-10

    Homology-based transferal remains the major approach to computational protein function annotations, but it becomes increasingly unreliable when the sequence identity between query and template decreases below 30%. We propose a novel pipeline, MetaGO, to deduce Gene Ontology attributes of proteins by combining sequence homology-based annotation with low-resolution structure prediction and comparison, and partner's homology-based protein-protein network mapping. The pipeline was tested on a large-scale set of 1000 non-redundant proteins from the CAFA3 experiment. Under the stringent benchmark conditions where templates with >30% sequence identity to the query are excluded, MetaGO achieves average F-measures of 0.487, 0.408, and 0.598, for Molecular Function, Biological Process, and Cellular Component, respectively, which are significantly higher than those achieved by other state-of-the-art function annotations methods. Detailed data analysis shows that the major advantage of the MetaGO lies in the new functional homolog detections from partner's homology-based network mapping and structure-based local and global structure alignments, the confidence scores of which can be optimally combined through logistic regression. These data demonstrate the power of using a hybrid model incorporating protein structure and interaction networks to deduce new functional insights beyond traditional sequence homology-based referrals, especially for proteins that lack homologous function templates. The MetaGO pipeline is available at http://zhanglab.ccmb.med.umich.edu/MetaGO/. Copyright © 2018. Published by Elsevier Ltd.

  3. Stringent DDI-based prediction of H. sapiens-M. tuberculosis H37Rv protein-protein interactions.

    Science.gov (United States)

    Zhou, Hufeng; Rezaei, Javad; Hugo, Willy; Gao, Shangzhi; Jin, Jingjing; Fan, Mengyuan; Yong, Chern-Han; Wozniak, Michal; Wong, Limsoon

    2013-01-01

    H. sapiens-M. tuberculosis H37Rv protein-protein interaction (PPI) data are very important information to illuminate the infection mechanism of M. tuberculosis H37Rv. But current H. sapiens-M. tuberculosis H37Rv PPI data are very scarce. This seriously limits the study of the interaction between this important pathogen and its host H. sapiens. Computational prediction of H. sapiens-M. tuberculosis H37Rv PPIs is an important strategy to fill in the gap. Domain-domain interaction (DDI) based prediction is one of the frequently used computational approaches in predicting both intra-species and inter-species PPIs. However, the performance of DDI-based host-pathogen PPI prediction has been rather limited. We develop a stringent DDI-based prediction approach with emphasis on (i) differences between the specific domain sequences on annotated regions of proteins under the same domain ID and (ii) calculation of the interaction strength of predicted PPIs based on the interacting residues in their interaction interfaces. We compare our stringent DDI-based approach to a conventional DDI-based approach for predicting PPIs based on gold standard intra-species PPIs and coherent informative Gene Ontology terms assessment. The assessment results show that our stringent DDI-based approach achieves much better performance in predicting PPIs than the conventional approach. Using our stringent DDI-based approach, we have predicted a small set of reliable H. sapiens-M. tuberculosis H37Rv PPIs which could be very useful for a variety of related studies. We also analyze the H. sapiens-M. tuberculosis H37Rv PPIs predicted by our stringent DDI-based approach using cellular compartment distribution analysis, functional category enrichment analysis and pathway enrichment analysis. The analyses support the validity of our prediction result. Also, based on an analysis of the H. sapiens-M. tuberculosis H37Rv PPI network predicted by our stringent DDI-based approach, we have discovered some

  4. Predicting membrane protein types by fusing composite protein sequence features into pseudo amino acid composition.

    Science.gov (United States)

    Hayat, Maqsood; Khan, Asifullah

    2011-02-21

    Membrane proteins are vital type of proteins that serve as channels, receptors, and energy transducers in a cell. Prediction of membrane protein types is an important research area in bioinformatics. Knowledge of membrane protein types provides some valuable information for predicting novel example of the membrane protein types. However, classification of membrane protein types can be both time consuming and susceptible to errors due to the inherent similarity of membrane protein types. In this paper, neural networks based membrane protein type prediction system is proposed. Composite protein sequence representation (CPSR) is used to extract the features of a protein sequence, which includes seven feature sets; amino acid composition, sequence length, 2 gram exchange group frequency, hydrophobic group, electronic group, sum of hydrophobicity, and R-group. Principal component analysis is then employed to reduce the dimensionality of the feature vector. The probabilistic neural network (PNN), generalized regression neural network, and support vector machine (SVM) are used as classifiers. A high success rate of 86.01% is obtained using SVM for the jackknife test. In case of independent dataset test, PNN yields the highest accuracy of 95.73%. These classifiers exhibit improved performance using other performance measures such as sensitivity, specificity, Mathew's correlation coefficient, and F-measure. The experimental results show that the prediction performance of the proposed scheme for classifying membrane protein types is the best reported, so far. This performance improvement may largely be credited to the learning capabilities of neural networks and the composite feature extraction strategy, which exploits seven different properties of protein sequences. The proposed Mem-Predictor can be accessed at http://111.68.99.218/Mem-Predictor. Copyright © 2010 Elsevier Ltd. All rights reserved.

  5. Semi-supervised drug-protein interaction prediction from heterogeneous biological spaces.

    Science.gov (United States)

    Xia, Zheng; Wu, Ling-Yun; Zhou, Xiaobo; Wong, Stephen T C

    2010-09-13

    Predicting drug-protein interactions from heterogeneous biological data sources is a key step for in silico drug discovery. The difficulty of this prediction task lies in the rarity of known drug-protein interactions and myriad unknown interactions to be predicted. To meet this challenge, a manifold regularization semi-supervised learning method is presented to tackle this issue by using labeled and unlabeled information which often generates better results than using the labeled data alone. Furthermore, our semi-supervised learning method integrates known drug-protein interaction network information as well as chemical structure and genomic sequence data. Using the proposed method, we predicted certain drug-protein interactions on the enzyme, ion channel, GPCRs, and nuclear receptor data sets. Some of them are confirmed by the latest publicly available drug targets databases such as KEGG. We report encouraging results of using our method for drug-protein interaction network reconstruction which may shed light on the molecular interaction inference and new uses of marketed drugs.

  6. Structural features that predict real-value fluctuations of globular proteins.

    Science.gov (United States)

    Jamroz, Michal; Kolinski, Andrzej; Kihara, Daisuke

    2012-05-01

    It is crucial to consider dynamics for understanding the biological function of proteins. We used a large number of molecular dynamics (MD) trajectories of nonhomologous proteins as references and examined static structural features of proteins that are most relevant to fluctuations. We examined correlation of individual structural features with fluctuations and further investigated effective combinations of features for predicting the real value of residue fluctuations using the support vector regression (SVR). It was found that some structural features have higher correlation than crystallographic B-factors with fluctuations observed in MD trajectories. Moreover, SVR that uses combinations of static structural features showed accurate prediction of fluctuations with an average Pearson's correlation coefficient of 0.669 and a root mean square error of 1.04 Å. This correlation coefficient is higher than the one observed in predictions by the Gaussian network model (GNM). An advantage of the developed method over the GNMs is that the former predicts the real value of fluctuation. The results help improve our understanding of relationships between protein structure and fluctuation. Furthermore, the developed method provides a convienient practial way to predict fluctuations of proteins using easily computed static structural features of proteins. Copyright © 2012 Wiley Periodicals, Inc.

  7. Tissue and subcellular localizations of 3H-cyclosporine A in mice

    International Nuclear Information System (INIS)

    Baeckman, L.; Brandt, I.; Appelkvist, E.-L.; Dallner, G.

    1988-01-01

    The tissue and subcellular localizations of 3 H-cyclosporine A after administration to mice were determined with whole-body autoradiography and scintillation counting of lipid extracts of tissues and subcellular fractions. The radioactivity was widely distributed in the body and the pattern of distribution after oral or parenteral administration was the same, except that tissue levels were generatlly lower after oral administration. Pretreatment of the animals with a diet containing cyclosporine A for 30 days before the injection of radioactive cyclosporine A did not change the pattern of distribution substantially. No significant radioactivity was found in the central nervous system, except for the choroidal plexus and the area postrema region of the brain. In pregnant mice no passage of radioactivity from the placentas to fetuses was observed after a single injection. 3 H-cyclosporine A and/or its metabolites showed a high affinity for the lympho-myeloid tissues, with a marked long-term retention in bone marrow and lymph nodes. There was massive excretion in the intestinal tract after parenteral administration, and the liver, bile, pancreas and salivary glands contained high levels of radioactivity. In the kidney radioactivity was confined to the outer zone of the outer kidney medulla. In liver homogenates no quantitatively significant binding of 3 H-cyclosporine A and/or its metabolites to cellular molecules such as proteins, DNA, phospho- or neutral lipids was found. After lipid extraction with organic solvents, almost all radioactivity was recovered in the organic phase. (author)

  8. Protein Sub-Nuclear Localization Prediction Using SVM and Pfam Domain Information

    Science.gov (United States)

    Kumar, Ravindra; Jain, Sohni; Kumari, Bandana; Kumar, Manish

    2014-01-01

    The nucleus is the largest and the highly organized organelle of eukaryotic cells. Within nucleus exist a number of pseudo-compartments, which are not separated by any membrane, yet each of them contains only a specific set of proteins. Understanding protein sub-nuclear localization can hence be an important step towards understanding biological functions of the nucleus. Here we have described a method, SubNucPred developed by us for predicting the sub-nuclear localization of proteins. This method predicts protein localization for 10 different sub-nuclear locations sequentially by combining presence or absence of unique Pfam domain and amino acid composition based SVM model. The prediction accuracy during leave-one-out cross-validation for centromeric proteins was 85.05%, for chromosomal proteins 76.85%, for nuclear speckle proteins 81.27%, for nucleolar proteins 81.79%, for nuclear envelope proteins 79.37%, for nuclear matrix proteins 77.78%, for nucleoplasm proteins 76.98%, for nuclear pore complex proteins 88.89%, for PML body proteins 75.40% and for telomeric proteins it was 83.33%. Comparison with other reported methods showed that SubNucPred performs better than existing methods. A web-server for predicting protein sub-nuclear localization named SubNucPred has been established at http://14.139.227.92/mkumar/subnucpred/. Standalone version of SubNucPred can also be downloaded from the web-server. PMID:24897370

  9. Protein secondary structure: category assignment and predictability

    DEFF Research Database (Denmark)

    Andersen, Claus A.; Bohr, Henrik; Brunak, Søren

    2001-01-01

    In the last decade, the prediction of protein secondary structure has been optimized using essentially one and the same assignment scheme known as DSSP. We present here a different scheme, which is more predictable. This scheme predicts directly the hydrogen bonds, which stabilize the secondary......-forward neural network with one hidden layer on a data set identical to the one used in earlier work....

  10. HitPredict version 4: comprehensive reliability scoring of physical protein-protein interactions from more than 100 species.

    Science.gov (United States)

    López, Yosvany; Nakai, Kenta; Patil, Ashwini

    2015-01-01

    HitPredict is a consolidated resource of experimentally identified, physical protein-protein interactions with confidence scores to indicate their reliability. The study of genes and their inter-relationships using methods such as network and pathway analysis requires high quality protein-protein interaction information. Extracting reliable interactions from most of the existing databases is challenging because they either contain only a subset of the available interactions, or a mixture of physical, genetic and predicted interactions. Automated integration of interactions is further complicated by varying levels of accuracy of database content and lack of adherence to standard formats. To address these issues, the latest version of HitPredict provides a manually curated dataset of 398 696 physical associations between 70 808 proteins from 105 species. Manual confirmation was used to resolve all issues encountered during data integration. For improved reliability assessment, this version combines a new score derived from the experimental information of the interactions with the original score based on the features of the interacting proteins. The combined interaction score performs better than either of the individual scores in HitPredict as well as the reliability score of another similar database. HitPredict provides a web interface to search proteins and visualize their interactions, and the data can be downloaded for offline analysis. Data usability has been enhanced by mapping protein identifiers across multiple reference databases. Thus, the latest version of HitPredict provides a significantly larger, more reliable and usable dataset of protein-protein interactions from several species for the study of gene groups. Database URL: http://hintdb.hgc.jp/htp. © The Author(s) 2015. Published by Oxford University Press.

  11. Protein-Based Urine Test Predicts Kidney Transplant Outcomes

    Science.gov (United States)

    ... News Releases News Release Thursday, August 22, 2013 Protein-based urine test predicts kidney transplant outcomes NIH- ... supporting development of noninvasive tests. Levels of a protein in the urine of kidney transplant recipients can ...

  12. Prediction of Protein Thermostability by an Efficient Neural Network Approach

    Directory of Open Access Journals (Sweden)

    Jalal Rezaeenour

    2016-10-01

    Full Text Available Introduction: Manipulation of protein stability is important for understanding the principles that govern protein thermostability, both in basic research and industrial applications. Various data mining techniques exist for prediction of thermostable proteins. Furthermore, ANN methods have attracted significant attention for prediction of thermostability, because they constitute an appropriate approach to mapping the non-linear input-output relationships and massive parallel computing. Method: An Extreme Learning Machine (ELM was applied to estimate thermal behavior of 1289 proteins. In the proposed algorithm, the parameters of ELM were optimized using a Genetic Algorithm (GA, which tuned a set of input variables, hidden layer biases, and input weights, to and enhance the prediction performance. The method was executed on a set of amino acids, yielding a total of 613 protein features. A number of feature selection algorithms were used to build subsets of the features. A total of 1289 protein samples and 613 protein features were calculated from UniProt database to understand features contributing to the enzymes’ thermostability and find out the main features that influence this valuable characteristic. Results:At the primary structure level, Gln, Glu and polar were the features that mostly contributed to protein thermostability. At the secondary structure level, Helix_S, Coil, and charged_Coil were the most important features affecting protein thermostability. These results suggest that the thermostability of proteins is mainly associated with primary structural features of the protein. According to the results, the influence of primary structure on the thermostabilty of a protein was more important than that of the secondary structure. It is shown that prediction accuracy of ELM (mean square error can improve dramatically using GA with error rates RMSE=0.004 and MAPE=0.1003. Conclusion: The proposed approach for forecasting problem

  13. SMYD3 interacts with HTLV-1 Tax and regulates subcellular localization of Tax.

    Science.gov (United States)

    Yamamoto, Keiyu; Ishida, Takaomi; Nakano, Kazumi; Yamagishi, Makoto; Yamochi, Tadanori; Tanaka, Yuetsu; Furukawa, Yoichi; Nakamura, Yusuke; Watanabe, Toshiki

    2011-01-01

    HTLV-1 Tax deregulates signal transduction pathways, transcription of genes, and cell cycle regulation of host cells, which is mainly mediated by its protein-protein interactions with host cellular factors. We previously reported an interaction of Tax with a histone methyltransferase (HMTase), SUV39H1. As the interaction was mediated by the SUV39H1 SET domain that is shared among HMTases, we examined the possibility of Tax interaction with another HMTase, SMYD3, which methylates histone H3 lysine 4 and activates transcription of genes, and studied the functional effects. Expression of endogenous SMYD3 in T cell lines and primary T cells was confirmed by immunoblotting analysis. Co-immuno-precipitaion assays and in vitro pull-down assay indicated interaction between Tax and SMYD3. The interaction was largely dependent on the C-terminal 180 amino acids of SMYD3, whereas the interacting domain of Tax was not clearly defined, although the N-terminal 108 amino acids were dispensable for the interaction. In the cotransfected cells, colocalization of Tax and SMYD3 was indicated in the cytoplasm or nuclei. Studies using mutants of Tax and SMYD3 suggested that SMYD3 dominates the subcellular localization of Tax. Reporter gene assays showed that nuclear factor-κB activation promoted by cytoplasmic Tax was enhanced by the presence of SMYD3, and attenuated by shRNA-mediated knockdown of SMYD3, suggesting an increased level of Tax localization in the cytoplasm by SMYD3. Our study revealed for the first time Tax-SMYD3 direct interaction, as well as apparent tethering of Tax by SMYD3, influencing the subcellular localization of Tax. Results suggested that SMYD3-mediated nucleocytoplasmic shuttling of Tax provides one base for the pleiotropic effects of Tax, which are mediated by the interaction of cellular proteins localized in the cytoplasm or nucleus. © 2010 Japanese Cancer Association.

  14. Protein complex prediction via dense subgraphs and false positive analysis.

    Directory of Open Access Journals (Sweden)

    Cecilia Hernandez

    Full Text Available Many proteins work together with others in groups called complexes in order to achieve a specific function. Discovering protein complexes is important for understanding biological processes and predict protein functions in living organisms. Large-scale and throughput techniques have made possible to compile protein-protein interaction networks (PPI networks, which have been used in several computational approaches for detecting protein complexes. Those predictions might guide future biologic experimental research. Some approaches are topology-based, where highly connected proteins are predicted to be complexes; some propose different clustering algorithms using partitioning, overlaps among clusters for networks modeled with unweighted or weighted graphs; and others use density of clusters and information based on protein functionality. However, some schemes still require much processing time or the quality of their results can be improved. Furthermore, most of the results obtained with computational tools are not accompanied by an analysis of false positives. We propose an effective and efficient mining algorithm for discovering highly connected subgraphs, which is our base for defining protein complexes. Our representation is based on transforming the PPI network into a directed acyclic graph that reduces the number of represented edges and the search space for discovering subgraphs. Our approach considers weighted and unweighted PPI networks. We compare our best alternative using PPI networks from Saccharomyces cerevisiae (yeast and Homo sapiens (human with state-of-the-art approaches in terms of clustering, biological metrics and execution times, as well as three gold standards for yeast and two for human. Furthermore, we analyze false positive predicted complexes searching the PDBe (Protein Data Bank in Europe database in order to identify matching protein complexes that have been purified and structurally characterized. Our analysis shows

  15. Hidden markov model for the prediction of transmembrane proteins using MATLAB.

    Science.gov (United States)

    Chaturvedi, Navaneet; Shanker, Sudhanshu; Singh, Vinay Kumar; Sinha, Dhiraj; Pandey, Paras Nath

    2011-01-01

    Since membranous proteins play a key role in drug targeting therefore transmembrane proteins prediction is active and challenging area of biological sciences. Location based prediction of transmembrane proteins are significant for functional annotation of protein sequences. Hidden markov model based method was widely applied for transmembrane topology prediction. Here we have presented a revised and a better understanding model than an existing one for transmembrane protein prediction. Scripting on MATLAB was built and compiled for parameter estimation of model and applied this model on amino acid sequence to know the transmembrane and its adjacent locations. Estimated model of transmembrane topology was based on TMHMM model architecture. Only 7 super states are defined in the given dataset, which were converted to 96 states on the basis of their length in sequence. Accuracy of the prediction of model was observed about 74 %, is a good enough in the area of transmembrane topology prediction. Therefore we have concluded the hidden markov model plays crucial role in transmembrane helices prediction on MATLAB platform and it could also be useful for drug discovery strategy. The database is available for free at bioinfonavneet@gmail.comvinaysingh@bhu.ac.in.

  16. Exploring the potential of 3D Zernike descriptors and SVM for protein-protein interface prediction.

    Science.gov (United States)

    Daberdaku, Sebastian; Ferrari, Carlo

    2018-02-06

    The correct determination of protein-protein interaction interfaces is important for understanding disease mechanisms and for rational drug design. To date, several computational methods for the prediction of protein interfaces have been developed, but the interface prediction problem is still not fully understood. Experimental evidence suggests that the location of binding sites is imprinted in the protein structure, but there are major differences among the interfaces of the various protein types: the characterising properties can vary a lot depending on the interaction type and function. The selection of an optimal set of features characterising the protein interface and the development of an effective method to represent and capture the complex protein recognition patterns are of paramount importance for this task. In this work we investigate the potential of a novel local surface descriptor based on 3D Zernike moments for the interface prediction task. Descriptors invariant to roto-translations are extracted from circular patches of the protein surface enriched with physico-chemical properties from the HQI8 amino acid index set, and are used as samples for a binary classification problem. Support Vector Machines are used as a classifier to distinguish interface local surface patches from non-interface ones. The proposed method was validated on 16 classes of proteins extracted from the Protein-Protein Docking Benchmark 5.0 and compared to other state-of-the-art protein interface predictors (SPPIDER, PrISE and NPS-HomPPI). The 3D Zernike descriptors are able to capture the similarity among patterns of physico-chemical and biochemical properties mapped on the protein surface arising from the various spatial arrangements of the underlying residues, and their usage can be easily extended to other sets of amino acid properties. The results suggest that the choice of a proper set of features characterising the protein interface is crucial for the interface prediction

  17. Constraint Logic Programming approach to protein structure prediction

    Directory of Open Access Journals (Sweden)

    Fogolari Federico

    2004-11-01

    Full Text Available Abstract Background The protein structure prediction problem is one of the most challenging problems in biological sciences. Many approaches have been proposed using database information and/or simplified protein models. The protein structure prediction problem can be cast in the form of an optimization problem. Notwithstanding its importance, the problem has very seldom been tackled by Constraint Logic Programming, a declarative programming paradigm suitable for solving combinatorial optimization problems. Results Constraint Logic Programming techniques have been applied to the protein structure prediction problem on the face-centered cube lattice model. Molecular dynamics techniques, endowed with the notion of constraint, have been also exploited. Even using a very simplified model, Constraint Logic Programming on the face-centered cube lattice model allowed us to obtain acceptable results for a few small proteins. As a test implementation their (known secondary structure and the presence of disulfide bridges are used as constraints. Simplified structures obtained in this way have been converted to all atom models with plausible structure. Results have been compared with a similar approach using a well-established technique as molecular dynamics. Conclusions The results obtained on small proteins show that Constraint Logic Programming techniques can be employed for studying protein simplified models, which can be converted into realistic all atom models. The advantage of Constraint Logic Programming over other, much more explored, methodologies, resides in the rapid software prototyping, in the easy way of encoding heuristics, and in exploiting all the advances made in this research area, e.g. in constraint propagation and its use for pruning the huge search space.

  18. Constraint Logic Programming approach to protein structure prediction.

    Science.gov (United States)

    Dal Palù, Alessandro; Dovier, Agostino; Fogolari, Federico

    2004-11-30

    The protein structure prediction problem is one of the most challenging problems in biological sciences. Many approaches have been proposed using database information and/or simplified protein models. The protein structure prediction problem can be cast in the form of an optimization problem. Notwithstanding its importance, the problem has very seldom been tackled by Constraint Logic Programming, a declarative programming paradigm suitable for solving combinatorial optimization problems. Constraint Logic Programming techniques have been applied to the protein structure prediction problem on the face-centered cube lattice model. Molecular dynamics techniques, endowed with the notion of constraint, have been also exploited. Even using a very simplified model, Constraint Logic Programming on the face-centered cube lattice model allowed us to obtain acceptable results for a few small proteins. As a test implementation their (known) secondary structure and the presence of disulfide bridges are used as constraints. Simplified structures obtained in this way have been converted to all atom models with plausible structure. Results have been compared with a similar approach using a well-established technique as molecular dynamics. The results obtained on small proteins show that Constraint Logic Programming techniques can be employed for studying protein simplified models, which can be converted into realistic all atom models. The advantage of Constraint Logic Programming over other, much more explored, methodologies, resides in the rapid software prototyping, in the easy way of encoding heuristics, and in exploiting all the advances made in this research area, e.g. in constraint propagation and its use for pruning the huge search space.

  19. Which clustering algorithm is better for predicting protein complexes?

    Directory of Open Access Journals (Sweden)

    Moschopoulos Charalampos N

    2011-12-01

    Full Text Available Abstract Background Protein-Protein interactions (PPI play a key role in determining the outcome of most cellular processes. The correct identification and characterization of protein interactions and the networks, which they comprise, is critical for understanding the molecular mechanisms within the cell. Large-scale techniques such as pull down assays and tandem affinity purification are used in order to detect protein interactions in an organism. Today, relatively new high-throughput methods like yeast two hybrid, mass spectrometry, microarrays, and phage display are also used to reveal protein interaction networks. Results In this paper we evaluated four different clustering algorithms using six different interaction datasets. We parameterized the MCL, Spectral, RNSC and Affinity Propagation algorithms and applied them to six PPI datasets produced experimentally by Yeast 2 Hybrid (Y2H and Tandem Affinity Purification (TAP methods. The predicted clusters, so called protein complexes, were then compared and benchmarked with already known complexes stored in published databases. Conclusions While results may differ upon parameterization, the MCL and RNSC algorithms seem to be more promising and more accurate at predicting PPI complexes. Moreover, they predict more complexes than other reviewed algorithms in absolute numbers. On the other hand the spectral clustering algorithm achieves the highest valid prediction rate in our experiments. However, it is nearly always outperformed by both RNSC and MCL in terms of the geometrical accuracy while it generates the fewest valid clusters than any other reviewed algorithm. This article demonstrates various metrics to evaluate the accuracy of such predictions as they are presented in the text below. Supplementary material can be found at: http://www.bioacademy.gr/bioinformatics/projects/ppireview.htm

  20. The 82-plex plasma protein signature that predicts increasing inflammation

    DEFF Research Database (Denmark)

    Tepel, Martin; Beck, Hans C; Tan, Qihua

    2015-01-01

    The objective of the study was to define the specific plasma protein signature that predicts the increase of the inflammation marker C-reactive protein from index day to next-day using proteome analysis and novel bioinformatics tools. We performed a prospective study of 91 incident kidney....... The prediction model selected and validated 82 plasma proteins which determined increased next-day C-reactive protein (area under receiver-operator-characteristics curve, 0.772; 95% confidence interval, 0.669 to 0.876; P signature (P ....001) was associated with observed increased next-day C-reactive protein. The 82-plex protein signature outperformed routine clinical procedures. The category-free net reclassification index improved with 82-plex plasma protein signature (total net reclassification index, 88.3%). Using the 82-plex plasma protein...

  1. Protein Secondary Structure Prediction Using Deep Convolutional Neural Fields.

    Science.gov (United States)

    Wang, Sheng; Peng, Jian; Ma, Jianzhu; Xu, Jinbo

    2016-01-11

    Protein secondary structure (SS) prediction is important for studying protein structure and function. When only the sequence (profile) information is used as input feature, currently the best predictors can obtain ~80% Q3 accuracy, which has not been improved in the past decade. Here we present DeepCNF (Deep Convolutional Neural Fields) for protein SS prediction. DeepCNF is a Deep Learning extension of Conditional Neural Fields (CNF), which is an integration of Conditional Random Fields (CRF) and shallow neural networks. DeepCNF can model not only complex sequence-structure relationship by a deep hierarchical architecture, but also interdependency between adjacent SS labels, so it is much more powerful than CNF. Experimental results show that DeepCNF can obtain ~84% Q3 accuracy, ~85% SOV score, and ~72% Q8 accuracy, respectively, on the CASP and CAMEO test proteins, greatly outperforming currently popular predictors. As a general framework, DeepCNF can be used to predict other protein structure properties such as contact number, disorder regions, and solvent accessibility.

  2. Combining modularity, conservation, and interactions of proteins significantly increases precision and coverage of protein function prediction

    Directory of Open Access Journals (Sweden)

    Sers Christine T

    2010-12-01

    Full Text Available Abstract Background While the number of newly sequenced genomes and genes is constantly increasing, elucidation of their function still is a laborious and time-consuming task. This has led to the development of a wide range of methods for predicting protein functions in silico. We report on a new method that predicts function based on a combination of information about protein interactions, orthology, and the conservation of protein networks in different species. Results We show that aggregation of these independent sources of evidence leads to a drastic increase in number and quality of predictions when compared to baselines and other methods reported in the literature. For instance, our method generates more than 12,000 novel protein functions for human with an estimated precision of ~76%, among which are 7,500 new functional annotations for 1,973 human proteins that previously had zero or only one function annotated. We also verified our predictions on a set of genes that play an important role in colorectal cancer (MLH1, PMS2, EPHB4 and could confirm more than 73% of them based on evidence in the literature. Conclusions The combination of different methods into a single, comprehensive prediction method infers thousands of protein functions for every species included in the analysis at varying, yet always high levels of precision and very good coverage.

  3. Prediction of beta-turns in proteins using the first-order Markov models.

    Science.gov (United States)

    Lin, Thy-Hou; Wang, Ging-Ming; Wang, Yen-Tseng

    2002-01-01

    We present a method based on the first-order Markov models for predicting simple beta-turns and loops containing multiple turns in proteins. Sequences of 338 proteins in a database are divided using the published turn criteria into the following three regions, namely, the turn, the boundary, and the nonturn ones. A transition probability matrix is constructed for either the turn or the nonturn region using the weighted transition probabilities computed for dipeptides identified from each region. There are two such matrices constructed for the boundary region since the transition probabilities for dipeptides immediately preceding or following a turn are different. The window used for scanning a protein sequence from amino (N-) to carboxyl (C-) terminal is a hexapeptide since the transition probability computed for a turn tetrapeptide is capped at both the N- and C- termini with a boundary transition probability indexed respectively from the two boundary transition matrices. A sum of the averaged product of the transition probabilities of all the hexapeptides involving each residue is computed. This is then weighted with a probability computed from assuming that all the hexapeptides are from the nonturn region to give the final prediction quantity. Both simple beta-turns and loops containing multiple turns in a protein are then identified by the rising of the prediction quantity computed. The performance of the prediction scheme or the percentage (%) of correct prediction is evaluated through computation of Matthews correlation coefficients for each protein predicted. It is found that the prediction method is capable of giving prediction results with better correlation between the percent of correct prediction and the Matthews correlation coefficients for a group of test proteins as compared with those predicted using some secondary structural prediction methods. The prediction accuracy for about 40% of proteins in the database or 50% of proteins in the test set is

  4. In Silico screening for functional candidates amongst hypothetical proteins

    Directory of Open Access Journals (Sweden)

    Sanderhoff May

    2009-09-01

    Full Text Available Abstract Background The definition of a hypothetical protein is a protein that is predicted to be expressed from an open reading frame, but for which there is no experimental evidence of translation. Hypothetical proteins constitute a substantial fraction of proteomes of human as well as of other eukaryotes. With the general belief that the majority of hypothetical proteins are the product of pseudogenes, it is essential to have a tool with the ability of pinpointing the minority of hypothetical proteins with a high probability of being expressed. Results Here, we present an in silico selection strategy where eukaryotic hypothetical proteins are sorted according to two criteria that can be reliably identified in silico: the presence of subcellular targeting signals and presence of characterized protein domains. To validate the selection strategy we applied it on a database of human hypothetical proteins dating to 2006 and compared the proteins predicted to be expressed by our selecting strategy, with their status in 2008. For the comparison we focused on mitochondrial proteins, since considerable amounts of research have focused on this field in between 2006 and 2008. Therefore, many proteins, defined as hypothetical in 2006, have later been characterized as mitochondrial. Conclusion Among the total amount of human proteins hypothetical in 2006, 21% have later been experimentally characterized and 6% of those have been shown to have a role in a mitochondrial context. In contrast, among the selected hypothetical proteins from the 2006 dataset, predicted by our strategy to have a mitochondrial role, 53-62% have later been experimentally characterized, and 85% of these have actually been assigned a role in mitochondria by 2008. Therefore our in silico selection strategy can be used to select the most promising candidates for subsequent in vitro and in vivo analyses.

  5. UK114, a YjgF/Yer057p/UK114 family protein highly conserved from bacteria to mammals, is localized in rat liver peroxisomes

    International Nuclear Information System (INIS)

    Antonenkov, Vasily D.; Ohlmeier, Steffen; Sormunen, Raija T.; Hiltunen, J. Kalervo

    2007-01-01

    Mammalian UK114 belongs to a highly conserved family of proteins with unknown functions. Although it is believed that UK114 is a cytosolic or mitochondrial protein there is no detailed study of its intracellular localization. Using analytical subcellular fractionation, electron microscopic colloidal gold technique, and two-dimensional gel electrophoresis of peroxisomal matrix proteins combined with mass spectrometric analysis we show here that a large portion of UK114 is present in rat liver peroxisomes. The peroxisomal UK114 is a soluble matrix protein and it is not inducible by the peroxisomal proliferator clofibrate. The data predict involvement of UK114 in peroxisomal metabolism

  6. Copper and zinc contamination in oysters: subcellular distribution and detoxification.

    Science.gov (United States)

    Wang, Wen-Xiong; Yang, Yubo; Guo, Xiaoyu; He, Mei; Guo, Feng; Ke, Caihuan

    2011-08-01

    Metal pollution levels in estuarine and coastal environments have been widely reported, but few documented reports exist of severe contamination in specific environments. Here, we report on a metal-contaminated estuary in Fujian Province, China, in which blue oysters (Crassostrea hongkongensis) and green oysters (Crassostrea angulata) were discovered to be contaminated with Cu and other metals. Extraordinarily high metal concentrations were found in the oysters collected from the estuary. Comparison with historical data suggests that the estuary has recently been contaminated with Cr, Cu, Ni, and Zn. Metal concentrations in blue oysters were as high as 1.4 and 2.4% of whole-body tissue dry wt for Cu and Zn, respectively. Cellular debris was the main subcellular fraction binding the metals, but metal-rich granules were important for Cr, Ni, and Pb. With increasing Cu accumulation, its partitioning into the cytosolic proteins decreased. In contrast, metallothionein-like proteins increased their importance in binding with Zn as tissue concentrations of Zn increased. In the most severely contaminated oysters, only a negligible fraction of their Cu and Zn was bound with the metal-sensitive fraction, which may explain the survival of oysters in such contaminated environments. Copyright © 2011 SETAC.

  7. Subcellular localization of cadmium in hyperaccumulator Populus ...

    African Journals Online (AJOL)

    In this study, subcellular localization of cadmium in hyperaccumulator grey poplar (Populus × canescens) was investigated by the transmission electron microscopy (TEM) method. Young Populus × canescens were grown and hydroponic experiments were conducted under four Cd2+ concentrations (10, 30, 50, and 70 μM) ...

  8. Nanoparticles-cell association predicted by protein corona fingerprints

    Science.gov (United States)

    Palchetti, S.; Digiacomo, L.; Pozzi, D.; Peruzzi, G.; Micarelli, E.; Mahmoudi, M.; Caracciolo, G.

    2016-06-01

    In a physiological environment (e.g., blood and interstitial fluids) nanoparticles (NPs) will bind proteins shaping a ``protein corona'' layer. The long-lived protein layer tightly bound to the NP surface is referred to as the hard corona (HC) and encodes information that controls NP bioactivity (e.g. cellular association, cellular signaling pathways, biodistribution, and toxicity). Decrypting this complex code has become a priority to predict the NP biological outcomes. Here, we use a library of 16 lipid NPs of varying size (Ø ~ 100-250 nm) and surface chemistry (unmodified and PEGylated) to investigate the relationships between NP physicochemical properties (nanoparticle size, aggregation state and surface charge), protein corona fingerprints (PCFs), and NP-cell association. We found out that none of the NPs' physicochemical properties alone was exclusively able to account for association with human cervical cancer cell line (HeLa). For the entire library of NPs, a total of 436 distinct serum proteins were detected. We developed a predictive-validation modeling that provides a means of assessing the relative significance of the identified corona proteins. Interestingly, a minor fraction of the HC, which consists of only 8 PCFs were identified as main promoters of NP association with HeLa cells. Remarkably, identified PCFs have several receptors with high level of expression on the plasma membrane of HeLa cells.In a physiological environment (e.g., blood and interstitial fluids) nanoparticles (NPs) will bind proteins shaping a ``protein corona'' layer. The long-lived protein layer tightly bound to the NP surface is referred to as the hard corona (HC) and encodes information that controls NP bioactivity (e.g. cellular association, cellular signaling pathways, biodistribution, and toxicity). Decrypting this complex code has become a priority to predict the NP biological outcomes. Here, we use a library of 16 lipid NPs of varying size (Ø ~ 100-250 nm) and surface

  9. PCI-SS: MISO dynamic nonlinear protein secondary structure prediction

    Directory of Open Access Journals (Sweden)

    Aboul-Magd Mohammed O

    2009-07-01

    Full Text Available Abstract Background Since the function of a protein is largely dictated by its three dimensional configuration, determining a protein's structure is of fundamental importance to biology. Here we report on a novel approach to determining the one dimensional secondary structure of proteins (distinguishing α-helices, β-strands, and non-regular structures from primary sequence data which makes use of Parallel Cascade Identification (PCI, a powerful technique from the field of nonlinear system identification. Results Using PSI-BLAST divergent evolutionary profiles as input data, dynamic nonlinear systems are built through a black-box approach to model the process of protein folding. Genetic algorithms (GAs are applied in order to optimize the architectural parameters of the PCI models. The three-state prediction problem is broken down into a combination of three binary sub-problems and protein structure classifiers are built using 2 layers of PCI classifiers. Careful construction of the optimization, training, and test datasets ensures that no homology exists between any training and testing data. A detailed comparison between PCI and 9 contemporary methods is provided over a set of 125 new protein chains guaranteed to be dissimilar to all training data. Unlike other secondary structure prediction methods, here a web service is developed to provide both human- and machine-readable interfaces to PCI-based protein secondary structure prediction. This server, called PCI-SS, is available at http://bioinf.sce.carleton.ca/PCISS. In addition to a dynamic PHP-generated web interface for humans, a Simple Object Access Protocol (SOAP interface is added to permit invocation of the PCI-SS service remotely. This machine-readable interface facilitates incorporation of PCI-SS into multi-faceted systems biology analysis pipelines requiring protein secondary structure information, and greatly simplifies high-throughput analyses. XML is used to represent the input

  10. Protein thermostability prediction within homologous families using temperature-dependent statistical potentials.

    Directory of Open Access Journals (Sweden)

    Fabrizio Pucci

    Full Text Available The ability to rationally modify targeted physical and biological features of a protein of interest holds promise in numerous academic and industrial applications and paves the way towards de novo protein design. In particular, bioprocesses that utilize the remarkable properties of enzymes would often benefit from mutants that remain active at temperatures that are either higher or lower than the physiological temperature, while maintaining the biological activity. Many in silico methods have been developed in recent years for predicting the thermodynamic stability of mutant proteins, but very few have focused on thermostability. To bridge this gap, we developed an algorithm for predicting the best descriptor of thermostability, namely the melting temperature Tm, from the protein's sequence and structure. Our method is applicable when the Tm of proteins homologous to the target protein are known. It is based on the design of several temperature-dependent statistical potentials, derived from datasets consisting of either mesostable or thermostable proteins. Linear combinations of these potentials have been shown to yield an estimation of the protein folding free energies at low and high temperatures, and the difference of these energies, a prediction of the melting temperature. This particular construction, that distinguishes between the interactions that contribute more than others to the stability at high temperatures and those that are more stabilizing at low T, gives better performances compared to the standard approach based on T-independent potentials which predict the thermal resistance from the thermodynamic stability. Our method has been tested on 45 proteins of known Tm that belong to 11 homologous families. The standard deviation between experimental and predicted Tm's is equal to 13.6°C in cross validation, and decreases to 8.3°C if the 6 worst predicted proteins are excluded. Possible extensions of our approach are discussed.

  11. Protein Function Prediction Based on Sequence and Structure Information

    KAUST Repository

    Smaili, Fatima Z.

    2016-05-25

    The number of available protein sequences in public databases is increasing exponentially. However, a significant fraction of these sequences lack functional annotation which is essential to our understanding of how biological systems and processes operate. In this master thesis project, we worked on inferring protein functions based on the primary protein sequence. In the approach we follow, 3D models are first constructed using I-TASSER. Functions are then deduced by structurally matching these predicted models, using global and local similarities, through three independent enzyme commission (EC) and gene ontology (GO) function libraries. The method was tested on 250 “hard” proteins, which lack homologous templates in both structure and function libraries. The results show that this method outperforms the conventional prediction methods based on sequence similarity or threading. Additionally, our method could be improved even further by incorporating protein-protein interaction information. Overall, the method we use provides an efficient approach for automated functional annotation of non-homologous proteins, starting from their sequence.

  12. Exploiting protein flexibility to predict the location of allosteric sites

    Directory of Open Access Journals (Sweden)

    Panjkovich Alejandro

    2012-10-01

    Full Text Available Abstract Background Allostery is one of the most powerful and common ways of regulation of protein activity. However, for most allosteric proteins identified to date the mechanistic details of allosteric modulation are not yet well understood. Uncovering common mechanistic patterns underlying allostery would allow not only a better academic understanding of the phenomena, but it would also streamline the design of novel therapeutic solutions. This relatively unexplored therapeutic potential and the putative advantages of allosteric drugs over classical active-site inhibitors fuel the attention allosteric-drug research is receiving at present. A first step to harness the regulatory potential and versatility of allosteric sites, in the context of drug-discovery and design, would be to detect or predict their presence and location. In this article, we describe a simple computational approach, based on the effect allosteric ligands exert on protein flexibility upon binding, to predict the existence and position of allosteric sites on a given protein structure. Results By querying the literature and a recently available database of allosteric sites, we gathered 213 allosteric proteins with structural information that we further filtered into a non-redundant set of 91 proteins. We performed normal-mode analysis and observed significant changes in protein flexibility upon allosteric-ligand binding in 70% of the cases. These results agree with the current view that allosteric mechanisms are in many cases governed by changes in protein dynamics caused by ligand binding. Furthermore, we implemented an approach that achieves 65% positive predictive value in identifying allosteric sites within the set of predicted cavities of a protein (stricter parameters set, 0.22 sensitivity, by combining the current analysis on dynamics with previous results on structural conservation of allosteric sites. We also analyzed four biological examples in detail, revealing

  13. Predicting nucleic acid binding interfaces from structural models of proteins.

    Science.gov (United States)

    Dror, Iris; Shazman, Shula; Mukherjee, Srayanta; Zhang, Yang; Glaser, Fabian; Mandel-Gutfreund, Yael

    2012-02-01

    The function of DNA- and RNA-binding proteins can be inferred from the characterization and accurate prediction of their binding interfaces. However, the main pitfall of various structure-based methods for predicting nucleic acid binding function is that they are all limited to a relatively small number of proteins for which high-resolution three-dimensional structures are available. In this study, we developed a pipeline for extracting functional electrostatic patches from surfaces of protein structural models, obtained using the I-TASSER protein structure predictor. The largest positive patches are extracted from the protein surface using the patchfinder algorithm. We show that functional electrostatic patches extracted from an ensemble of structural models highly overlap the patches extracted from high-resolution structures. Furthermore, by testing our pipeline on a set of 55 known nucleic acid binding proteins for which I-TASSER produces high-quality models, we show that the method accurately identifies the nucleic acids binding interface on structural models of proteins. Employing a combined patch approach we show that patches extracted from an ensemble of models better predicts the real nucleic acid binding interfaces compared with patches extracted from independent models. Overall, these results suggest that combining information from a collection of low-resolution structural models could be a valuable approach for functional annotation. We suggest that our method will be further applicable for predicting other functional surfaces of proteins with unknown structure. Copyright © 2011 Wiley Periodicals, Inc.

  14. Protein Secondary Structure Prediction Using AutoEncoder Network and Bayes Classifier

    Science.gov (United States)

    Wang, Leilei; Cheng, Jinyong

    2018-03-01

    Protein secondary structure prediction is belong to bioinformatics,and it's important in research area. In this paper, we propose a new prediction way of protein using bayes classifier and autoEncoder network. Our experiments show some algorithms including the construction of the model, the classification of parameters and so on. The data set is a typical CB513 data set for protein. In terms of accuracy, the method is the cross validation based on the 3-fold. Then we can get the Q3 accuracy. Paper results illustrate that the autoencoder network improved the prediction accuracy of protein secondary structure.

  15. Interaction between a plasma membrane-localized ankyrin-repeat protein ITN1 and a nuclear protein RTV1

    Energy Technology Data Exchange (ETDEWEB)

    Sakamoto, Hikaru [Department of Bioproduction, Faculty of Bioindustry, Tokyo University of Agriculture, 196 Yasaka, Abashiri-shi, Hokkaido 093-2422 (Japan); Sakata, Keiko; Kusumi, Kensuke [Department of Biology, Faculty of Sciences, Kyushu University, 6-10-1 Hakozaki, Higashi-ku, Fukuoka 812-8581 (Japan); Kojima, Mikiko; Sakakibara, Hitoshi [RIKEN Plant Science Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045 (Japan); Iba, Koh, E-mail: koibascb@kyushu-u.org [Department of Biology, Faculty of Sciences, Kyushu University, 6-10-1 Hakozaki, Higashi-ku, Fukuoka 812-8581 (Japan)

    2012-06-29

    Highlights: Black-Right-Pointing-Pointer ITN1, a plasma membrane ankyrin protein, interacts with a nuclear DNA-binding protein RTV1. Black-Right-Pointing-Pointer The nuclear transport of RTV1 is partially inhibited by interaction with ITN1. Black-Right-Pointing-Pointer RTV1 can promote the nuclear localization of ITN1. Black-Right-Pointing-Pointer Both overexpression of RTV1 and the lack of ITN1 increase salicylic acids sensitivity in plants. -- Abstract: The increased tolerance to NaCl 1 (ITN1) protein is a plasma membrane (PM)-localized protein involved in responses to NaCl stress in Arabidopsis. The predicted structure of ITN1 is composed of multiple transmembrane regions and an ankyrin-repeat domain that is known to mediate protein-protein interactions. To elucidate the molecular functions of ITN1, we searched for interacting partners using a yeast two-hybrid assay, and a nuclear-localized DNA-binding protein, RTV1, was identified as a candidate. Bimolecular fluorescence complementation analysis revealed that RTV1 interacted with ITN1 at the PM and nuclei in vivo. RTV1 tagged with red fluorescent protein localized to nuclei and ITN1 tagged with green fluorescent protein localized to PM; however, both proteins localized to both nuclei and the PM when co-expressed. These findings suggest that RTV1 and ITN1 regulate the subcellular localization of each other.

  16. PRODIGY : a web server for predicting the binding affinity of protein-protein complexes

    NARCIS (Netherlands)

    Xue, Li; Garcia Lopes Maia Rodrigues, João; Kastritis, Panagiotis L; Bonvin, Alexandre Mjj; Vangone, Anna

    2016-01-01

    Gaining insights into the structural determinants of protein-protein interactions holds the key for a deeper understanding of biological functions, diseases and development of therapeutics. An important aspect of this is the ability to accurately predict the binding strength for a given

  17. Prediction of Protein-Protein Interaction By Metasample-Based Sparse Representation

    Directory of Open Access Journals (Sweden)

    Xiuquan Du

    2015-01-01

    Full Text Available Protein-protein interactions (PPIs play key roles in many cellular processes such as transcription regulation, cell metabolism, and endocrine function. Understanding these interactions takes a great promotion to the pathogenesis and treatment of various diseases. A large amount of data has been generated by experimental techniques; however, most of these data are usually incomplete or noisy, and the current biological experimental techniques are always very time-consuming and expensive. In this paper, we proposed a novel method (metasample-based sparse representation classification, MSRC for PPIs prediction. A group of metasamples are extracted from the original training samples and then use the l1-regularized least square method to express a new testing sample as the linear combination of these metasamples. PPIs prediction is achieved by using a discrimination function defined in the representation coefficients. The MSRC is applied to PPIs dataset; it achieves 84.9% sensitivity, and 94.55% specificity, which is slightly lower than support vector machine (SVM and much higher than naive Bayes (NB, neural networks (NN, and k-nearest neighbor (KNN. The result shows that the MSRC is efficient for PPIs prediction.

  18. Optimization of ruminococcus albus endoglucanase cel5-cbm6 production in plants by incorporating an elp tag and targeting to different subcellular compartments

    Energy Technology Data Exchange (ETDEWEB)

    Pereira, E.O.; Menassa, R. [Western Ontario Univ., London, ON (Canada). Dept. of Biology; Agriculture and Agri-Food Canada, London, ON (Canada); Kolotilin, I. [Agriculture and Agri-Food Canada, London, ON (Canada)

    2009-07-01

    The production of biomass-based biofuel such as ethanol depends on the deconstruction of a cellulosic matrix and requires a variety of enzymes that hydrolyze glycosidic bonds to release fermentable sugars. Endoglucanases are one of most important groups of natural cellulosic hydrolytic enzymes that act on cellulose. In order to decrease ethanol production costs, the cost of producing cellulases must also be reduced. Genetically engineered transgenic plants are among the most economical systems for large scale production of recombinant proteins because of the large amount of enzymes that can be produced with minimal input. Cellulases present different levels of expression in different subcellular compartments. Cel5-CBM6 is a fused protein containing an endocellulase from Ruminococus albus (Cel5) and a cellulose binding domain (CBD) of Clostridium stercorarium. It accumulates in both the chloroplast and cytoplasm, but severe growth defects occur when expressed in the cytoplasm. Therefore, other subcellular compartments such as endoplasmic reticulum (ER) and vacuole must be evaluated and compared to determine the best co partment for production and activity of cellulases. Since elastin-like polypeptide (ELP) has also been shown to increase recombinant protein accumulation in plants, this study evaluated the effects of incorporating an ELP tag and a retrieval signal peptide on the expression levels of Cel5-CBM6.

  19. RSARF: Prediction of residue solvent accessibility from protein sequence using random forest method

    KAUST Repository

    Ganesan, Pugalenthi; Kandaswamy, Krishna Kumar Umar; Chou -, Kuochen; Vivekanandan, Saravanan; Kolatkar, Prasanna R.

    2012-01-01

    Prediction of protein structure from its amino acid sequence is still a challenging problem. The complete physicochemical understanding of protein folding is essential for the accurate structure prediction. Knowledge of residue solvent accessibility gives useful insights into protein structure prediction and function prediction. In this work, we propose a random forest method, RSARF, to predict residue accessible surface area from protein sequence information. The training and testing was performed using 120 proteins containing 22006 residues. For each residue, buried and exposed state was computed using five thresholds (0%, 5%, 10%, 25%, and 50%). The prediction accuracy for 0%, 5%, 10%, 25%, and 50% thresholds are 72.9%, 78.25%, 78.12%, 77.57% and 72.07% respectively. Further, comparison of RSARF with other methods using a benchmark dataset containing 20 proteins shows that our approach is useful for prediction of residue solvent accessibility from protein sequence without using structural information. The RSARF program, datasets and supplementary data are available at http://caps.ncbs.res.in/download/pugal/RSARF/. - See more at: http://www.eurekaselect.com/89216/article#sthash.pwVGFUjq.dpuf

  20. BetaTPred: prediction of beta-TURNS in a protein using statistical algorithms.

    Science.gov (United States)

    Kaur, Harpreet; Raghava, G P S

    2002-03-01

    beta-turns play an important role from a structural and functional point of view. beta-turns are the most common type of non-repetitive structures in proteins and comprise on average, 25% of the residues. In the past numerous methods have been developed to predict beta-turns in a protein. Most of these prediction methods are based on statistical approaches. In order to utilize the full potential of these methods, there is a need to develop a web server. This paper describes a web server called BetaTPred, developed for predicting beta-TURNS in a protein from its amino acid sequence. BetaTPred allows the user to predict turns in a protein using existing statistical algorithms. It also allows to predict different types of beta-TURNS e.g. type I, I', II, II', VI, VIII and non-specific. This server assists the users in predicting the consensus beta-TURNS in a protein. The server is accessible from http://imtech.res.in/raghava/betatpred/

  1. Subcellular compartmentalization of Cd and Zn in two bivalves. II. Significance of trophically available metal (TAM)

    Science.gov (United States)

    Wallace, W.G.; Luoma, S.N.

    2003-01-01

    This paper examines how the subcellular partitioning of Cd and Zn in the bivalves Macoma balthica and Potamocorbula amurensis may affect the trophic transfer of metal to predators. Results show that the partitioning of metals to organelles, 'enzymes' and metallothioneins (MT) comprise a subcellular compartment containing trophically available metal (TAM; i.e. metal trophically available to predators), and that because this partitioning varies with species, animal size and metal, TAM is similarly influenced. Clams from San Francisco Bay, California, were exposed for 14 d to 3.5 ??g 1-1 Cd and 20.5 ??g 1-1 Zn, including 109Cd and 65Zn as radiotracers, and were used in feeding experiments with grass shrimp Palaemon macrodatylus, or used to investigate the subcellular partitioning of metal. Grass shrimp fed Cd-contaminated P. amurensis absorbed ???60% of ingested Cd, which was in accordance with the partitioning of Cd to the bivalve's TAM compartment (i.e. Cd associated with organelles, 'enzymes' and MT); a similar relationship was found in previous studies with grass shrimp fed Cd-contaminated oligochaetes. Thus, TAM may be used as a tool to predict the trophic transfer of at least Cd. Subcellular fractionation revealed that ???34% of both the Cd and Zn accumulated by M. balthica was associated with TAM, while partitioning to TAM in P. amurensis was metal-dependent (???60% for TAM-Cd%, ???73% for TAM-Zn%). The greater TAM-Cd% of P. amurensis than M. balthica is due to preferential binding of Cd to MT and 'enzymes', while enhanced TAM-Zn% of P. amurensis results from a greater binding of Zn to organelles. TAM for most species-metal combinations was size-dependent, decreasing with increased clam size. Based on field data, it is estimated that of the 2 bivalves, P. amurensis poses the greater threat of Cd exposure to predators because of higher tissue concentrations and greater partitioning as TAM; exposure of Zn to predators would be similar between these species.

  2. Critical Features of Fragment Libraries for Protein Structure Prediction.

    Science.gov (United States)

    Trevizani, Raphael; Custódio, Fábio Lima; Dos Santos, Karina Baptista; Dardenne, Laurent Emmanuel

    2017-01-01

    The use of fragment libraries is a popular approach among protein structure prediction methods and has proven to substantially improve the quality of predicted structures. However, some vital aspects of a fragment library that influence the accuracy of modeling a native structure remain to be determined. This study investigates some of these features. Particularly, we analyze the effect of using secondary structure prediction guiding fragments selection, different fragments sizes and the effect of structural clustering of fragments within libraries. To have a clearer view of how these factors affect protein structure prediction, we isolated the process of model building by fragment assembly from some common limitations associated with prediction methods, e.g., imprecise energy functions and optimization algorithms, by employing an exact structure-based objective function under a greedy algorithm. Our results indicate that shorter fragments reproduce the native structure more accurately than the longer. Libraries composed of multiple fragment lengths generate even better structures, where longer fragments show to be more useful at the beginning of the simulations. The use of many different fragment sizes shows little improvement when compared to predictions carried out with libraries that comprise only three different fragment sizes. Models obtained from libraries built using only sequence similarity are, on average, better than those built with a secondary structure prediction bias. However, we found that the use of secondary structure prediction allows greater reduction of the search space, which is invaluable for prediction methods. The results of this study can be critical guidelines for the use of fragment libraries in protein structure prediction.

  3. Predicting protein complexes from weighted protein-protein interaction graphs with a novel unsupervised methodology: Evolutionary enhanced Markov clustering.

    Science.gov (United States)

    Theofilatos, Konstantinos; Pavlopoulou, Niki; Papasavvas, Christoforos; Likothanassis, Spiros; Dimitrakopoulos, Christos; Georgopoulos, Efstratios; Moschopoulos, Charalampos; Mavroudi, Seferina

    2015-03-01

    Proteins are considered to be the most important individual components of biological systems and they combine to form physical protein complexes which are responsible for certain molecular functions. Despite the large availability of protein-protein interaction (PPI) information, not much information is available about protein complexes. Experimental methods are limited in terms of time, efficiency, cost and performance constraints. Existing computational methods have provided encouraging preliminary results, but they phase certain disadvantages as they require parameter tuning, some of them cannot handle weighted PPI data and others do not allow a protein to participate in more than one protein complex. In the present paper, we propose a new fully unsupervised methodology for predicting protein complexes from weighted PPI graphs. The proposed methodology is called evolutionary enhanced Markov clustering (EE-MC) and it is a hybrid combination of an adaptive evolutionary algorithm and a state-of-the-art clustering algorithm named enhanced Markov clustering. EE-MC was compared with state-of-the-art methodologies when applied to datasets from the human and the yeast Saccharomyces cerevisiae organisms. Using public available datasets, EE-MC outperformed existing methodologies (in some datasets the separation metric was increased by 10-20%). Moreover, when applied to new human datasets its performance was encouraging in the prediction of protein complexes which consist of proteins with high functional similarity. In specific, 5737 protein complexes were predicted and 72.58% of them are enriched for at least one gene ontology (GO) function term. EE-MC is by design able to overcome intrinsic limitations of existing methodologies such as their inability to handle weighted PPI networks, their constraint to assign every protein in exactly one cluster and the difficulties they face concerning the parameter tuning. This fact was experimentally validated and moreover, new

  4. Integration of relational and hierarchical network information for protein function prediction

    Directory of Open Access Journals (Sweden)

    Jiang Xiaoyu

    2008-08-01

    Full Text Available Abstract Background In the current climate of high-throughput computational biology, the inference of a protein's function from related measurements, such as protein-protein interaction relations, has become a canonical task. Most existing technologies pursue this task as a classification problem, on a term-by-term basis, for each term in a database, such as the Gene Ontology (GO database, a popular rigorous vocabulary for biological functions. However, ontology structures are essentially hierarchies, with certain top to bottom annotation rules which protein function predictions should in principle follow. Currently, the most common approach to imposing these hierarchical constraints on network-based classifiers is through the use of transitive closure to predictions. Results We propose a probabilistic framework to integrate information in relational data, in the form of a protein-protein interaction network, and a hierarchically structured database of terms, in the form of the GO database, for the purpose of protein function prediction. At the heart of our framework is a factorization of local neighborhood information in the protein-protein interaction network across successive ancestral terms in the GO hierarchy. We introduce a classifier within this framework, with computationally efficient implementation, that produces GO-term predictions that naturally obey a hierarchical 'true-path' consistency from root to leaves, without the need for further post-processing. Conclusion A cross-validation study, using data from the yeast Saccharomyces cerevisiae, shows our method offers substantial improvements over both standard 'guilt-by-association' (i.e., Nearest-Neighbor and more refined Markov random field methods, whether in their original form or when post-processed to artificially impose 'true-path' consistency. Further analysis of the results indicates that these improvements are associated with increased predictive capabilities (i.e., increased

  5. Multiple-Localization and Hub Proteins

    Science.gov (United States)

    Ota, Motonori; Gonja, Hideki; Koike, Ryotaro; Fukuchi, Satoshi

    2016-01-01

    Protein-protein interactions are fundamental for all biological phenomena, and protein-protein interaction networks provide a global view of the interactions. The hub proteins, with many interaction partners, play vital roles in the networks. We investigated the subcellular localizations of proteins in the human network, and found that the ones localized in multiple subcellular compartments, especially the nucleus/cytoplasm proteins (NCP), the cytoplasm/cell membrane proteins (CMP), and the nucleus/cytoplasm/cell membrane proteins (NCMP), tend to be hubs. Examinations of keywords suggested that among NCP, those related to post-translational modifications and transcription functions are the major contributors to the large number of interactions. These types of proteins are characterized by a multi-domain architecture and intrinsic disorder. A survey of the typical hub proteins with prominent numbers of interaction partners in the type revealed that most are either transcription factors or co-regulators involved in signaling pathways. They translocate from the cytoplasm to the nucleus, triggered by the phosphorylation and/or ubiquitination of intrinsically disordered regions. Among CMP and NCMP, the contributors to the numerous interactions are related to either kinase or ubiquitin ligase activity. Many of them reside on the cytoplasmic side of the cell membrane, and act as the upstream regulators of signaling pathways. Overall, these hub proteins function to transfer external signals to the nucleus, through the cell membrane and the cytoplasm. Our analysis suggests that multiple-localization is a crucial concept to characterize groups of hub proteins and their biological functions in cellular information processing. PMID:27285823

  6. Prediction of RNA-Binding Proteins by Voting Systems

    Directory of Open Access Journals (Sweden)

    C. R. Peng

    2011-01-01

    Full Text Available It is important to identify which proteins can interact with RNA for the purpose of protein annotation, since interactions between RNA and proteins influence the structure of the ribosome and play important roles in gene expression. This paper tries to identify proteins that can interact with RNA using voting systems. Firstly through Weka, 34 learning algorithms are chosen for investigation. Then simple majority voting system (SMVS is used for the prediction of RNA-binding proteins, achieving average ACC (overall prediction accuracy value of 79.72% and MCC (Matthew’s correlation coefficient value of 59.77% for the independent testing dataset. Then mRMR (minimum redundancy maximum relevance strategy is used, which is transferred into algorithm selection. In addition, the MCC value of each classifier is assigned to be the weight of the classifier’s vote. As a result, best average MCC values are attained when 22 algorithms are selected and integrated through weighted votes, which are 64.70% for the independent testing dataset, and ACC value is 82.04% at this moment.

  7. Seasonal variations in hepatic Cd and Cu concentrations and in the sub-cellular distribution of these metals in juvenile yellow perch (Perca flavescens)

    International Nuclear Information System (INIS)

    Kraemer, Lisa D.; Campbell, Peter G.C.; Hare, Landis

    2006-01-01

    Temporal fluctuations in metal (Cd and Cu) concentrations were monitored over four months (May to August) in the liver of juvenile yellow perch (Perca flavescens) sampled from four lakes situated along a metal concentration gradient in northwestern Quebec: Lake Opasatica (reference lake, low metal concentrations), Lake Vaudray (moderate metal concentrations) and lakes Osisko and Dufault (high metal levels). The objectives of this study were to determine if hepatic metal concentrations and metal-handling strategies at the sub-cellular level varied seasonally. Our results showed that Cd and Cu concentrations varied most, in both absolute and relative values, in fish with the highest hepatic metal concentrations, whereas fish sampled from the reference lake did not show any significant variation. To examine the sub-cellular partitioning of these two metals, we used a differential centrifugation technique that allowed the separation of cellular debris, metal detoxified fractions (heat-stable proteins such as metallothionein) and metal sensitive fractions (heat-denaturable proteins (HDP) and organelles). Whereas Cd concentrations in organelle and HDP fractions were maintained at low concentrations in perch from Lakes Opasatica and Vaudray, concentrations in these sensitive fractions were higher and more variable in perch from Lakes Dufault and Osisko, suggesting that there may be some liver dysfunction in these two fish populations. Similarly, Cu concentrations in these sensitive fractions were higher and more variable in perch from the two most Cu-contaminated lakes (Dufault and Osisko) than in perch from the other two lakes, suggesting a breakdown of homeostatic control over this metal. These results suggest not only that metal concentrations vary seasonally, but also that concentrations vary most in fish from contaminated sites. Furthermore, at the sub-cellular level, homeostatic control of metal concentrations in metal-sensitive fractions is difficult to maintain in

  8. Fast dynamics perturbation analysis for prediction of protein functional sites

    Directory of Open Access Journals (Sweden)

    Cohn Judith D

    2008-01-01

    Full Text Available Abstract Background We present a fast version of the dynamics perturbation analysis (DPA algorithm to predict functional sites in protein structures. The original DPA algorithm finds regions in proteins where interactions cause a large change in the protein conformational distribution, as measured using the relative entropy Dx. Such regions are associated with functional sites. Results The Fast DPA algorithm, which accelerates DPA calculations, is motivated by an empirical observation that Dx in a normal-modes model is highly correlated with an entropic term that only depends on the eigenvalues of the normal modes. The eigenvalues are accurately estimated using first-order perturbation theory, resulting in a N-fold reduction in the overall computational requirements of the algorithm, where N is the number of residues in the protein. The performance of the original and Fast DPA algorithms was compared using protein structures from a standard small-molecule docking test set. For nominal implementations of each algorithm, top-ranked Fast DPA predictions overlapped the true binding site 94% of the time, compared to 87% of the time for original DPA. In addition, per-protein recall statistics (fraction of binding-site residues that are among predicted residues were slightly better for Fast DPA. On the other hand, per-protein precision statistics (fraction of predicted residues that are among binding-site residues were slightly better using original DPA. Overall, the performance of Fast DPA in predicting ligand-binding-site residues was comparable to that of the original DPA algorithm. Conclusion Compared to the original DPA algorithm, the decreased run time with comparable performance makes Fast DPA well-suited for implementation on a web server and for high-throughput analysis.

  9. BIPS: BIANA Interolog Prediction Server. A tool for protein-protein interaction inference.

    Science.gov (United States)

    Garcia-Garcia, Javier; Schleker, Sylvia; Klein-Seetharaman, Judith; Oliva, Baldo

    2012-07-01

    Protein-protein interactions (PPIs) play a crucial role in biology, and high-throughput experiments have greatly increased the coverage of known interactions. Still, identification of complete inter- and intraspecies interactomes is far from being complete. Experimental data can be complemented by the prediction of PPIs within an organism or between two organisms based on the known interactions of the orthologous genes of other organisms (interologs). Here, we present the BIANA (Biologic Interactions and Network Analysis) Interolog Prediction Server (BIPS), which offers a web-based interface to facilitate PPI predictions based on interolog information. BIPS benefits from the capabilities of the framework BIANA to integrate the several PPI-related databases. Additional metadata can be used to improve the reliability of the predicted interactions. Sensitivity and specificity of the server have been calculated using known PPIs from different interactomes using a leave-one-out approach. The specificity is between 72 and 98%, whereas sensitivity varies between 1 and 59%, depending on the sequence identity cut-off used to calculate similarities between sequences. BIPS is freely accessible at http://sbi.imim.es/BIPS.php.

  10. Integrative approaches to the prediction of protein functions based on the feature selection

    Directory of Open Access Journals (Sweden)

    Lee Hyunju

    2009-12-01

    Full Text Available Abstract Background Protein function prediction has been one of the most important issues in functional genomics. With the current availability of various genomic data sets, many researchers have attempted to develop integration models that combine all available genomic data for protein function prediction. These efforts have resulted in the improvement of prediction quality and the extension of prediction coverage. However, it has also been observed that integrating more data sources does not always increase the prediction quality. Therefore, selecting data sources that highly contribute to the protein function prediction has become an important issue. Results We present systematic feature selection methods that assess the contribution of genome-wide data sets to predict protein functions and then investigate the relationship between genomic data sources and protein functions. In this study, we use ten different genomic data sources in Mus musculus, including: protein-domains, protein-protein interactions, gene expressions, phenotype ontology, phylogenetic profiles and disease data sources to predict protein functions that are labelled with Gene Ontology (GO terms. We then apply two approaches to feature selection: exhaustive search feature selection using a kernel based logistic regression (KLR, and a kernel based L1-norm regularized logistic regression (KL1LR. In the first approach, we exhaustively measure the contribution of each data set for each function based on its prediction quality. In the second approach, we use the estimated coefficients of features as measures of contribution of data sources. Our results show that the proposed methods improve the prediction quality compared to the full integration of all data sources and other filter-based feature selection methods. We also show that contributing data sources can differ depending on the protein function. Furthermore, we observe that highly contributing data sets can be similar among

  11. A probabilistic fragment-based protein structure prediction algorithm.

    Directory of Open Access Journals (Sweden)

    David Simoncini

    Full Text Available Conformational sampling is one of the bottlenecks in fragment-based protein structure prediction approaches. They generally start with a coarse-grained optimization where mainchain atoms and centroids of side chains are considered, followed by a fine-grained optimization with an all-atom representation of proteins. It is during this coarse-grained phase that fragment-based methods sample intensely the conformational space. If the native-like region is sampled more, the accuracy of the final all-atom predictions may be improved accordingly. In this work we present EdaFold, a new method for fragment-based protein structure prediction based on an Estimation of Distribution Algorithm. Fragment-based approaches build protein models by assembling short fragments from known protein structures. Whereas the probability mass functions over the fragment libraries are uniform in the usual case, we propose an algorithm that learns from previously generated decoys and steers the search toward native-like regions. A comparison with Rosetta AbInitio protocol shows that EdaFold is able to generate models with lower energies and to enhance the percentage of near-native coarse-grained decoys on a benchmark of [Formula: see text] proteins. The best coarse-grained models produced by both methods were refined into all-atom models and used in molecular replacement. All atom decoys produced out of EdaFold's decoy set reach high enough accuracy to solve the crystallographic phase problem by molecular replacement for some test proteins. EdaFold showed a higher success rate in molecular replacement when compared to Rosetta. Our study suggests that improving low resolution coarse-grained decoys allows computational methods to avoid subsequent sampling issues during all-atom refinement and to produce better all-atom models. EdaFold can be downloaded from http://www.riken.jp/zhangiru/software.html [corrected].

  12. G-LoSA for Prediction of Protein-Ligand Binding Sites and Structures.

    Science.gov (United States)

    Lee, Hui Sun; Im, Wonpil

    2017-01-01

    Recent advances in high-throughput structure determination and computational protein structure prediction have significantly enriched the universe of protein structure. However, there is still a large gap between the number of available protein structures and that of proteins with annotated function in high accuracy. Computational structure-based protein function prediction has emerged to reduce this knowledge gap. The identification of a ligand binding site and its structure is critical to the determination of a protein's molecular function. We present a computational methodology for predicting small molecule ligand binding site and ligand structure using G-LoSA, our protein local structure alignment and similarity measurement tool. All the computational procedures described here can be easily implemented using G-LoSA Toolkit, a package of standalone software programs and preprocessed PDB structure libraries. G-LoSA and G-LoSA Toolkit are freely available to academic users at http://compbio.lehigh.edu/GLoSA . We also illustrate a case study to show the potential of our template-based approach harnessing G-LoSA for protein function prediction.

  13. Predicting Protein Function via Semantic Integration of Multiple Networks.

    Science.gov (United States)

    Yu, Guoxian; Fu, Guangyuan; Wang, Jun; Zhu, Hailong

    2016-01-01

    Determining the biological functions of proteins is one of the key challenges in the post-genomic era. The rapidly accumulated large volumes of proteomic and genomic data drives to develop computational models for automatically predicting protein function in large scale. Recent approaches focus on integrating multiple heterogeneous data sources and they often get better results than methods that use single data source alone. In this paper, we investigate how to integrate multiple biological data sources with the biological knowledge, i.e., Gene Ontology (GO), for protein function prediction. We propose a method, called SimNet, to Semantically integrate multiple functional association Networks derived from heterogenous data sources. SimNet firstly utilizes GO annotations of proteins to capture the semantic similarity between proteins and introduces a semantic kernel based on the similarity. Next, SimNet constructs a composite network, obtained as a weighted summation of individual networks, and aligns the network with the kernel to get the weights assigned to individual networks. Then, it applies a network-based classifier on the composite network to predict protein function. Experiment results on heterogenous proteomic data sources of Yeast, Human, Mouse, and Fly show that, SimNet not only achieves better (or comparable) results than other related competitive approaches, but also takes much less time. The Matlab codes of SimNet are available at https://sites.google.com/site/guoxian85/simnet.

  14. Validation of Molecular Dynamics Simulations for Prediction of Three-Dimensional Structures of Small Proteins.

    Science.gov (United States)

    Kato, Koichi; Nakayoshi, Tomoki; Fukuyoshi, Shuichi; Kurimoto, Eiji; Oda, Akifumi

    2017-10-12

    Although various higher-order protein structure prediction methods have been developed, almost all of them were developed based on the three-dimensional (3D) structure information of known proteins. Here we predicted the short protein structures by molecular dynamics (MD) simulations in which only Newton's equations of motion were used and 3D structural information of known proteins was not required. To evaluate the ability of MD simulationto predict protein structures, we calculated seven short test protein (10-46 residues) in the denatured state and compared their predicted and experimental structures. The predicted structure for Trp-cage (20 residues) was close to the experimental structure by 200-ns MD simulation. For proteins shorter or longer than Trp-cage, root-mean square deviation values were larger than those for Trp-cage. However, secondary structures could be reproduced by MD simulations for proteins with 10-34 residues. Simulations by replica exchange MD were performed, but the results were similar to those from normal MD simulations. These results suggest that normal MD simulations can roughly predict short protein structures and 200-ns simulations are frequently sufficient for estimating the secondary structures of protein (approximately 20 residues). Structural prediction method using only fundamental physical laws are useful for investigating non-natural proteins, such as primitive proteins and artificial proteins for peptide-based drug delivery systems.

  15. Feature-Based and String-Based Models for Predicting RNA-Protein Interaction

    Directory of Open Access Journals (Sweden)

    Donald Adjeroh

    2018-03-01

    Full Text Available In this work, we study two approaches for the problem of RNA-Protein Interaction (RPI. In the first approach, we use a feature-based technique by combining extracted features from both sequences and secondary structures. The feature-based approach enhanced the prediction accuracy as it included much more available information about the RNA-protein pairs. In the second approach, we apply search algorithms and data structures to extract effective string patterns for prediction of RPI, using both sequence information (protein and RNA sequences, and structure information (protein and RNA secondary structures. This led to different string-based models for predicting interacting RNA-protein pairs. We show results that demonstrate the effectiveness of the proposed approaches, including comparative results against leading state-of-the-art methods.

  16. Extraction of Protein-Protein Interaction from Scientific Articles by Predicting Dominant Keywords.

    Science.gov (United States)

    Koyabu, Shun; Phan, Thi Thanh Thuy; Ohkawa, Takenao

    2015-01-01

    For the automatic extraction of protein-protein interaction information from scientific articles, a machine learning approach is useful. The classifier is generated from training data represented using several features to decide whether a protein pair in each sentence has an interaction. Such a specific keyword that is directly related to interaction as "bind" or "interact" plays an important role for training classifiers. We call it a dominant keyword that affects the capability of the classifier. Although it is important to identify the dominant keywords, whether a keyword is dominant depends on the context in which it occurs. Therefore, we propose a method for predicting whether a keyword is dominant for each instance. In this method, a keyword that derives imbalanced classification results is tentatively assumed to be a dominant keyword initially. Then the classifiers are separately trained from the instance with and without the assumed dominant keywords. The validity of the assumed dominant keyword is evaluated based on the classification results of the generated classifiers. The assumption is updated by the evaluation result. Repeating this process increases the prediction accuracy of the dominant keyword. Our experimental results using five corpora show the effectiveness of our proposed method with dominant keyword prediction.

  17. Evaluation of multiple protein docking structures using correctly predicted pairwise subunits

    Directory of Open Access Journals (Sweden)

    Esquivel-Rodríguez Juan

    2012-03-01

    Full Text Available Abstract Background Many functionally important proteins in a cell form complexes with multiple chains. Therefore, computational prediction of multiple protein complexes is an important task in bioinformatics. In the development of multiple protein docking methods, it is important to establish a metric for evaluating prediction results in a reasonable and practical fashion. However, since there are only few works done in developing methods for multiple protein docking, there is no study that investigates how accurate structural models of multiple protein complexes should be to allow scientists to gain biological insights. Methods We generated a series of predicted models (decoys of various accuracies by our multiple protein docking pipeline, Multi-LZerD, for three multi-chain complexes with 3, 4, and 6 chains. We analyzed the decoys in terms of the number of correctly predicted pair conformations in the decoys. Results and conclusion We found that pairs of chains with the correct mutual orientation exist even in the decoys with a large overall root mean square deviation (RMSD to the native. Therefore, in addition to a global structure similarity measure, such as the global RMSD, the quality of models for multiple chain complexes can be better evaluated by using the local measurement, the number of chain pairs with correct mutual orientation. We termed the fraction of correctly predicted pairs (RMSD at the interface of less than 4.0Å as fpair and propose to use it for evaluation of the accuracy of multiple protein docking.

  18. Fast computational methods for predicting protein structure from primary amino acid sequence

    Science.gov (United States)

    Agarwal, Pratul Kumar [Knoxville, TN

    2011-07-19

    The present invention provides a method utilizing primary amino acid sequence of a protein, energy minimization, molecular dynamics and protein vibrational modes to predict three-dimensional structure of a protein. The present invention also determines possible intermediates in the protein folding pathway. The present invention has important applications to the design of novel drugs as well as protein engineering. The present invention predicts the three-dimensional structure of a protein independent of size of the protein, overcoming a significant limitation in the prior art.

  19. Predicting Secretory Proteins with SignalP

    DEFF Research Database (Denmark)

    Nielsen, Henrik

    2017-01-01

    SignalP is the currently most widely used program for prediction of signal peptides from amino acid sequences. Proteins with signal peptides are targeted to the secretory pathway, but are not necessarily secreted. After a brief introduction to the biology of signal peptides and the history...

  20. Lipase genes in Mucor circinelloides: identification, sub-cellular location, phylogenetic analysis and expression profiling during growth and lipid accumulation.

    Science.gov (United States)

    Zan, Xinyi; Tang, Xin; Chu, Linfang; Zhao, Lina; Chen, Haiqin; Chen, Yong Q; Chen, Wei; Song, Yuanda

    2016-10-01

    Lipases or triacylglycerol hydrolases are widely spread in nature and are particularly common in the microbial world. The filamentous fungus Mucor circinelloides is a potential lipase producer, as it grows well in triacylglycerol-contained culture media. So far only one lipase from M. circinelloides has been characterized, while the majority of lipases remain unknown in this fungus. In the present study, 47 potential lipase genes in M. circinelloides WJ11 and 30 potential lipase genes in M. circinelloides CBS 277.49 were identified by extensive bioinformatics analysis. An overview of these lipases is presented, including several characteristics, sub-cellular location, phylogenetic analysis and expression profiling of the lipase genes during growth and lipid accumulation. All of these proteins contained the consensus sequence for a classical lipase (GXSXG motif) and were divided into four types including α/β-hydrolase_1, α/β-hydrolase_3, class_3 and GDSL lipase (GDSL) based on gene annotations. Phylogenetic analyses revealed that class_3 family and α/β-hydrolase_3 family were the conserved lipase family in M. circinelloides. Additionally, some lipases also contained a typical acyltransferase motif of H-(X) 4-D, and these lipases may play a dual role in lipid metabolism, catalyzing both lipid hydrolysis and transacylation reactions. The differential expression of all lipase genes were confirmed by quantitative real-time PCR, and the expression profiling were analyzed to predict the possible biological roles of these lipase genes in lipid metabolism in M. circinelloides. We preliminarily hypothesized that lipases may be involved in triacylglycerol degradation, phospholipid synthesis and beta-oxidation. Moreover, the results of sub-cellular localization, the presence of signal peptide and transcriptional analyses of lipase genes indicated that four lipase in WJ11 most likely belong to extracellular lipases with a signal peptide. These findings provide a platform

  1. Protein (multi-)location prediction: using location inter-dependencies in a probabilistic framework

    Science.gov (United States)

    2014-01-01

    Motivation Knowing the location of a protein within the cell is important for understanding its function, role in biological processes, and potential use as a drug target. Much progress has been made in developing computational methods that predict single locations for proteins. Most such methods are based on the over-simplifying assumption that proteins localize to a single location. However, it has been shown that proteins localize to multiple locations. While a few recent systems attempt to predict multiple locations of proteins, their performance leaves much room for improvement. Moreover, they typically treat locations as independent and do not attempt to utilize possible inter-dependencies among locations. Our hypothesis is that directly incorporating inter-dependencies among locations into both the classifier-learning and the prediction process can improve location prediction performance. Results We present a new method and a preliminary system we have developed that directly incorporates inter-dependencies among locations into the location-prediction process of multiply-localized proteins. Our method is based on a collection of Bayesian network classifiers, where each classifier is used to predict a single location. Learning the structure of each Bayesian network classifier takes into account inter-dependencies among locations, and the prediction process uses estimates involving multiple locations. We evaluate our system on a dataset of single- and multi-localized proteins (the most comprehensive protein multi-localization dataset currently available, derived from the DBMLoc dataset). Our results, obtained by incorporating inter-dependencies, are significantly higher than those obtained by classifiers that do not use inter-dependencies. The performance of our system on multi-localized proteins is comparable to a top performing system (YLoc+), without being restricted only to location-combinations present in the training set. PMID:24646119

  2. Protein (multi-)location prediction: using location inter-dependencies in a probabilistic framework.

    Science.gov (United States)

    Simha, Ramanuja; Shatkay, Hagit

    2014-03-19

    Knowing the location of a protein within the cell is important for understanding its function, role in biological processes, and potential use as a drug target. Much progress has been made in developing computational methods that predict single locations for proteins. Most such methods are based on the over-simplifying assumption that proteins localize to a single location. However, it has been shown that proteins localize to multiple locations. While a few recent systems attempt to predict multiple locations of proteins, their performance leaves much room for improvement. Moreover, they typically treat locations as independent and do not attempt to utilize possible inter-dependencies among locations. Our hypothesis is that directly incorporating inter-dependencies among locations into both the classifier-learning and the prediction process can improve location prediction performance. We present a new method and a preliminary system we have developed that directly incorporates inter-dependencies among locations into the location-prediction process of multiply-localized proteins. Our method is based on a collection of Bayesian network classifiers, where each classifier is used to predict a single location. Learning the structure of each Bayesian network classifier takes into account inter-dependencies among locations, and the prediction process uses estimates involving multiple locations. We evaluate our system on a dataset of single- and multi-localized proteins (the most comprehensive protein multi-localization dataset currently available, derived from the DBMLoc dataset). Our results, obtained by incorporating inter-dependencies, are significantly higher than those obtained by classifiers that do not use inter-dependencies. The performance of our system on multi-localized proteins is comparable to a top performing system (YLoc+), without being restricted only to location-combinations present in the training set.

  3. Adaptive compressive learning for prediction of protein-protein interactions from primary sequence.

    Science.gov (United States)

    Zhang, Ya-Nan; Pan, Xiao-Yong; Huang, Yan; Shen, Hong-Bin

    2011-08-21

    Protein-protein interactions (PPIs) play an important role in biological processes. Although much effort has been devoted to the identification of novel PPIs by integrating experimental biological knowledge, there are still many difficulties because of lacking enough protein structural and functional information. It is highly desired to develop methods based only on amino acid sequences for predicting PPIs. However, sequence-based predictors are often struggling with the high-dimensionality causing over-fitting and high computational complexity problems, as well as the redundancy of sequential feature vectors. In this paper, a novel computational approach based on compressed sensing theory is proposed to predict yeast Saccharomyces cerevisiae PPIs from primary sequence and has achieved promising results. The key advantage of the proposed compressed sensing algorithm is that it can compress the original high-dimensional protein sequential feature vector into a much lower but more condensed space taking the sparsity property of the original signal into account. What makes compressed sensing much more attractive in protein sequence analysis is its compressed signal can be reconstructed from far fewer measurements than what is usually considered necessary in traditional Nyquist sampling theory. Experimental results demonstrate that proposed compressed sensing method is powerful for analyzing noisy biological data and reducing redundancy in feature vectors. The proposed method represents a new strategy of dealing with high-dimensional protein discrete model and has great potentiality to be extended to deal with many other complicated biological systems. Copyright © 2011 Elsevier Ltd. All rights reserved.

  4. CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure prediction

    KAUST Repository

    Cui, Xuefeng; Lu, Zhiwu; Wang, Sheng; Jing-Yan Wang, Jim; Gao, Xin

    2016-01-01

    Motivation: Protein homology detection, a fundamental problem in computational biology, is an indispensable step toward predicting protein structures and understanding protein functions. Despite the advances in recent decades on sequence alignment

  5. An Updated View of Translocator Protein (TSPO

    Directory of Open Access Journals (Sweden)

    Nunzio Denora

    2017-12-01

    Full Text Available Decades of study on the role of mitochondria in living cells have evidenced the importance of the 18 kDa mitochondrial translocator protein (TSPO, first discovered in the 1977 as an alternative binding site for the benzodiazepine diazepam in the kidneys. This protein participates in a variety of cellular functions, including cholesterol transport, steroid hormone synthesis, mitochondrial respiration, permeability transition pore opening, apoptosis, and cell proliferation. Thus, TSPO has become an extremely attractive subcellular target for the early detection of disease states that involve the overexpression of this protein and the selective mitochondrial drug delivery. This special issue was programmed with the aim of summarizing the latest findings about the role of TSPO in eukaryotic cells and as a potential subcellular target of diagnostics or therapeutics. A total of 9 papers have been accepted for publication in this issue, in particular, 2 reviews and 7 primary data manuscripts, overall describing the main advances in this field.

  6. Hill-Climbing search and diversification within an evolutionary approach to protein structure prediction.

    Science.gov (United States)

    Chira, Camelia; Horvath, Dragos; Dumitrescu, D

    2011-07-30

    Proteins are complex structures made of amino acids having a fundamental role in the correct functioning of living cells. The structure of a protein is the result of the protein folding process. However, the general principles that govern the folding of natural proteins into a native structure are unknown. The problem of predicting a protein structure with minimum-energy starting from the unfolded amino acid sequence is a highly complex and important task in molecular and computational biology. Protein structure prediction has important applications in fields such as drug design and disease prediction. The protein structure prediction problem is NP-hard even in simplified lattice protein models. An evolutionary model based on hill-climbing genetic operators is proposed for protein structure prediction in the hydrophobic - polar (HP) model. Problem-specific search operators are implemented and applied using a steepest-ascent hill-climbing approach. Furthermore, the proposed model enforces an explicit diversification stage during the evolution in order to avoid local optimum. The main features of the resulting evolutionary algorithm - hill-climbing mechanism and diversification strategy - are evaluated in a set of numerical experiments for the protein structure prediction problem to assess their impact to the efficiency of the search process. Furthermore, the emerging consolidated model is compared to relevant algorithms from the literature for a set of difficult bidimensional instances from lattice protein models. The results obtained by the proposed algorithm are promising and competitive with those of related methods.

  7. Hill-Climbing search and diversification within an evolutionary approach to protein structure prediction

    Directory of Open Access Journals (Sweden)

    Chira Camelia

    2011-07-01

    Full Text Available Abstract Proteins are complex structures made of amino acids having a fundamental role in the correct functioning of living cells. The structure of a protein is the result of the protein folding process. However, the general principles that govern the folding of natural proteins into a native structure are unknown. The problem of predicting a protein structure with minimum-energy starting from the unfolded amino acid sequence is a highly complex and important task in molecular and computational biology. Protein structure prediction has important applications in fields such as drug design and disease prediction. The protein structure prediction problem is NP-hard even in simplified lattice protein models. An evolutionary model based on hill-climbing genetic operators is proposed for protein structure prediction in the hydrophobic - polar (HP model. Problem-specific search operators are implemented and applied using a steepest-ascent hill-climbing approach. Furthermore, the proposed model enforces an explicit diversification stage during the evolution in order to avoid local optimum. The main features of the resulting evolutionary algorithm - hill-climbing mechanism and diversification strategy - are evaluated in a set of numerical experiments for the protein structure prediction problem to assess their impact to the efficiency of the search process. Furthermore, the emerging consolidated model is compared to relevant algorithms from the literature for a set of difficult bidimensional instances from lattice protein models. The results obtained by the proposed algorithm are promising and competitive with those of related methods.

  8. An update of the DEF database of protein fold class predictions

    DEFF Research Database (Denmark)

    Reczko, Martin; Karras, Dimitris; Bohr, Henrik

    1997-01-01

    An update is given on the Database of Expected Fold classes (DEF) that contains a collection of fold-class predictions made from protein sequences and a mail server that provides new predictions for new sequences. To any given sequence one of 49 fold-classes is chosen to classify the structure re...... related to the sequence with high accuracy. The updated predictions system is developed using data from the new version of the 3D-ALI database of aligned protein structures and thus is giving more reliable and more detailed predictions than the previous DEF system.......An update is given on the Database of Expected Fold classes (DEF) that contains a collection of fold-class predictions made from protein sequences and a mail server that provides new predictions for new sequences. To any given sequence one of 49 fold-classes is chosen to classify the structure...

  9. Minimum curvilinearity to enhance topological prediction of protein interactions by network embedding

    KAUST Repository

    Cannistraci, Carlo

    2013-06-21

    Motivation: Most functions within the cell emerge thanks to protein-protein interactions (PPIs), yet experimental determination of PPIs is both expensive and time-consuming. PPI networks present significant levels of noise and incompleteness. Predicting interactions using only PPI-network topology (topological prediction) is difficult but essential when prior biological knowledge is absent or unreliable.Methods: Network embedding emphasizes the relations between network proteins embedded in a low-dimensional space, in which protein pairs that are closer to each other represent good candidate interactions. To achieve network denoising, which boosts prediction performance, we first applied minimum curvilinear embedding (MCE), and then adopted shortest path (SP) in the reduced space to assign likelihood scores to candidate interactions. Furthermore, we introduce (i) a new valid variation of MCE, named non-centred MCE (ncMCE); (ii) two automatic strategies for selecting the appropriate embedding dimension; and (iii) two new randomized procedures for evaluating predictions.Results: We compared our method against several unsupervised and supervisedly tuned embedding approaches and node neighbourhood techniques. Despite its computational simplicity, ncMCE-SP was the overall leader, outperforming the current methods in topological link prediction.Conclusion: Minimum curvilinearity is a valuable non-linear framework that we successfully applied to the embedding of protein networks for the unsupervised prediction of novel PPIs. The rationale for our approach is that biological and evolutionary information is imprinted in the non-linear patterns hidden behind the protein network topology, and can be exploited for predicting new protein links. The predicted PPIs represent good candidates for testing in high-throughput experiments or for exploitation in systems biology tools such as those used for network-based inference and prediction of disease-related functional modules. The

  10. Plant Proteins Are Smaller Because They Are Encoded by Fewer Exons than Animal Proteins.

    Science.gov (United States)

    Ramírez-Sánchez, Obed; Pérez-Rodríguez, Paulino; Delaye, Luis; Tiessen, Axel

    2016-12-01

    Protein size is an important biochemical feature since longer proteins can harbor more domains and therefore can display more biological functionalities than shorter proteins. We found remarkable differences in protein length, exon structure, and domain count among different phylogenetic lineages. While eukaryotic proteins have an average size of 472 amino acid residues (aa), average protein sizes in plant genomes are smaller than those of animals and fungi. Proteins unique to plants are ∼81aa shorter than plant proteins conserved among other eukaryotic lineages. The smaller average size of plant proteins could neither be explained by endosymbiosis nor subcellular compartmentation nor exon size, but rather due to exon number. Metazoan proteins are encoded on average by ∼10 exons of small size [∼176 nucleotides (nt)]. Streptophyta have on average only ∼5.7 exons of medium size (∼230nt). Multicellular species code for large proteins by increasing the exon number, while most unicellular organisms employ rather larger exons (>400nt). Among subcellular compartments, membrane proteins are the largest (∼520aa), whereas the smallest proteins correspond to the gene ontology group of ribosome (∼240aa). Plant genes are encoded by half the number of exons and also contain fewer domains than animal proteins on average. Interestingly, endosymbiotic proteins that migrated to the plant nucleus became larger than their cyanobacterial orthologs. We thus conclude that plants have proteins larger than bacteria but smaller than animals or fungi. Compared to the average of eukaryotic species, plants have ∼34% more but ∼20% smaller proteins. This suggests that photosynthetic organisms are unique and deserve therefore special attention with regard to the evolutionary forces acting on their genomes and proteomes. Copyright © 2016 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.

  11. Plant Proteins Are Smaller Because They Are Encoded by Fewer Exons than Animal Proteins

    Directory of Open Access Journals (Sweden)

    Obed Ramírez-Sánchez

    2016-12-01

    Full Text Available Protein size is an important biochemical feature since longer proteins can harbor more domains and therefore can display more biological functionalities than shorter proteins. We found remarkable differences in protein length, exon structure, and domain count among different phylogenetic lineages. While eukaryotic proteins have an average size of 472 amino acid residues (aa, average protein sizes in plant genomes are smaller than those of animals and fungi. Proteins unique to plants are ∼81 aa shorter than plant proteins conserved among other eukaryotic lineages. The smaller average size of plant proteins could neither be explained by endosymbiosis nor subcellular compartmentation nor exon size, but rather due to exon number. Metazoan proteins are encoded on average by ∼10 exons of small size [∼176 nucleotides (nt]. Streptophyta have on average only ∼5.7 exons of medium size (∼230 nt. Multicellular species code for large proteins by increasing the exon number, while most unicellular organisms employ rather larger exons (>400 nt. Among subcellular compartments, membrane proteins are the largest (∼520 aa, whereas the smallest proteins correspond to the gene ontology group of ribosome (∼240 aa. Plant genes are encoded by half the number of exons and also contain fewer domains than animal proteins on average. Interestingly, endosymbiotic proteins that migrated to the plant nucleus became larger than their cyanobacterial orthologs. We thus conclude that plants have proteins larger than bacteria but smaller than animals or fungi. Compared to the average of eukaryotic species, plants have ∼34% more but ∼20% smaller proteins. This suggests that photosynthetic organisms are unique and deserve therefore special attention with regard to the evolutionary forces acting on their genomes and proteomes.

  12. A Particle Swarm Optimization-Based Approach with Local Search for Predicting Protein Folding.

    Science.gov (United States)

    Yang, Cheng-Hong; Lin, Yu-Shiun; Chuang, Li-Yeh; Chang, Hsueh-Wei

    2017-10-01

    The hydrophobic-polar (HP) model is commonly used for predicting protein folding structures and hydrophobic interactions. This study developed a particle swarm optimization (PSO)-based algorithm combined with local search algorithms; specifically, the high exploration PSO (HEPSO) algorithm (which can execute global search processes) was combined with three local search algorithms (hill-climbing algorithm, greedy algorithm, and Tabu table), yielding the proposed HE-L-PSO algorithm. By using 20 known protein structures, we evaluated the performance of the HE-L-PSO algorithm in predicting protein folding in the HP model. The proposed HE-L-PSO algorithm exhibited favorable performance in predicting both short and long amino acid sequences with high reproducibility and stability, compared with seven reported algorithms. The HE-L-PSO algorithm yielded optimal solutions for all predicted protein folding structures. All HE-L-PSO-predicted protein folding structures possessed a hydrophobic core that is similar to normal protein folding.

  13. Predicting Protein-Protein Interaction Sites with a Novel Membership Based Fuzzy SVM Classifier.

    Science.gov (United States)

    Sriwastava, Brijesh K; Basu, Subhadip; Maulik, Ujjwal

    2015-01-01

    Predicting residues that participate in protein-protein interactions (PPI) helps to identify, which amino acids are located at the interface. In this paper, we show that the performance of the classical support vector machine (SVM) algorithm can further be improved with the use of a custom-designed fuzzy membership function, for the partner-specific PPI interface prediction problem. We evaluated the performances of both classical SVM and fuzzy SVM (F-SVM) on the PPI databases of three different model proteomes of Homo sapiens, Escherichia coli and Saccharomyces Cerevisiae and calculated the statistical significance of the developed F-SVM over classical SVM algorithm. We also compared our performance with the available state-of-the-art fuzzy methods in this domain and observed significant performance improvements. To predict interaction sites in protein complexes, local composition of amino acids together with their physico-chemical characteristics are used, where the F-SVM based prediction method exploits the membership function for each pair of sequence fragments. The average F-SVM performance (area under ROC curve) on the test samples in 10-fold cross validation experiment are measured as 77.07, 78.39, and 74.91 percent for the aforementioned organisms respectively. Performances on independent test sets are obtained as 72.09, 73.24 and 82.74 percent respectively. The software is available for free download from http://code.google.com/p/cmater-bioinfo.

  14. Tip chip : Subcellular sampling from single cancer cells

    NARCIS (Netherlands)

    Quist, Jos; Sarajlic, Edin; Lai, Stanley C.S.; Lemay, Serge G.

    2016-01-01

    To analyze the molecular content of single cells, cell lysis is typically required, yielding a snapshot of cell behavior only. To follow complex molecular profiles over time, subcellular sampling methods potentially can be used, but to date these methods involve laborious offline analysis. Here we

  15. Building a better fragment library for de novo protein structure prediction.

    Directory of Open Access Journals (Sweden)

    Saulo H P de Oliveira

    Full Text Available Fragment-based approaches are the current standard for de novo protein structure prediction. These approaches rely on accurate and reliable fragment libraries to generate good structural models. In this work, we describe a novel method for structure fragment library generation and its application in fragment-based de novo protein structure prediction. The importance of correct testing procedures in assessing the quality of fragment libraries is demonstrated. In particular, the exclusion of homologs to the target from the libraries to correctly simulate a de novo protein structure prediction scenario, something which surprisingly is not always done. We demonstrate that fragments presenting different predominant predicted secondary structures should be treated differently during the fragment library generation step and that exhaustive and random search strategies should both be used. This information was used to develop a novel method, Flib. On a validation set of 41 structurally diverse proteins, Flib libraries presents both a higher precision and coverage than two of the state-of-the-art methods, NNMake and HHFrag. Flib also achieves better precision and coverage on the set of 275 protein domains used in the two previous experiments of the the Critical Assessment of Structure Prediction (CASP9 and CASP10. We compared Flib libraries against NNMake libraries in a structure prediction context. Of the 13 cases in which a correct answer was generated, Flib models were more accurate than NNMake models for 10. "Flib is available for download at: http://www.stats.ox.ac.uk/research/proteins/resources".

  16. Building a Better Fragment Library for De Novo Protein Structure Prediction

    Science.gov (United States)

    de Oliveira, Saulo H. P.; Shi, Jiye; Deane, Charlotte M.

    2015-01-01

    Fragment-based approaches are the current standard for de novo protein structure prediction. These approaches rely on accurate and reliable fragment libraries to generate good structural models. In this work, we describe a novel method for structure fragment library generation and its application in fragment-based de novo protein structure prediction. The importance of correct testing procedures in assessing the quality of fragment libraries is demonstrated. In particular, the exclusion of homologs to the target from the libraries to correctly simulate a de novo protein structure prediction scenario, something which surprisingly is not always done. We demonstrate that fragments presenting different predominant predicted secondary structures should be treated differently during the fragment library generation step and that exhaustive and random search strategies should both be used. This information was used to develop a novel method, Flib. On a validation set of 41 structurally diverse proteins, Flib libraries presents both a higher precision and coverage than two of the state-of-the-art methods, NNMake and HHFrag. Flib also achieves better precision and coverage on the set of 275 protein domains used in the two previous experiments of the the Critical Assessment of Structure Prediction (CASP9 and CASP10). We compared Flib libraries against NNMake libraries in a structure prediction context. Of the 13 cases in which a correct answer was generated, Flib models were more accurate than NNMake models for 10. “Flib is available for download at: http://www.stats.ox.ac.uk/research/proteins/resources”. PMID:25901595

  17. InterMap3D: predicting and visualizing co-evolving protein residues

    DEFF Research Database (Denmark)

    Oliveira, Rodrigo Gouveia; Roque, francisco jose sousa simôes almeida; Wernersson, Rasmus

    2009-01-01

    InterMap3D predicts co-evolving protein residues and plots them on the 3D protein structure. Starting with a single protein sequence, InterMap3D automatically finds a set of homologous sequences, generates an alignment and fetches the most similar 3D structure from the Protein Data Bank (PDB......). It can also accept a user-generated alignment. Based on the alignment, co-evolving residues are then predicted using three different methods: Row and Column Weighing of Mutual Information, Mutual Information/Entropy and Dependency. Finally, InterMap3D generates high-quality images of the protein...

  18. Binding Ligand Prediction for Proteins Using Partial Matching of Local Surface Patches

    Directory of Open Access Journals (Sweden)

    Lee Sael

    2010-12-01

    Full Text Available Functional elucidation of uncharacterized protein structures is an important task in bioinformatics. We report our new approach for structure-based function prediction which captures local surface features of ligand binding pockets. Function of proteins, specifically, binding ligands of proteins, can be predicted by finding similar local surface regions of known proteins. To enable partial comparison of binding sites in proteins, a weighted bipartite matching algorithm is used to match pairs of surface patches. The surface patches are encoded with the 3D Zernike descriptors. Unlike the existing methods which compare global characteristics of the protein fold or the global pocket shape, the local surface patch method can find functional similarity between non-homologous proteins and binding pockets for flexible ligand molecules. The proposed method improves prediction results over global pocket shape-based method which was previously developed by our group.

  19. Binding ligand prediction for proteins using partial matching of local surface patches.

    Science.gov (United States)

    Sael, Lee; Kihara, Daisuke

    2010-01-01

    Functional elucidation of uncharacterized protein structures is an important task in bioinformatics. We report our new approach for structure-based function prediction which captures local surface features of ligand binding pockets. Function of proteins, specifically, binding ligands of proteins, can be predicted by finding similar local surface regions of known proteins. To enable partial comparison of binding sites in proteins, a weighted bipartite matching algorithm is used to match pairs of surface patches. The surface patches are encoded with the 3D Zernike descriptors. Unlike the existing methods which compare global characteristics of the protein fold or the global pocket shape, the local surface patch method can find functional similarity between non-homologous proteins and binding pockets for flexible ligand molecules. The proposed method improves prediction results over global pocket shape-based method which was previously developed by our group.

  20. CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure prediction

    KAUST Repository

    Cui, Xuefeng

    2016-06-15

    Motivation: Protein homology detection, a fundamental problem in computational biology, is an indispensable step toward predicting protein structures and understanding protein functions. Despite the advances in recent decades on sequence alignment, threading and alignment-free methods, protein homology detection remains a challenging open problem. Recently, network methods that try to find transitive paths in the protein structure space demonstrate the importance of incorporating network information of the structure space. Yet, current methods merge the sequence space and the structure space into a single space, and thus introduce inconsistency in combining different sources of information. Method: We present a novel network-based protein homology detection method, CMsearch, based on cross-modal learning. Instead of exploring a single network built from the mixture of sequence and structure space information, CMsearch builds two separate networks to represent the sequence space and the structure space. It then learns sequence–structure correlation by simultaneously taking sequence information, structure information, sequence space information and structure space information into consideration. Results: We tested CMsearch on two challenging tasks, protein homology detection and protein structure prediction, by querying all 8332 PDB40 proteins. Our results demonstrate that CMsearch is insensitive to the similarity metrics used to define the sequence and the structure spaces. By using HMM–HMM alignment as the sequence similarity metric, CMsearch clearly outperforms state-of-the-art homology detection methods and the CASP-winning template-based protein structure prediction methods.

  1. Protein 8-class secondary structure prediction using conditional neural fields.

    Science.gov (United States)

    Wang, Zhiyong; Zhao, Feng; Peng, Jian; Xu, Jinbo

    2011-10-01

    Compared with the protein 3-class secondary structure (SS) prediction, the 8-class prediction gains less attention and is also much more challenging, especially for proteins with few sequence homologs. This paper presents a new probabilistic method for 8-class SS prediction using conditional neural fields (CNFs), a recently invented probabilistic graphical model. This CNF method not only models the complex relationship between sequence features and SS, but also exploits the interdependency among SS types of adjacent residues. In addition to sequence profiles, our method also makes use of non-evolutionary information for SS prediction. Tested on the CB513 and RS126 data sets, our method achieves Q8 accuracy of 64.9 and 64.7%, respectively, which are much better than the SSpro8 web server (51.0 and 48.0%, respectively). Our method can also be used to predict other structure properties (e.g. solvent accessibility) of a protein or the SS of RNA. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  2. Sub-cellular localisation of a 15N-labelled peptide vector using NanoSIMS imaging

    Science.gov (United States)

    Römer, Winfried; Wu, Ting-Di; Duchambon, Patricia; Amessou, Mohamed; Carrez, Danièle; Johannes, Ludger; Guerquin-Kern, Jean-Luc

    2006-07-01

    Dynamic SIMS imaging is proposed to map sub-cellular distributions of isotopically labelled, exogenous compounds. NanoSIMS imaging allows the characterisation of the intracellular transport pathways of exogenous molecules, including peptide vectors employed in innovative therapies, using stable isotopes as molecular markers to detect the compound of interest. Shiga toxin B-subunit (STxB) was chosen as a representative peptide vector. The recombinant protein ( 15N-STxB) was synthesised in Escherichia coli using 15NH 4Cl as sole nitrogen source resulting in 15N enrichment in the molecule. Using the NanoSIMS 50 ion microprobe (Cameca), different ion species ( 12C 14N -, 12C 15N -, 31P -) originating from the same sputtered micro volume were simultaneously detected. High mass resolving power enabled the discrimination of 12C 15N - from its polyatomic isobars of mass 27. We imaged the membrane binding and internalisation of 15N-STxB in HeLa cells at spatial resolutions of less than 100 nm. Thus, the use of rare stable isotopes like 15N with dynamic SIMS imaging permits sub-cellular detection of isotopically labelled, exogenous molecules and imaging of their transport pathways at high mass and spatial resolution. Application of stable isotopes as markers can replace the large and chemically complex tags used for fluorescence microscopy, without altering the chemical and physical properties of the molecule.

  3. Prediction of protein-protein interactions in dengue virus coat proteins guided by low resolution cryoEM structures

    Directory of Open Access Journals (Sweden)

    Srinivasan Narayanaswamy

    2010-06-01

    Full Text Available Abstract Background Dengue virus along with the other members of the flaviviridae family has reemerged as deadly human pathogens. Understanding the mechanistic details of these infections can be highly rewarding in developing effective antivirals. During maturation of the virus inside the host cell, the coat proteins E and M undergo conformational changes, altering the morphology of the viral coat. However, due to low resolution nature of the available 3-D structures of viral assemblies, the atomic details of these changes are still elusive. Results In the present analysis, starting from Cα positions of low resolution cryo electron microscopic structures the residue level details of protein-protein interaction interfaces of dengue virus coat proteins have been predicted. By comparing the preexisting structures of virus in different phases of life cycle, the changes taking place in these predicted protein-protein interaction interfaces were followed as a function of maturation process of the virus. Besides changing the current notion about the presence of only homodimers in the mature viral coat, the present analysis indicated presence of a proline-rich motif at the protein-protein interaction interface of the coat protein. Investigating the conservation status of these seemingly functionally crucial residues across other members of flaviviridae family enabled dissecting common mechanisms used for infections by these viruses. Conclusions Thus, using computational approach the present analysis has provided better insights into the preexisting low resolution structures of virus assemblies, the findings of which can be made use of in designing effective antivirals against these deadly human pathogens.

  4. Prediction of Protein–Protein Interactions by Evidence Combining Methods

    Directory of Open Access Journals (Sweden)

    Ji-Wei Chang

    2016-11-01

    Full Text Available Most cellular functions involve proteins’ features based on their physical interactions with other partner proteins. Sketching a map of protein–protein interactions (PPIs is therefore an important inception step towards understanding the basics of cell functions. Several experimental techniques operating in vivo or in vitro have made significant contributions to screening a large number of protein interaction partners, especially high-throughput experimental methods. However, computational approaches for PPI predication supported by rapid accumulation of data generated from experimental techniques, 3D structure definitions, and genome sequencing have boosted the map sketching of PPIs. In this review, we shed light on in silico PPI prediction methods that integrate evidence from multiple sources, including evolutionary relationship, function annotation, sequence/structure features, network topology and text mining. These methods are developed for integration of multi-dimensional evidence, for designing the strategies to predict novel interactions, and for making the results consistent with the increase of prediction coverage and accuracy.

  5. Acetylation dynamics of human nuclear proteins during the ionizing radiation-induced DNA damage response

    DEFF Research Database (Denmark)

    Bennetzen, Martin; Andersen, J.S.; Lasen, D.H.

    2013-01-01

    Genotoxic insults, such as ionizing radiation (IR), cause DNA damage that evokes a multifaceted cellular DNA damage response (DDR). DNA damage signaling events that control protein activity, subcellular localization, DNA binding, protein-protein interactions, etc. rely heavily on time...

  6. Update on protein structure prediction: results of the 1995 IRBM workshop

    DEFF Research Database (Denmark)

    Hubbard, Tim; Tramontano, Anna; Hansen, Jan

    1996-01-01

    Computational tools for protein structure prediction are of great interest to molecular, structural and theoretical biologists due to a rapidly increasing number of protein sequences with no known structure. In October 1995, a workshop was held at IRBM to predict as much as possible about a numbe...

  7. Update on protein structure prediction: results of the 1995 IRBM workshop

    DEFF Research Database (Denmark)

    Hubbard, Tim; Tramontano, Anna; Hansen, Jan

    1996-01-01

    Computational tools for protein structure prediction are of great interest to molecular, structural and theoretical biologists due to a rapidly increasing number of protein sequences with no known structure. In October 1995, a workshop was held at IRBM to predict as much as possible about a number...

  8. Predicting pKa for proteins using COSMO-RS

    DEFF Research Database (Denmark)

    Andersson, Martin Peter; Jensen, Jan Halborg; Stipp, Susan Louise Svane

    2013-01-01

    We have used the COSMO-RS implicit solvation method to calculate the equilibrium constants, pKa, for deprotonation of the acidic residues of the ovomucoid inhibitor protein, OMTKY3. The root mean square error for comparison with experimental data is only 0.5 pH units and the maximum error 0.8 p......H units. The results show that the accuracy of pKa prediction using COSMO-RS is as good for large biomolecules as it is for smaller inorganic and organic acids and that the method compares very well to previous pKa predictions of the OMTKY3 protein using Quantum Mechanics/Molecular Mechanics. Our approach...

  9. Knowledge base and neural network approach for protein secondary structure prediction.

    Science.gov (United States)

    Patel, Maulika S; Mazumdar, Himanshu S

    2014-11-21

    Protein structure prediction is of great relevance given the abundant genomic and proteomic data generated by the genome sequencing projects. Protein secondary structure prediction is addressed as a sub task in determining the protein tertiary structure and function. In this paper, a novel algorithm, KB-PROSSP-NN, which is a combination of knowledge base and modeling of the exceptions in the knowledge base using neural networks for protein secondary structure prediction (PSSP), is proposed. The knowledge base is derived from a proteomic sequence-structure database and consists of the statistics of association between the 5-residue words and corresponding secondary structure. The predicted results obtained using knowledge base are refined with a Backpropogation neural network algorithm. Neural net models the exceptions of the knowledge base. The Q3 accuracy of 90% and 82% is achieved on the RS126 and CB396 test sets respectively which suggest improvement over existing state of art methods. Copyright © 2014 Elsevier Ltd. All rights reserved.

  10. ComplexContact: a web server for inter-protein contact prediction using deep learning

    KAUST Repository

    Zeng, Hong; Wang, Sheng; Zhou, Tianming; Zhao, Feifeng; Li, Xiufeng; Wu, Qing; Xu, Jinbo

    2018-01-01

    ComplexContact (http://raptorx2.uchicago.edu/ComplexContact/) is a web server for sequence-based interfacial residue-residue contact prediction of a putative protein complex. Interfacial residue-residue contacts are critical for understanding how proteins form complex and interact at residue level. When receiving a pair of protein sequences, ComplexContact first searches for their sequence homologs and builds two paired multiple sequence alignments (MSA), then it applies co-evolution analysis and a CASP-winning deep learning (DL) method to predict interfacial contacts from paired MSAs and visualizes the prediction as an image. The DL method was originally developed for intra-protein contact prediction and performed the best in CASP12. Our large-scale experimental test further shows that ComplexContact greatly outperforms pure co-evolution methods for inter-protein contact prediction, regardless of the species.

  11. ComplexContact: a web server for inter-protein contact prediction using deep learning

    KAUST Repository

    Zeng, Hong

    2018-05-20

    ComplexContact (http://raptorx2.uchicago.edu/ComplexContact/) is a web server for sequence-based interfacial residue-residue contact prediction of a putative protein complex. Interfacial residue-residue contacts are critical for understanding how proteins form complex and interact at residue level. When receiving a pair of protein sequences, ComplexContact first searches for their sequence homologs and builds two paired multiple sequence alignments (MSA), then it applies co-evolution analysis and a CASP-winning deep learning (DL) method to predict interfacial contacts from paired MSAs and visualizes the prediction as an image. The DL method was originally developed for intra-protein contact prediction and performed the best in CASP12. Our large-scale experimental test further shows that ComplexContact greatly outperforms pure co-evolution methods for inter-protein contact prediction, regardless of the species.

  12. ComplexContact: a web server for inter-protein contact prediction using deep learning.

    Science.gov (United States)

    Zeng, Hong; Wang, Sheng; Zhou, Tianming; Zhao, Feifeng; Li, Xiufeng; Wu, Qing; Xu, Jinbo

    2018-05-22

    ComplexContact (http://raptorx2.uchicago.edu/ComplexContact/) is a web server for sequence-based interfacial residue-residue contact prediction of a putative protein complex. Interfacial residue-residue contacts are critical for understanding how proteins form complex and interact at residue level. When receiving a pair of protein sequences, ComplexContact first searches for their sequence homologs and builds two paired multiple sequence alignments (MSA), then it applies co-evolution analysis and a CASP-winning deep learning (DL) method to predict interfacial contacts from paired MSAs and visualizes the prediction as an image. The DL method was originally developed for intra-protein contact prediction and performed the best in CASP12. Our large-scale experimental test further shows that ComplexContact greatly outperforms pure co-evolution methods for inter-protein contact prediction, regardless of the species.

  13. Incorporating information on predicted solvent accessibility to the co-evolution-based study of protein interactions.

    Science.gov (United States)

    Ochoa, David; García-Gutiérrez, Ponciano; Juan, David; Valencia, Alfonso; Pazos, Florencio

    2013-01-27

    A widespread family of methods for studying and predicting protein interactions using sequence information is based on co-evolution, quantified as similarity of phylogenetic trees. Part of the co-evolution observed between interacting proteins could be due to co-adaptation caused by inter-protein contacts. In this case, the co-evolution is expected to be more evident when evaluated on the surface of the proteins or the internal layers close to it. In this work we study the effect of incorporating information on predicted solvent accessibility to three methods for predicting protein interactions based on similarity of phylogenetic trees. We evaluate the performance of these methods in predicting different types of protein associations when trees based on positions with different characteristics of predicted accessibility are used as input. We found that predicted accessibility improves the results of two recent versions of the mirrortree methodology in predicting direct binary physical interactions, while it neither improves these methods, nor the original mirrortree method, in predicting other types of interactions. That improvement comes at no cost in terms of applicability since accessibility can be predicted for any sequence. We also found that predictions of protein-protein interactions are improved when multiple sequence alignments with a richer representation of sequences (including paralogs) are incorporated in the accessibility prediction.

  14. Segmentation and quantification of subcellular structures in fluorescence microscopy images using Squassh.

    Science.gov (United States)

    Rizk, Aurélien; Paul, Grégory; Incardona, Pietro; Bugarski, Milica; Mansouri, Maysam; Niemann, Axel; Ziegler, Urs; Berger, Philipp; Sbalzarini, Ivo F

    2014-03-01

    Detection and quantification of fluorescently labeled molecules in subcellular compartments is a key step in the analysis of many cell biological processes. Pixel-wise colocalization analyses, however, are not always suitable, because they do not provide object-specific information, and they are vulnerable to noise and background fluorescence. Here we present a versatile protocol for a method named 'Squassh' (segmentation and quantification of subcellular shapes), which is used for detecting, delineating and quantifying subcellular structures in fluorescence microscopy images. The workflow is implemented in freely available, user-friendly software. It works on both 2D and 3D images, accounts for the microscope optics and for uneven image background, computes cell masks and provides subpixel accuracy. The Squassh software enables both colocalization and shape analyses. The protocol can be applied in batch, on desktop computers or computer clusters, and it usually requires images, respectively. Basic computer-user skills and some experience with fluorescence microscopy are recommended to successfully use the protocol.

  15. A Kernel for Protein Secondary Structure Prediction

    OpenAIRE

    Guermeur , Yann; Lifchitz , Alain; Vert , Régis

    2004-01-01

    http://mitpress.mit.edu/catalog/item/default.asp?ttype=2&tid=10338&mode=toc; International audience; Multi-class support vector machines have already proved efficient in protein secondary structure prediction as ensemble methods, to combine the outputs of sets of classifiers based on different principles. In this chapter, their implementation as basic prediction methods, processing the primary structure or the profile of multiple alignments, is investigated. A kernel devoted to the task is in...

  16. I-TASSER server for protein 3D structure prediction

    Directory of Open Access Journals (Sweden)

    Zhang Yang

    2008-01-01

    Full Text Available Abstract Background Prediction of 3-dimensional protein structures from amino acid sequences represents one of the most important problems in computational structural biology. The community-wide Critical Assessment of Structure Prediction (CASP experiments have been designed to obtain an objective assessment of the state-of-the-art of the field, where I-TASSER was ranked as the best method in the server section of the recent 7th CASP experiment. Our laboratory has since then received numerous requests about the public availability of the I-TASSER algorithm and the usage of the I-TASSER predictions. Results An on-line version of I-TASSER is developed at the KU Center for Bioinformatics which has generated protein structure predictions for thousands of modeling requests from more than 35 countries. A scoring function (C-score based on the relative clustering structural density and the consensus significance score of multiple threading templates is introduced to estimate the accuracy of the I-TASSER predictions. A large-scale benchmark test demonstrates a strong correlation between the C-score and the TM-score (a structural similarity measurement with values in [0, 1] of the first models with a correlation coefficient of 0.91. Using a C-score cutoff > -1.5 for the models of correct topology, both false positive and false negative rates are below 0.1. Combining C-score and protein length, the accuracy of the I-TASSER models can be predicted with an average error of 0.08 for TM-score and 2 Å for RMSD. Conclusion The I-TASSER server has been developed to generate automated full-length 3D protein structural predictions where the benchmarked scoring system helps users to obtain quantitative assessments of the I-TASSER models. The output of the I-TASSER server for each query includes up to five full-length models, the confidence score, the estimated TM-score and RMSD, and the standard deviation of the estimations. The I-TASSER server is freely available

  17. Improving the accuracy of protein secondary structure prediction using structural alignment

    Directory of Open Access Journals (Sweden)

    Gallin Warren J

    2006-06-01

    Full Text Available Abstract Background The accuracy of protein secondary structure prediction has steadily improved over the past 30 years. Now many secondary structure prediction methods routinely achieve an accuracy (Q3 of about 75%. We believe this accuracy could be further improved by including structure (as opposed to sequence database comparisons as part of the prediction process. Indeed, given the large size of the Protein Data Bank (>35,000 sequences, the probability of a newly identified sequence having a structural homologue is actually quite high. Results We have developed a method that performs structure-based sequence alignments as part of the secondary structure prediction process. By mapping the structure of a known homologue (sequence ID >25% onto the query protein's sequence, it is possible to predict at least a portion of that query protein's secondary structure. By integrating this structural alignment approach with conventional (sequence-based secondary structure methods and then combining it with a "jury-of-experts" system to generate a consensus result, it is possible to attain very high prediction accuracy. Using a sequence-unique test set of 1644 proteins from EVA, this new method achieves an average Q3 score of 81.3%. Extensive testing indicates this is approximately 4–5% better than any other method currently available. Assessments using non sequence-unique test sets (typical of those used in proteome annotation or structural genomics indicate that this new method can achieve a Q3 score approaching 88%. Conclusion By using both sequence and structure databases and by exploiting the latest techniques in machine learning it is possible to routinely predict protein secondary structure with an accuracy well above 80%. A program and web server, called PROTEUS, that performs these secondary structure predictions is accessible at http://wishart.biology.ualberta.ca/proteus. For high throughput or batch sequence analyses, the PROTEUS programs

  18. Incorporating functional inter-relationships into protein function prediction algorithms

    Directory of Open Access Journals (Sweden)

    Kumar Vipin

    2009-05-01

    Full Text Available Abstract Background Functional classification schemes (e.g. the Gene Ontology that serve as the basis for annotation efforts in several organisms are often the source of gold standard information for computational efforts at supervised protein function prediction. While successful function prediction algorithms have been developed, few previous efforts have utilized more than the protein-to-functional class label information provided by such knowledge bases. For instance, the Gene Ontology not only captures protein annotations to a set of functional classes, but it also arranges these classes in a DAG-based hierarchy that captures rich inter-relationships between different classes. These inter-relationships present both opportunities, such as the potential for additional training examples for small classes from larger related classes, and challenges, such as a harder to learn distinction between similar GO terms, for standard classification-based approaches. Results We propose a method to enhance the performance of classification-based protein function prediction algorithms by addressing the issue of using these interrelationships between functional classes constituting functional classification schemes. Using a standard measure for evaluating the semantic similarity between nodes in an ontology, we quantify and incorporate these inter-relationships into the k-nearest neighbor classifier. We present experiments on several large genomic data sets, each of which is used for the modeling and prediction of over hundred classes from the GO Biological Process ontology. The results show that this incorporation produces more accurate predictions for a large number of the functional classes considered, and also that the classes benefitted most by this approach are those containing the fewest members. In addition, we show how our proposed framework can be used for integrating information from the entire GO hierarchy for improving the accuracy of

  19. Parallel protein secondary structure prediction based on neural networks.

    Science.gov (United States)

    Zhong, Wei; Altun, Gulsah; Tian, Xinmin; Harrison, Robert; Tai, Phang C; Pan, Yi

    2004-01-01

    Protein secondary structure prediction has a fundamental influence on today's bioinformatics research. In this work, binary and tertiary classifiers of protein secondary structure prediction are implemented on Denoeux belief neural network (DBNN) architecture. Hydrophobicity matrix, orthogonal matrix, BLOSUM62 and PSSM (position specific scoring matrix) are experimented separately as the encoding schemes for DBNN. The experimental results contribute to the design of new encoding schemes. New binary classifier for Helix versus not Helix ( approximately H) for DBNN produces prediction accuracy of 87% when PSSM is used for the input profile. The performance of DBNN binary classifier is comparable to other best prediction methods. The good test results for binary classifiers open a new approach for protein structure prediction with neural networks. Due to the time consuming task of training the neural networks, Pthread and OpenMP are employed to parallelize DBNN in the hyperthreading enabled Intel architecture. Speedup for 16 Pthreads is 4.9 and speedup for 16 OpenMP threads is 4 in the 4 processors shared memory architecture. Both speedup performance of OpenMP and Pthread is superior to that of other research. With the new parallel training algorithm, thousands of amino acids can be processed in reasonable amount of time. Our research also shows that hyperthreading technology for Intel architecture is efficient for parallel biological algorithms.

  20. Using support vector machine to predict beta- and gamma-turns in proteins.

    Science.gov (United States)

    Hu, Xiuzhen; Li, Qianzhong

    2008-09-01

    By using the composite vector with increment of diversity, position conservation scoring function, and predictive secondary structures to express the information of sequence, a support vector machine (SVM) algorithm for predicting beta- and gamma-turns in the proteins is proposed. The 426 and 320 nonhomologous protein chains described by Guruprasad and Rajkumar (Guruprasad and Rajkumar J. Biosci 2000, 25,143) are used for training and testing the predictive model of the beta- and gamma-turns, respectively. The overall prediction accuracy and the Matthews correlation coefficient in 7-fold cross-validation are 79.8% and 0.47, respectively, for the beta-turns. The overall prediction accuracy in 5-fold cross-validation is 61.0% for the gamma-turns. These results are significantly higher than the other algorithms in the prediction of beta- and gamma-turns using the same datasets. In addition, the 547 and 823 nonhomologous protein chains described by Fuchs and Alix (Fuchs and Alix Proteins: Struct Funct Bioinform 2005, 59, 828) are used for training and testing the predictive model of the beta- and gamma-turns, and better results are obtained. This algorithm may be helpful to improve the performance of protein turns' prediction. To ensure the ability of the SVM method to correctly classify beta-turn and non-beta-turn (gamma-turn and non-gamma-turn), the receiver operating characteristic threshold independent measure curves are provided. (c) 2008 Wiley Periodicals, Inc.

  1. Plasma proteins predict conversion to dementia from prodromal disease.

    Science.gov (United States)

    Hye, Abdul; Riddoch-Contreras, Joanna; Baird, Alison L; Ashton, Nicholas J; Bazenet, Chantal; Leung, Rufina; Westman, Eric; Simmons, Andrew; Dobson, Richard; Sattlecker, Martina; Lupton, Michelle; Lunnon, Katie; Keohane, Aoife; Ward, Malcolm; Pike, Ian; Zucht, Hans Dieter; Pepin, Danielle; Zheng, Wei; Tunnicliffe, Alan; Richardson, Jill; Gauthier, Serge; Soininen, Hilkka; Kłoszewska, Iwona; Mecocci, Patrizia; Tsolaki, Magda; Vellas, Bruno; Lovestone, Simon

    2014-11-01

    The study aimed to validate previously discovered plasma biomarkers associated with AD, using a design based on imaging measures as surrogate for disease severity and assess their prognostic value in predicting conversion to dementia. Three multicenter cohorts of cognitively healthy elderly, mild cognitive impairment (MCI), and AD participants with standardized clinical assessments and structural neuroimaging measures were used. Twenty-six candidate proteins were quantified in 1148 subjects using multiplex (xMAP) assays. Sixteen proteins correlated with disease severity and cognitive decline. Strongest associations were in the MCI group with a panel of 10 proteins predicting progression to AD (accuracy 87%, sensitivity 85%, and specificity 88%). We have identified 10 plasma proteins strongly associated with disease severity and disease progression. Such markers may be useful for patient selection for clinical trials and assessment of patients with predisease subjective memory complaints. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.

  2. Enhancing the prediction of protein pairings between interacting families using orthology information

    Directory of Open Access Journals (Sweden)

    Pazos Florencio

    2008-01-01

    Full Text Available Abstract Background It has repeatedly been shown that interacting protein families tend to have similar phylogenetic trees. These similarities can be used to predicting the mapping between two families of interacting proteins (i.e. which proteins from one family interact with which members of the other. The correct mapping will be that which maximizes the similarity between the trees. The two families may eventually comprise orthologs and paralogs, if members of the two families are present in more than one organism. This fact can be exploited to restrict the possible mappings, simply by impeding links between proteins of different organisms. We present here an algorithm to predict the mapping between families of interacting proteins which is able to incorporate information regarding orthologues, or any other assignment of proteins to "classes" that may restrict possible mappings. Results For the first time in methods for predicting mappings, we have tested this new approach on a large number of interacting protein domains in order to statistically assess its performance. The method accurately predicts around 80% in the most favourable cases. We also analysed in detail the results of the method for a well defined case of interacting families, the sensor and kinase components of the Ntr-type two-component system, for which up to 98% of the pairings predicted by the method were correct. Conclusion Based on the well established relationship between tree similarity and interactions we developed a method for predicting the mapping between two interacting families using genomic information alone. The program is available through a web interface.

  3. Analysis of substructural variation in families of enzymatic proteins with applications to protein function prediction

    Directory of Open Access Journals (Sweden)

    Fofanov Viacheslav Y

    2010-05-01

    Full Text Available Abstract Background Structural variations caused by a wide range of physico-chemical and biological sources directly influence the function of a protein. For enzymatic proteins, the structure and chemistry of the catalytic binding site residues can be loosely defined as a substructure of the protein. Comparative analysis of drug-receptor substructures across and within species has been used for lead evaluation. Substructure-level similarity between the binding sites of functionally similar proteins has also been used to identify instances of convergent evolution among proteins. In functionally homologous protein families, shared chemistry and geometry at catalytic sites provide a common, local point of comparison among proteins that may differ significantly at the sequence, fold, or domain topology levels. Results This paper describes two key results that can be used separately or in combination for protein function analysis. The Family-wise Analysis of SubStructural Templates (FASST method uses all-against-all substructure comparison to determine Substructural Clusters (SCs. SCs characterize the binding site substructural variation within a protein family. In this paper we focus on examples of automatically determined SCs that can be linked to phylogenetic distance between family members, segregation by conformation, and organization by homology among convergent protein lineages. The Motif Ensemble Statistical Hypothesis (MESH framework constructs a representative motif for each protein cluster among the SCs determined by FASST to build motif ensembles that are shown through a series of function prediction experiments to improve the function prediction power of existing motifs. Conclusions FASST contributes a critical feedback and assessment step to existing binding site substructure identification methods and can be used for the thorough investigation of structure-function relationships. The application of MESH allows for an automated

  4. Computational Prediction of Human Salivary Proteins from Blood Circulation and Application to Diagnostic Biomarker Identification

    Science.gov (United States)

    Wang, Jiaxin; Liang, Yanchun; Wang, Yan; Cui, Juan; Liu, Ming; Du, Wei; Xu, Ying

    2013-01-01

    Proteins can move from blood circulation into salivary glands through active transportation, passive diffusion or ultrafiltration, some of which are then released into saliva and hence can potentially serve as biomarkers for diseases if accurately identified. We present a novel computational method for predicting salivary proteins that come from circulation. The basis for the prediction is a set of physiochemical and sequence features we found to be discerning between human proteins known to be movable from circulation to saliva and proteins deemed to be not in saliva. A classifier was trained based on these features using a support-vector machine to predict protein secretion into saliva. The classifier achieved 88.56% average recall and 90.76% average precision in 10-fold cross-validation on the training data, indicating that the selected features are informative. Considering the possibility that our negative training data may not be highly reliable (i.e., proteins predicted to be not in saliva), we have also trained a ranking method, aiming to rank the known salivary proteins from circulation as the highest among the proteins in the general background, based on the same features. This prediction capability can be used to predict potential biomarker proteins for specific human diseases when coupled with the information of differentially expressed proteins in diseased versus healthy control tissues and a prediction capability for blood-secretory proteins. Using such integrated information, we predicted 31 candidate biomarker proteins in saliva for breast cancer. PMID:24324552

  5. A large-scale evaluation of computational protein function prediction

    NARCIS (Netherlands)

    Radivojac, P.; Clark, W.T.; Oron, T.R.; Schnoes, A.M.; Wittkop, T.; Kourmpetis, Y.A.I.; Dijk, van A.D.J.; Friedberg, I.

    2013-01-01

    Automated annotation of protein function is challenging. As the number of sequenced genomes rapidly grows, the overwhelming majority of protein products can only be annotated computationally. If computational predictions are to be relied upon, it is crucial that the accuracy of these methods be

  6. HMMBinder: DNA-Binding Protein Prediction Using HMM Profile Based Features.

    Science.gov (United States)

    Zaman, Rianon; Chowdhury, Shahana Yasmin; Rashid, Mahmood A; Sharma, Alok; Dehzangi, Abdollah; Shatabda, Swakkhar

    2017-01-01

    DNA-binding proteins often play important role in various processes within the cell. Over the last decade, a wide range of classification algorithms and feature extraction techniques have been used to solve this problem. In this paper, we propose a novel DNA-binding protein prediction method called HMMBinder. HMMBinder uses monogram and bigram features extracted from the HMM profiles of the protein sequences. To the best of our knowledge, this is the first application of HMM profile based features for the DNA-binding protein prediction problem. We applied Support Vector Machines (SVM) as a classification technique in HMMBinder. Our method was tested on standard benchmark datasets. We experimentally show that our method outperforms the state-of-the-art methods found in the literature.

  7. HMMBinder: DNA-Binding Protein Prediction Using HMM Profile Based Features

    Directory of Open Access Journals (Sweden)

    Rianon Zaman

    2017-01-01

    Full Text Available DNA-binding proteins often play important role in various processes within the cell. Over the last decade, a wide range of classification algorithms and feature extraction techniques have been used to solve this problem. In this paper, we propose a novel DNA-binding protein prediction method called HMMBinder. HMMBinder uses monogram and bigram features extracted from the HMM profiles of the protein sequences. To the best of our knowledge, this is the first application of HMM profile based features for the DNA-binding protein prediction problem. We applied Support Vector Machines (SVM as a classification technique in HMMBinder. Our method was tested on standard benchmark datasets. We experimentally show that our method outperforms the state-of-the-art methods found in the literature.

  8. Automatic generation of bioinformatics tools for predicting protein-ligand binding sites.

    Science.gov (United States)

    Komiyama, Yusuke; Banno, Masaki; Ueki, Kokoro; Saad, Gul; Shimizu, Kentaro

    2016-03-15

    Predictive tools that model protein-ligand binding on demand are needed to promote ligand research in an innovative drug-design environment. However, it takes considerable time and effort to develop predictive tools that can be applied to individual ligands. An automated production pipeline that can rapidly and efficiently develop user-friendly protein-ligand binding predictive tools would be useful. We developed a system for automatically generating protein-ligand binding predictions. Implementation of this system in a pipeline of Semantic Web technique-based web tools will allow users to specify a ligand and receive the tool within 0.5-1 day. We demonstrated high prediction accuracy for three machine learning algorithms and eight ligands. The source code and web application are freely available for download at http://utprot.net They are implemented in Python and supported on Linux. shimizu@bi.a.u-tokyo.ac.jp Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.

  9. Distance matrix-based approach to protein structure prediction.

    Science.gov (United States)

    Kloczkowski, Andrzej; Jernigan, Robert L; Wu, Zhijun; Song, Guang; Yang, Lei; Kolinski, Andrzej; Pokarowski, Piotr

    2009-03-01

    Much structural information is encoded in the internal distances; a distance matrix-based approach can be used to predict protein structure and dynamics, and for structural refinement. Our approach is based on the square distance matrix D = [r(ij)(2)] containing all square distances between residues in proteins. This distance matrix contains more information than the contact matrix C, that has elements of either 0 or 1 depending on whether the distance r (ij) is greater or less than a cutoff value r (cutoff). We have performed spectral decomposition of the distance matrices D = sigma lambda(k)V(k)V(kT), in terms of eigenvalues lambda kappa and the corresponding eigenvectors v kappa and found that it contains at most five nonzero terms. A dominant eigenvector is proportional to r (2)--the square distance of points from the center of mass, with the next three being the principal components of the system of points. By predicting r (2) from the sequence we can approximate a distance matrix of a protein with an expected RMSD value of about 7.3 A, and by combining it with the prediction of the first principal component we can improve this approximation to 4.0 A. We can also explain the role of hydrophobic interactions for the protein structure, because r is highly correlated with the hydrophobic profile of the sequence. Moreover, r is highly correlated with several sequence profiles which are useful in protein structure prediction, such as contact number, the residue-wise contact order (RWCO) or mean square fluctuations (i.e. crystallographic temperature factors). We have also shown that the next three components are related to spatial directionality of the secondary structure elements, and they may be also predicted from the sequence, improving overall structure prediction. We have also shown that the large number of available HIV-1 protease structures provides a remarkable sampling of conformations, which can be viewed as direct structural information about the

  10. Prediction of Hydrophobic Cores of Proteins Using Wavelet Analysis.

    Science.gov (United States)

    Hirakawa; Kuhara

    1997-01-01

    Information concerning the secondary structures, flexibility, epitope and hydrophobic regions of amino acid sequences can be extracted by assigning physicochemical indices to each amino acid residue, and information on structure can be derived using the sliding window averaging technique, which is in wide use for smoothing out raw functions. Wavelet analysis has shown great potential and applicability in many fields, such as astronomy, radar, earthquake prediction, and signal or image processing. This approach is efficient for removing noise from various functions. Here we employed wavelet analysis to smooth out a plot assigned to a hydrophobicity index for amino acid sequences. We then used the resulting function to predict hydrophobic cores in globular proteins. We calculated the prediction accuracy for the hydrophobic cores of 88 representative set of proteins. Use of wavelet analysis made feasible the prediction of hydrophobic cores at 6.13% greater accuracy than the sliding window averaging technique.

  11. (PS)2: protein structure prediction server version 3.0.

    Science.gov (United States)

    Huang, Tsun-Tsao; Hwang, Jenn-Kang; Chen, Chu-Huang; Chu, Chih-Sheng; Lee, Chi-Wen; Chen, Chih-Chieh

    2015-07-01

    Protein complexes are involved in many biological processes. Examining coupling between subunits of a complex would be useful to understand the molecular basis of protein function. Here, our updated (PS)(2) web server predicts the three-dimensional structures of protein complexes based on comparative modeling; furthermore, this server examines the coupling between subunits of the predicted complex by combining structural and evolutionary considerations. The predicted complex structure could be indicated and visualized by Java-based 3D graphics viewers and the structural and evolutionary profiles are shown and compared chain-by-chain. For each subunit, considerations with or without the packing contribution of other subunits cause the differences in similarities between structural and evolutionary profiles, and these differences imply which form, complex or monomeric, is preferred in the biological condition for the subunit. We believe that the (PS)(2) server would be a useful tool for biologists who are interested not only in the structures of protein complexes but also in the coupling between subunits of the complexes. The (PS)(2) is freely available at http://ps2v3.life.nctu.edu.tw/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  12. Predicting protein structures with a multiplayer online game.

    Science.gov (United States)

    Cooper, Seth; Khatib, Firas; Treuille, Adrien; Barbero, Janos; Lee, Jeehyung; Beenen, Michael; Leaver-Fay, Andrew; Baker, David; Popović, Zoran; Players, Foldit

    2010-08-05

    People exert large amounts of problem-solving effort playing computer games. Simple image- and text-recognition tasks have been successfully 'crowd-sourced' through games, but it is not clear if more complex scientific problems can be solved with human-directed computing. Protein structure prediction is one such problem: locating the biologically relevant native conformation of a protein is a formidable computational challenge given the very large size of the search space. Here we describe Foldit, a multiplayer online game that engages non-scientists in solving hard prediction problems. Foldit players interact with protein structures using direct manipulation tools and user-friendly versions of algorithms from the Rosetta structure prediction methodology, while they compete and collaborate to optimize the computed energy. We show that top-ranked Foldit players excel at solving challenging structure refinement problems in which substantial backbone rearrangements are necessary to achieve the burial of hydrophobic residues. Players working collaboratively develop a rich assortment of new strategies and algorithms; unlike computational approaches, they explore not only the conformational space but also the space of possible search strategies. The integration of human visual problem-solving and strategy development capabilities with traditional computational algorithms through interactive multiplayer games is a powerful new approach to solving computationally-limited scientific problems.

  13. GRIP: A web-based system for constructing Gold Standard datasets for protein-protein interaction prediction

    Directory of Open Access Journals (Sweden)

    Zheng Huiru

    2009-01-01

    Full Text Available Abstract Background Information about protein interaction networks is fundamental to understanding protein function and cellular processes. Interaction patterns among proteins can suggest new drug targets and aid in the design of new therapeutic interventions. Efforts have been made to map interactions on a proteomic-wide scale using both experimental and computational techniques. Reference datasets that contain known interacting proteins (positive cases and non-interacting proteins (negative cases are essential to support computational prediction and validation of protein-protein interactions. Information on known interacting and non interacting proteins are usually stored within databases. Extraction of these data can be both complex and time consuming. Although, the automatic construction of reference datasets for classification is a useful resource for researchers no public resource currently exists to perform this task. Results GRIP (Gold Reference dataset constructor from Information on Protein complexes is a web-based system that provides researchers with the functionality to create reference datasets for protein-protein interaction prediction in Saccharomyces cerevisiae. Both positive and negative cases for a reference dataset can be extracted, organised and downloaded by the user. GRIP also provides an upload facility whereby users can submit proteins to determine protein complex membership. A search facility is provided where a user can search for protein complex information in Saccharomyces cerevisiae. Conclusion GRIP is developed to retrieve information on protein complex, cellular localisation, and physical and genetic interactions in Saccharomyces cerevisiae. Manual construction of reference datasets can be a time consuming process requiring programming knowledge. GRIP simplifies and speeds up this process by allowing users to automatically construct reference datasets. GRIP is free to access at http://rosalind.infj.ulst.ac.uk/GRIP/.

  14. Feature Selection and the Class Imbalance Problem in Predicting Protein Function from Sequence

    NARCIS (Netherlands)

    Al-Shahib, A.; Breitling, R.; Gilbert, D.

    2005-01-01

    Abstract: When the standard approach to predict protein function by sequence homology fails, other alternative methods can be used that require only the amino acid sequence for predicting function. One such approach uses machine learning to predict protein function directly from amino acid sequence

  15. PRmePRed: A protein arginine methylation prediction tool.

    Directory of Open Access Journals (Sweden)

    Pawan Kumar

    Full Text Available Protein methylation is an important Post-Translational Modification (PTMs of proteins. Arginine methylation carries out and regulates several important biological functions, including gene regulation and signal transduction. Experimental identification of arginine methylation site is a daunting task as it is costly as well as time and labour intensive. Hence reliable prediction tools play an important task in rapid screening and identification of possible methylation sites in proteomes. Our preliminary assessment using the available prediction methods on collected data yielded unimpressive results. This motivated us to perform a comprehensive data analysis and appraisal of features relevant in the context of biological significance, that led to the development of a prediction tool PRmePRed with better performance. The PRmePRed perform reasonably well with an accuracy of 84.10%, 82.38% sensitivity, 83.77% specificity, and Matthew's correlation coefficient of 66.20% in 10-fold cross-validation. PRmePRed is freely available at http://bioinfo.icgeb.res.in/PRmePRed/.

  16. Distribution of polycyclic aromatic hydrocarbons in subcellular root tissues of ryegrass (Lolium multiflorum Lam.)

    Science.gov (United States)

    2010-01-01

    Background Because of the increasing quantity and high toxicity to humans of polycyclic aromatic hydrocarbons (PAHs) in the environment, several bioremediation mechanisms and protocols have been investigated to restore PAH-contaminated sites. The transport of organic contaminants among plant cells via tissues and their partition in roots, stalks, and leaves resulting from transpiration and lipid content have been extensively investigated. However, information about PAH distributions in intracellular tissues is lacking, thus limiting the further development of a mechanism-based phytoremediation strategy to improve treatment efficiency. Results Pyrene exhibited higher uptake and was more recalcitrant to metabolism in ryegrass roots than was phenanthrene. The kinetic processes of uptake from ryegrass culture medium revealed that these two PAHs were first adsorbed onto root cell walls, and they then penetrated cell membranes and were distributed in intracellular organelle fractions. At the beginning of uptake (< 50 h), adsorption to cell walls dominated the subcellular partitioning of the PAHs. After 96 h of uptake, the subcellular partition of PAHs approached a stable state in the plant water system, with the proportion of PAH distributed in subcellular fractions being controlled by the lipid contents of each component. Phenanthrene and pyrene primarily accumulated in plant root cell walls and organelles, with about 45% of PAHs in each of these two fractions, and the remainder was retained in the dissolved fraction of the cells. Because of its higher lipophilicity, pyrene displayed greater accumulation factors in subcellular walls and organelle fractions than did phenanthrene. Conclusions Transpiration and the lipid content of root cell fractions are the main drivers of the subcellular partition of PAHs in roots. Initially, PAHs adsorb to plant cell walls, and they then gradually diffuse into subcellular fractions of tissues. The lipid content of intracellular

  17. Electrostatics, structure prediction, and the energy landscapes for protein folding and binding.

    Science.gov (United States)

    Tsai, Min-Yeh; Zheng, Weihua; Balamurugan, D; Schafer, Nicholas P; Kim, Bobby L; Cheung, Margaret S; Wolynes, Peter G

    2016-01-01

    While being long in range and therefore weakly specific, electrostatic interactions are able to modulate the stability and folding landscapes of some proteins. The relevance of electrostatic forces for steering the docking of proteins to each other is widely acknowledged, however, the role of electrostatics in establishing specifically funneled landscapes and their relevance for protein structure prediction are still not clear. By introducing Debye-Hückel potentials that mimic long-range electrostatic forces into the Associative memory, Water mediated, Structure, and Energy Model (AWSEM), a transferable protein model capable of predicting tertiary structures, we assess the effects of electrostatics on the landscapes of thirteen monomeric proteins and four dimers. For the monomers, we find that adding electrostatic interactions does not improve structure prediction. Simulations of ribosomal protein S6 show, however, that folding stability depends monotonically on electrostatic strength. The trend in predicted melting temperatures of the S6 variants agrees with experimental observations. Electrostatic effects can play a range of roles in binding. The binding of the protein complex KIX-pKID is largely assisted by electrostatic interactions, which provide direct charge-charge stabilization of the native state and contribute to the funneling of the binding landscape. In contrast, for several other proteins, including the DNA-binding protein FIS, electrostatics causes frustration in the DNA-binding region, which favors its binding with DNA but not with its protein partner. This study highlights the importance of long-range electrostatics in functional responses to problems where proteins interact with their charged partners, such as DNA, RNA, as well as membranes. © 2015 The Protein Society.

  18. A sequence-based dynamic ensemble learning system for protein ligand-binding site prediction

    KAUST Repository

    Chen, Peng

    2015-12-03

    Background: Proteins have the fundamental ability to selectively bind to other molecules and perform specific functions through such interactions, such as protein-ligand binding. Accurate prediction of protein residues that physically bind to ligands is important for drug design and protein docking studies. Most of the successful protein-ligand binding predictions were based on known structures. However, structural information is not largely available in practice due to the huge gap between the number of known protein sequences and that of experimentally solved structures

  19. A sequence-based dynamic ensemble learning system for protein ligand-binding site prediction

    KAUST Repository

    Chen, Peng; Hu, ShanShan; Zhang, Jun; Gao, Xin; Li, Jinyan; Xia, Junfeng; Wang, Bing

    2015-01-01

    Background: Proteins have the fundamental ability to selectively bind to other molecules and perform specific functions through such interactions, such as protein-ligand binding. Accurate prediction of protein residues that physically bind to ligands is important for drug design and protein docking studies. Most of the successful protein-ligand binding predictions were based on known structures. However, structural information is not largely available in practice due to the huge gap between the number of known protein sequences and that of experimentally solved structures

  20. Expression and analysis of exogenous proteins in epidermal cells.

    Science.gov (United States)

    Dagnino, Lina; Ho, Ernest; Chang, Wing Y

    2010-01-01

    In this chapter we review protocols for transient transfection of primary keratinocytes. The ability to transfect primary epidermal cells regardless of their differentiation status allows the biochemical and molecular characterization of multiple proteins. We review methods to analyze exogenous protein abundance in transfected keratinocytes by immunoblot and immunoprecipitation. We also present protocols to determine the subcellular distribution of these proteins by indirect immunofluorescence microscopy approaches.

  1. Organelle-targeting surface-enhanced Raman scattering (SERS) nanosensors for subcellular pH sensing.

    Science.gov (United States)

    Shen, Yanting; Liang, Lijia; Zhang, Shuqin; Huang, Dianshuai; Zhang, Jing; Xu, Shuping; Liang, Chongyang; Xu, Weiqing

    2018-01-25

    The pH value of subcellular organelles in living cells is a significant parameter in the physiological activities of cells. Its abnormal fluctuations are commonly believed to be associated with cancers and other diseases. Herein, a series of surface-enhanced Raman scattering (SERS) nanosensors with high sensitivity and targeting function was prepared for the quantification and monitoring of pH values in mitochondria, nucleus, and lysosome. The nanosensors were composed of gold nanorods (AuNRs) functionalized with a pH-responsive molecule (4-mercaptopyridine, MPy) and peptides that could specifically deliver the AuNRs to the targeting subcellular organelles. The localization of our prepared nanoprobes in specific organelles was confirmed by super-high resolution fluorescence imaging and bio-transmission electron microscopy (TEM) methods. By the targeting ability, the pH values of the specific organelles can be determined by monitoring the vibrational spectral changes of MPy with different pH values. Compared to the cases of reported lysosome and cytoplasm SERS pH sensors, more accurate pH values of mitochondria and nucleus, which could be two additional intracellular tracers for subcellular microenvironments, were disclosed by this SERS approach, further improving the accuracy of discrimination of related diseases. Our sensitive SERS strategy can also be employed to explore crucial physiological and biological processes that are related to subcellular pH fluctuations.

  2. Supervised maximum-likelihood weighting of composite protein networks for complex prediction

    Directory of Open Access Journals (Sweden)

    Yong Chern Han

    2012-12-01

    Full Text Available Abstract Background Protein complexes participate in many important cellular functions, so finding the set of existent complexes is essential for understanding the organization and regulation of processes in the cell. With the availability of large amounts of high-throughput protein-protein interaction (PPI data, many algorithms have been proposed to discover protein complexes from PPI networks. However, such approaches are hindered by the high rate of noise in high-throughput PPI data, including spurious and missing interactions. Furthermore, many transient interactions are detected between proteins that are not from the same complex, while not all proteins from the same complex may actually interact. As a result, predicted complexes often do not match true complexes well, and many true complexes go undetected. Results We address these challenges by integrating PPI data with other heterogeneous data sources to construct a composite protein network, and using a supervised maximum-likelihood approach to weight each edge based on its posterior probability of belonging to a complex. We then use six different clustering algorithms, and an aggregative clustering strategy, to discover complexes in the weighted network. We test our method on Saccharomyces cerevisiae and Homo sapiens, and show that complex discovery is improved: compared to previously proposed supervised and unsupervised weighting approaches, our method recalls more known complexes, achieves higher precision at all recall levels, and generates novel complexes of greater functional similarity. Furthermore, our maximum-likelihood approach allows learned parameters to be used to visualize and evaluate the evidence of novel predictions, aiding human judgment of their credibility. Conclusions Our approach integrates multiple data sources with supervised learning to create a weighted composite protein network, and uses six clustering algorithms with an aggregative clustering strategy to

  3. Multi-level machine learning prediction of protein–protein interactions in Saccharomyces cerevisiae

    Directory of Open Access Journals (Sweden)

    Julian Zubek

    2015-07-01

    Full Text Available Accurate identification of protein–protein interactions (PPI is the key step in understanding proteins’ biological functions, which are typically context-dependent. Many existing PPI predictors rely on aggregated features from protein sequences, however only a few methods exploit local information about specific residue contacts. In this work we present a two-stage machine learning approach for prediction of protein–protein interactions. We start with the carefully filtered data on protein complexes available for Saccharomyces cerevisiae in the Protein Data Bank (PDB database. First, we build linear descriptions of interacting and non-interacting sequence segment pairs based on their inter-residue distances. Secondly, we train machine learning classifiers to predict binary segment interactions for any two short sequence fragments. The final prediction of the protein–protein interaction is done using the 2D matrix representation of all-against-all possible interacting sequence segments of both analysed proteins. The level-I predictor achieves 0.88 AUC for micro-scale, i.e., residue-level prediction. The level-II predictor improves the results further by a more complex learning paradigm. We perform 30-fold macro-scale, i.e., protein-level cross-validation experiment. The level-II predictor using PSIPRED-predicted secondary structure reaches 0.70 precision, 0.68 recall, and 0.70 AUC, whereas other popular methods provide results below 0.6 threshold (recall, precision, AUC. Our results demonstrate that multi-scale sequence features aggregation procedure is able to improve the machine learning results by more than 10% as compared to other sequence representations. Prepared datasets and source code for our experimental pipeline are freely available for download from: http://zubekj.github.io/mlppi/ (open source Python implementation, OS independent.

  4. Sub-cellular damage by copper in the cnidarian Zoanthus robustus.

    Science.gov (United States)

    Grant, A; Trompf, K; Seung, D; Nivison-Smith, L; Bowcock, H; Kresse, H; Holmes, S; Radford, J; Morrow, P

    2010-09-01

    Sessile organisms may experience chronic exposure to copper that is released into the marine environment from antifoulants and stormwater runoff. We have identified the site of damage caused by copper to the symbiotic cnidarian, Zoanthus robustus (Anthozoa, Hexacorallia). External changes to the zoanthids were apparent when compared with controls. The normally flexible bodies contracted and became rigid. Histological examination of the zoanthid tissue revealed that copper had caused sub-cellular changes to proteins within the extracellular matrix (ECM) of the tubular body. Collagen in the ECM and the internal septa increased in thickness to five and seven times that of controls respectively. The epithelium, which stained for elastin, was also twice as thick and tough to cut, but exposure to copper did not change the total amount of desmosine which is found only in elastin. We conclude that copper stimulated collagen synthesis in the ECM and also caused cross-linking of existing proteins. However, there was no expulsion of the symbiotic algae (Symbiodinium sp.) and no effect on algal pigments or respiration (44, 66 and 110 microg Cu L(-1)). A decrease in net photosynthesis was observed only at the highest copper concentration (156 microg Cu L(-1)). These results show that cnidarians may be more susceptible to damage by copper than their symbiotic algae. Copyright (c) 2010 Elsevier Inc. All rights reserved.

  5. HPASubC: A suite of tools for user subclassification of human protein atlas tissue images.

    Science.gov (United States)

    Cornish, Toby C; Chakravarti, Aravinda; Kapoor, Ashish; Halushka, Marc K

    2015-01-01

    The human protein atlas (HPA) is a powerful proteomic tool for visualizing the distribution of protein expression across most human tissues and many common malignancies. The HPA includes immunohistochemically-stained images from tissue microarrays (TMAs) that cover 48 tissue types and 20 common malignancies. The TMA data are used to provide expression information at the tissue, cellular, and occasionally, subcellular level. The HPA also provides subcellular data from confocal immunofluorescence data on three cell lines. Despite the availability of localization data, many unique patterns of cellular and subcellular expression are not documented. To get at this more granular data, we have developed a suite of Python scripts, HPASubC, to aid in subcellular, and cell-type specific classification of HPA images. This method allows the user to download and optimize specific HPA TMA images for review. Then, using a playstation-style video game controller, a trained observer can rapidly step through 10's of 1000's of images to identify patterns of interest. We have successfully used this method to identify 703 endothelial cell (EC) and/or smooth muscle cell (SMCs) specific proteins discovered within 49,200 heart TMA images. This list will assist us in subdividing cardiac gene or protein array data into expression by one of the predominant cell types of the myocardium: Myocytes, SMCs or ECs. The opportunity to further characterize unique staining patterns across a range of human tissues and malignancies will accelerate our understanding of disease processes and point to novel markers for tissue evaluation in surgical pathology.

  6. HPASubC: A suite of tools for user subclassification of human protein atlas tissue images

    Science.gov (United States)

    Cornish, Toby C.; Chakravarti, Aravinda; Kapoor, Ashish; Halushka, Marc K.

    2015-01-01

    Background: The human protein atlas (HPA) is a powerful proteomic tool for visualizing the distribution of protein expression across most human tissues and many common malignancies. The HPA includes immunohistochemically-stained images from tissue microarrays (TMAs) that cover 48 tissue types and 20 common malignancies. The TMA data are used to provide expression information at the tissue, cellular, and occasionally, subcellular level. The HPA also provides subcellular data from confocal immunofluorescence data on three cell lines. Despite the availability of localization data, many unique patterns of cellular and subcellular expression are not documented. Materials and Methods: To get at this more granular data, we have developed a suite of Python scripts, HPASubC, to aid in subcellular, and cell-type specific classification of HPA images. This method allows the user to download and optimize specific HPA TMA images for review. Then, using a playstation-style video game controller, a trained observer can rapidly step through 10's of 1000's of images to identify patterns of interest. Results: We have successfully used this method to identify 703 endothelial cell (EC) and/or smooth muscle cell (SMCs) specific proteins discovered within 49,200 heart TMA images. This list will assist us in subdividing cardiac gene or protein array data into expression by one of the predominant cell types of the myocardium: Myocytes, SMCs or ECs. Conclusions: The opportunity to further characterize unique staining patterns across a range of human tissues and malignancies will accelerate our understanding of disease processes and point to novel markers for tissue evaluation in surgical pathology. PMID:26167380

  7. NetTurnP – Neural Network Prediction of Beta-turns by Use of Evolutionary Information and Predicted Protein Sequence Features

    DEFF Research Database (Denmark)

    Petersen, Bent; Lundegaard, Claus; Petersen, Thomas Nordahl

    2010-01-01

    is the highest reported performance on a two-class prediction of β-turn and not-β-turn. Furthermore NetTurnP shows improved performance on some of the specific β-turn types. In the present work, neural network methods have been trained to predict β-turn or not and individual β-turn types from the primary amino......β-turns are the most common type of non-repetitive structures, and constitute on average 25% of the amino acids in proteins. The formation of β-turns plays an important role in protein folding, protein stability and molecular recognition processes. In this work we present the neural network method...... NetTurnP, for prediction of two-class β-turns and prediction of the individual β-turn types, by use of evolutionary information and predicted protein sequence features. It has been evaluated against a commonly used dataset BT426, and achieves a Matthews correlation coefficient of 0.50, which...

  8. An Improved Method of Predicting Extinction Coefficients for the Determination of Protein Concentration.

    Science.gov (United States)

    Hilario, Eric C; Stern, Alan; Wang, Charlie H; Vargas, Yenny W; Morgan, Charles J; Swartz, Trevor E; Patapoff, Thomas W

    2017-01-01

    Concentration determination is an important method of protein characterization required in the development of protein therapeutics. There are many known methods for determining the concentration of a protein solution, but the easiest to implement in a manufacturing setting is absorption spectroscopy in the ultraviolet region. For typical proteins composed of the standard amino acids, absorption at wavelengths near 280 nm is due to the three amino acid chromophores tryptophan, tyrosine, and phenylalanine in addition to a contribution from disulfide bonds. According to the Beer-Lambert law, absorbance is proportional to concentration and path length, with the proportionality constant being the extinction coefficient. Typically the extinction coefficient of proteins is experimentally determined by measuring a solution absorbance then experimentally determining the concentration, a measurement with some inherent variability depending on the method used. In this study, extinction coefficients were calculated based on the measured absorbance of model compounds of the four amino acid chromophores. These calculated values for an unfolded protein were then compared with an experimental concentration determination based on enzymatic digestion of proteins. The experimentally determined extinction coefficient for the native proteins was consistently found to be 1.05 times the calculated value for the unfolded proteins for a wide range of proteins with good accuracy and precision under well-controlled experimental conditions. The value of 1.05 times the calculated value was termed the predicted extinction coefficient. Statistical analysis shows that the differences between predicted and experimentally determined coefficients are scattered randomly, indicating no systematic bias between the values among the proteins measured. The predicted extinction coefficient was found to be accurate and not subject to the inherent variability of experimental methods. We propose the use of a

  9. Predicting Protein Secondary Structure with Markov Models

    DEFF Research Database (Denmark)

    Fischer, Paul; Larsen, Simon; Thomsen, Claus

    2004-01-01

    we are considering here, is to predict the secondary structure from the primary one. To this end we train a Markov model on training data and then use it to classify parts of unknown protein sequences as sheets, helices or coils. We show how to exploit the directional information contained...... in the Markov model for this task. Classifications that are purely based on statistical models might not always be biologically meaningful. We present combinatorial methods to incorporate biological background knowledge to enhance the prediction performance....

  10. Systematic Prediction of Scaffold Proteins Reveals New Design Principles in Scaffold-Mediated Signal Transduction

    Science.gov (United States)

    Hu, Jianfei; Neiswinger, Johnathan; Zhang, Jin; Zhu, Heng; Qian, Jiang

    2015-01-01

    Scaffold proteins play a crucial role in facilitating signal transduction in eukaryotes by bringing together multiple signaling components. In this study, we performed a systematic analysis of scaffold proteins in signal transduction by integrating protein-protein interaction and kinase-substrate relationship networks. We predicted 212 scaffold proteins that are involved in 605 distinct signaling pathways. The computational prediction was validated using a protein microarray-based approach. The predicted scaffold proteins showed several interesting characteristics, as we expected from the functionality of scaffold proteins. We found that the scaffold proteins are likely to interact with each other, which is consistent with previous finding that scaffold proteins tend to form homodimers and heterodimers. Interestingly, a single scaffold protein can be involved in multiple signaling pathways by interacting with other scaffold protein partners. Furthermore, we propose two possible regulatory mechanisms by which the activity of scaffold proteins is coordinated with their associated pathways through phosphorylation process. PMID:26393507

  11. Correlation of chemical shifts predicted by molecular dynamics simulations for partially disordered proteins

    Energy Technology Data Exchange (ETDEWEB)

    Karp, Jerome M.; Erylimaz, Ertan; Cowburn, David, E-mail: cowburn@cowburnlab.org, E-mail: David.cowburn@einstein.yu.edu [Albert Einstein College of Medicine of Yeshiva University, Department of Biochemistry (United States)

    2015-01-15

    There has been a longstanding interest in being able to accurately predict NMR chemical shifts from structural data. Recent studies have focused on using molecular dynamics (MD) simulation data as input for improved prediction. Here we examine the accuracy of chemical shift prediction for intein systems, which have regions of intrinsic disorder. We find that using MD simulation data as input for chemical shift prediction does not consistently improve prediction accuracy over use of a static X-ray crystal structure. This appears to result from the complex conformational ensemble of the disordered protein segments. We show that using accelerated molecular dynamics (aMD) simulations improves chemical shift prediction, suggesting that methods which better sample the conformational ensemble like aMD are more appropriate tools for use in chemical shift prediction for proteins with disordered regions. Moreover, our study suggests that data accurately reflecting protein dynamics must be used as input for chemical shift prediction in order to correctly predict chemical shifts in systems with disorder.

  12. Transient Expression and Cellular Localization of Recombinant Proteins in Cultured Insect Cells.

    Science.gov (United States)

    Fabrick, Jeffrey A; Hull, J Joe

    2017-04-20

    Heterologous protein expression systems are used for the production of recombinant proteins, the interpretation of cellular trafficking/localization, and the determination of the biochemical function of proteins at the sub-organismal level. Although baculovirus expression systems are increasingly used for protein production in numerous biotechnological, pharmaceutical, and industrial applications, nonlytic systems that do not involve viral infection have clear benefits but are often overlooked and underutilized. Here, we describe a method for generating nonlytic expression vectors and transient recombinant protein expression. This protocol allows for the efficient cellular localization of recombinant proteins and can be used to rapidly discern protein trafficking within the cell. We show the expression of four recombinant proteins in a commercially available insect cell line, including two aquaporin proteins from the insect Bemisia tabaci, as well as subcellular marker proteins specific for the cell plasma membrane and for intracellular lysosomes. All recombinant proteins were produced as chimeras with fluorescent protein markers at their carboxyl termini, which allows for the direct detection of the recombinant proteins. The double transfection of cells with plasmids harboring constructs for the genes of interest and a known subcellular marker allows for live cell imaging and improved validation of cellular protein localization.

  13. Studies on the turnover and subcellular localization of membrane gangliosides in cultured neuroblastoma cells

    International Nuclear Information System (INIS)

    Clarke, J.T.; Cook, H.W.; Spence, M.W.

    1985-01-01

    To compare the subcellular distribution of endogenously synthesized and exogenous gangliosides, cultured murine neuroblastoma cells (N1E-115) were incubated in suspension for 22 h in the presence of D-[1- 3 H]galactose or [ 3 H]GM1 ganglioside, transferred to culture medium containing no radioisotope for periods of up to 72 hr, and then subjected to subcellular fractionation and analysis of lipid-sialic acid and radiolabeled ganglioside levels. The results indicated that GM2 and GM3 were the principal gangliosides in the cells with only traces of GM1 and small amounts of disialogangliosides present. About 50% of the endogenously synthesized radiolabelled ganglioside in the four major subcellular membrane fractions studied was recovered from plasma membrane and only 10-15% from the crude mitochondrial membrane fraction. In contrast, 45% of the exogenous [ 3 H]GM1 taken up into the same subcellular membrane fractions was recovered from the crude mitochondrial fraction; less than 15% was localized in the plasma membrane fraction. The results are similar to those obtained from previously reported studies on membrane phospholipid turnover. They suggest that exogenous GM1 ganglioside, like exogenous phosphatidylcholine, does not intermix freely with any quantitatively major pool of endogenous membrane lipid

  14. Sequence-based prediction of protein-binding sites in DNA: comparative study of two SVM models.

    Science.gov (United States)

    Park, Byungkyu; Im, Jinyong; Tuvshinjargal, Narankhuu; Lee, Wook; Han, Kyungsook

    2014-11-01

    As many structures of protein-DNA complexes have been known in the past years, several computational methods have been developed to predict DNA-binding sites in proteins. However, its inverse problem (i.e., predicting protein-binding sites in DNA) has received much less attention. One of the reasons is that the differences between the interaction propensities of nucleotides are much smaller than those between amino acids. Another reason is that DNA exhibits less diverse sequence patterns than protein. Therefore, predicting protein-binding DNA nucleotides is much harder than predicting DNA-binding amino acids. We computed the interaction propensity (IP) of nucleotide triplets with amino acids using an extensive dataset of protein-DNA complexes, and developed two support vector machine (SVM) models that predict protein-binding nucleotides from sequence data alone. One SVM model predicts protein-binding nucleotides using DNA sequence data alone, and the other SVM model predicts protein-binding nucleotides using both DNA and protein sequences. In a 10-fold cross-validation with 1519 DNA sequences, the SVM model that uses DNA sequence data only predicted protein-binding nucleotides with an accuracy of 67.0%, an F-measure of 67.1%, and a Matthews correlation coefficient (MCC) of 0.340. With an independent dataset of 181 DNAs that were not used in training, it achieved an accuracy of 66.2%, an F-measure 66.3% and a MCC of 0.324. Another SVM model that uses both DNA and protein sequences achieved an accuracy of 69.6%, an F-measure of 69.6%, and a MCC of 0.383 in a 10-fold cross-validation with 1519 DNA sequences and 859 protein sequences. With an independent dataset of 181 DNAs and 143 proteins, it showed an accuracy of 67.3%, an F-measure of 66.5% and a MCC of 0.329. Both in cross-validation and independent testing, the second SVM model that used both DNA and protein sequence data showed better performance than the first model that used DNA sequence data. To the best of

  15. Immunochemical characterization of the brain glutamate binding protein

    International Nuclear Information System (INIS)

    Roy, S.

    1986-01-01

    A glutamate binding protein (GBP) was purified from bovine and rat brain to near homogeneity. Polyclonal antibodies were raised against this protein. An enzyme-linked-immunosorbent-assay was used to quantify and determine the specificity of the antibody response. The antibodies were shown to strongly react with bovine brain GBP and the analogous protein from rat brain. The antibodies did not show any crossreactivity with the glutamate metabolizing enzymes, glutamate dehydrogenase, glutamine synthetase and glutamyl transpeptidase, however it crossreacted moderately with glutamate decarboxylase. The antibodies were also used to define the possible physiologic activity of GBP in synaptic membranes. The antibodies were shown: (i) to inhibit the excitatory amino-acid stimulation of thiocyanate (SCN)flux, (ii) had no effect on transport of L-Glutamic acid across the synaptic membrane, and (iii) had no effect on the depolarization-induced release of L-glutamate. When the anti-GBP antibodies were used to localize and quantify the GBP distribution in various subcellular fractions and in brain tissue samples, it was found that the hippocampus had the highest immunoreactivity followed by the cerebral cortex, cerebellar cortex and caudate-putamen. The distribution of immunoreactivity in the subcellular fraction were as follows: synaptic membranes > crude mitochondrial fraction > homogenate > myelin. In conclusion these studies suggest that: (a) the rat brain GBP and the bovine brain GBP are immunologically homologous protein, (b) there are no structural similarities between the GBP and the glutamate metabolizing enzymes with the exception of glutamate decarboxylase and (c) the subcellular and regional distribution of the GBP immunoreactivity followed a similar pattern as observed for L-[ 3 H]-binding

  16. Nanodiamond Landmarks for Subcellular Multimodal Optical and Electron Imaging

    Science.gov (United States)

    Zurbuchen, Mark A.; Lake, Michael P.; Kohan, Sirus A.; Leung, Belinda; Bouchard, Louis-S.

    2013-01-01

    There is a growing need for biolabels that can be used in both optical and electron microscopies, are non-cytotoxic, and do not photobleach. Such biolabels could enable targeted nanoscale imaging of sub-cellular structures, and help to establish correlations between conjugation-delivered biomolecules and function. Here we demonstrate a sub-cellular multi-modal imaging methodology that enables localization of inert particulate probes, consisting of nanodiamonds having fluorescent nitrogen-vacancy centers. These are functionalized to target specific structures, and are observable by both optical and electron microscopies. Nanodiamonds targeted to the nuclear pore complex are rapidly localized in electron-microscopy diffraction mode to enable “zooming-in” to regions of interest for detailed structural investigations. Optical microscopies reveal nanodiamonds for in-vitro tracking or uptake-confirmation. The approach is general, works down to the single nanodiamond level, and can leverage the unique capabilities of nanodiamonds, such as biocompatibility, sensitive magnetometry, and gene and drug delivery. PMID:24036840

  17. Structure and function of yeast glutaredoxin 2 depend on postranslational processing and are related to subcellular distribution.

    Science.gov (United States)

    Porras, Pablo; McDonagh, Brian; Pedrajas, Jose Rafael; Bárcena, J Antonio; Padilla, C Alicia

    2010-04-01

    We have previously shown that glutaredoxin 2 (Grx2) from Saccharomyces cerevisiae localizes at 3 different subcellular compartments, cytosol, mitochondrial matrix and outer membrane, as the result of different postranslational processing of one single gene. Having set the mechanism responsible for this remarkable phenomenon, we have now aimed at defining whether this diversity of subcellular localizations correlates with differences in structure and function of the Grx2 isoforms. We have determined the N-terminal sequence of the soluble mitochondrial matrix Grx2 by mass spectrometry and have determined the exact cleavage site by Mitochondrial Processing Peptidase (MPP). As a consequence of this cleavage, the mitochondrial matrix Grx2 isoform possesses a basic tetrapeptide extension at the N-terminus compared to the cytosolic form. A functional relationship to this structural difference is that mitochondrial Grx2 displays a markedly higher activity in the catalysis of GSSG reduction by the mitochondrial dithiol dihydrolipoamide. We have prepared Grx2 mutants affected on key residues inside the presequence to direct the protein to one single cellular compartment; either the cytosol, the mitochondrial membrane or the matrix and have analyzed their functional phenotypes. Strains expressing Grx2 only in the cytosol are equally sensitive to H(2)O(2) as strains lacking the gene, whereas those expressing Grx2 exclusively in the mitochondrial matrix are more resistant. Mutations on key basic residues drastically affect the cellular fate of the protein, showing that evolutionary diversification of Grx2 structural and functional properties are strictly dependent on the sequence of the targeting signal peptide. Copyright 2009 Elsevier B.V. All rights reserved.

  18. MUFOLD-SS: New deep inception-inside-inception networks for protein secondary structure prediction.

    Science.gov (United States)

    Fang, Chao; Shang, Yi; Xu, Dong

    2018-05-01

    Protein secondary structure prediction can provide important information for protein 3D structure prediction and protein functions. Deep learning offers a new opportunity to significantly improve prediction accuracy. In this article, a new deep neural network architecture, named the Deep inception-inside-inception (Deep3I) network, is proposed for protein secondary structure prediction and implemented as a software tool MUFOLD-SS. The input to MUFOLD-SS is a carefully designed feature matrix corresponding to the primary amino acid sequence of a protein, which consists of a rich set of information derived from individual amino acid, as well as the context of the protein sequence. Specifically, the feature matrix is a composition of physio-chemical properties of amino acids, PSI-BLAST profile, and HHBlits profile. MUFOLD-SS is composed of a sequence of nested inception modules and maps the input matrix to either eight states or three states of secondary structures. The architecture of MUFOLD-SS enables effective processing of local and global interactions between amino acids in making accurate prediction. In extensive experiments on multiple datasets, MUFOLD-SS outperformed the best existing methods and other deep neural networks significantly. MUFold-SS can be downloaded from http://dslsrv8.cs.missouri.edu/~cf797/MUFoldSS/download.html. © 2018 Wiley Periodicals, Inc.

  19. Prediction of human protein function according to Gene Ontology categories

    DEFF Research Database (Denmark)

    Jensen, Lars Juhl; Gupta, Ramneek; Stærfeldt, Hans Henrik

    2003-01-01

    developed a method for prediction of protein function for a subset of classes from the Gene Ontology classification scheme. This subset includes several pharmaceutically interesting categories-transcription factors, receptors, ion channels, stress and immune response proteins, hormones and growth factors...

  20. The Phyre2 web portal for protein modeling, prediction and analysis.

    Science.gov (United States)

    Kelley, Lawrence A; Mezulis, Stefans; Yates, Christopher M; Wass, Mark N; Sternberg, Michael J E

    2015-06-01

    Phyre2 is a suite of tools available on the web to predict and analyze protein structure, function and mutations. The focus of Phyre2 is to provide biologists with a simple and intuitive interface to state-of-the-art protein bioinformatics tools. Phyre2 replaces Phyre, the original version of the server for which we previously published a paper in Nature Protocols. In this updated protocol, we describe Phyre2, which uses advanced remote homology detection methods to build 3D models, predict ligand binding sites and analyze the effect of amino acid variants (e.g., nonsynonymous SNPs (nsSNPs)) for a user's protein sequence. Users are guided through results by a simple interface at a level of detail they determine. This protocol will guide users from submitting a protein sequence to interpreting the secondary and tertiary structure of their models, their domain composition and model quality. A range of additional available tools is described to find a protein structure in a genome, to submit large number of sequences at once and to automatically run weekly searches for proteins that are difficult to model. The server is available at http://www.sbg.bio.ic.ac.uk/phyre2. A typical structure prediction will be returned between 30 min and 2 h after submission.

  1. Predicting protein structures with a multiplayer online game

    OpenAIRE

    Cooper, Seth; Khatib, Firas; Treuille, Adrien; Barbero, Janos; Lee, Jeehyung; Beenen, Michael; Leaver-Fay, Andrew; Baker, David; Popović, Zoran

    2010-01-01

    People exert significant amounts of problem solving effort playing computer games. Simple image- and text-recognition tasks have been successfully crowd-sourced through gamesi, ii, iii, but it is not clear if more complex scientific problems can be similarly solved with human-directed computing. Protein structure prediction is one such problem: locating the biologically relevant native conformation of a protein is a formidable computational challenge given the very large size of the search sp...

  2. Majority of cellular fatty acid acylated proteins are localized to the cytoplasmic surface of the plasma membrane

    International Nuclear Information System (INIS)

    Wilcox, C.A.; Olson, E.N.

    1987-01-01

    The BC 2 Hl muscle cell line was previously reported to contain a broad array of fatty acid acylated proteins. Palmitate was shown to be attached to membrane proteins posttranslationally through thiol ester linkages, whereas myristate was attached cotranslationally, or within seconds thereafter, to soluble and membrane-bound proteins through amide linkages. The temporal and subcellular differences between palmitate and myristate acylation suggested that these two classes of acyl proteins might follow different intracellular pathways to distinct subcellular membrane systems or organelles. In this study, the authors examined the subcellular localization of the major fatty acylated proteins in BC 4 Hl cells. Palmitate-containing proteins were localized to the plasma membrane, but only a subset of myristate-containing proteins was localized to this membrane fraction. The majority of acyl proteins were nonglycosylated and resistant to digestion with extracellular proteases, suggesting that they were not exposed to the external surface of the plasma membrane. Many proteins were, however, digested during incubation of isolated membranes with proteases, which indicates that these proteins were, however, digested during incubation of isolated membranes with proteases, which indicates that these proteins face the cytoplasm. Two-dimensional gel electrophoresis of proteins labeled with [ 3 H]palmitate and [ 3 H]myristate revealed that individual proteins were modified by only one of the two fatty acids and did not undergo both N-linked myristylation and ester-linked palmitylation. Together, these results suggest that the majority of cellular acyl proteins are routed to the cytoplasmic surface of the plasma membrane, and they raise the possibility that fatty acid acylation may play a role in intracellular sorting of nontransmembranous, nonglycosylated membrane proteins

  3. System and methods for predicting transmembrane domains in membrane proteins and mining the genome for recognizing G-protein coupled receptors

    Science.gov (United States)

    Trabanino, Rene J; Vaidehi, Nagarajan; Hall, Spencer E; Goddard, William A; Floriano, Wely

    2013-02-05

    The invention provides computer-implemented methods and apparatus implementing a hierarchical protocol using multiscale molecular dynamics and molecular modeling methods to predict the presence of transmembrane regions in proteins, such as G-Protein Coupled Receptors (GPCR), and protein structural models generated according to the protocol. The protocol features a coarse grain sampling method, such as hydrophobicity analysis, to provide a fast and accurate procedure for predicting transmembrane regions. Methods and apparatus of the invention are useful to screen protein or polynucleotide databases for encoded proteins with transmembrane regions, such as GPCRs.

  4. MemBrain: An Easy-to-Use Online Webserver for Transmembrane Protein Structure Prediction

    Science.gov (United States)

    Yin, Xi; Yang, Jing; Xiao, Feng; Yang, Yang; Shen, Hong-Bin

    2018-03-01

    Membrane proteins are an important kind of proteins embedded in the membranes of cells and play crucial roles in living organisms, such as ion channels, transporters, receptors. Because it is difficult to determinate the membrane protein's structure by wet-lab experiments, accurate and fast amino acid sequence-based computational methods are highly desired. In this paper, we report an online prediction tool called MemBrain, whose input is the amino acid sequence. MemBrain consists of specialized modules for predicting transmembrane helices, residue-residue contacts and relative accessible surface area of α-helical membrane proteins. MemBrain achieves a prediction accuracy of 97.9% of A TMH, 87.1% of A P, 3.2 ± 3.0 of N-score, 3.1 ± 2.8 of C-score. MemBrain-Contact obtains 62%/64.1% prediction accuracy on training and independent dataset on top L/5 contact prediction, respectively. And MemBrain-Rasa achieves Pearson correlation coefficient of 0.733 and its mean absolute error of 13.593. These prediction results provide valuable hints for revealing the structure and function of membrane proteins. MemBrain web server is free for academic use and available at www.csbio.sjtu.edu.cn/bioinf/MemBrain/. [Figure not available: see fulltext.

  5. Prediction of localization and interactions of apoptotic proteins

    Directory of Open Access Journals (Sweden)

    Matula Pavel

    2009-07-01

    Full Text Available Abstract During apoptosis several mitochondrial proteins are released. Some of them participate in caspase-independent nuclear DNA degradation, especially apoptosis-inducing factor (AIF and endonuclease G (endoG. Another interesting protein, which was expected to act similarly as AIF due to the high sequence homology with AIF is AIF-homologous mitochondrion-associated inducer of death (AMID. We studied the structure, cellular localization, and interactions of several proteins in silico and also in cells using fluorescent microscopy. We found the AMID protein to be cytoplasmic, most probably incorporated into the cytoplasmic side of the lipid membranes. Bioinformatic predictions were conducted to analyze the interactions of the studied proteins with each other and with other possible partners. We conducted molecular modeling of proteins with unknown 3D structures. These models were then refined by MolProbity server and employed in molecular docking simulations of interactions. Our results show data acquired using a combination of modern in silico methods and image analysis to understand the localization, interactions and functions of proteins AMID, AIF, endonuclease G, and other apoptosis-related proteins.

  6. Subcellular location of the enzymes of purine breakdown in the yeast Candida famata grown on uric acid

    NARCIS (Netherlands)

    Large, Peter J.; Waterham, Hans R.; Veenhuis, Marten

    1990-01-01

    The subcellular location of the enzymes of purine breakdown in the yeast Candida famata, which grows on uric acid as sole carbon and nitrogen source, has been examined by subcellular fractionation methods. Uricase was confirmed as being peroxisomal, but the other three enzymes, allantoinase,

  7. Genome-scale prediction of proteins with long intrinsically disordered regions.

    Science.gov (United States)

    Peng, Zhenling; Mizianty, Marcin J; Kurgan, Lukasz

    2014-01-01

    Proteins with long disordered regions (LDRs), defined as having 30 or more consecutive disordered residues, are abundant in eukaryotes, and these regions are recognized as a distinct class of biologically functional domains. LDRs facilitate various cellular functions and are important for target selection in structural genomics. Motivated by the lack of methods that directly predict proteins with LDRs, we designed Super-fast predictor of proteins with Long Intrinsically DisordERed regions (SLIDER). SLIDER utilizes logistic regression that takes an empirically chosen set of numerical features, which consider selected physicochemical properties of amino acids, sequence complexity, and amino acid composition, as its inputs. Empirical tests show that SLIDER offers competitive predictive performance combined with low computational cost. It outperforms, by at least a modest margin, a comprehensive set of modern disorder predictors (that can indirectly predict LDRs) and is 16 times faster compared to the best currently available disorder predictor. Utilizing our time-efficient predictor, we characterized abundance and functional roles of proteins with LDRs over 110 eukaryotic proteomes. Similar to related studies, we found that eukaryotes have many (on average 30.3%) proteins with LDRs with majority of proteomes having between 25 and 40%, where higher abundance is characteristic to proteomes that have larger proteins. Our first-of-its-kind large-scale functional analysis shows that these proteins are enriched in a number of cellular functions and processes including certain binding events, regulation of catalytic activities, cellular component organization, biogenesis, biological regulation, and some metabolic and developmental processes. A webserver that implements SLIDER is available at http://biomine.ece.ualberta.ca/SLIDER/. Copyright © 2013 Wiley Periodicals, Inc.

  8. Pathways and Subcellular Compartmentation of NAD Biosynthesis in Human Cells

    Science.gov (United States)

    Nikiforov, Andrey; Dölle, Christian; Niere, Marc; Ziegler, Mathias

    2011-01-01

    NAD is a vital redox carrier, and its degradation is a key element of important regulatory pathways. NAD-mediated functions are compartmentalized and have to be fueled by specific biosynthetic routes. However, little is known about the different pathways, their subcellular distribution, and regulation in human cells. In particular, the route(s) to generate mitochondrial NAD, the largest subcellular pool, is still unknown. To visualize organellar NAD changes in cells, we targeted poly(ADP-ribose) polymerase activity into the mitochondrial matrix. This activity synthesized immunodetectable poly(ADP-ribose) depending on mitochondrial NAD availability. Based on this novel detector system, detailed subcellular enzyme localizations, and pharmacological inhibitors, we identified extracellular NAD precursors, their cytosolic conversions, and the pathway of mitochondrial NAD generation. Our results demonstrate that, besides nicotinamide and nicotinic acid, only the corresponding nucleosides readily enter the cells. Nucleotides (e.g. NAD and NMN) undergo extracellular degradation resulting in the formation of permeable precursors. These precursors can all be converted to cytosolic and mitochondrial NAD. For mitochondrial NAD synthesis, precursors are converted to NMN in the cytosol. When taken up into the organelles, NMN (together with ATP) serves as substrate of NMNAT3 to form NAD. NMNAT3 was conclusively localized to the mitochondrial matrix and is the only known enzyme of NAD synthesis residing within these organelles. We thus present a comprehensive dissection of mammalian NAD biosynthesis, the groundwork to understand regulation of NAD-mediated processes, and the organismal homeostasis of this fundamental molecule. PMID:21504897

  9. Three-dimensional protein structure prediction: Methods and computational strategies.

    Science.gov (United States)

    Dorn, Márcio; E Silva, Mariel Barbachan; Buriol, Luciana S; Lamb, Luis C

    2014-10-12

    A long standing problem in structural bioinformatics is to determine the three-dimensional (3-D) structure of a protein when only a sequence of amino acid residues is given. Many computational methodologies and algorithms have been proposed as a solution to the 3-D Protein Structure Prediction (3-D-PSP) problem. These methods can be divided in four main classes: (a) first principle methods without database information; (b) first principle methods with database information; (c) fold recognition and threading methods; and (d) comparative modeling methods and sequence alignment strategies. Deterministic computational techniques, optimization techniques, data mining and machine learning approaches are typically used in the construction of computational solutions for the PSP problem. Our main goal with this work is to review the methods and computational strategies that are currently used in 3-D protein prediction. Copyright © 2014 Elsevier Ltd. All rights reserved.

  10. CaMELS: In silico prediction of calmodulin binding proteins and their binding sites.

    Science.gov (United States)

    Abbasi, Wajid Arshad; Asif, Amina; Andleeb, Saiqa; Minhas, Fayyaz Ul Amir Afsar

    2017-09-01

    Due to Ca 2+ -dependent binding and the sequence diversity of Calmodulin (CaM) binding proteins, identifying CaM interactions and binding sites in the wet-lab is tedious and costly. Therefore, computational methods for this purpose are crucial to the design of such wet-lab experiments. We present an algorithm suite called CaMELS (CalModulin intEraction Learning System) for predicting proteins that interact with CaM as well as their binding sites using sequence information alone. CaMELS offers state of the art accuracy for both CaM interaction and binding site prediction and can aid biologists in studying CaM binding proteins. For CaM interaction prediction, CaMELS uses protein sequence features coupled with a large-margin classifier. CaMELS models the binding site prediction problem using multiple instance machine learning with a custom optimization algorithm which allows more effective learning over imprecisely annotated CaM-binding sites during training. CaMELS has been extensively benchmarked using a variety of data sets, mutagenic studies, proteome-wide Gene Ontology enrichment analyses and protein structures. Our experiments indicate that CaMELS outperforms simple motif-based search and other existing methods for interaction and binding site prediction. We have also found that the whole sequence of a protein, rather than just its binding site, is important for predicting its interaction with CaM. Using the machine learning model in CaMELS, we have identified important features of protein sequences for CaM interaction prediction as well as characteristic amino acid sub-sequences and their relative position for identifying CaM binding sites. Python code for training and evaluating CaMELS together with a webserver implementation is available at the URL: http://faculty.pieas.edu.pk/fayyaz/software.html#camels. © 2017 Wiley Periodicals, Inc.

  11. Prediction of antigenic epitopes on protein surfaces by consensus scoring

    Directory of Open Access Journals (Sweden)

    Zhang Chi

    2009-09-01

    Full Text Available Abstract Background Prediction of antigenic epitopes on protein surfaces is important for vaccine design. Most existing epitope prediction methods focus on protein sequences to predict continuous epitopes linear in sequence. Only a few structure-based epitope prediction algorithms are available and they have not yet shown satisfying performance. Results We present a new antigen Epitope Prediction method, which uses ConsEnsus Scoring (EPCES from six different scoring functions - residue epitope propensity, conservation score, side-chain energy score, contact number, surface planarity score, and secondary structure composition. Applied to unbounded antigen structures from an independent test set, EPCES was able to predict antigenic eptitopes with 47.8% sensitivity, 69.5% specificity and an AUC value of 0.632. The performance of the method is statistically similar to other published methods. The AUC value of EPCES is slightly higher compared to the best results of existing algorithms by about 0.034. Conclusion Our work shows consensus scoring of multiple features has a better performance than any single term. The successful prediction is also due to the new score of residue epitope propensity based on atomic solvent accessibility.

  12. Detection of the Mr 110,000 lung resistance-related protein LRP/MVP with monoclonal antibodies.

    Science.gov (United States)

    Schroeijers, A B; Scheffer, G L; Reurs, A W; Pijnenborg, A C; Abbondanza, C; Wiemer, E A; Scheper, R J

    2001-11-01

    The Mr 110,000 lung resistance-related protein (LRP), also termed the major vault protein (MVP), constitutes >70% of subcellular ribonucleoprotein particles called vaults. Overexpression of LRP/MVP and vaults has been linked directly to MDR in cancer cells. Clinically, LRP/MVP expression can be of value to predict response to chemotherapy and prognosis. Monoclonal antibodies (MAbs) against LRP/MVP have played a critical role in determining the relevance of this protein in clinical drug resistance. We compared the applicability of the previously described MAbs LRP-56, LMR-5, LRP, 1027, 1032, and newly isolated MAbs MVP-9, MVP-16, MVP-18, and MVP-37 for the immunodetection of LRP/MVP by immunoblotting analysis and by immunocyto- and histochemistry. The availability of a broader panel of reagents for the specific and sensitive immunodetection of LRP/MVP should greatly facilitate biological and clinical studies of vault-related MDR.

  13. Defining the predicted protein secretome of the fungal wheat leaf pathogen Mycosphaerella graminicola.

    Directory of Open Access Journals (Sweden)

    Alexandre Morais do Amaral

    Full Text Available The Dothideomycete fungus Mycosphaerella graminicola is the causal agent of Septoria tritici blotch, a devastating disease of wheat leaves that causes dramatic decreases in yield. Infection involves an initial extended period of symptomless intercellular colonisation prior to the development of visible necrotic disease lesions. Previous functional genomics and gene expression profiling studies have implicated the production of secreted virulence effector proteins as key facilitators of the initial symptomless growth phase. In order to identify additional candidate virulence effectors, we re-analysed and catalogued the predicted protein secretome of M. graminicola isolate IPO323, which is currently regarded as the reference strain for this species. We combined several bioinformatic approaches in order to increase the probability of identifying truly secreted proteins with either a predicted enzymatic function or an as yet unknown function. An initial secretome of 970 proteins was predicted, whilst further stringent selection criteria predicted 492 proteins. Of these, 321 possess some functional annotation, the composition of which may reflect the strictly intercellular growth habit of this pathogen, leaving 171 with no functional annotation. This analysis identified a protein family encoding secreted peroxidases/chloroperoxidases (PF01328 which is expanded within all members of the family Mycosphaerellaceae. Further analyses were done on the non-annotated proteins for size and cysteine content (effector protein hallmarks, and then by studying the distribution of homologues in 17 other sequenced Dothideomycete fungi within an overall total of 91 predicted proteomes from fungal, oomycete and nematode species. This detailed M. graminicola secretome analysis provides the basis for further functional and comparative genomics studies.

  14. On the role of electrostatics on protein-protein interactions

    Science.gov (United States)

    Zhang, Zhe; Witham, Shawn; Alexov, Emil

    2011-01-01

    The role of electrostatics on protein-protein interactions and binding is reviewed in this article. A brief outline of the computational modeling, in the framework of continuum electrostatics, is presented and basic electrostatic effects occurring upon the formation of the complex are discussed. The role of the salt concentration and pH of the water phase on protein-protein binding free energy is demonstrated and indicates that the increase of the salt concentration tends to weaken the binding, an observation that is attributed to the optimization of the charge-charge interactions across the interface. It is pointed out that the pH-optimum (pH of optimal binding affinity) varies among the protein-protein complexes, and perhaps is a result of their adaptation to particular subcellular compartment. At the end, the similarities and differences between hetero- and homo-complexes are outlined and discussed with respect to the binding mode and charge complementarity. PMID:21572182

  15. An automated decision-tree approach to predicting protein interaction hot spots.

    Science.gov (United States)

    Darnell, Steven J; Page, David; Mitchell, Julie C

    2007-09-01

    Protein-protein interactions can be altered by mutating one or more "hot spots," the subset of residues that account for most of the interface's binding free energy. The identification of hot spots requires a significant experimental effort, highlighting the practical value of hot spot predictions. We present two knowledge-based models that improve the ability to predict hot spots: K-FADE uses shape specificity features calculated by the Fast Atomic Density Evaluation (FADE) program, and K-CON uses biochemical contact features. The combined K-FADE/CON (KFC) model displays better overall predictive accuracy than computational alanine scanning (Robetta-Ala). In addition, because these methods predict different subsets of known hot spots, a large and significant increase in accuracy is achieved by combining KFC and Robetta-Ala. The KFC analysis is applied to the calmodulin (CaM)/smooth muscle myosin light chain kinase (smMLCK) interface, and to the bone morphogenetic protein-2 (BMP-2)/BMP receptor-type I (BMPR-IA) interface. The results indicate a strong correlation between KFC hot spot predictions and mutations that significantly reduce the binding affinity of the interface. 2007 Wiley-Liss, Inc.

  16. Changes in predicted protein disorder tendency may contribute to disease risk

    Directory of Open Access Journals (Sweden)

    Hu Yang

    2011-12-01

    Full Text Available Abstract Background Recent studies suggest that many proteins or regions of proteins lack 3D structure. Defined as intrinsically disordered proteins, these proteins/peptides are functionally important. Recent advances in next generation sequencing technologies enable genome-wide identification of novel nucleotide variations in a specific population or cohort. Results Using the exonic single nucleotide variations (SNVs identified in the 1,000 Genomes Project and distributed by the Genetic Analysis Workshop 17, we systematically analysed the genetic and predicted disorder potential features of the non-synonymous variations. The result of experiments suggests that a significant change in the tendency of a protein region to be structured or disordered caused by SNVs may lead to malfunction of such a protein and contribute to disease risk. Conclusions After validation with functional SNVs on the traits distributed by GAW17, we conclude that it is valuable to consider structure/disorder tendencies while prioritizing and predicting mechanistic effects arising from novel genetic variations.

  17. Sequence Based Prediction of Antioxidant Proteins Using a Classifier Selection Strategy.

    Directory of Open Access Journals (Sweden)

    Lina Zhang

    Full Text Available Antioxidant proteins perform significant functions in maintaining oxidation/antioxidation balance and have potential therapies for some diseases. Accurate identification of antioxidant proteins could contribute to revealing physiological processes of oxidation/antioxidation balance and developing novel antioxidation-based drugs. In this study, an ensemble method is presented to predict antioxidant proteins with hybrid features, incorporating SSI (Secondary Structure Information, PSSM (Position Specific Scoring Matrix, RSA (Relative Solvent Accessibility, and CTD (Composition, Transition, Distribution. The prediction results of the ensemble predictor are determined by an average of prediction results of multiple base classifiers. Based on a classifier selection strategy, we obtain an optimal ensemble classifier composed of RF (Random Forest, SMO (Sequential Minimal Optimization, NNA (Nearest Neighbor Algorithm, and J48 with an accuracy of 0.925. A Relief combined with IFS (Incremental Feature Selection method is adopted to obtain optimal features from hybrid features. With the optimal features, the ensemble method achieves improved performance with a sensitivity of 0.95, a specificity of 0.93, an accuracy of 0.94, and an MCC (Matthew's Correlation Coefficient of 0.880, far better than the existing method. To evaluate the prediction performance objectively, the proposed method is compared with existing methods on the same independent testing dataset. Encouragingly, our method performs better than previous studies. In addition, our method achieves more balanced performance with a sensitivity of 0.878 and a specificity of 0.860. These results suggest that the proposed ensemble method can be a potential candidate for antioxidant protein prediction. For public access, we develop a user-friendly web server for antioxidant protein identification that is freely accessible at http://antioxidant.weka.cc.

  18. The subcellular localization of natural 210Po in the hepatopancreas of the rock lobster (Jasus lalandii)

    International Nuclear Information System (INIS)

    Heyraud, M.; Dowdle, E.B.; Cherry, R.D.

    1987-01-01

    The subcellular localization of the naturally occurring nuclide 210 Po in the hepatopancreas of the South African rock lobster, Jasus lalandii, has been studied using centrifugation, ultrafiltration and chromatography. Just over half of the 210 Po was found to be associated with a component in the microsomal pellet. Most of the 210 Po was tightly bound to a component of high molecular mass. Dissociation of the 210 Po from this component required incubation with sulphydryl-reducing reagents, after which the 210 Po appeared to associate with a fraction having a molecular mass of 1500 daltons or less. A search for negatively-charged, hydrophobic, sulphur-containing membrane proteins which bind 210 Po is suggested. (author)

  19. Subcellular localization of natural /sup 210/Po in the hepatopancreas of the rock lobster (Jasus lalandii)

    Energy Technology Data Exchange (ETDEWEB)

    Heyraud, M; Dowdle, E B; Cherry, R D

    1987-01-01

    The subcellular localization of the naturally occurring nuclide /sup 210/Po in the hepatopancreas of the South African rock lobster, Jasus lalandii, has been studied using centrifugation, ultrafiltration and chromatography. Just over half of the /sup 210/Po was found to be associated with a component in the microsomal pellet. Most of the /sup 210/Po was tightly bound to a component of high molecular mass. Dissociation of the /sup 210/Po from this component required incubation with sulphydryl-reducing reagents, after which the /sup 210/Po appeared to associate with a fraction having a molecular mass of 1500 daltons or less. A search for negatively-charged, hydrophobic, sulphur-containing membrane proteins which bind /sup 210/Po is suggested.

  20. Characterization and subcellular compartmentation of recombinant 4-hydroxyphenylpyruvate dioxygenase from Arabidopsis in transgenic tobacco.

    Science.gov (United States)

    Garcia, I; Rodgers, M; Pepin, R; Hsieh, T F; Matringe, M

    1999-04-01

    4-Hydroxyphenylpyruvate dioxygenase (4HPPD) catalyzes the formation of homogentisate (2,5-dihydroxyphenylacetate) from p-hydroxyphenylpyruvate and molecular oxygen. In plants this enzyme activity is involved in two distinct metabolic processes, the biosynthesis of prenylquinones and the catabolism of tyrosine. We report here the molecular and biochemical characterization of an Arabidopsis 4HPPD and the compartmentation of the recombinant protein in chlorophyllous tissues. We isolated a 1508-bp cDNA with one large open reading frame of 1338 bp. Southern analysis strongly suggested that this Arabidopsis 4HPPD is encoded by a single-copy gene. We investigated the biochemical characteristics of this 4HPPD by overproducing the recombinant protein in Escherichia coli JM105. The subcellular localization of the recombinant 4HPPD in chlorophyllous tissues was examined by overexpressing its complete coding sequence in transgenic tobacco (Nicotiana tabacum), using Agrobacterium tumefaciens transformation. We performed western analyses for the immunodetection of protein extracts from purified chloroplasts and total leaf extracts and for the immunocytochemistry on tissue sections. These analyses clearly revealed that 4HPPD was confined to the cytosol compartment, not targeted to the chloroplast. Western analyses confirmed the presence of a cytosolic form of 4HPPD in cultured green Arabidopsis cells.

  1. QuaBingo: A Prediction System for Protein Quaternary Structure Attributes Using Block Composition

    Directory of Open Access Journals (Sweden)

    Chi-Hua Tung

    2016-01-01

    Full Text Available Background. Quaternary structures of proteins are closely relevant to gene regulation, signal transduction, and many other biological functions of proteins. In the current study, a new method based on protein-conserved motif composition in block format for feature extraction is proposed, which is termed block composition. Results. The protein quaternary assembly states prediction system which combines blocks with functional domain composition, called QuaBingo, is constructed by three layers of classifiers that can categorize quaternary structural attributes of monomer, homooligomer, and heterooligomer. The building of the first layer classifier uses support vector machines (SVM based on blocks and functional domains of proteins, and the second layer SVM was utilized to process the outputs of the first layer. Finally, the result is determined by the Random Forest of the third layer. We compared the effectiveness of the combination of block composition, functional domain composition, and pseudoamino acid composition of the model. In the 11 kinds of functional protein families, QuaBingo is 23% of Matthews Correlation Coefficient (MCC higher than the existing prediction system. The results also revealed the biological characterization of the top five block compositions. Conclusions. QuaBingo provides better predictive ability for predicting the quaternary structural attributes of proteins.

  2. HPASubC: A suite of tools for user subclassification of human protein atlas tissue images

    Directory of Open Access Journals (Sweden)

    Toby C Cornish

    2015-01-01

    Full Text Available Background: The human protein atlas (HPA is a powerful proteomic tool for visualizing the distribution of protein expression across most human tissues and many common malignancies. The HPA includes immunohistochemically-stained images from tissue microarrays (TMAs that cover 48 tissue types and 20 common malignancies. The TMA data are used to provide expression information at the tissue, cellular, and occasionally, subcellular level. The HPA also provides subcellular data from confocal immunofluorescence data on three cell lines. Despite the availability of localization data, many unique patterns of cellular and subcellular expression are not documented. Materials and Methods: To get at this more granular data, we have developed a suite of Python scripts, HPASubC, to aid in subcellular, and cell-type specific classification of HPA images. This method allows the user to download and optimize specific HPA TMA images for review. Then, using a playstation-style video game controller, a trained observer can rapidly step through 10′s of 1000′s of images to identify patterns of interest. Results: We have successfully used this method to identify 703 endothelial cell (EC and/or smooth muscle cell (SMCs specific proteins discovered within 49,200 heart TMA images. This list will assist us in subdividing cardiac gene or protein array data into expression by one of the predominant cell types of the myocardium: Myocytes, SMCs or ECs. Conclusions: The opportunity to further characterize unique staining patterns across a range of human tissues and malignancies will accelerate our understanding of disease processes and point to novel markers for tissue evaluation in surgical pathology.

  3. Loss of Niemann-Pick C1 or C2 protein results in similar biochemical changes suggesting that these proteins function in a common lysosomal pathway.

    Directory of Open Access Journals (Sweden)

    Sayali S Dixit

    Full Text Available Niemann-Pick Type C (NPC disease is a lysosomal storage disorder characterized by accumulation of unesterified cholesterol and other lipids in the endolysosomal system. NPC disease results from a defect in either of two distinct cholesterol-binding proteins: a transmembrane protein, NPC1, and a small soluble protein, NPC2. NPC1 and NPC2 are thought to function closely in the export of lysosomal cholesterol with both proteins binding cholesterol in vitro but they may have unrelated lysosomal roles. To investigate this possibility, we compared biochemical consequences of the loss of either protein. Analyses of lysosome-enriched subcellular fractions from brain and liver revealed similar decreases in buoyant densities of lysosomes from NPC1 or NPC2 deficient mice compared to controls. The subcellular distribution of both proteins was similar and paralleled a lysosomal marker. In liver, absence of either NPC1 or NPC2 resulted in similar alterations in the carbohydrate processing of the lysosomal protease, tripeptidyl peptidase I. These results highlight biochemical alterations in the lysosomal system of the NPC-mutant mice that appear secondary to lipid storage. In addition, the similarity in biochemical phenotypes resulting from either NPC1 or NPC2 deficiency supports models in which the function of these two proteins within lysosomes are linked closely.

  4. Prediction of protein post-translational modifications: main trends and methods

    Science.gov (United States)

    Sobolev, B. N.; Veselovsky, A. V.; Poroikov, V. V.

    2014-02-01

    The review summarizes main trends in the development of methods for the prediction of protein post-translational modifications (PTMs) by considering the three most common types of PTMs — phosphorylation, acetylation and glycosylation. Considerable attention is given to general characteristics of regulatory interactions associated with PTMs. Different approaches to the prediction of PTMs are analyzed. Most of the methods are based only on the analysis of the neighbouring environment of modification sites. The related software is characterized by relatively low accuracy of PTM predictions, which may be due both to the incompleteness of training data and the features of PTM regulation. Advantages and limitations of the phylogenetic approach are considered. The prediction of PTMs using data on regulatory interactions, including the modular organization of interacting proteins, is a promising field, provided that a more carefully selected training data will be used. The bibliography includes 145 references.

  5. Cascaded bidirectional recurrent neural networks for protein secondary structure prediction.

    Science.gov (United States)

    Chen, Jinmiao; Chaudhari, Narendra

    2007-01-01

    Protein secondary structure (PSS) prediction is an important topic in bioinformatics. Our study on a large set of non-homologous proteins shows that long-range interactions commonly exist and negatively affect PSS prediction. Besides, we also reveal strong correlations between secondary structure (SS) elements. In order to take into account the long-range interactions and SS-SS correlations, we propose a novel prediction system based on cascaded bidirectional recurrent neural network (BRNN). We compare the cascaded BRNN against another two BRNN architectures, namely the original BRNN architecture used for speech recognition as well as Pollastri's BRNN that was proposed for PSS prediction. Our cascaded BRNN achieves an overall three state accuracy Q3 of 74.38\\%, and reaches a high Segment OVerlap (SOV) of 66.0455. It outperforms the original BRNN and Pollastri's BRNN in both Q3 and SOV. Specifically, it improves the SOV score by 4-6%.

  6. Unraveling 14-3-3 proteins in C4 panicoids with emphasis on model plant Setaria italica reveals phosphorylation-dependent subcellular localization of RS splicing factor.

    Directory of Open Access Journals (Sweden)

    Karunesh Kumar

    Full Text Available 14-3-3 proteins are a large multigenic family of regulatory proteins ubiquitously found in eukaryotes. In plants, 14-3-3 proteins are reported to play significant role in both development and response to stress stimuli. Therefore, considering their importance, genome-wide analyses have been performed in many plants including Arabidopsis, rice and soybean. But, till date, no comprehensive investigation has been conducted in any C4 panicoid crops. In view of this, the present study was performed to identify 8, 5 and 26 potential 14-3-3 gene family members in foxtail millet (Si14-3-3, sorghum (Sb14-3-3 and maize (Zm14-3-3, respectively. In silico characterization revealed large variations in their gene structures; segmental and tandem duplications have played a major role in expansion of these genes in foxtail millet and maize. Gene ontology annotation showed the participation of 14-3-3 proteins in diverse biological processes and molecular functions, and in silico expression profiling indicated their higher expression in all the investigated tissues. Comparative mapping was performed to derive the orthologous relationships between 14-3-3 genes of foxtail millet and other Poaceae members, which showed a higher, as well as similar percentage of orthology among these crops. Expression profiling of Si14-3-3 genes during different time-points of abiotic stress and hormonal treatments showed a differential expression pattern of these genes, and sub-cellular localization studies revealed the site of action of Si14-3-3 proteins within the cells. Further downstream characterization indicated the interaction of Si14-3-3 with a nucleocytoplasmic shuttling phosphoprotein (SiRSZ21A in a phosphorylation-dependent manner, and this demonstrates that Si14-3-3 might regulate the splicing events by binding with phosphorylated SiRSZ21A. Taken together, the present study is a comprehensive analysis of 14-3-3 gene family members in foxtail millet, sorghum and maize

  7. Unraveling 14-3-3 proteins in C4 panicoids with emphasis on model plant Setaria italica reveals phosphorylation-dependent subcellular localization of RS splicing factor.

    Science.gov (United States)

    Kumar, Karunesh; Muthamilarasan, Mehanathan; Bonthala, Venkata Suresh; Roy, Riti; Prasad, Manoj

    2015-01-01

    14-3-3 proteins are a large multigenic family of regulatory proteins ubiquitously found in eukaryotes. In plants, 14-3-3 proteins are reported to play significant role in both development and response to stress stimuli. Therefore, considering their importance, genome-wide analyses have been performed in many plants including Arabidopsis, rice and soybean. But, till date, no comprehensive investigation has been conducted in any C4 panicoid crops. In view of this, the present study was performed to identify 8, 5 and 26 potential 14-3-3 gene family members in foxtail millet (Si14-3-3), sorghum (Sb14-3-3) and maize (Zm14-3-3), respectively. In silico characterization revealed large variations in their gene structures; segmental and tandem duplications have played a major role in expansion of these genes in foxtail millet and maize. Gene ontology annotation showed the participation of 14-3-3 proteins in diverse biological processes and molecular functions, and in silico expression profiling indicated their higher expression in all the investigated tissues. Comparative mapping was performed to derive the orthologous relationships between 14-3-3 genes of foxtail millet and other Poaceae members, which showed a higher, as well as similar percentage of orthology among these crops. Expression profiling of Si14-3-3 genes during different time-points of abiotic stress and hormonal treatments showed a differential expression pattern of these genes, and sub-cellular localization studies revealed the site of action of Si14-3-3 proteins within the cells. Further downstream characterization indicated the interaction of Si14-3-3 with a nucleocytoplasmic shuttling phosphoprotein (SiRSZ21A) in a phosphorylation-dependent manner, and this demonstrates that Si14-3-3 might regulate the splicing events by binding with phosphorylated SiRSZ21A. Taken together, the present study is a comprehensive analysis of 14-3-3 gene family members in foxtail millet, sorghum and maize, which provides

  8. Parametric Bayesian priors and better choice of negative examples improve protein function prediction.

    Science.gov (United States)

    Youngs, Noah; Penfold-Brown, Duncan; Drew, Kevin; Shasha, Dennis; Bonneau, Richard

    2013-05-01

    Computational biologists have demonstrated the utility of using machine learning methods to predict protein function from an integration of multiple genome-wide data types. Yet, even the best performing function prediction algorithms rely on heuristics for important components of the algorithm, such as choosing negative examples (proteins without a given function) or determining key parameters. The improper choice of negative examples, in particular, can hamper the accuracy of protein function prediction. We present a novel approach for choosing negative examples, using a parameterizable Bayesian prior computed from all observed annotation data, which also generates priors used during function prediction. We incorporate this new method into the GeneMANIA function prediction algorithm and demonstrate improved accuracy of our algorithm over current top-performing function prediction methods on the yeast and mouse proteomes across all metrics tested. Code and Data are available at: http://bonneaulab.bio.nyu.edu/funcprop.html

  9. Protein mislocalization: mechanisms, functions and clinical applications in cancer

    Science.gov (United States)

    Wang, Xiaohong; Li, Shulin

    2014-01-01

    The changes from normal cells to cancer cells are primarily regulated by genome instability, which foster hallmark functions of cancer through multiple mechanisms including protein mislocalization. Mislocalization of these proteins, including oncoproteins, tumor suppressors, and other cancer-related proteins, can interfere with normal cellular function and cooperatively drive tumor development and metastasis. This review describes the cancer-related effects of protein subcellular mislocalization, the related mislocalization mechanisms, and the potential application of this knowledge to cancer diagnosis, prognosis, and therapy. PMID:24709009

  10. Prequels to Synthetic Biology: From Candidate Gene Identification and Validation to Enzyme Subcellular Localization in Plant and Yeast Cells.

    Science.gov (United States)

    Foureau, E; Carqueijeiro, I; Dugé de Bernonville, T; Melin, C; Lafontaine, F; Besseau, S; Lanoue, A; Papon, N; Oudin, A; Glévarec, G; Clastre, M; St-Pierre, B; Giglioli-Guivarc'h, N; Courdavault, V

    2016-01-01

    Natural compounds extracted from microorganisms or plants constitute an inexhaustible source of valuable molecules whose supply can be potentially challenged by limitations in biological sourcing. The recent progress in synthetic biology combined to the increasing access to extensive transcriptomics and genomics data now provide new alternatives to produce these molecules by transferring their whole biosynthetic pathway in heterologous production platforms such as yeasts or bacteria. While the generation of high titer producing strains remains per se an arduous field of investigation, elucidation of the biosynthetic pathways as well as characterization of their complex subcellular organization are essential prequels to the efficient development of such bioengineering approaches. Using examples from plants and yeasts as a framework, we describe potent methods to rationalize the study of partially characterized pathways, including the basics of computational applications to identify candidate genes in transcriptomics data and the validation of their function by an improved procedure of virus-induced gene silencing mediated by direct DNA transfer to get around possible resistance to Agrobacterium-delivery of viral vectors. To identify potential alterations of biosynthetic fluxes resulting from enzyme mislocalizations in reconstituted pathways, we also detail protocols aiming at characterizing subcellular localizations of protein in plant cells by expression of fluorescent protein fusions through biolistic-mediated transient transformation, and localization of transferred enzymes in yeast using similar fluorescence procedures. Albeit initially developed for the Madagascar periwinkle, these methods may be applied to other plant species or organisms in order to establish synthetic biology platform. © 2016 Elsevier Inc. All rights reserved.

  11. Quantitative analysis and prediction of curvature in leucine-rich repeat proteins.

    Science.gov (United States)

    Hindle, K Lauren; Bella, Jordi; Lovell, Simon C

    2009-11-01

    Leucine-rich repeat (LRR) proteins form a large and diverse family. They have a wide range of functions most of which involve the formation of protein-protein interactions. All known LRR structures form curved solenoids, although there is large variation in their curvature. It is this curvature that determines the shape and dimensions of the inner space available for ligand binding. Unfortunately, large-scale parameters such as the overall curvature of a protein domain are extremely difficult to predict. Here, we present a quantitative analysis of determinants of curvature of this family. Individual repeats typically range in length between 20 and 30 residues and have a variety of secondary structures on their convex side. The observed curvature of the LRR domains correlates poorly with the lengths of their individual repeats. We have, therefore, developed a scoring function based on the secondary structure of the convex side of the protein that allows prediction of the overall curvature with a high degree of accuracy. We also demonstrate the effectiveness of this method in selecting a suitable template for comparative modeling. We have developed an automated, quantitative protocol that can be used to predict accurately the curvature of leucine-rich repeat proteins of unknown structure from sequence alone. This protocol is available as an online resource at http://www.bioinf.manchester.ac.uk/curlrr/.

  12. A predicted protein interactome identifies conserved global networks and disease resistance subnetworks in maize.

    Directory of Open Access Journals (Sweden)

    Matt eGeisler

    2015-06-01

    Full Text Available Interactomes are genome-wide roadmaps of protein-protein interactions. They have been produced for humans, yeast, the fruit fly, and Arabidopsis thaliana and have become invaluable tools for generating and testing hypotheses. A predicted interactome for Zea mays (PiZeaM is presented here as an aid to the research community for this valuable crop species. PiZeaM was built using a proven method of interologs (interacting orthologs that were identified using both one-to-one and many-to-many orthology between genomes of maize and reference species. Where both maize orthologs occurred for an experimentally determined interaction in the reference species, we predicted a likely interaction in maize. A total of 49,026 unique interactions for 6,004 maize proteins were predicted. These interactions are enriched for processes that are evolutionarily conserved, but include many otherwise poorly annotated proteins in maize. The predicted maize interactions were further analyzed by comparing annotation of interacting proteins, including different layers of ontology. A map of pairwise gene co-expression was also generated and compared to predicted interactions. Two global subnetworks were constructed for highly conserved interactions. These subnetworks showed clear clustering of proteins by function. Another subnetwork was created for disease response using a bait and prey strategy to capture interacting partners for proteins that respond to other organisms. Closer examination of this subnetwork revealed the connectivity between biotic and abiotic hormone stress pathways. We believe PiZeaM will provide a useful tool for the prediction of protein function and analysis of pathways for Z. mays researchers and is presented in this paper as a reference tool for the exploration of protein interactions in maize.

  13. Early spatiotemporal-specific changes in intermediate signals are predictive of cytotoxic sensitivity to TNFα and co-treatments

    Science.gov (United States)

    Loo, Lit-Hsin; Bougen-Zhukov, Nicola Michelle; Tan, Wei-Ling Cecilia

    2017-03-01

    Signaling pathways can generate different cellular responses to the same cytotoxic agents. Current quantitative models for predicting these differential responses are usually based on large numbers of intracellular gene products or signals at different levels of signaling cascades. Here, we report a study to predict cellular sensitivity to tumor necrosis factor alpha (TNFα) using high-throughput cellular imaging and machine-learning methods. We measured and compared 1170 protein phosphorylation events in a panel of human lung cancer cell lines based on different signals, subcellular regions, and time points within one hour of TNFα treatment. We found that two spatiotemporal-specific changes in an intermediate signaling protein, p90 ribosomal S6 kinase (RSK), are sufficient to predict the TNFα sensitivity of these cell lines. Our models could also predict the combined effects of TNFα and other kinase inhibitors, many of which are not known to target RSK directly. Therefore, early spatiotemporal-specific changes in intermediate signals are sufficient to represent the complex cellular responses to these perturbations. Our study provides a general framework for the development of rapid, signaling-based cytotoxicity screens that may be used to predict cellular sensitivity to a cytotoxic agent, or identify co-treatments that may sensitize or desensitize cells to the agent.

  14. Improved hybrid optimization algorithm for 3D protein structure prediction.

    Science.gov (United States)

    Zhou, Changjun; Hou, Caixia; Wei, Xiaopeng; Zhang, Qiang

    2014-07-01

    A new improved hybrid optimization algorithm - PGATS algorithm, which is based on toy off-lattice model, is presented for dealing with three-dimensional protein structure prediction problems. The algorithm combines the particle swarm optimization (PSO), genetic algorithm (GA), and tabu search (TS) algorithms. Otherwise, we also take some different improved strategies. The factor of stochastic disturbance is joined in the particle swarm optimization to improve the search ability; the operations of crossover and mutation that are in the genetic algorithm are changed to a kind of random liner method; at last tabu search algorithm is improved by appending a mutation operator. Through the combination of a variety of strategies and algorithms, the protein structure prediction (PSP) in a 3D off-lattice model is achieved. The PSP problem is an NP-hard problem, but the problem can be attributed to a global optimization problem of multi-extremum and multi-parameters. This is the theoretical principle of the hybrid optimization algorithm that is proposed in this paper. The algorithm combines local search and global search, which overcomes the shortcoming of a single algorithm, giving full play to the advantage of each algorithm. In the current universal standard sequences, Fibonacci sequences and real protein sequences are certified. Experiments show that the proposed new method outperforms single algorithms on the accuracy of calculating the protein sequence energy value, which is proved to be an effective way to predict the structure of proteins.

  15. Predicting binding within disordered protein regions to structurally characterised peptide-binding domains.

    Directory of Open Access Journals (Sweden)

    Waqasuddin Khan

    Full Text Available Disordered regions of proteins often bind to structured domains, mediating interactions within and between proteins. However, it is difficult to identify a priori the short disordered regions involved in binding. We set out to determine if docking such peptide regions to peptide binding domains would assist in these predictions.We assembled a redundancy reduced dataset of SLiM (Short Linear Motif containing proteins from the ELM database. We selected 84 sequences which had an associated PDB structures showing the SLiM bound to a protein receptor, where the SLiM was found within a 50 residue region of the protein sequence which was predicted to be disordered. First, we investigated the Vina docking scores of overlapping tripeptides from the 50 residue SLiM containing disordered regions of the protein sequence to the corresponding PDB domain. We found only weak discrimination of docking scores between peptides involved in binding and adjacent non-binding peptides in this context (AUC 0.58.Next, we trained a bidirectional recurrent neural network (BRNN using as input the protein sequence, predicted secondary structure, Vina docking score and predicted disorder score. The results were very promising (AUC 0.72 showing that multiple sources of information can be combined to produce results which are clearly superior to any single source.We conclude that the Vina docking score alone has only modest power to define the location of a peptide within a larger protein region known to contain it. However, combining this information with other knowledge (using machine learning methods clearly improves the identification of peptide binding regions within a protein sequence. This approach combining docking with machine learning is primarily a predictor of binding to peptide-binding sites, and is not intended as a predictor of specificity of binding to particular receptors.

  16. Prediction and analysis of beta-turns in proteins by support vector machine.

    Science.gov (United States)

    Pham, Tho Hoan; Satou, Kenji; Ho, Tu Bao

    2003-01-01

    Tight turn has long been recognized as one of the three important features of proteins after the alpha-helix and beta-sheet. Tight turns play an important role in globular proteins from both the structural and functional points of view. More than 90% tight turns are beta-turns. Analysis and prediction of beta-turns in particular and tight turns in general are very useful for the design of new molecules such as drugs, pesticides, and antigens. In this paper, we introduce a support vector machine (SVM) approach to prediction and analysis of beta-turns. We have investigated two aspects of applying SVM to the prediction and analysis of beta-turns. First, we developed a new SVM method, called BTSVM, which predicts beta-turns of a protein from its sequence. The prediction results on the dataset of 426 non-homologous protein chains by sevenfold cross-validation technique showed that our method is superior to the other previous methods. Second, we analyzed how amino acid positions support (or prevent) the formation of beta-turns based on the "multivariable" classification model of a linear SVM. This model is more general than the other ones of previous statistical methods. Our analysis results are more comprehensive and easier to use than previously published analysis results.

  17. Characterization and Prediction of Protein Phosphorylation Hotspots in Arabidopsis thaliana.

    Science.gov (United States)

    Christian, Jan-Ole; Braginets, Rostyslav; Schulze, Waltraud X; Walther, Dirk

    2012-01-01

    The regulation of protein function by modulating the surface charge status via sequence-locally enriched phosphorylation sites (P-sites) in so called phosphorylation "hotspots" has gained increased attention in recent years. We set out to identify P-hotspots in the model plant Arabidopsis thaliana. We analyzed the spacing of experimentally detected P-sites within peptide-covered regions along Arabidopsis protein sequences as available from the PhosPhAt database. Confirming earlier reports (Schweiger and Linial, 2010), we found that, indeed, P-sites tend to cluster and that distributions between serine and threonine P-sites to their respected closest next P-site differ significantly from those for tyrosine P-sites. The ability to predict P-hotspots by applying available computational P-site prediction programs that focus on identifying single P-sites was observed to be severely compromised by the inevitable interference of nearby P-sites. We devised a new approach, named HotSPotter, for the prediction of phosphorylation hotspots. HotSPotter is based primarily on local amino acid compositional preferences rather than sequence position-specific motifs and uses support vector machines as the underlying classification engine. HotSPotter correctly identified experimentally determined phosphorylation hotspots in A. thaliana with high accuracy. Applied to the Arabidopsis proteome, HotSPotter-predicted 13,677 candidate P-hotspots in 9,599 proteins corresponding to 7,847 unique genes. Hotspot containing proteins are involved predominantly in signaling processes confirming the surmised modulating role of hotspots in signaling and interaction events. Our study provides new bioinformatics means to identify phosphorylation hotspots and lays the basis for further investigating novel candidate P-hotspots. All phosphorylation hotspot annotations and predictions have been made available as part of the PhosPhAt database at http://phosphat.mpimp-golm.mpg.de.

  18. Imaging cells and sub-cellular structures with ultrahigh resolution full-field X-ray microscopy.

    Science.gov (United States)

    Chien, C C; Tseng, P Y; Chen, H H; Hua, T E; Chen, S T; Chen, Y Y; Leng, W H; Wang, C H; Hwu, Y; Yin, G C; Liang, K S; Chen, F R; Chu, Y S; Yeh, H I; Yang, Y C; Yang, C S; Zhang, G L; Je, J H; Margaritondo, G

    2013-01-01

    Our experimental results demonstrate that full-field hard-X-ray microscopy is finally able to investigate the internal structure of cells in tissues. This result was made possible by three main factors: the use of a coherent (synchrotron) source of X-rays, the exploitation of contrast mechanisms based on the real part of the refractive index and the magnification provided by high-resolution Fresnel zone-plate objectives. We specifically obtained high-quality microradiographs of human and mouse cells with 29 nm Rayleigh spatial resolution and verified that tomographic reconstruction could be implemented with a final resolution level suitable for subcellular features. We also demonstrated that a phase retrieval method based on a wave propagation algorithm could yield good subcellular images starting from a series of defocused microradiographs. The concluding discussion compares cellular and subcellular hard-X-ray microradiology with other techniques and evaluates its potential impact on biomedical research. Copyright © 2012 Elsevier Inc. All rights reserved.

  19. Predicting the tolerated sequences for proteins and protein interfaces using RosettaBackrub flexible backbone design.

    Directory of Open Access Journals (Sweden)

    Colin A Smith

    Full Text Available Predicting the set of sequences that are tolerated by a protein or protein interface, while maintaining a desired function, is useful for characterizing protein interaction specificity and for computationally designing sequence libraries to engineer proteins with new functions. Here we provide a general method, a detailed set of protocols, and several benchmarks and analyses for estimating tolerated sequences using flexible backbone protein design implemented in the Rosetta molecular modeling software suite. The input to the method is at least one experimentally determined three-dimensional protein structure or high-quality model. The starting structure(s are expanded or refined into a conformational ensemble using Monte Carlo simulations consisting of backrub backbone and side chain moves in Rosetta. The method then uses a combination of simulated annealing and genetic algorithm optimization methods to enrich for low-energy sequences for the individual members of the ensemble. To emphasize certain functional requirements (e.g. forming a binding interface, interactions between and within parts of the structure (e.g. domains can be reweighted in the scoring function. Results from each backbone structure are merged together to create a single estimate for the tolerated sequence space. We provide an extensive description of the protocol and its parameters, all source code, example analysis scripts and three tests applying this method to finding sequences predicted to stabilize proteins or protein interfaces. The generality of this method makes many other applications possible, for example stabilizing interactions with small molecules, DNA, or RNA. Through the use of within-domain reweighting and/or multistate design, it may also be possible to use this method to find sequences that stabilize particular protein conformations or binding interactions over others.

  20. Cell array-based intracellular localization screening reveals novel functional features of human chromosome 21 proteins

    Directory of Open Access Journals (Sweden)

    Kahlem Pascal

    2006-06-01

    Full Text Available Abstract Background Trisomy of human chromosome 21 (Chr21 results in Down's syndrome, a complex developmental and neurodegenerative disease. Molecular analysis of Down's syndrome, however, poses a particular challenge, because the aneuploid region of Chr21 contains many genes of unknown function. Subcellular localization of human Chr21 proteins may contribute to further understanding of the functions and regulatory mechanisms of the genes that code for these proteins. Following this idea, we used a transfected-cell array technique to perform a rapid and cost-effective analysis of the intracellular distribution of Chr 21 proteins. Results We chose 89 genes that were distributed over the majority of 21q, ranging from RBM11 (14.5 Mb to MCM3AP (46.6 Mb, with part of them expressed aberrantly in the Down's syndrome mouse model. Open reading frames of these genes were cloned into a mammalian expression vector with an amino-terminal His6 tag. All of the constructs were arrayed on glass slides and reverse transfected into HEK293T cells for protein expression. Co-localization detection using a set of organelle markers was carried out for each Chr21 protein. Here, we report the subcellular localization properties of 52 proteins. For 34 of these proteins, their localization is described for the first time. Furthermore, the alteration in cell morphology and growth as a result of protein over-expression for claudin-8 and claudin-14 genes has been characterized. Conclusion The cell array-based protein expression and detection approach is a cost-effective platform for large-scale functional analyses, including protein subcellular localization and cell phenotype screening. The results from this study reveal novel functional features of human Chr21 proteins, which should contribute to further understanding of the molecular pathology of Down's syndrome.

  1. HitPredict version 4: comprehensive reliability scoring of physical protein?protein interactions from more than 100 species

    OpenAIRE

    L?pez, Yosvany; Nakai, Kenta; Patil, Ashwini

    2015-01-01

    HitPredict is a consolidated resource of experimentally identified, physical protein?protein interactions with confidence scores to indicate their reliability. The study of genes and their inter-relationships using methods such as network and pathway analysis requires high quality protein?protein interaction information. Extracting reliable interactions from most of the existing databases is challenging because they either contain only a subset of the available interactions, or a mixture of p...

  2. InterProSurf: a web server for predicting interacting sites on protein surfaces

    Science.gov (United States)

    Negi, Surendra S.; Schein, Catherine H.; Oezguen, Numan; Power, Trevor D.; Braun, Werner

    2009-01-01

    Summary A new web server, InterProSurf, predicts interacting amino acid residues in proteins that are most likely to interact with other proteins, given the 3D structures of subunits of a protein complex. The prediction method is based on solvent accessible surface area of residues in the isolated subunits, a propensity scale for interface residues and a clustering algorithm to identify surface regions with residues of high interface propensities. Here we illustrate the application of InterProSurf to determine which areas of Bacillus anthracis toxins and measles virus hemagglutinin protein interact with their respective cell surface receptors. The computationally predicted regions overlap with those regions previously identified as interface regions by sequence analysis and mutagenesis experiments. PMID:17933856

  3. A web server for analysis, comparison and prediction of protein ligand binding sites.

    Science.gov (United States)

    Singh, Harinder; Srivastava, Hemant Kumar; Raghava, Gajendra P S

    2016-03-25

    One of the major challenges in the field of system biology is to understand the interaction between a wide range of proteins and ligands. In the past, methods have been developed for predicting binding sites in a protein for a limited number of ligands. In order to address this problem, we developed a web server named 'LPIcom' to facilitate users in understanding protein-ligand interaction. Analysis, comparison and prediction modules are available in the "LPIcom' server to predict protein-ligand interacting residues for 824 ligands. Each ligand must have at least 30 protein binding sites in PDB. Analysis module of the server can identify residues preferred in interaction and binding motif for a given ligand; for example residues glycine, lysine and arginine are preferred in ATP binding sites. Comparison module of the server allows comparing protein-binding sites of multiple ligands to understand the similarity between ligands based on their binding site. This module indicates that ATP, ADP and GTP ligands are in the same cluster and thus their binding sites or interacting residues exhibit a high level of similarity. Propensity-based prediction module has been developed for predicting ligand-interacting residues in a protein for more than 800 ligands. In addition, a number of web-based tools have been integrated to facilitate users in creating web logo and two-sample between ligand interacting and non-interacting residues. In summary, this manuscript presents a web-server for analysis of ligand interacting residue. This server is available for public use from URL http://crdd.osdd.net/raghava/lpicom .

  4. Subcellular SIMS imaging of gadolinium isotopes in human glioblastoma cells treated with a gadolinium containing MRI agent

    Science.gov (United States)

    Smith, Duane R.; Lorey, Daniel R.; Chandra, Subhash

    2004-06-01

    Neutron capture therapy is an experimental binary radiotherapeutic modality for the treatment of brain tumors such as glioblastoma multiforme. Recently, neutron capture therapy with gadolinium-157 has gained attention, and techniques for studying the subcellular distribution of gadolinium-157 are needed. In this preliminary study, we have been able to image the subcellular distribution of gadolinium-157, as well as the other six naturally abundant isotopes of gadolinium, with SIMS ion microscopy. T98G human glioblastoma cells were treated for 24 h with 25 mg/ml of the metal ion complex diethylenetriaminepentaacetic acid Gd(III) dihydrogen salt hydrate (Gd-DTPA). Gd-DTPA is a contrast enhancing agent used for MRI of brain tumors, blood-brain barrier impairment, diseases of the central nervous system, etc. A highly heterogeneous subcellular distribution was observed for gadolinium-157. The nuclei in each cell were distinctly lower in gadolinium-157 than in the cytoplasm. Even within the cytoplasm the gadolinium-157 was heterogeneously distributed. The other six naturally abundant isotopes of gadolinium were imaged from the same cells and exhibited a subcellular distribution consistent with that observed for gadolinium-157. These observations indicate that SIMS ion microscopy may be a viable approach for subcellular studies of gadolinium containing neutron capture therapy drugs and may even play a major role in the development and validation of new gadolinium contrast enhancing agents for diagnostic MRI applications.

  5. Prediction of Carbohydrate-Binding Proteins from Sequences Using Support Vector Machines

    Directory of Open Access Journals (Sweden)

    Seizi Someya

    2010-01-01

    Full Text Available Carbohydrate-binding proteins are proteins that can interact with sugar chains but do not modify them. They are involved in many physiological functions, and we have developed a method for predicting them from their amino acid sequences. Our method is based on support vector machines (SVMs. We first clarified the definition of carbohydrate-binding proteins and then constructed positive and negative datasets with which the SVMs were trained. By applying the leave-one-out test to these datasets, our method delivered 0.92 of the area under the receiver operating characteristic (ROC curve. We also examined two amino acid grouping methods that enable effective learning of sequence patterns and evaluated the performance of these methods. When we applied our method in combination with the homology-based prediction method to the annotated human genome database, H-invDB, we found that the true positive rate of prediction was improved.

  6. Computational methods using weighed-extreme learning machine to predict protein self-interactions with protein evolutionary information.

    Science.gov (United States)

    An, Ji-Yong; Zhang, Lei; Zhou, Yong; Zhao, Yu-Jun; Wang, Da-Fu

    2017-08-18

    Self-interactions Proteins (SIPs) is important for their biological activity owing to the inherent interaction amongst their secondary structures or domains. However, due to the limitations of experimental Self-interactions detection, one major challenge in the study of prediction SIPs is how to exploit computational approaches for SIPs detection based on evolutionary information contained protein sequence. In the work, we presented a novel computational approach named WELM-LAG, which combined the Weighed-Extreme Learning Machine (WELM) classifier with Local Average Group (LAG) to predict SIPs based on protein sequence. The major improvement of our method lies in presenting an effective feature extraction method used to represent candidate Self-interactions proteins by exploring the evolutionary information embedded in PSI-BLAST-constructed position specific scoring matrix (PSSM); and then employing a reliable and robust WELM classifier to carry out classification. In addition, the Principal Component Analysis (PCA) approach is used to reduce the impact of noise. The WELM-LAG method gave very high average accuracies of 92.94 and 96.74% on yeast and human datasets, respectively. Meanwhile, we compared it with the state-of-the-art support vector machine (SVM) classifier and other existing methods on human and yeast datasets, respectively. Comparative results indicated that our approach is very promising and may provide a cost-effective alternative for predicting SIPs. In addition, we developed a freely available web server called WELM-LAG-SIPs to predict SIPs. The web server is available at http://219.219.62.123:8888/WELMLAG/ .

  7. Protein secondary structure prediction for a single-sequence using hidden semi-Markov models

    Directory of Open Access Journals (Sweden)

    Borodovsky Mark

    2006-03-01

    Full Text Available Abstract Background The accuracy of protein secondary structure prediction has been improving steadily towards the 88% estimated theoretical limit. There are two types of prediction algorithms: Single-sequence prediction algorithms imply that information about other (homologous proteins is not available, while algorithms of the second type imply that information about homologous proteins is available, and use it intensively. The single-sequence algorithms could make an important contribution to studies of proteins with no detected homologs, however the accuracy of protein secondary structure prediction from a single-sequence is not as high as when the additional evolutionary information is present. Results In this paper, we further refine and extend the hidden semi-Markov model (HSMM initially considered in the BSPSS algorithm. We introduce an improved residue dependency model by considering the patterns of statistically significant amino acid correlation at structural segment borders. We also derive models that specialize on different sections of the dependency structure and incorporate them into HSMM. In addition, we implement an iterative training method to refine estimates of HSMM parameters. The three-state-per-residue accuracy and other accuracy measures of the new method, IPSSP, are shown to be comparable or better than ones for BSPSS as well as for PSIPRED, tested under the single-sequence condition. Conclusions We have shown that new dependency models and training methods bring further improvements to single-sequence protein secondary structure prediction. The results are obtained under cross-validation conditions using a dataset with no pair of sequences having significant sequence similarity. As new sequences are added to the database it is possible to augment the dependency structure and obtain even higher accuracy. Current and future advances should contribute to the improvement of function prediction for orphan proteins inscrutable

  8. Identification and correction of abnormal, incomplete and mispredicted proteins in public databases

    Directory of Open Access Journals (Sweden)

    Bányai László

    2008-08-01

    Full Text Available Abstract Background Despite significant improvements in computational annotation of genomes, sequences of abnormal, incomplete or incorrectly predicted genes and proteins remain abundant in public databases. Since the majority of incomplete, abnormal or mispredicted entries are not annotated as such, these errors seriously affect the reliability of these databases. Here we describe the MisPred approach that may provide an efficient means for the quality control of databases. The current version of the MisPred approach uses five distinct routines for identifying abnormal, incomplete or mispredicted entries based on the principle that a sequence is likely to be incorrect if some of its features conflict with our current knowledge about protein-coding genes and proteins: (i conflict between the predicted subcellular localization of proteins and the absence of the corresponding sequence signals; (ii presence of extracellular and cytoplasmic domains and the absence of transmembrane segments; (iii co-occurrence of extracellular and nuclear domains; (iv violation of domain integrity; (v chimeras encoded by two or more genes located on different chromosomes. Results Analyses of predicted EnsEMBL protein sequences of nine deuterostome (Homo sapiens, Mus musculus, Rattus norvegicus, Monodelphis domestica, Gallus gallus, Xenopus tropicalis, Fugu rubripes, Danio rerio and Ciona intestinalis and two protostome species (Caenorhabditis elegans and Drosophila melanogaster have revealed that the absence of expected signal peptides and violation of domain integrity account for the majority of mispredictions. Analyses of sequences predicted by NCBI's GNOMON annotation pipeline show that the rates of mispredictions are comparable to those of EnsEMBL. Interestingly, even the manually curated UniProtKB/Swiss-Prot dataset is contaminated with mispredicted or abnormal proteins, although to a much lesser extent than UniProtKB/TrEMBL or the EnsEMBL or GNOMON-predicted

  9. Sequence-based feature prediction and annotation of proteins

    DEFF Research Database (Denmark)

    Juncker, Agnieszka; Jensen, Lars J.; Pierleoni, Andrea

    2009-01-01

    A recent trend in computational methods for annotation of protein function is that many prediction tools are combined in complex workflows and pipelines to facilitate the analysis of feature combinations, for example, the entire repertoire of kinase-binding motifs in the human proteome....

  10. Predicting Ligand Binding Sites on Protein Surfaces by 3-Dimensional Probability Density Distributions of Interacting Atoms

    Science.gov (United States)

    Jian, Jhih-Wei; Elumalai, Pavadai; Pitti, Thejkiran; Wu, Chih Yuan; Tsai, Keng-Chang; Chang, Jeng-Yih; Peng, Hung-Pin; Yang, An-Suei

    2016-01-01

    Predicting ligand binding sites (LBSs) on protein structures, which are obtained either from experimental or computational methods, is a useful first step in functional annotation or structure-based drug design for the protein structures. In this work, the structure-based machine learning algorithm ISMBLab-LIG was developed to predict LBSs on protein surfaces with input attributes derived from the three-dimensional probability density maps of interacting atoms, which were reconstructed on the query protein surfaces and were relatively insensitive to local conformational variations of the tentative ligand binding sites. The prediction accuracy of the ISMBLab-LIG predictors is comparable to that of the best LBS predictors benchmarked on several well-established testing datasets. More importantly, the ISMBLab-LIG algorithm has substantial tolerance to the prediction uncertainties of computationally derived protein structure models. As such, the method is particularly useful for predicting LBSs not only on experimental protein structures without known LBS templates in the database but also on computationally predicted model protein structures with structural uncertainties in the tentative ligand binding sites. PMID:27513851

  11. A Deep Learning Network Approach to ab initio Protein Secondary Structure Prediction.

    Science.gov (United States)

    Spencer, Matt; Eickholt, Jesse; Jianlin Cheng

    2015-01-01

    Ab initio protein secondary structure (SS) predictions are utilized to generate tertiary structure predictions, which are increasingly demanded due to the rapid discovery of proteins. Although recent developments have slightly exceeded previous methods of SS prediction, accuracy has stagnated around 80 percent and many wonder if prediction cannot be advanced beyond this ceiling. Disciplines that have traditionally employed neural networks are experimenting with novel deep learning techniques in attempts to stimulate progress. Since neural networks have historically played an important role in SS prediction, we wanted to determine whether deep learning could contribute to the advancement of this field as well. We developed an SS predictor that makes use of the position-specific scoring matrix generated by PSI-BLAST and deep learning network architectures, which we call DNSS. Graphical processing units and CUDA software optimize the deep network architecture and efficiently train the deep networks. Optimal parameters for the training process were determined, and a workflow comprising three separately trained deep networks was constructed in order to make refined predictions. This deep learning network approach was used to predict SS for a fully independent test dataset of 198 proteins, achieving a Q3 accuracy of 80.7 percent and a Sov accuracy of 74.2 percent.

  12. Structure, function and subcellular localization of the potato Resistance protein Rx1

    NARCIS (Netherlands)

    Slootweg, E.J.

    2009-01-01

    Resistance proteins are part of the plant’s immune system and mediate a defence response upon recognizing their cognate pathogens. They are thought to be present in the cell as part of a larger protein complex. The modular architecture of R proteins suggests that they form a scaffold for various

  13. Prediction of membrane transport proteins and their substrate specificities using primary sequence information.

    Directory of Open Access Journals (Sweden)

    Nitish K Mishra

    Full Text Available Membrane transport proteins (transporters move hydrophilic substrates across hydrophobic membranes and play vital roles in most cellular functions. Transporters represent a diverse group of proteins that differ in topology, energy coupling mechanism, and substrate specificity as well as sequence similarity. Among the functional annotations of transporters, information about their transporting substrates is especially important. The experimental identification and characterization of transporters is currently costly and time-consuming. The development of robust bioinformatics-based methods for the prediction of membrane transport proteins and their substrate specificities is therefore an important and urgent task.Support vector machine (SVM-based computational models, which comprehensively utilize integrative protein sequence features such as amino acid composition, dipeptide composition, physico-chemical composition, biochemical composition, and position-specific scoring matrices (PSSM, were developed to predict the substrate specificity of seven transporter classes: amino acid, anion, cation, electron, protein/mRNA, sugar, and other transporters. An additional model to differentiate transporters from non-transporters was also developed. Among the developed models, the biochemical composition and PSSM hybrid model outperformed other models and achieved an overall average prediction accuracy of 76.69% with a Mathews correlation coefficient (MCC of 0.49 and a receiver operating characteristic area under the curve (AUC of 0.833 on our main dataset. This model also achieved an overall average prediction accuracy of 78.88% and MCC of 0.41 on an independent dataset.Our analyses suggest that evolutionary information (i.e., the PSSM and the AAIndex are key features for the substrate specificity prediction of transport proteins. In comparison, similarity-based methods such as BLAST, PSI-BLAST, and hidden Markov models do not provide accurate predictions

  14. Rationally designed synthetic protein hydrogels with predictable mechanical properties.

    Science.gov (United States)

    Wu, Junhua; Li, Pengfei; Dong, Chenling; Jiang, Heting; Bin Xue; Gao, Xiang; Qin, Meng; Wang, Wei; Bin Chen; Cao, Yi

    2018-02-12

    Designing synthetic protein hydrogels with tailored mechanical properties similar to naturally occurring tissues is an eternal pursuit in tissue engineering and stem cell and cancer research. However, it remains challenging to correlate the mechanical properties of protein hydrogels with the nanomechanics of individual building blocks. Here we use single-molecule force spectroscopy, protein engineering and theoretical modeling to prove that the mechanical properties of protein hydrogels are predictable based on the mechanical hierarchy of the cross-linkers and the load-bearing modules at the molecular level. These findings provide a framework for rationally designing protein hydrogels with independently tunable elasticity, extensibility, toughness and self-healing. Using this principle, we demonstrate the engineering of self-healable muscle-mimicking hydrogels that can significantly dissipate energy through protein unfolding. We expect that this principle can be generalized for the construction of protein hydrogels with customized mechanical properties for biomedical applications.

  15. NetTurnP--neural network prediction of beta-turns by use of evolutionary information and predicted protein sequence features.

    Directory of Open Access Journals (Sweden)

    Bent Petersen

    Full Text Available UNLABELLED: β-turns are the most common type of non-repetitive structures, and constitute on average 25% of the amino acids in proteins. The formation of β-turns plays an important role in protein folding, protein stability and molecular recognition processes. In this work we present the neural network method NetTurnP, for prediction of two-class β-turns and prediction of the individual β-turn types, by use of evolutionary information and predicted protein sequence features. It has been evaluated against a commonly used dataset BT426, and achieves a Matthews correlation coefficient of 0.50, which is the highest reported performance on a two-class prediction of β-turn and not-β-turn. Furthermore NetTurnP shows improved performance on some of the specific β-turn types. In the present work, neural network methods have been trained to predict β-turn or not and individual β-turn types from the primary amino acid sequence. The individual β-turn types I, I', II, II', VIII, VIa1, VIa2, VIba and IV have been predicted based on classifications by PROMOTIF, and the two-class prediction of β-turn or not is a superset comprised of all β-turn types. The performance is evaluated using a golden set of non-homologous sequences known as BT426. Our two-class prediction method achieves a performance of: MCC=0.50, Qtotal=82.1%, sensitivity=75.6%, PPV=68.8% and AUC=0.864. We have compared our performance to eleven other prediction methods that obtain Matthews correlation coefficients in the range of 0.17-0.47. For the type specific β-turn predictions, only type I and II can be predicted with reasonable Matthews correlation coefficients, where we obtain performance values of 0.36 and 0.31, respectively. CONCLUSION: The NetTurnP method has been implemented as a webserver, which is freely available at http://www.cbs.dtu.dk/services/NetTurnP/. NetTurnP is the only available webserver that allows submission of multiple sequences.

  16. NetTurnP--neural network prediction of beta-turns by use of evolutionary information and predicted protein sequence features.

    Science.gov (United States)

    Petersen, Bent; Lundegaard, Claus; Petersen, Thomas Nordahl

    2010-11-30

    β-turns are the most common type of non-repetitive structures, and constitute on average 25% of the amino acids in proteins. The formation of β-turns plays an important role in protein folding, protein stability and molecular recognition processes. In this work we present the neural network method NetTurnP, for prediction of two-class β-turns and prediction of the individual β-turn types, by use of evolutionary information and predicted protein sequence features. It has been evaluated against a commonly used dataset BT426, and achieves a Matthews correlation coefficient of 0.50, which is the highest reported performance on a two-class prediction of β-turn and not-β-turn. Furthermore NetTurnP shows improved performance on some of the specific β-turn types. In the present work, neural network methods have been trained to predict β-turn or not and individual β-turn types from the primary amino acid sequence. The individual β-turn types I, I', II, II', VIII, VIa1, VIa2, VIba and IV have been predicted based on classifications by PROMOTIF, and the two-class prediction of β-turn or not is a superset comprised of all β-turn types. The performance is evaluated using a golden set of non-homologous sequences known as BT426. Our two-class prediction method achieves a performance of: MCC=0.50, Qtotal=82.1%, sensitivity=75.6%, PPV=68.8% and AUC=0.864. We have compared our performance to eleven other prediction methods that obtain Matthews correlation coefficients in the range of 0.17-0.47. For the type specific β-turn predictions, only type I and II can be predicted with reasonable Matthews correlation coefficients, where we obtain performance values of 0.36 and 0.31, respectively. The NetTurnP method has been implemented as a webserver, which is freely available at http://www.cbs.dtu.dk/services/NetTurnP/. NetTurnP is the only available webserver that allows submission of multiple sequences.

  17. Assimilation and subcellular partitioning of elements by grass shrimp collected along an impact gradient

    International Nuclear Information System (INIS)

    Seebaugh, David R.; Wallace, William G.

    2009-01-01

    Chronic exposure to polluted field conditions can impact metal bioavailability in prey and may influence metal transfer to predators. The present study investigated the assimilation of Cd, Hg and organic carbon by grass shrimp Palaemonetes pugio, collected along an impact gradient within the New York/New Jersey Harbor Estuary. Adult shrimp were collected from five Staten Island, New York study sites, fed 109 Cd- or 203 Hg-labeled amphipods or 14 C-labeled meals and analyzed for assimilation efficiencies (AE). Subsamples of amphipods and shrimp were subjected to subcellular fractionation to isolate metal associated with a compartment presumed to contain trophically available metal (TAM) (metal associated with heat-stable proteins [HSP - e.g., metallothionein-like proteins], heat-denatured proteins [HDP - e.g., enzymes] and organelles [ORG]). TAM- 109 Cd% and TAM- 203 Hg% in radiolabeled amphipods were ∼64% and ∼73%, respectively. Gradients in AE- 109 Cd% (∼54% to ∼75%) and AE- 203 Hg% (∼61% to ∼78%) were observed for grass shrimp, with the highest values exhibited by shrimp collected from sites within the heavily polluted Arthur Kill complex. Population differences in AE- 14 C% were not observed. Assimilated 109 Cd% partitioned to the TAM compartment in grass shrimp varied between ∼67% and ∼75%. 109 Cd bound to HSP in shrimp varied between ∼15% and ∼47%, while 109 Cd associated with metal-sensitive HDP was ∼17% to ∼44%. Percentages of assimilated 109 Cd bound to ORG were constant at ∼10%. Assimilated 203 Hg% associated with TAM in grass shrimp did not exhibit significant variation. Percentages of assimilated 203 Hg bound to HDP (∼47%) and ORG (∼11%) did not vary among populations and partitioning of 203 Hg to HSP was not observed. Using a simplified biokinetic model of metal accumulation from the diet, it is estimated that site-specific variability in Cd AE by shrimp and tissue Cd burdens in field-collected prey (polychaetes Nereis spp

  18. ALG-2 oscillates in subcellular localization, unitemporally with calcium oscillations

    DEFF Research Database (Denmark)

    la Cour, Jonas Marstrand; Mollerup, Jens; Berchtold, Martin Werner

    2007-01-01

    discovered that the subcellular distribution of a tagged version of ALG-2 could be directed by physiological external stimuli (including ATP, EGF, prostaglandin, histamine), which provoke intracellular Ca2+ oscillations. Cellular stimulation led to a redistribution of ALG-2 from the cytosol to a punctate...

  19. Predictive and comparative analysis of Ebolavirus proteins

    Science.gov (United States)

    Cong, Qian; Pei, Jimin; Grishin, Nick V

    2015-01-01

    Ebolavirus is the pathogen for Ebola Hemorrhagic Fever (EHF). This disease exhibits a high fatality rate and has recently reached a historically epidemic proportion in West Africa. Out of the 5 known Ebolavirus species, only Reston ebolavirus has lost human pathogenicity, while retaining the ability to cause EHF in long-tailed macaque. Significant efforts have been spent to determine the three-dimensional (3D) structures of Ebolavirus proteins, to study their interaction with host proteins, and to identify the functional motifs in these viral proteins. Here, in light of these experimental results, we apply computational analysis to predict the 3D structures and functional sites for Ebolavirus protein domains with unknown structure, including a zinc-finger domain of VP30, the RNA-dependent RNA polymerase catalytic domain and a methyltransferase domain of protein L. In addition, we compare sequences of proteins that interact with Ebolavirus proteins from RESTV-resistant primates with those from RESTV-susceptible monkeys. The host proteins that interact with GP and VP35 show an elevated level of sequence divergence between the RESTV-resistant and RESTV-susceptible species, suggesting that they may be responsible for host specificity. Meanwhile, we detect variable positions in protein sequences that are likely associated with the loss of human pathogenicity in RESTV, map them onto the 3D structures and compare their positions to known functional sites. VP35 and VP30 are significantly enriched in these potential pathogenicity determinants and the clustering of such positions on the surfaces of VP35 and GP suggests possible uncharacterized interaction sites with host proteins that contribute to the virulence of Ebolavirus. PMID:26158395

  20. Predictive and comparative analysis of Ebolavirus proteins.

    Science.gov (United States)

    Cong, Qian; Pei, Jimin; Grishin, Nick V

    2015-01-01

    Ebolavirus is the pathogen for Ebola Hemorrhagic Fever (EHF). This disease exhibits a high fatality rate and has recently reached a historically epidemic proportion in West Africa. Out of the 5 known Ebolavirus species, only Reston ebolavirus has lost human pathogenicity, while retaining the ability to cause EHF in long-tailed macaque. Significant efforts have been spent to determine the three-dimensional (3D) structures of Ebolavirus proteins, to study their interaction with host proteins, and to identify the functional motifs in these viral proteins. Here, in light of these experimental results, we apply computational analysis to predict the 3D structures and functional sites for Ebolavirus protein domains with unknown structure, including a zinc-finger domain of VP30, the RNA-dependent RNA polymerase catalytic domain and a methyltransferase domain of protein L. In addition, we compare sequences of proteins that interact with Ebolavirus proteins from RESTV-resistant primates with those from RESTV-susceptible monkeys. The host proteins that interact with GP and VP35 show an elevated level of sequence divergence between the RESTV-resistant and RESTV-susceptible species, suggesting that they may be responsible for host specificity. Meanwhile, we detect variable positions in protein sequences that are likely associated with the loss of human pathogenicity in RESTV, map them onto the 3D structures and compare their positions to known functional sites. VP35 and VP30 are significantly enriched in these potential pathogenicity determinants and the clustering of such positions on the surfaces of VP35 and GP suggests possible uncharacterized interaction sites with host proteins that contribute to the virulence of Ebolavirus.