WorldWideScience

Sample records for protein structure model

  1. Automated Protein Structure Modeling with SWISS-MODEL Workspace and the Protein Model Portal

    OpenAIRE

    Bordoli, Lorenza; Schwede, Torsten

    2012-01-01

    Comparative protein structure modeling is a computational approach to build three-dimensional structural models for proteins using experimental structures of related protein family members as templates. Regular blind assessments of modeling accuracy have demonstrated that comparative protein structure modeling is currently the most reliable technique to model protein structures. Homology models are often sufficiently accurate to substitute for experimental structures in a wide variety of appl...

  2. Automated protein structure modeling with SWISS-MODEL Workspace and the Protein Model Portal.

    Science.gov (United States)

    Bordoli, Lorenza; Schwede, Torsten

    2012-01-01

    Comparative protein structure modeling is a computational approach to build three-dimensional structural models for proteins using experimental structures of related protein family members as templates. Regular blind assessments of modeling accuracy have demonstrated that comparative protein structure modeling is currently the most reliable technique to model protein structures. Homology models are often sufficiently accurate to substitute for experimental structures in a wide variety of applications. Since the usefulness of a model for specific application is determined by its accuracy, model quality estimation is an essential component of protein structure prediction. Comparative protein modeling has become a routine approach in many areas of life science research since fully automated modeling systems allow also nonexperts to build reliable models. In this chapter, we describe practical approaches for automated protein structure modeling with SWISS-MODEL Workspace and the Protein Model Portal.

  3. Modeling protein structures: construction and their applications.

    Science.gov (United States)

    Ring, C S; Cohen, F E

    1993-06-01

    Although no general solution to the protein folding problem exists, the three-dimensional structures of proteins are being successfully predicted when experimentally derived constraints are used in conjunction with heuristic methods. In the case of interleukin-4, mutagenesis data and CD spectroscopy were instrumental in the accurate assignment of secondary structure. In addition, the tertiary structure was highly constrained by six cysteines separated by many residues that formed three disulfide bridges. Although the correct structure was a member of a short list of plausible structures, the "best" structure was the topological enantiomer of the experimentally determined conformation. For many proteases, other experimentally derived structures can be used as templates to identify the secondary structure elements. In a procedure called modeling by homology, the structure of a known protein is used as a scaffold to predict the structure of another related protein. This method has been used to model a serine and a cysteine protease that are important in the schistosome and malarial life cycles, respectively. The model structures were then used to identify putative small molecule enzyme inhibitors computationally. Experiments confirm that some of these nonpeptidic compounds are active at concentrations of less than 10 microM.

  4. Fast loop modeling for protein structures

    Science.gov (United States)

    Zhang, Jiong; Nguyen, Son; Shang, Yi; Xu, Dong; Kosztin, Ioan

    2015-03-01

    X-ray crystallography is the main method for determining 3D protein structures. In many cases, however, flexible loop regions of proteins cannot be resolved by this approach. This leads to incomplete structures in the protein data bank, preventing further computational study and analysis of these proteins. For instance, all-atom molecular dynamics (MD) simulation studies of structure-function relationship require complete protein structures. To address this shortcoming, we have developed and implemented an efficient computational method for building missing protein loops. The method is database driven and uses deep learning and multi-dimensional scaling algorithms. We have implemented the method as a simple stand-alone program, which can also be used as a plugin in existing molecular modeling software, e.g., VMD. The quality and stability of the generated structures are assessed and tested via energy scoring functions and by equilibrium MD simulations. The proposed method can also be used in template-based protein structure prediction. Work supported by the National Institutes of Health [R01 GM100701]. Computer time was provided by the University of Missouri Bioinformatics Consortium.

  5. Predicting Protein Secondary Structure with Markov Models

    DEFF Research Database (Denmark)

    Fischer, Paul; Larsen, Simon; Thomsen, Claus

    2004-01-01

    we are considering here, is to predict the secondary structure from the primary one. To this end we train a Markov model on training data and then use it to classify parts of unknown protein sequences as sheets, helices or coils. We show how to exploit the directional information contained...... in the Markov model for this task. Classifications that are purely based on statistical models might not always be biologically meaningful. We present combinatorial methods to incorporate biological background knowledge to enhance the prediction performance....

  6. A hidden markov model derived structural alphabet for proteins.

    Science.gov (United States)

    Camproux, A C; Gautier, R; Tufféry, P

    2004-06-04

    Understanding and predicting protein structures depends on the complexity and the accuracy of the models used to represent them. We have set up a hidden Markov model that discretizes protein backbone conformation as series of overlapping fragments (states) of four residues length. This approach learns simultaneously the geometry of the states and their connections. We obtain, using a statistical criterion, an optimal systematic decomposition of the conformational variability of the protein peptidic chain in 27 states with strong connection logic. This result is stable over different protein sets. Our model fits well the previous knowledge related to protein architecture organisation and seems able to grab some subtle details of protein organisation, such as helix sub-level organisation schemes. Taking into account the dependence between the states results in a description of local protein structure of low complexity. On an average, the model makes use of only 8.3 states among 27 to describe each position of a protein structure. Although we use short fragments, the learning process on entire protein conformations captures the logic of the assembly on a larger scale. Using such a model, the structure of proteins can be reconstructed with an average accuracy close to 1.1A root-mean-square deviation and for a complexity of only 3. Finally, we also observe that sequence specificity increases with the number of states of the structural alphabet. Such models can constitute a very relevant approach to the analysis of protein architecture in particular for protein structure prediction.

  7. The Protein Model Portal--a comprehensive resource for protein structure and model information.

    Science.gov (United States)

    Haas, Juergen; Roth, Steven; Arnold, Konstantin; Kiefer, Florian; Schmidt, Tobias; Bordoli, Lorenza; Schwede, Torsten

    2013-01-01

    The Protein Model Portal (PMP) has been developed to foster effective use of 3D molecular models in biomedical research by providing convenient and comprehensive access to structural information for proteins. Both experimental structures and theoretical models for a given protein can be searched simultaneously and analyzed for structural variability. By providing a comprehensive view on structural information, PMP offers the opportunity to apply consistent assessment and validation criteria to the complete set of structural models available for proteins. PMP is an open project so that new methods developed by the community can contribute to PMP, for example, new modeling servers for creating homology models and model quality estimation servers for model validation. The accuracy of participating modeling servers is continuously evaluated by the Continuous Automated Model EvaluatiOn (CAMEO) project. The PMP offers a unique interface to visualize structural coverage of a protein combining both theoretical models and experimental structures, allowing straightforward assessment of the model quality and hence their utility. The portal is updated regularly and actively developed to include latest methods in the field of computational structural biology. Database URL: http://www.proteinmodelportal.org.

  8. The Protein Model Portal—a comprehensive resource for protein structure and model information

    Science.gov (United States)

    Haas, Juergen; Roth, Steven; Arnold, Konstantin; Kiefer, Florian; Schmidt, Tobias; Bordoli, Lorenza; Schwede, Torsten

    2013-01-01

    The Protein Model Portal (PMP) has been developed to foster effective use of 3D molecular models in biomedical research by providing convenient and comprehensive access to structural information for proteins. Both experimental structures and theoretical models for a given protein can be searched simultaneously and analyzed for structural variability. By providing a comprehensive view on structural information, PMP offers the opportunity to apply consistent assessment and validation criteria to the complete set of structural models available for proteins. PMP is an open project so that new methods developed by the community can contribute to PMP, for example, new modeling servers for creating homology models and model quality estimation servers for model validation. The accuracy of participating modeling servers is continuously evaluated by the Continuous Automated Model EvaluatiOn (CAMEO) project. The PMP offers a unique interface to visualize structural coverage of a protein combining both theoretical models and experimental structures, allowing straightforward assessment of the model quality and hence their utility. The portal is updated regularly and actively developed to include latest methods in the field of computational structural biology. Database URL: http://www.proteinmodelportal.org PMID:23624946

  9. Predicting nucleic acid binding interfaces from structural models of proteins.

    Science.gov (United States)

    Dror, Iris; Shazman, Shula; Mukherjee, Srayanta; Zhang, Yang; Glaser, Fabian; Mandel-Gutfreund, Yael

    2012-02-01

    The function of DNA- and RNA-binding proteins can be inferred from the characterization and accurate prediction of their binding interfaces. However, the main pitfall of various structure-based methods for predicting nucleic acid binding function is that they are all limited to a relatively small number of proteins for which high-resolution three-dimensional structures are available. In this study, we developed a pipeline for extracting functional electrostatic patches from surfaces of protein structural models, obtained using the I-TASSER protein structure predictor. The largest positive patches are extracted from the protein surface using the patchfinder algorithm. We show that functional electrostatic patches extracted from an ensemble of structural models highly overlap the patches extracted from high-resolution structures. Furthermore, by testing our pipeline on a set of 55 known nucleic acid binding proteins for which I-TASSER produces high-quality models, we show that the method accurately identifies the nucleic acids binding interface on structural models of proteins. Employing a combined patch approach we show that patches extracted from an ensemble of models better predicts the real nucleic acid binding interfaces compared with patches extracted from independent models. Overall, these results suggest that combining information from a collection of low-resolution structural models could be a valuable approach for functional annotation. We suggest that our method will be further applicable for predicting other functional surfaces of proteins with unknown structure. Copyright © 2011 Wiley Periodicals, Inc.

  10. Models of protein-ligand crystal structures: trust, but verify.

    Science.gov (United States)

    Deller, Marc C; Rupp, Bernhard

    2015-09-01

    X-ray crystallography provides the most accurate models of protein-ligand structures. These models serve as the foundation of many computational methods including structure prediction, molecular modelling, and structure-based drug design. The success of these computational methods ultimately depends on the quality of the underlying protein-ligand models. X-ray crystallography offers the unparalleled advantage of a clear mathematical formalism relating the experimental data to the protein-ligand model. In the case of X-ray crystallography, the primary experimental evidence is the electron density of the molecules forming the crystal. The first step in the generation of an accurate and precise crystallographic model is the interpretation of the electron density of the crystal, typically carried out by construction of an atomic model. The atomic model must then be validated for fit to the experimental electron density and also for agreement with prior expectations of stereochemistry. Stringent validation of protein-ligand models has become possible as a result of the mandatory deposition of primary diffraction data, and many computational tools are now available to aid in the validation process. Validation of protein-ligand complexes has revealed some instances of overenthusiastic interpretation of ligand density. Fundamental concepts and metrics of protein-ligand quality validation are discussed and we highlight software tools to assist in this process. It is essential that end users select high quality protein-ligand models for their computational and biological studies, and we provide an overview of how this can be achieved.

  11. A generative, probabilistic model of local protein structure

    DEFF Research Database (Denmark)

    Boomsma, Wouter; Mardia, Kanti V.; Taylor, Charles C.

    2008-01-01

    Despite significant progress in recent years, protein structure prediction maintains its status as one of the prime unsolved problems in computational biology. One of the key remaining challenges is an efficient probabilistic exploration of the structural space that correctly reflects the relative...... conformational stabilities. Here, we present a fully probabilistic, continuous model of local protein structure in atomic detail. The generative model makes efficient conformational sampling possible and provides a framework for the rigorous analysis of local sequence-structure correlations in the native state...

  12. Protein Structure Classification and Loop Modeling Using Multiple Ramachandran Distributions

    KAUST Repository

    Najibi, Seyed Morteza

    2017-02-08

    Recently, the study of protein structures using angular representations has attracted much attention among structural biologists. The main challenge is how to efficiently model the continuous conformational space of the protein structures based on the differences and similarities between different Ramachandran plots. Despite the presence of statistical methods for modeling angular data of proteins, there is still a substantial need for more sophisticated and faster statistical tools to model the large-scale circular datasets. To address this need, we have developed a nonparametric method for collective estimation of multiple bivariate density functions for a collection of populations of protein backbone angles. The proposed method takes into account the circular nature of the angular data using trigonometric spline which is more efficient compared to existing methods. This collective density estimation approach is widely applicable when there is a need to estimate multiple density functions from different populations with common features. Moreover, the coefficients of adaptive basis expansion for the fitted densities provide a low-dimensional representation that is useful for visualization, clustering, and classification of the densities. The proposed method provides a novel and unique perspective to two important and challenging problems in protein structure research: structure-based protein classification and angular-sampling-based protein loop structure prediction.

  13. Protein Structure Classification and Loop Modeling Using Multiple Ramachandran Distributions

    KAUST Repository

    Najibi, Seyed Morteza; Maadooliat, Mehdi; Zhou, Lan; Huang, Jianhua Z.; Gao, Xin

    2017-01-01

    Recently, the study of protein structures using angular representations has attracted much attention among structural biologists. The main challenge is how to efficiently model the continuous conformational space of the protein structures based on the differences and similarities between different Ramachandran plots. Despite the presence of statistical methods for modeling angular data of proteins, there is still a substantial need for more sophisticated and faster statistical tools to model the large-scale circular datasets. To address this need, we have developed a nonparametric method for collective estimation of multiple bivariate density functions for a collection of populations of protein backbone angles. The proposed method takes into account the circular nature of the angular data using trigonometric spline which is more efficient compared to existing methods. This collective density estimation approach is widely applicable when there is a need to estimate multiple density functions from different populations with common features. Moreover, the coefficients of adaptive basis expansion for the fitted densities provide a low-dimensional representation that is useful for visualization, clustering, and classification of the densities. The proposed method provides a novel and unique perspective to two important and challenging problems in protein structure research: structure-based protein classification and angular-sampling-based protein loop structure prediction.

  14. A resource for benchmarking the usefulness of protein structure models.

    KAUST Repository

    Carbajo, Daniel

    2012-08-02

    BACKGROUND: Increasingly, biologists and biochemists use computational tools to design experiments to probe the function of proteins and/or to engineer them for a variety of different purposes. The most effective strategies rely on the knowledge of the three-dimensional structure of the protein of interest. However it is often the case that an experimental structure is not available and that models of different quality are used instead. On the other hand, the relationship between the quality of a model and its appropriate use is not easy to derive in general, and so far it has been analyzed in detail only for specific application. RESULTS: This paper describes a database and related software tools that allow testing of a given structure based method on models of a protein representing different levels of accuracy. The comparison of the results of a computational experiment on the experimental structure and on a set of its decoy models will allow developers and users to assess which is the specific threshold of accuracy required to perform the task effectively. CONCLUSIONS: The ModelDB server automatically builds decoy models of different accuracy for a given protein of known structure and provides a set of useful tools for their analysis. Pre-computed data for a non-redundant set of deposited protein structures are available for analysis and download in the ModelDB database. IMPLEMENTATION, AVAILABILITY AND REQUIREMENTS: Project name: A resource for benchmarking the usefulness of protein structure models. Project home page: http://bl210.caspur.it/MODEL-DB/MODEL-DB_web/MODindex.php.Operating system(s): Platform independent. Programming language: Perl-BioPerl (program); mySQL, Perl DBI and DBD modules (database); php, JavaScript, Jmol scripting (web server). Other requirements: Java Runtime Environment v1.4 or later, Perl, BioPerl, CPAN modules, HHsearch, Modeller, LGA, NCBI Blast package, DSSP, Speedfill (Surfnet) and PSAIA. License: Free. Any restrictions to use by

  15. A resource for benchmarking the usefulness of protein structure models.

    Science.gov (United States)

    Carbajo, Daniel; Tramontano, Anna

    2012-08-02

    Increasingly, biologists and biochemists use computational tools to design experiments to probe the function of proteins and/or to engineer them for a variety of different purposes. The most effective strategies rely on the knowledge of the three-dimensional structure of the protein of interest. However it is often the case that an experimental structure is not available and that models of different quality are used instead. On the other hand, the relationship between the quality of a model and its appropriate use is not easy to derive in general, and so far it has been analyzed in detail only for specific application. This paper describes a database and related software tools that allow testing of a given structure based method on models of a protein representing different levels of accuracy. The comparison of the results of a computational experiment on the experimental structure and on a set of its decoy models will allow developers and users to assess which is the specific threshold of accuracy required to perform the task effectively. The ModelDB server automatically builds decoy models of different accuracy for a given protein of known structure and provides a set of useful tools for their analysis. Pre-computed data for a non-redundant set of deposited protein structures are available for analysis and download in the ModelDB database. IMPLEMENTATION, AVAILABILITY AND REQUIREMENTS: Project name: A resource for benchmarking the usefulness of protein structure models. Project home page: http://bl210.caspur.it/MODEL-DB/MODEL-DB_web/MODindex.php.Operating system(s): Platform independent. Programming language: Perl-BioPerl (program); mySQL, Perl DBI and DBD modules (database); php, JavaScript, Jmol scripting (web server). Other requirements: Java Runtime Environment v1.4 or later, Perl, BioPerl, CPAN modules, HHsearch, Modeller, LGA, NCBI Blast package, DSSP, Speedfill (Surfnet) and PSAIA. License: Free. Any restrictions to use by non-academics: No.

  16. A resource for benchmarking the usefulness of protein structure models.

    KAUST Repository

    Carbajo, Daniel; Tramontano, Anna

    2012-01-01

    BACKGROUND: Increasingly, biologists and biochemists use computational tools to design experiments to probe the function of proteins and/or to engineer them for a variety of different purposes. The most effective strategies rely on the knowledge of the three-dimensional structure of the protein of interest. However it is often the case that an experimental structure is not available and that models of different quality are used instead. On the other hand, the relationship between the quality of a model and its appropriate use is not easy to derive in general, and so far it has been analyzed in detail only for specific application. RESULTS: This paper describes a database and related software tools that allow testing of a given structure based method on models of a protein representing different levels of accuracy. The comparison of the results of a computational experiment on the experimental structure and on a set of its decoy models will allow developers and users to assess which is the specific threshold of accuracy required to perform the task effectively. CONCLUSIONS: The ModelDB server automatically builds decoy models of different accuracy for a given protein of known structure and provides a set of useful tools for their analysis. Pre-computed data for a non-redundant set of deposited protein structures are available for analysis and download in the ModelDB database. IMPLEMENTATION, AVAILABILITY AND REQUIREMENTS: Project name: A resource for benchmarking the usefulness of protein structure models. Project home page: http://bl210.caspur.it/MODEL-DB/MODEL-DB_web/MODindex.php.Operating system(s): Platform independent. Programming language: Perl-BioPerl (program); mySQL, Perl DBI and DBD modules (database); php, JavaScript, Jmol scripting (web server). Other requirements: Java Runtime Environment v1.4 or later, Perl, BioPerl, CPAN modules, HHsearch, Modeller, LGA, NCBI Blast package, DSSP, Speedfill (Surfnet) and PSAIA. License: Free. Any restrictions to use by

  17. Mass Spectrometry Coupled Experiments and Protein Structure Modeling Methods

    Directory of Open Access Journals (Sweden)

    Lee Sael

    2013-10-01

    Full Text Available With the accumulation of next generation sequencing data, there is increasing interest in the study of intra-species difference in molecular biology, especially in relation to disease analysis. Furthermore, the dynamics of the protein is being identified as a critical factor in its function. Although accuracy of protein structure prediction methods is high, provided there are structural templates, most methods are still insensitive to amino-acid differences at critical points that may change the overall structure. Also, predicted structures are inherently static and do not provide information about structural change over time. It is challenging to address the sensitivity and the dynamics by computational structure predictions alone. However, with the fast development of diverse mass spectrometry coupled experiments, low-resolution but fast and sensitive structural information can be obtained. This information can then be integrated into the structure prediction process to further improve the sensitivity and address the dynamics of the protein structures. For this purpose, this article focuses on reviewing two aspects: the types of mass spectrometry coupled experiments and structural data that are obtainable through those experiments; and the structure prediction methods that can utilize these data as constraints. Also, short review of current efforts in integrating experimental data in the structural modeling is provided.

  18. A resource for benchmarking the usefulness of protein structure models

    Directory of Open Access Journals (Sweden)

    Carbajo Daniel

    2012-08-01

    Full Text Available Abstract Background Increasingly, biologists and biochemists use computational tools to design experiments to probe the function of proteins and/or to engineer them for a variety of different purposes. The most effective strategies rely on the knowledge of the three-dimensional structure of the protein of interest. However it is often the case that an experimental structure is not available and that models of different quality are used instead. On the other hand, the relationship between the quality of a model and its appropriate use is not easy to derive in general, and so far it has been analyzed in detail only for specific application. Results This paper describes a database and related software tools that allow testing of a given structure based method on models of a protein representing different levels of accuracy. The comparison of the results of a computational experiment on the experimental structure and on a set of its decoy models will allow developers and users to assess which is the specific threshold of accuracy required to perform the task effectively. Conclusions The ModelDB server automatically builds decoy models of different accuracy for a given protein of known structure and provides a set of useful tools for their analysis. Pre-computed data for a non-redundant set of deposited protein structures are available for analysis and download in the ModelDB database. Implementation, availability and requirements Project name: A resource for benchmarking the usefulness of protein structure models. Project home page: http://bl210.caspur.it/MODEL-DB/MODEL-DB_web/MODindex.php. Operating system(s: Platform independent. Programming language: Perl-BioPerl (program; mySQL, Perl DBI and DBD modules (database; php, JavaScript, Jmol scripting (web server. Other requirements: Java Runtime Environment v1.4 or later, Perl, BioPerl, CPAN modules, HHsearch, Modeller, LGA, NCBI Blast package, DSSP, Speedfill (Surfnet and PSAIA. License: Free. Any

  19. Classification of proteins: available structural space for molecular modeling.

    Science.gov (United States)

    Andreeva, Antonina

    2012-01-01

    The wealth of available protein structural data provides unprecedented opportunity to study and better understand the underlying principles of protein folding and protein structure evolution. A key to achieving this lies in the ability to analyse these data and to organize them in a coherent classification scheme. Over the past years several protein classifications have been developed that aim to group proteins based on their structural relationships. Some of these classification schemes explore the concept of structural neighbourhood (structural continuum), whereas other utilize the notion of protein evolution and thus provide a discrete rather than continuum view of protein structure space. This chapter presents a strategy for classification of proteins with known three-dimensional structure. Steps in the classification process along with basic definitions are introduced. Examples illustrating some fundamental concepts of protein folding and evolution with a special focus on the exceptions to them are presented.

  20. Binding free energy analysis of protein-protein docking model structures by evERdock.

    Science.gov (United States)

    Takemura, Kazuhiro; Matubayasi, Nobuyuki; Kitao, Akio

    2018-03-14

    To aid the evaluation of protein-protein complex model structures generated by protein docking prediction (decoys), we previously developed a method to calculate the binding free energies for complexes. The method combines a short (2 ns) all-atom molecular dynamics simulation with explicit solvent and solution theory in the energy representation (ER). We showed that this method successfully selected structures similar to the native complex structure (near-native decoys) as the lowest binding free energy structures. In our current work, we applied this method (evERdock) to 100 or 300 model structures of four protein-protein complexes. The crystal structures and the near-native decoys showed the lowest binding free energy of all the examined structures, indicating that evERdock can successfully evaluate decoys. Several decoys that show low interface root-mean-square distance but relatively high binding free energy were also identified. Analysis of the fraction of native contacts, hydrogen bonds, and salt bridges at the protein-protein interface indicated that these decoys were insufficiently optimized at the interface. After optimizing the interactions around the interface by including interfacial water molecules, the binding free energies of these decoys were improved. We also investigated the effect of solute entropy on binding free energy and found that consideration of the entropy term does not necessarily improve the evaluations of decoys using the normal model analysis for entropy calculation.

  1. Accurate protein structure modeling using sparse NMR data and homologous structure information.

    Science.gov (United States)

    Thompson, James M; Sgourakis, Nikolaos G; Liu, Gaohua; Rossi, Paolo; Tang, Yuefeng; Mills, Jeffrey L; Szyperski, Thomas; Montelione, Gaetano T; Baker, David

    2012-06-19

    While information from homologous structures plays a central role in X-ray structure determination by molecular replacement, such information is rarely used in NMR structure determination because it can be incorrect, both locally and globally, when evolutionary relationships are inferred incorrectly or there has been considerable evolutionary structural divergence. Here we describe a method that allows robust modeling of protein structures of up to 225 residues by combining (1)H(N), (13)C, and (15)N backbone and (13)Cβ chemical shift data, distance restraints derived from homologous structures, and a physically realistic all-atom energy function. Accurate models are distinguished from inaccurate models generated using incorrect sequence alignments by requiring that (i) the all-atom energies of models generated using the restraints are lower than models generated in unrestrained calculations and (ii) the low-energy structures converge to within 2.0 Å backbone rmsd over 75% of the protein. Benchmark calculations on known structures and blind targets show that the method can accurately model protein structures, even with very remote homology information, to a backbone rmsd of 1.2-1.9 Å relative to the conventional determined NMR ensembles and of 0.9-1.6 Å relative to X-ray structures for well-defined regions of the protein structures. This approach facilitates the accurate modeling of protein structures using backbone chemical shift data without need for side-chain resonance assignments and extensive analysis of NOESY cross-peak assignments.

  2. Quality assessment of protein model-structures based on structural and functional similarities.

    Science.gov (United States)

    Konopka, Bogumil M; Nebel, Jean-Christophe; Kotulska, Malgorzata

    2012-09-21

    Experimental determination of protein 3D structures is expensive, time consuming and sometimes impossible. A gap between number of protein structures deposited in the World Wide Protein Data Bank and the number of sequenced proteins constantly broadens. Computational modeling is deemed to be one of the ways to deal with the problem. Although protein 3D structure prediction is a difficult task, many tools are available. These tools can model it from a sequence or partial structural information, e.g. contact maps. Consequently, biologists have the ability to generate automatically a putative 3D structure model of any protein. However, the main issue becomes evaluation of the model quality, which is one of the most important challenges of structural biology. GOBA--Gene Ontology-Based Assessment is a novel Protein Model Quality Assessment Program. It estimates the compatibility between a model-structure and its expected function. GOBA is based on the assumption that a high quality model is expected to be structurally similar to proteins functionally similar to the prediction target. Whereas DALI is used to measure structure similarity, protein functional similarity is quantified using standardized and hierarchical description of proteins provided by Gene Ontology combined with Wang's algorithm for calculating semantic similarity. Two approaches are proposed to express the quality of protein model-structures. One is a single model quality assessment method, the other is its modification, which provides a relative measure of model quality. Exhaustive evaluation is performed on data sets of model-structures submitted to the CASP8 and CASP9 contests. The validation shows that the method is able to discriminate between good and bad model-structures. The best of tested GOBA scores achieved 0.74 and 0.8 as a mean Pearson correlation to the observed quality of models in our CASP8 and CASP9-based validation sets. GOBA also obtained the best result for two targets of CASP8, and

  3. Building alternate protein structures using the elastic network model.

    Science.gov (United States)

    Yang, Qingyi; Sharp, Kim A

    2009-02-15

    We describe a method for efficiently generating ensembles of alternate, all-atom protein structures that (a) differ significantly from the starting structure, (b) have good stereochemistry (bonded geometry), and (c) have good steric properties (absence of atomic overlap). The method uses reconstruction from a series of backbone framework structures that are obtained from a modified elastic network model (ENM) by perturbation along low-frequency normal modes. To ensure good quality backbone frameworks, the single force parameter ENM is modified by introducing two more force parameters to characterize the interaction between the consecutive carbon alphas and those within the same secondary structure domain. The relative stiffness of the three parameters is parameterized to reproduce B-factors, while maintaining good bonded geometry. After parameterization, violations of experimental Calpha-Calpha distances and Calpha-Calpha-Calpha pseudo angles along the backbone are reduced to less than 1%. Simultaneously, the average B-factor correlation coefficient improves to R = 0.77. Two applications illustrate the potential of the approach. (1) 102,051 protein backbones spanning a conformational space of 15 A root mean square deviation were generated from 148 nonredundant proteins in the PDB database, and all-atom models with minimal bonded and nonbonded violations were produced from this ensemble of backbone structures using the SCWRL side chain building program. (2) Improved backbone templates for homology modeling. Fifteen query sequences were each modeled on two targets. For each of the 30 target frameworks, dozens of improved templates could be produced In all cases, improved full atom homology models resulted, of which 50% could be identified blind using the D-Fire statistical potential. (c) 2008 Wiley-Liss, Inc.

  4. Pushing the frontiers of atomic models for protein tertiary structure ...

    Indian Academy of Sciences (India)

    as an NP complete or NP hard problem.4,5 This notwith- standing, the dire need for tertiary structures of proteins in drug discovery and other areas6–8 has propelled the development of a multitude of computational recipes. In this article, we focus on ab initio/de novo strategies,. Bhageerath in particular, for protein tertiary ...

  5. Modeling membrane protein structure through site-directed ESR spectroscopy

    NARCIS (Netherlands)

    Kavalenka, A.A.

    2009-01-01

    Site-directed spin labeling (SDSL) electron spin resonance (ESR) spectroscopy is a
    relatively new biophysical tool for obtaining structural information about proteins. This
    thesis presents a novel approach, based on powerful spectral analysis techniques (multicomponent
    spectral

  6. Compare local pocket and global protein structure models by small structure patterns

    KAUST Repository

    Cui, Xuefeng

    2015-09-09

    Researchers proposed several criteria to assess the quality of predicted protein structures because it is one of the essential tasks in the Critical Assessment of Techniques for Protein Structure Prediction (CASP) competitions. Popular criteria include root mean squared deviation (RMSD), MaxSub score, TM-score, GDT-TS and GDT-HA scores. All these criteria require calculation of rigid transformations to superimpose the the predicted protein structure to the native protein structure. Yet, how to obtain the rigid transformations is unknown or with high time complexity, and, hence, heuristic algorithms were proposed. In this work, we carefully design various small structure patterns, including the ones specifically tuned for local pockets. Such structure patterns are biologically meaningful, and address the issue of relying on a sufficient number of backbone residue fragments for existing methods. We sample the rigid transformations from these small structure patterns; and the optimal superpositions yield by these small structures are refined and reported. As a result, among 11; 669 pairs of predicted and native local protein pocket models from the CASP10 dataset, the GDT-TS scores calculated by our method are significantly higher than those calculated by LGA. Moreover, our program is computationally much more efficient. Source codes and executables are publicly available at http://www.cbrc.kaust.edu.sa/prosta/

  7. Connecting Protein Structure to Intermolecular Interactions: A Computer Modeling Laboratory

    Science.gov (United States)

    Abualia, Mohammed; Schroeder, Lianne; Garcia, Megan; Daubenmire, Patrick L.; Wink, Donald J.; Clark, Ginevra A.

    2016-01-01

    An understanding of protein folding relies on a solid foundation of a number of critical chemical concepts, such as molecular structure, intra-/intermolecular interactions, and relating structure to function. Recent reports show that students struggle on all levels to achieve these understandings and use them in meaningful ways. Further, several…

  8. Structural model of dodecameric heat-shock protein Hsp21

    DEFF Research Database (Denmark)

    Rutsdottir, Gudrun; Härmark, Johan; Weide, Yoran

    2017-01-01

    for investigating structure-function relationships of Hsp21 and understanding these sequence variations, we developed a structural model of Hsp21 based on homology modeling, cryo-EM, cross-linking mass spectrometry, NMR, and small-angle X-ray scattering. Our data suggest a dodecameric arrangement of two trimer...

  9. Conformational Sampling in Template-Free Protein Loop Structure Modeling: An Overview

    OpenAIRE

    Li, Yaohang

    2013-01-01

    Accurately modeling protein loops is an important step to predict three-dimensional structures as well as to understand functions of many proteins. Because of their high flexibility, modeling the three-dimensional structures of loops is difficult and is usually treated as a “mini protein folding problem” under geometric constraints. In the past decade, there has been remarkable progress in template-free loop structure modeling due to advances of computational methods as well as stably increas...

  10. Formulation of probabilistic models of protein structure in atomic detail using the reference ratio method

    DEFF Research Database (Denmark)

    Valentin, Jan B.; Andreetta, Christian; Boomsma, Wouter

    2014-01-01

    We propose a method to formulate probabilistic models of protein structure in atomic detail, for a given amino acid sequence, based on Bayesian principles, while retaining a close link to physics. We start from two previously developed probabilistic models of protein structure on a local length s....... The results indicate that the proposed method and the probabilistic models show considerable promise for probabilistic protein structure prediction and related applications. © 2013 Wiley Periodicals, Inc....

  11. Compare local pocket and global protein structure models by small structure patterns

    KAUST Repository

    Cui, Xuefeng; Kuwahara, Hiroyuki; Li, Shuai Cheng; Gao, Xin

    2015-01-01

    Researchers proposed several criteria to assess the quality of predicted protein structures because it is one of the essential tasks in the Critical Assessment of Techniques for Protein Structure Prediction (CASP) competitions. Popular criteria

  12. The Protein Model Portal

    OpenAIRE

    Arnold, Konstantin; Kiefer, Florian; Kopp, J?rgen; Battey, James N. D.; Podvinec, Michael; Westbrook, John D.; Berman, Helen M.; Bordoli, Lorenza; Schwede, Torsten

    2008-01-01

    Structural Genomics has been successful in determining the structures of many unique proteins in a high throughput manner. Still, the number of known protein sequences is much larger than the number of experimentally solved protein structures. Homology (or comparative) modeling methods make use of experimental protein structures to build models for evolutionary related proteins. Thereby, experimental structure determination efforts and homology modeling complement each other in the exploratio...

  13. Hidden Markov model-derived structural alphabet for proteins: the learning of protein local shapes captures sequence specificity.

    Science.gov (United States)

    Camproux, A C; Tufféry, P

    2005-08-05

    Understanding and predicting protein structures depend on the complexity and the accuracy of the models used to represent them. We have recently set up a Hidden Markov Model to optimally compress protein three-dimensional conformations into a one-dimensional series of letters of a structural alphabet. Such a model learns simultaneously the shape of representative structural letters describing the local conformation and the logic of their connections, i.e. the transition matrix between the letters. Here, we move one step further and report some evidence that such a model of protein local architecture also captures some accurate amino acid features. All the letters have specific and distinct amino acid distributions. Moreover, we show that words of amino acids can have significant propensities for some letters. Perspectives point towards the prediction of the series of letters describing the structure of a protein from its amino acid sequence.

  14. Conformational sampling in template-free protein loop structure modeling: an overview.

    Science.gov (United States)

    Li, Yaohang

    2013-01-01

    Accurately modeling protein loops is an important step to predict three-dimensional structures as well as to understand functions of many proteins. Because of their high flexibility, modeling the three-dimensional structures of loops is difficult and is usually treated as a "mini protein folding problem" under geometric constraints. In the past decade, there has been remarkable progress in template-free loop structure modeling due to advances of computational methods as well as stably increasing number of known structures available in PDB. This mini review provides an overview on the recent computational approaches for loop structure modeling. In particular, we focus on the approaches of sampling loop conformation space, which is a critical step to obtain high resolution models in template-free methods. We review the potential energy functions for loop modeling, loop buildup mechanisms to satisfy geometric constraints, and loop conformation sampling algorithms. The recent loop modeling results are also summarized.

  15. CONFORMATIONAL SAMPLING IN TEMPLATE-FREE PROTEIN LOOP STRUCTURE MODELING: AN OVERVIEW

    Directory of Open Access Journals (Sweden)

    Yaohang Li

    2013-02-01

    Full Text Available Accurately modeling protein loops is an important step to predict three-dimensional structures as well as to understand functions of many proteins. Because of their high flexibility, modeling the three-dimensional structures of loops is difficult and is usually treated as a “mini protein folding problem” under geometric constraints. In the past decade, there has been remarkable progress in template-free loop structure modeling due to advances of computational methods as well as stably increasing number of known structures available in PDB. This mini review provides an overview on the recent computational approaches for loop structure modeling. In particular, we focus on the approaches of sampling loop conformation space, which is a critical step to obtain high resolution models in template-free methods. We review the potential energy functions for loop modeling, loop buildup mechanisms to satisfy geometric constraints, and loop conformation sampling algorithms. The recent loop modeling results are also summarized.

  16. Protein structure modelling and evaluation based on a 4-distance description of side-chain interactions

    Directory of Open Access Journals (Sweden)

    Inbar Yuval

    2010-07-01

    Full Text Available Abstract Background Accurate evaluation and modelling of residue-residue interactions within and between proteins is a key aspect of computational structure prediction including homology modelling, protein-protein docking, refinement of low-resolution structures, and computational protein design. Results Here we introduce a method for accurate protein structure modelling and evaluation based on a novel 4-distance description of residue-residue interaction geometry. Statistical 4-distance preferences were extracted from high-resolution protein structures and were used as a basis for a knowledge-based potential, called Hunter. We demonstrate that 4-distance description of side chain interactions can be used reliably to discriminate the native structure from a set of decoys. Hunter ranked the native structure as the top one in 217 out of 220 high-resolution decoy sets, in 25 out of 28 "Decoys 'R' Us" decoy sets and in 24 out of 27 high-resolution CASP7/8 decoy sets. The same concept was applied to side chain modelling in protein structures. On a set of very high-resolution protein structures the average RMSD was 1.47 Å for all residues and 0.73 Å for buried residues, which is in the range of attainable accuracy for a model. Finally, we show that Hunter performs as good or better than other top methods in homology modelling based on results from the CASP7 experiment. The supporting web site http://bioinfo.weizmann.ac.il/hunter/ was developed to enable the use of Hunter and for visualization and interactive exploration of 4-distance distributions. Conclusions Our results suggest that Hunter can be used as a tool for evaluation and for accurate modelling of residue-residue interactions in protein structures. The same methodology is applicable to other areas involving high-resolution modelling of biomolecules.

  17. Formulation of probabilistic models of protein structure in atomic detail using the reference ratio method.

    Science.gov (United States)

    Valentin, Jan B; Andreetta, Christian; Boomsma, Wouter; Bottaro, Sandro; Ferkinghoff-Borg, Jesper; Frellsen, Jes; Mardia, Kanti V; Tian, Pengfei; Hamelryck, Thomas

    2014-02-01

    We propose a method to formulate probabilistic models of protein structure in atomic detail, for a given amino acid sequence, based on Bayesian principles, while retaining a close link to physics. We start from two previously developed probabilistic models of protein structure on a local length scale, which concern the dihedral angles in main chain and side chains, respectively. Conceptually, this constitutes a probabilistic and continuous alternative to the use of discrete fragment and rotamer libraries. The local model is combined with a nonlocal model that involves a small number of energy terms according to a physical force field, and some information on the overall secondary structure content. In this initial study we focus on the formulation of the joint model and the evaluation of the use of an energy vector as a descriptor of a protein's nonlocal structure; hence, we derive the parameters of the nonlocal model from the native structure without loss of generality. The local and nonlocal models are combined using the reference ratio method, which is a well-justified probabilistic construction. For evaluation, we use the resulting joint models to predict the structure of four proteins. The results indicate that the proposed method and the probabilistic models show considerable promise for probabilistic protein structure prediction and related applications. Copyright © 2013 Wiley Periodicals, Inc.

  18. MyPMFs: a simple tool for creating statistical potentials to assess protein structural models.

    Science.gov (United States)

    Postic, Guillaume; Hamelryck, Thomas; Chomilier, Jacques; Stratmann, Dirk

    2018-05-29

    Evaluating the model quality of protein structures that evolve in environments with particular physicochemical properties requires scoring functions that are adapted to their specific residue compositions and/or structural characteristics. Thus, computational methods developed for structures from the cytosol cannot work properly on membrane or secreted proteins. Here, we present MyPMFs, an easy-to-use tool that allows users to train statistical potentials of mean force (PMFs) on the protein structures of their choice, with all parameters being adjustable. We demonstrate its use by creating an accurate statistical potential for transmembrane protein domains. We also show its usefulness to study the influence of the physical environment on residue interactions within protein structures. Our open-source software is freely available for download at https://github.com/bibip-impmc/mypmfs. Copyright © 2018. Published by Elsevier B.V.

  19. Modeling Protein Structures in Feed and Seed Tissues Using Novel Synchrotron-Based Analytical Technique

    International Nuclear Information System (INIS)

    Yu, P.

    2008-01-01

    Traditional 'wet' chemical analyses usually looks for a specific known component (such as protein) through homogenization and separation of the components of interest from the complex tissue matrix. Traditional 'wet' chemical analyses rely heavily on the use of harsh chemicals and derivatization, therefore altering the native feed protein structures and possibly generating artifacts. The objective of this study was to introduce a novel and non-destructive method to estimate protein structures in feed and seeds within intact tissues using advanced synchrotron-based infrared microspectroscopy (SFTIRM). The experiments were performed at the National Synchrotron Light Source in Brookhaven National Laboratory (US Dept. of Energy, NY). The results show that with synchrotron-based SFTIRM, we are able to localize relatively 'pure' protein without destructions of the feed and seed tissues and qualify protein internal structures in terms of the proportions and ratios of a-helix, β-sheet, random coil and β-turns on a relative basis using multi-peak modeling procedures. These protein structure profile (a-helix, β-sheet, etc.) may influence protein quality and availability in animals. Several examples of feed and seeds were provided. The implications of this study are that we can use this new method to compare internal protein structures between feeds and between seed verities. We can also use this method to detect heat-induced the structural changes of protein in feeds.

  20. From the Protein's Perspective: The Benefits and Challenges of Protein Structure-Based Pharmacophore Modeling

    NARCIS (Netherlands)

    Sanders, M.P.A.; McGuire, R; Roumen, L.; de Esch, I.J.P.; de Vlieg, J; Klomp, J.P.G; de Graaf, C.

    2011-01-01

    A pharmacophore describes the arrangement of molecular features a ligand must contain to efficaciously bind a receptor. Pharmacophore models are developed to improve molecular understanding of ligand-protein interactions, and can be used as a tool to identify novel compounds that fulfil the

  1. CONFOLD2: improved contact-driven ab initio protein structure modeling.

    Science.gov (United States)

    Adhikari, Badri; Cheng, Jianlin

    2018-01-25

    Contact-guided protein structure prediction methods are becoming more and more successful because of the latest advances in residue-residue contact prediction. To support contact-driven structure prediction, effective tools that can quickly build tertiary structural models of good quality from predicted contacts need to be developed. We develop an improved contact-driven protein modelling method, CONFOLD2, and study how it may be effectively used for ab initio protein structure prediction with predicted contacts as input. It builds models using various subsets of input contacts to explore the fold space under the guidance of a soft square energy function, and then clusters the models to obtain the top five models. CONFOLD2 obtains an average reconstruction accuracy of 0.57 TM-score for the 150 proteins in the PSICOV contact prediction dataset. When benchmarked on the CASP11 contacts predicted using CONSIP2 and CASP12 contacts predicted using Raptor-X, CONFOLD2 achieves a mean TM-score of 0.41 on both datasets. CONFOLD2 allows to quickly generate top five structural models for a protein sequence when its secondary structures and contacts predictions at hand. The source code of CONFOLD2 is publicly available at https://github.com/multicom-toolbox/CONFOLD2/ .

  2. MOLECULAR MODELING INDICATES THAT HOMOCYSTEINE INDUCES CONFORMATIONAL CHANGES IN THE STRUCTURE OF PUTATIVE TARGET PROTEINS

    Directory of Open Access Journals (Sweden)

    Yumnam Silla

    2015-09-01

    Full Text Available An elevated level of homocysteine, a reactive thiol containing amino acid is associated with a multitude of complex diseases. A majority (>80% of homocysteine in circulation is bound to protein cysteine residues. Although, till date only 21 proteins have been experimentally shown to bind with homocysteine, using an insilico approach we had earlier identified several potential target proteins that could bind with homocysteine. Shomocysteinylation of proteins could potentially alter the structure and/or function of the protein. Earlier studies have shown that binding of homocysteine to protein alters its function. However, the effect of homocysteine on the target protein structure has not yet been documented. In the present work, we assess conformational or structural changes if any due to protein homocysteinylation using two proteins, granzyme B (GRAB and junctional adhesion molecule 1 (JAM1, which could potentially bind to homocysteine. We, for the first time, constructed computational models of homocysteine bound to target proteins and monitored their structural changes using explicit solvent molecular dynamic (MD simulation. Analysis of homocysteine bound trajectories revealed higher flexibility of the active site residues and local structural perturbations compared to the unbound native structure’s simulation, which could affect the stability of the protein. In addition, secondary structure analysis of homocysteine bound trajectories also revealed disappearance of â-helix within the G-helix and linker region that connects between the domain regions (as defined in the crystal structure. Our study thus captures the conformational transitions induced by homocysteine and we suggest these structural alterations might have implications for hyperhomocysteinemia induced pathologies.

  3. Computational methods for constructing protein structure models from 3D electron microscopy maps.

    Science.gov (United States)

    Esquivel-Rodríguez, Juan; Kihara, Daisuke

    2013-10-01

    Protein structure determination by cryo-electron microscopy (EM) has made significant progress in the past decades. Resolutions of EM maps have been improving as evidenced by recently reported structures that are solved at high resolutions close to 3Å. Computational methods play a key role in interpreting EM data. Among many computational procedures applied to an EM map to obtain protein structure information, in this article we focus on reviewing computational methods that model protein three-dimensional (3D) structures from a 3D EM density map that is constructed from two-dimensional (2D) maps. The computational methods we discuss range from de novo methods, which identify structural elements in an EM map, to structure fitting methods, where known high resolution structures are fit into a low-resolution EM map. A list of available computational tools is also provided. Copyright © 2013 Elsevier Inc. All rights reserved.

  4. Structural characterisation of medically relevant protein assemblies by integrating mass spectrometry with computational modelling.

    Science.gov (United States)

    Politis, Argyris; Schmidt, Carla

    2018-03-20

    Structural mass spectrometry with its various techniques is a powerful tool for the structural elucidation of medically relevant protein assemblies. It delivers information on the composition, stoichiometries, interactions and topologies of these assemblies. Most importantly it can deal with heterogeneous mixtures and assemblies which makes it universal among the conventional structural techniques. In this review we summarise recent advances and challenges in structural mass spectrometric techniques. We describe how the combination of the different mass spectrometry-based methods with computational strategies enable structural models at molecular levels of resolution. These models hold significant potential for helping us in characterizing the function of protein assemblies related to human health and disease. In this review we summarise the techniques of structural mass spectrometry often applied when studying protein-ligand complexes. We exemplify these techniques through recent examples from literature that helped in the understanding of medically relevant protein assemblies. We further provide a detailed introduction into various computational approaches that can be integrated with these mass spectrometric techniques. Last but not least we discuss case studies that integrated mass spectrometry and computational modelling approaches and yielded models of medically important protein assembly states such as fibrils and amyloids. Copyright © 2017 The Author(s). Published by Elsevier B.V. All rights reserved.

  5. Protein loop modeling using a new hybrid energy function and its application to modeling in inaccurate structural environments.

    Directory of Open Access Journals (Sweden)

    Hahnbeom Park

    Full Text Available Protein loop modeling is a tool for predicting protein local structures of particular interest, providing opportunities for applications involving protein structure prediction and de novo protein design. Until recently, the majority of loop modeling methods have been developed and tested by reconstructing loops in frameworks of experimentally resolved structures. In many practical applications, however, the protein loops to be modeled are located in inaccurate structural environments. These include loops in model structures, low-resolution experimental structures, or experimental structures of different functional forms. Accordingly, discrepancies in the accuracy of the structural environment assumed in development of the method and that in practical applications present additional challenges to modern loop modeling methods. This study demonstrates a new strategy for employing a hybrid energy function combining physics-based and knowledge-based components to help tackle this challenge. The hybrid energy function is designed to combine the strengths of each energy component, simultaneously maintaining accurate loop structure prediction in a high-resolution framework structure and tolerating minor environmental errors in low-resolution structures. A loop modeling method based on global optimization of this new energy function is tested on loop targets situated in different levels of environmental errors, ranging from experimental structures to structures perturbed in backbone as well as side chains and template-based model structures. The new method performs comparably to force field-based approaches in loop reconstruction in crystal structures and better in loop prediction in inaccurate framework structures. This result suggests that higher-accuracy predictions would be possible for a broader range of applications. The web server for this method is available at http://galaxy.seoklab.org/loop with the PS2 option for the scoring function.

  6. Electronic transport on the spatial structure of the protein: Three-dimensional lattice model

    International Nuclear Information System (INIS)

    Sarmento, R.G.; Frazão, N.F.; Macedo-Filho, A.

    2017-01-01

    Highlights: • The electronic transport on the structure of the three-dimensional lattice model of the protein is studied. • The signing of the current–voltage is directly affected by permutations of the weak bonds in the structure. • Semiconductor behave of the proteins suggest a potential application in the development of novel biosensors. - Abstract: We report a numerical analysis of the electronic transport in protein chain consisting of thirty-six standard amino acids. The protein chains studied have three-dimensional structure, which can present itself in three distinct conformations and the difference consist in the presence or absence of thirteen hydrogen-bondings. Our theoretical method uses an electronic tight-binding Hamiltonian model, appropriate to describe the protein segments modeled by the amino acid chain. We note that the presence and the permutations between weak bonds in the structure of proteins are directly related to the signing of the current–voltage. Furthermore, the electronic transport depends on the effect of temperature. In addition, we have found a semiconductor behave in the models investigated and it suggest a potential application in the development of novel biosensors for molecular diagnostics.

  7. Electronic transport on the spatial structure of the protein: Three-dimensional lattice model

    Energy Technology Data Exchange (ETDEWEB)

    Sarmento, R.G. [Departamento de Ciências Biológicas, Universidade Federal do Piauí, 64800-000 Floriano, PI (Brazil); Frazão, N.F. [Centro de Educação e Saúde, Universidade Federal de Campina Grande, 581750-000 Cuité, PB (Brazil); Macedo-Filho, A., E-mail: amfilho@gmail.com [Campus Prof. Antonio Geovanne Alves de Sousa, Universidade Estadual do Piauí, 64260-000 Piripiri, PI (Brazil)

    2017-01-30

    Highlights: • The electronic transport on the structure of the three-dimensional lattice model of the protein is studied. • The signing of the current–voltage is directly affected by permutations of the weak bonds in the structure. • Semiconductor behave of the proteins suggest a potential application in the development of novel biosensors. - Abstract: We report a numerical analysis of the electronic transport in protein chain consisting of thirty-six standard amino acids. The protein chains studied have three-dimensional structure, which can present itself in three distinct conformations and the difference consist in the presence or absence of thirteen hydrogen-bondings. Our theoretical method uses an electronic tight-binding Hamiltonian model, appropriate to describe the protein segments modeled by the amino acid chain. We note that the presence and the permutations between weak bonds in the structure of proteins are directly related to the signing of the current–voltage. Furthermore, the electronic transport depends on the effect of temperature. In addition, we have found a semiconductor behave in the models investigated and it suggest a potential application in the development of novel biosensors for molecular diagnostics.

  8. Modeling complexes of modeled proteins.

    Science.gov (United States)

    Anishchenko, Ivan; Kundrotas, Petras J; Vakser, Ilya A

    2017-03-01

    Structural characterization of proteins is essential for understanding life processes at the molecular level. However, only a fraction of known proteins have experimentally determined structures. This fraction is even smaller for protein-protein complexes. Thus, structural modeling of protein-protein interactions (docking) primarily has to rely on modeled structures of the individual proteins, which typically are less accurate than the experimentally determined ones. Such "double" modeling is the Grand Challenge of structural reconstruction of the interactome. Yet it remains so far largely untested in a systematic way. We present a comprehensive validation of template-based and free docking on a set of 165 complexes, where each protein model has six levels of structural accuracy, from 1 to 6 Å C α RMSD. Many template-based docking predictions fall into acceptable quality category, according to the CAPRI criteria, even for highly inaccurate proteins (5-6 Å RMSD), although the number of such models (and, consequently, the docking success rate) drops significantly for models with RMSD > 4 Å. The results show that the existing docking methodologies can be successfully applied to protein models with a broad range of structural accuracy, and the template-based docking is much less sensitive to inaccuracies of protein models than the free docking. Proteins 2017; 85:470-478. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  9. Insulin as a model to teach three-dimensional structure of proteins

    Directory of Open Access Journals (Sweden)

    João Batista Teixeira da Rocha

    2018-02-01

    Proteins are the most ubiquitous macromolecules found in the living cells and have innumerous physiological functions. Therefore, it is fundamental to build a solid knowledge about the proteins three dimensional structure to better understand the living state. The hierarchical structure of proteins is usually studied in the undergraduate discipline of Biochemistry. Here we described pedagogical interventions designed to increase the preservice teacher chemistry students’ knowledge about protein structure. The activities were made using alternative and cheap materials to encourage the application of these simple methodologies by the future teachers in the secondary school. From the primary structure of insulin chains, students had to construct a three-dimensional structure of insulin. After the activities, the students highlighted an improvement of their previous knowledge about proteins structure. The construction of a tridimensional model together with other activities seems to be an efficient way to promote the learning about the structure of proteins to undergraduate students. The methodology used was inexpensiveness and simple and it can be used both in the university and in the high-school.

  10. In Silico Characterization and Structural Modeling of Dermacentor andersoni p36 Immunosuppressive Protein

    Directory of Open Access Journals (Sweden)

    Martin Omulindi Oyugi

    2018-01-01

    Full Text Available Ticks cause approximately $17–19 billion economic losses to the livestock industry globally. Development of recombinant antitick vaccine is greatly hindered by insufficient knowledge and understanding of proteins expressed by ticks. Ticks secrete immunosuppressant proteins that modulate the host’s immune system during blood feeding; these molecules could be a target for antivector vaccine development. Recombinant p36, a 36 kDa immunosuppressor from the saliva of female Dermacentor andersoni, suppresses T-lymphocytes proliferation in vitro. To identify potential unique structural and dynamic properties responsible for the immunosuppressive function of p36 proteins, this study utilized bioinformatic tool to characterize and model structure of D. andersoni p36 protein. Evaluation of p36 protein family as suitable vaccine antigens predicted a p36 homolog in Rhipicephalus appendiculatus, the tick vector of East Coast fever, with an antigenicity score of 0.7701 that compares well with that of Bm86 (0.7681, the protein antigen that constitute commercial tick vaccine Tickgard™. Ab initio modeling of the D. andersoni p36 protein yielded a 3D structure that predicted conserved antigenic region, which has potential of binding immunomodulating ligands including glycerol and lactose, found located within exposed loop, suggesting a likely role in immunosuppressive function of tick p36 proteins. Laboratory confirmation of these preliminary results is necessary in future studies.

  11. Association of protein structure, protein and carbohydrate subfractions with bioenergy profiles and biodegradation functions in modeled forage

    Science.gov (United States)

    Ji, Cuiying; Zhang, Xuewei; Yu, Peiqiang

    2016-03-01

    The objectives of this study were to detect unique aspects and association of forage protein inherent structure, biological compounds, protein and carbohydrate subfractions, bioenergy profiles, and biodegradation features. In this study, common available alfalfa hay from two different sourced-origins (FSO vs. CSO) was used as a modeled forage for inherent structure profile, bioenergy, biodegradation and their association between their structure and bio-functions. The molecular spectral profiles were determined using non-invasive molecular spectroscopy. The parameters included: protein structure amide I group, amide II group and their ratios; protein subfractions (PA1, PA2, PB1, PB2, PC); carbohydrate fractions (CA1, CA2, CA3, CA4, CB1, CB2, CC); biodegradable and undegradable fractions of protein (RDPA2, RDPB1, RDPB2, RDP; RUPA2 RUPB1, RUPB2, RUPC, RUP); biodegradable and undegradable fractions of carbohydrate (RDCA4, RDCB1, RDCB2, RDCB3, RDCHO; RUCA4, RUCB1; RUCB2; RUCB3 RUCC, RUCHO) and bioenergy profiles (tdNDF, tdFA, tdCP, tdNFC, TDN1 ×, DE3 ×, ME3 ×, NEL3 ×; NEm, NEg). The results show differences in protein and carbohydrate (CHO) subfractions in the moderately degradable true protein fraction (PB1: 502 vs. 420 g/kg CP, P = 0.09), slowly degraded true protein fraction (PB2: 45 vs. 96 g/kg CP, P = 0.02), moderately degradable CHO fraction (CB2: 283 vs. 223 g/kg CHO, P = 0.06) and slowly degraded CHO fraction (CB3: 369 vs. 408 g/kg CHO) between the two sourced origins. As to biodegradable (RD) fractions of protein and CHO in rumen, there were differences in RD of PB1 (417 vs. 349 g/kg CP, P = 0.09), RD of PB2 (29 vs. 62 g/kg CP, P = 0.02), RD of CB2 (251 vs. 198 g/kg DM, P = 0.06), RD of CB3 (236 vs. 261 g/kg CHO, P = 0.08). As to bioenergy profile, there were differences in total digestible nutrient (TDN: 551 vs. 537 g/kg DM, P = 0.06), and metabolic bioenergy (P = 0.095). As to protein molecular structure, there were differences in protein structure 1st

  12. Protein secondary structure prediction for a single-sequence using hidden semi-Markov models

    Directory of Open Access Journals (Sweden)

    Borodovsky Mark

    2006-03-01

    Full Text Available Abstract Background The accuracy of protein secondary structure prediction has been improving steadily towards the 88% estimated theoretical limit. There are two types of prediction algorithms: Single-sequence prediction algorithms imply that information about other (homologous proteins is not available, while algorithms of the second type imply that information about homologous proteins is available, and use it intensively. The single-sequence algorithms could make an important contribution to studies of proteins with no detected homologs, however the accuracy of protein secondary structure prediction from a single-sequence is not as high as when the additional evolutionary information is present. Results In this paper, we further refine and extend the hidden semi-Markov model (HSMM initially considered in the BSPSS algorithm. We introduce an improved residue dependency model by considering the patterns of statistically significant amino acid correlation at structural segment borders. We also derive models that specialize on different sections of the dependency structure and incorporate them into HSMM. In addition, we implement an iterative training method to refine estimates of HSMM parameters. The three-state-per-residue accuracy and other accuracy measures of the new method, IPSSP, are shown to be comparable or better than ones for BSPSS as well as for PSIPRED, tested under the single-sequence condition. Conclusions We have shown that new dependency models and training methods bring further improvements to single-sequence protein secondary structure prediction. The results are obtained under cross-validation conditions using a dataset with no pair of sequences having significant sequence similarity. As new sequences are added to the database it is possible to augment the dependency structure and obtain even higher accuracy. Current and future advances should contribute to the improvement of function prediction for orphan proteins inscrutable

  13. Structure modeling of all identified G protein-coupled receptors in the human genome.

    Science.gov (United States)

    Zhang, Yang; Devries, Mark E; Skolnick, Jeffrey

    2006-02-01

    G protein-coupled receptors (GPCRs), encoded by about 5% of human genes, comprise the largest family of integral membrane proteins and act as cell surface receptors responsible for the transduction of endogenous signal into a cellular response. Although tertiary structural information is crucial for function annotation and drug design, there are few experimentally determined GPCR structures. To address this issue, we employ the recently developed threading assembly refinement (TASSER) method to generate structure predictions for all 907 putative GPCRs in the human genome. Unlike traditional homology modeling approaches, TASSER modeling does not require solved homologous template structures; moreover, it often refines the structures closer to native. These features are essential for the comprehensive modeling of all human GPCRs when close homologous templates are absent. Based on a benchmarked confidence score, approximately 820 predicted models should have the correct folds. The majority of GPCR models share the characteristic seven-transmembrane helix topology, but 45 ORFs are predicted to have different structures. This is due to GPCR fragments that are predominantly from extracellular or intracellular domains as well as database annotation errors. Our preliminary validation includes the automated modeling of bovine rhodopsin, the only solved GPCR in the Protein Data Bank. With homologous templates excluded, the final model built by TASSER has a global C(alpha) root-mean-squared deviation from native of 4.6 angstroms, with a root-mean-squared deviation in the transmembrane helix region of 2.1 angstroms. Models of several representative GPCRs are compared with mutagenesis and affinity labeling data, and consistent agreement is demonstrated. Structure clustering of the predicted models shows that GPCRs with similar structures tend to belong to a similar functional class even when their sequences are diverse. These results demonstrate the usefulness and robustness

  14. Structure modeling of all identified G protein-coupled receptors in the human genome.

    Directory of Open Access Journals (Sweden)

    Yang Zhang

    2006-02-01

    Full Text Available G protein-coupled receptors (GPCRs, encoded by about 5% of human genes, comprise the largest family of integral membrane proteins and act as cell surface receptors responsible for the transduction of endogenous signal into a cellular response. Although tertiary structural information is crucial for function annotation and drug design, there are few experimentally determined GPCR structures. To address this issue, we employ the recently developed threading assembly refinement (TASSER method to generate structure predictions for all 907 putative GPCRs in the human genome. Unlike traditional homology modeling approaches, TASSER modeling does not require solved homologous template structures; moreover, it often refines the structures closer to native. These features are essential for the comprehensive modeling of all human GPCRs when close homologous templates are absent. Based on a benchmarked confidence score, approximately 820 predicted models should have the correct folds. The majority of GPCR models share the characteristic seven-transmembrane helix topology, but 45 ORFs are predicted to have different structures. This is due to GPCR fragments that are predominantly from extracellular or intracellular domains as well as database annotation errors. Our preliminary validation includes the automated modeling of bovine rhodopsin, the only solved GPCR in the Protein Data Bank. With homologous templates excluded, the final model built by TASSER has a global C(alpha root-mean-squared deviation from native of 4.6 angstroms, with a root-mean-squared deviation in the transmembrane helix region of 2.1 angstroms. Models of several representative GPCRs are compared with mutagenesis and affinity labeling data, and consistent agreement is demonstrated. Structure clustering of the predicted models shows that GPCRs with similar structures tend to belong to a similar functional class even when their sequences are diverse. These results demonstrate the usefulness

  15. Constructing a folding model for protein S6 guided by native fluctuations deduced from NMR structures

    International Nuclear Information System (INIS)

    Lammert, Heiko; Noel, Jeffrey K.; Haglund, Ellinor; Onuchic, José N.; Schug, Alexander

    2015-01-01

    The diversity in a set of protein nuclear magnetic resonance (NMR) structures provides an estimate of native state fluctuations that can be used to refine and enrich structure-based protein models (SBMs). Dynamics are an essential part of a protein’s functional native state. The dynamics in the native state are controlled by the same funneled energy landscape that guides the entire folding process. SBMs apply the principle of minimal frustration, drawn from energy landscape theory, to construct a funneled folding landscape for a given protein using only information from the native structure. On an energy landscape smoothed by evolution towards minimal frustration, geometrical constraints, imposed by the native structure, control the folding mechanism and shape the native dynamics revealed by the model. Native-state fluctuations can alternatively be estimated directly from the diversity in the set of NMR structures for a protein. Based on this information, we identify a highly flexible loop in the ribosomal protein S6 and modify the contact map in a SBM to accommodate the inferred dynamics. By taking into account the probable native state dynamics, the experimental transition state is recovered in the model, and the correct order of folding events is restored. Our study highlights how the shared energy landscape connects folding and function by showing that a better description of the native basin improves the prediction of the folding mechanism

  16. A 3D model of the membrane protein complex formed by the white spot syndrome virus structural proteins.

    Directory of Open Access Journals (Sweden)

    Yun-Shiang Chang

    Full Text Available BACKGROUND: Outbreaks of white spot disease have had a large negative economic impact on cultured shrimp worldwide. However, the pathogenesis of the causative virus, WSSV (whit spot syndrome virus, is not yet well understood. WSSV is a large enveloped virus. The WSSV virion has three structural layers surrounding its core DNA: an outer envelope, a tegument and a nucleocapsid. In this study, we investigated the protein-protein interactions of the major WSSV structural proteins, including several envelope and tegument proteins that are known to be involved in the infection process. PRINCIPAL FINDINGS: In the present report, we used coimmunoprecipitation and yeast two-hybrid assays to elucidate and/or confirm all the interactions that occur among the WSSV structural (envelope and tegument proteins VP51A, VP19, VP24, VP26 and VP28. We found that VP51A interacted directly not only with VP26 but also with VP19 and VP24. VP51A, VP19 and VP24 were also shown to have an affinity for self-interaction. Chemical cross-linking assays showed that these three self-interacting proteins could occur as dimers. CONCLUSIONS: From our present results in conjunction with other previously established interactions we construct a 3D model in which VP24 acts as a core protein that directly associates with VP26, VP28, VP38A, VP51A and WSV010 to form a membrane-associated protein complex. VP19 and VP37 are attached to this complex via association with VP51A and VP28, respectively. Through the VP26-VP51C interaction this envelope complex is anchored to the nucleocapsid, which is made of layers of rings formed by VP664. A 3D model of the nucleocapsid and the surrounding outer membrane is presented.

  17. Protein structure analysis using the resonant recognition model and wavelet transforms

    International Nuclear Information System (INIS)

    Fang, Q.; Cosic, I.

    1998-01-01

    An approach based on the resonant recognition model and the discrete wavelet transform is introduced here for characterising proteins' biological function. The protein sequence is converted into a numerical series by assigning the electron-ion interaction potential to each amino acid from N-terminal to C-terminal. A set of peaks is found after performing a wavelet transform onto a numerical series representing a group of homologous proteins. These peaks are related to protein structural and functional properties and named characteristic vector of that protein group. Further more, the amino acids contributing mostly to a protein's biological functions, the so-called 'hot spots' amino acids, are predicted by the continuous wavelet transform. It is found that the hot spots are clustered around the protein's cleft structure. The wavelets approach provides a novel methods for amino acid sequence analysis as well as an expansion for the newly established macromolecular interaction model: the resonant recognition model. Copyright (1998) Australasian Physical and Engineering Sciences in Medicine

  18. Identify High-Quality Protein Structural Models by Enhanced K-Means.

    Science.gov (United States)

    Wu, Hongjie; Li, Haiou; Jiang, Min; Chen, Cheng; Lv, Qiang; Wu, Chuang

    2017-01-01

    Background. One critical issue in protein three-dimensional structure prediction using either ab initio or comparative modeling involves identification of high-quality protein structural models from generated decoys. Currently, clustering algorithms are widely used to identify near-native models; however, their performance is dependent upon different conformational decoys, and, for some algorithms, the accuracy declines when the decoy population increases. Results. Here, we proposed two enhanced K -means clustering algorithms capable of robustly identifying high-quality protein structural models. The first one employs the clustering algorithm SPICKER to determine the initial centroids for basic K -means clustering ( SK -means), whereas the other employs squared distance to optimize the initial centroids ( K -means++). Our results showed that SK -means and K -means++ were more robust as compared with SPICKER alone, detecting 33 (59%) and 42 (75%) of 56 targets, respectively, with template modeling scores better than or equal to those of SPICKER. Conclusions. We observed that the classic K -means algorithm showed a similar performance to that of SPICKER, which is a widely used algorithm for protein-structure identification. Both SK -means and K -means++ demonstrated substantial improvements relative to results from SPICKER and classical K -means.

  19. Mixing Energy Models in Genetic Algorithms for On-Lattice Protein Structure Prediction

    Directory of Open Access Journals (Sweden)

    Mahmood A. Rashid

    2013-01-01

    Full Text Available Protein structure prediction (PSP is computationally a very challenging problem. The challenge largely comes from the fact that the energy function that needs to be minimised in order to obtain the native structure of a given protein is not clearly known. A high resolution 20×20 energy model could better capture the behaviour of the actual energy function than a low resolution energy model such as hydrophobic polar. However, the fine grained details of the high resolution interaction energy matrix are often not very informative for guiding the search. In contrast, a low resolution energy model could effectively bias the search towards certain promising directions. In this paper, we develop a genetic algorithm that mainly uses a high resolution energy model for protein structure evaluation but uses a low resolution HP energy model in focussing the search towards exploring structures that have hydrophobic cores. We experimentally show that this mixing of energy models leads to significant lower energy structures compared to the state-of-the-art results.

  20. Structure Based Thermostability Prediction Models for Protein Single Point Mutations with Machine Learning Tools.

    Directory of Open Access Journals (Sweden)

    Lei Jia

    Full Text Available Thermostability issue of protein point mutations is a common occurrence in protein engineering. An application which predicts the thermostability of mutants can be helpful for guiding decision making process in protein design via mutagenesis. An in silico point mutation scanning method is frequently used to find "hot spots" in proteins for focused mutagenesis. ProTherm (http://gibk26.bio.kyutech.ac.jp/jouhou/Protherm/protherm.html is a public database that consists of thousands of protein mutants' experimentally measured thermostability. Two data sets based on two differently measured thermostability properties of protein single point mutations, namely the unfolding free energy change (ddG and melting temperature change (dTm were obtained from this database. Folding free energy change calculation from Rosetta, structural information of the point mutations as well as amino acid physical properties were obtained for building thermostability prediction models with informatics modeling tools. Five supervised machine learning methods (support vector machine, random forests, artificial neural network, naïve Bayes classifier, K nearest neighbor and partial least squares regression are used for building the prediction models. Binary and ternary classifications as well as regression models were built and evaluated. Data set redundancy and balancing, the reverse mutations technique, feature selection, and comparison to other published methods were discussed. Rosetta calculated folding free energy change ranked as the most influential features in all prediction models. Other descriptors also made significant contributions to increasing the accuracy of the prediction models.

  1. Comparative sequence and structural analyses of G-protein-coupled receptor crystal structures and implications for molecular models.

    Directory of Open Access Journals (Sweden)

    Catherine L Worth

    Full Text Available BACKGROUND: Up until recently the only available experimental (high resolution structure of a G-protein-coupled receptor (GPCR was that of bovine rhodopsin. In the past few years the determination of GPCR structures has accelerated with three new receptors, as well as squid rhodopsin, being successfully crystallized. All share a common molecular architecture of seven transmembrane helices and can therefore serve as templates for building molecular models of homologous GPCRs. However, despite the common general architecture of these structures key differences do exist between them. The choice of which experimental GPCR structure(s to use for building a comparative model of a particular GPCR is unclear and without detailed structural and sequence analyses, could be arbitrary. The aim of this study is therefore to perform a systematic and detailed analysis of sequence-structure relationships of known GPCR structures. METHODOLOGY: We analyzed in detail conserved and unique sequence motifs and structural features in experimentally-determined GPCR structures. Deeper insight into specific and important structural features of GPCRs as well as valuable information for template selection has been gained. Using key features a workflow has been formulated for identifying the most appropriate template(s for building homology models of GPCRs of unknown structure. This workflow was applied to a set of 14 human family A GPCRs suggesting for each the most appropriate template(s for building a comparative molecular model. CONCLUSIONS: The available crystal structures represent only a subset of all possible structural variation in family A GPCRs. Some GPCRs have structural features that are distributed over different crystal structures or which are not present in the templates suggesting that homology models should be built using multiple templates. This study provides a systematic analysis of GPCR crystal structures and a consistent method for identifying

  2. Comparative sequence and structural analyses of G-protein-coupled receptor crystal structures and implications for molecular models.

    Science.gov (United States)

    Worth, Catherine L; Kleinau, Gunnar; Krause, Gerd

    2009-09-16

    Up until recently the only available experimental (high resolution) structure of a G-protein-coupled receptor (GPCR) was that of bovine rhodopsin. In the past few years the determination of GPCR structures has accelerated with three new receptors, as well as squid rhodopsin, being successfully crystallized. All share a common molecular architecture of seven transmembrane helices and can therefore serve as templates for building molecular models of homologous GPCRs. However, despite the common general architecture of these structures key differences do exist between them. The choice of which experimental GPCR structure(s) to use for building a comparative model of a particular GPCR is unclear and without detailed structural and sequence analyses, could be arbitrary. The aim of this study is therefore to perform a systematic and detailed analysis of sequence-structure relationships of known GPCR structures. We analyzed in detail conserved and unique sequence motifs and structural features in experimentally-determined GPCR structures. Deeper insight into specific and important structural features of GPCRs as well as valuable information for template selection has been gained. Using key features a workflow has been formulated for identifying the most appropriate template(s) for building homology models of GPCRs of unknown structure. This workflow was applied to a set of 14 human family A GPCRs suggesting for each the most appropriate template(s) for building a comparative molecular model. The available crystal structures represent only a subset of all possible structural variation in family A GPCRs. Some GPCRs have structural features that are distributed over different crystal structures or which are not present in the templates suggesting that homology models should be built using multiple templates. This study provides a systematic analysis of GPCR crystal structures and a consistent method for identifying suitable templates for GPCR homology modelling that will

  3. A Self-Assisting Protein Folding Model for Teaching Structural Molecular Biology.

    Science.gov (United States)

    Davenport, Jodi; Pique, Michael; Getzoff, Elizabeth; Huntoon, Jon; Gardner, Adam; Olson, Arthur

    2017-04-04

    Structural molecular biology is now becoming part of high school science curriculum thus posing a challenge for teachers who need to convey three-dimensional (3D) structures with conventional text and pictures. In many cases even interactive computer graphics does not go far enough to address these challenges. We have developed a flexible model of the polypeptide backbone using 3D printing technology. With this model we have produced a polypeptide assembly kit to create an idealized model of the Triosephosphate isomerase mutase enzyme (TIM), which forms a structure known as TIM barrel. This kit has been used in a laboratory practical where students perform a step-by-step investigation into the nature of protein folding, starting with the handedness of amino acids to the formation of secondary and tertiary structure. Based on the classroom evidence we collected, we conclude that these models are valuable and inexpensive resource for teaching structural molecular biology. Copyright © 2017 Elsevier Ltd. All rights reserved.

  4. Modeling the Structure of SARS 3a Transmembrane Protein Using a ...

    Indian Academy of Sciences (India)

    Modeling the structure of SARS 3a Transmembrane protein using a ... for the implicit membrane molecular dynamics (MD) simulations. ... The coordinates during the simulation were saved every 500 steps, and were used for analysis. ... the pair list for calculation of nonbonded interactions being updated after every 10 steps.

  5. Modeling of the structure of ribosomal protein L1 from the archaeon Haloarcula marismortui

    Science.gov (United States)

    Nevskaya, N. A.; Kljashtorny, V. G.; Vakhrusheva, A. V.; Garber, M. B.; Nikonov, S. V.

    2017-07-01

    The halophilic archaeon Haloarcula marismortui proliferates in the Dead Sea at extremely high salt concentrations (higher than 3 M). This is the only archaeon, for which the crystal structure of the ribosomal 50S subunit was determined. However, the structure of the functionally important side protuberance containing the abnormally negatively charged protein L1 (HmaL1) was not visualized. Attempts to crystallize HmaL1 in the isolated state or as its complex with RNA using normal salt concentrations (≤500 mM) failed. A theoretical model of HmaL1 was built based on the structural data for homologs of the protein L1 from other organisms, and this model was refined by molecular dynamics methods. Analysis of this model showed that the protein HmaL1 can undergo aggregation due to the presence of a cluster of positive charges unique for proteins L1. This cluster is located at the RNA-protein interface, which interferes with the crystallization of HmaL1 and the binding of the latter to RNA.

  6. The Protein Model Portal.

    Science.gov (United States)

    Arnold, Konstantin; Kiefer, Florian; Kopp, Jürgen; Battey, James N D; Podvinec, Michael; Westbrook, John D; Berman, Helen M; Bordoli, Lorenza; Schwede, Torsten

    2009-03-01

    Structural Genomics has been successful in determining the structures of many unique proteins in a high throughput manner. Still, the number of known protein sequences is much larger than the number of experimentally solved protein structures. Homology (or comparative) modeling methods make use of experimental protein structures to build models for evolutionary related proteins. Thereby, experimental structure determination efforts and homology modeling complement each other in the exploration of the protein structure space. One of the challenges in using model information effectively has been to access all models available for a specific protein in heterogeneous formats at different sites using various incompatible accession code systems. Often, structure models for hundreds of proteins can be derived from a given experimentally determined structure, using a variety of established methods. This has been done by all of the PSI centers, and by various independent modeling groups. The goal of the Protein Model Portal (PMP) is to provide a single portal which gives access to the various models that can be leveraged from PSI targets and other experimental protein structures. A single interface allows all existing pre-computed models across these various sites to be queried simultaneously, and provides links to interactive services for template selection, target-template alignment, model building, and quality assessment. The current release of the portal consists of 7.6 million model structures provided by different partner resources (CSMP, JCSG, MCSG, NESG, NYSGXRC, JCMM, ModBase, SWISS-MODEL Repository). The PMP is available at http://www.proteinmodelportal.org and from the PSI Structural Genomics Knowledgebase.

  7. Protein structure modeling for CASP10 by multiple layers of global optimization.

    Science.gov (United States)

    Joo, Keehyoung; Lee, Juyong; Sim, Sangjin; Lee, Sun Young; Lee, Kiho; Heo, Seungryong; Lee, In-Ho; Lee, Sung Jong; Lee, Jooyoung

    2014-02-01

    In the template-based modeling (TBM) category of CASP10 experiment, we introduced a new protocol called protein modeling system (PMS) to generate accurate protein structures in terms of side-chains as well as backbone trace. In the new protocol, a global optimization algorithm, called conformational space annealing (CSA), is applied to the three layers of TBM procedure: multiple sequence-structure alignment, 3D chain building, and side-chain re-modeling. For 3D chain building, we developed a new energy function which includes new distance restraint terms of Lorentzian type (derived from multiple templates), and new energy terms that combine (physical) energy terms such as dynamic fragment assembly (DFA) energy, DFIRE statistical potential energy, hydrogen bonding term, etc. These physical energy terms are expected to guide the structure modeling especially for loop regions where no template structures are available. In addition, we developed a new quality assessment method based on random forest machine learning algorithm to screen templates, multiple alignments, and final models. For TBM targets of CASP10, we find that, due to the combination of three stages of CSA global optimizations and quality assessment, the modeling accuracy of PMS improves at each additional stage of the protocol. It is especially noteworthy that the side-chains of the final PMS models are far more accurate than the models in the intermediate steps. Copyright © 2013 Wiley Periodicals, Inc.

  8. Modeling structure of G protein-coupled receptors in huan genome

    KAUST Repository

    Zhang, Yang

    2016-01-26

    G protein-coupled receptors (or GPCRs) are integral transmembrane proteins responsible to various cellular signal transductions. Human GPCR proteins are encoded by 5% of human genes but account for the targets of 40% of the FDA approved drugs. Due to difficulties in crystallization, experimental structure determination remains extremely difficult for human GPCRs, which have been a major barrier in modern structure-based drug discovery. We proposed a new hybrid protocol, GPCR-I-TASSER, to construct GPCR structure models by integrating experimental mutagenesis data with ab initio transmembrane-helix assembly simulations, assisted by the predicted transmembrane-helix interaction networks. The method was tested in recent community-wide GPCRDock experiments and constructed models with a root mean square deviation 1.26 Å for Dopamine-3 and 2.08 Å for Chemokine-4 receptors in the transmembrane domain regions, which were significantly closer to the native than the best templates available in the PDB. GPCR-I-TASSER has been applied to model all 1,026 putative GPCRs in the human genome, where 923 are found to have correct folds based on the confidence score analysis and mutagenesis data comparison. The successfully modeled GPCRs contain many pharmaceutically important families that do not have previously solved structures, including Trace amine, Prostanoids, Releasing hormones, Melanocortins, Vasopressin and Neuropeptide Y receptors. All the human GPCR models have been made publicly available through the GPCR-HGmod database at http://zhanglab.ccmb.med.umich.edu/GPCR-HGmod/ The results demonstrate new progress on genome-wide structure modeling of transmembrane proteins which should bring useful impact on the effort of GPCR-targeted drug discovery.

  9. Protein Structure Prediction by Protein Threading

    Science.gov (United States)

    Xu, Ying; Liu, Zhijie; Cai, Liming; Xu, Dong

    The seminal work of Bowie, Lüthy, and Eisenberg (Bowie et al., 1991) on "the inverse protein folding problem" laid the foundation of protein structure prediction by protein threading. By using simple measures for fitness of different amino acid types to local structural environments defined in terms of solvent accessibility and protein secondary structure, the authors derived a simple and yet profoundly novel approach to assessing if a protein sequence fits well with a given protein structural fold. Their follow-up work (Elofsson et al., 1996; Fischer and Eisenberg, 1996; Fischer et al., 1996a,b) and the work by Jones, Taylor, and Thornton (Jones et al., 1992) on protein fold recognition led to the development of a new brand of powerful tools for protein structure prediction, which we now term "protein threading." These computational tools have played a key role in extending the utility of all the experimentally solved structures by X-ray crystallography and nuclear magnetic resonance (NMR), providing structural models and functional predictions for many of the proteins encoded in the hundreds of genomes that have been sequenced up to now.

  10. Citrate synthase proteins in extremophilic organisms: Studies within a structure-based model

    International Nuclear Information System (INIS)

    Różycki, Bartosz; Cieplak, Marek

    2014-01-01

    We study four citrate synthase homodimeric proteins within a structure-based coarse-grained model. Two of these proteins come from thermophilic bacteria, one from a cryophilic bacterium and one from a mesophilic organism; three are in the closed and two in the open conformations. Even though the proteins belong to the same fold, the model distinguishes the properties of these proteins in a way which is consistent with experiments. For instance, the thermophilic proteins are more stable thermodynamically than their mesophilic and cryophilic homologues, which we observe both in the magnitude of thermal fluctuations near the native state and in the kinetics of thermal unfolding. The level of stability correlates with the average coordination number for amino acid contacts and with the degree of structural compactness. The pattern of positional fluctuations along the sequence in the closed conformation is different than in the open conformation, including within the active site. The modes of correlated and anticorrelated movements of pairs of amino acids forming the active site are very different in the open and closed conformations. Taken together, our results show that the precise location of amino acid contacts in the native structure appears to be a critical element in explaining the similarities and differences in the thermodynamic properties, local flexibility, and collective motions of the different forms of the enzyme

  11. Bayesian Inference using Neural Net Likelihood Models for Protein Secondary Structure Prediction

    Directory of Open Access Journals (Sweden)

    Seong-Gon Kim

    2011-06-01

    Full Text Available Several techniques such as Neural Networks, Genetic Algorithms, Decision Trees and other statistical or heuristic methods have been used to approach the complex non-linear task of predicting Alpha-helicies, Beta-sheets and Turns of a proteins secondary structure in the past. This project introduces a new machine learning method by using an offline trained Multilayered Perceptrons (MLP as the likelihood models within a Bayesian Inference framework to predict secondary structures proteins. Varying window sizes are used to extract neighboring amino acid information and passed back and forth between the Neural Net models and the Bayesian Inference process until there is a convergence of the posterior secondary structure probability.

  12. Structure of liposome encapsulating proteins characterized by X-ray scattering and shell-modeling

    International Nuclear Information System (INIS)

    Hirai, Mitsuhiro; Kimura, Ryota; Takeuchi, Kazuki; Hagiwara, Yoshihiko; Kawai-Hirai, Rika; Ohta, Noboru; Igarashi, Noriyuki; Shimuzu, Nobutaka

    2013-01-01

    Wide-angle X-ray scattering data using a third-generation synchrotron radiation source are presented. Lipid liposomes are promising drug delivery systems because they have superior curative effects owing to their high adaptability to a living body. Lipid liposomes encapsulating proteins were constructed and the structures examined using synchrotron radiation small- and wide-angle X-ray scattering (SR-SWAXS). The liposomes were prepared by a sequential combination of natural swelling, ultrasonic dispersion, freeze-throw, extrusion and spin-filtration. The liposomes were composed of acidic glycosphingolipid (ganglioside), cholesterol and phospholipids. By using shell-modeling methods, the asymmetric bilayer structure of the liposome and the encapsulation efficiency of proteins were determined. As well as other analytical techniques, SR-SWAXS and shell-modeling methods are shown to be a powerful tool for characterizing in situ structures of lipid liposomes as an important candidate of drug delivery systems

  13. Structural characterization of a recombinant fusion protein by instrumental analysis and molecular modeling.

    Directory of Open Access Journals (Sweden)

    Zhigang Wu

    Full Text Available Conbercept is a genetically engineered homodimeric protein for the treatment of wet age-related macular degeneration (wet AMD that functions by blocking VEGF-family proteins. Its huge, highly variable architecture makes characterization and development of a functional assay difficult. In this study, the primary structure, number of disulfide linkages and glycosylation state of conbercept were characterized by high-performance liquid chromatography, mass spectrometry, and capillary electrophoresis. Molecular modeling was then applied to obtain the spatial structural model of the conbercept-VEGF-A complex, and to study its inter-atomic interactions and dynamic behavior. This work was incorporated into a platform useful for studying the structure of conbercept and its ligand binding functions.

  14. Protein structural model selection by combining consensus and single scoring methods.

    Directory of Open Access Journals (Sweden)

    Zhiquan He

    Full Text Available Quality assessment (QA for predicted protein structural models is an important and challenging research problem in protein structure prediction. Consensus Global Distance Test (CGDT methods assess each decoy (predicted structural model based on its structural similarity to all others in a decoy set and has been proved to work well when good decoys are in a majority cluster. Scoring functions evaluate each single decoy based on its structural properties. Both methods have their merits and limitations. In this paper, we present a novel method called PWCom, which consists of two neural networks sequentially to combine CGDT and single model scoring methods such as RW, DDFire and OPUS-Ca. Specifically, for every pair of decoys, the difference of the corresponding feature vectors is input to the first neural network which enables one to predict whether the decoy-pair are significantly different in terms of their GDT scores to the native. If yes, the second neural network is used to decide which one of the two is closer to the native structure. The quality score for each decoy in the pool is based on the number of winning times during the pairwise comparisons. Test results on three benchmark datasets from different model generation methods showed that PWCom significantly improves over consensus GDT and single scoring methods. The QA server (MUFOLD-Server applying this method in CASP 10 QA category was ranked the second place in terms of Pearson and Spearman correlation performance.

  15. An evolutionary model for protein-coding regions with conserved RNA structure

    DEFF Research Database (Denmark)

    Pedersen, Jakob Skou; Forsberg, Roald; Meyer, Irmtraud Margret

    2004-01-01

    in the RNA structure. The overlap of these fundamental dependencies is sufficient to cause "contagious" context dependencies which cascade across many nucleotide sites. Such large-scale dependencies challenge the use of traditional phylogenetic models in evolutionary inference because they explicitly assume...... components of traditional phylogenetic models. We applied this to a data set of full-genome sequences from the hepatitis C virus where five RNA structures are mapped within the coding region. This allowed us to partition the effects of selection on different structural elements and to test various hypotheses......Here we present a model of nucleotide substitution in protein-coding regions that also encode the formation of conserved RNA structures. In such regions, apparent evolutionary context dependencies exist, both between nucleotides occupying the same codon and between nucleotides forming a base pair...

  16. Molecular modelling of the Norrie disease protein predicts a cystine knot growth factor tertiary structure.

    Science.gov (United States)

    Meitinger, T; Meindl, A; Bork, P; Rost, B; Sander, C; Haasemann, M; Murken, J

    1993-12-01

    The X-lined gene for Norrie disease, which is characterized by blindness, deafness and mental retardation has been cloned recently. This gene has been thought to code for a putative extracellular factor; its predicted amino acid sequence is homologous to the C-terminal domain of diverse extracellular proteins. Sequence pattern searches and three-dimensional modelling now suggest that the Norrie disease protein (NDP) has a tertiary structure similar to that of transforming growth factor beta (TGF beta). Our model identifies NDP as a member of an emerging family of growth factors containing a cystine knot motif, with direct implications for the physiological role of NDP. The model also sheds light on sequence related domains such as the C-terminal domain of mucins and of von Willebrand factor.

  17. Chemical cross-linking and mass spectrometry for protein structural modeling

    NARCIS (Netherlands)

    Back, Jaap Willem; de Jong, Luitzen; Muijsers, Anton O.; de Koster, Chris G.

    2003-01-01

    The growth of gene and protein sequence information is currently so rapid that three-dimensional structural information is lacking for the overwhelming majority of known proteins. In this review, efforts towards rapid and sensitive methods for protein structural characterization are described,

  18. Protein structure modeling and refinement by global optimization in CASP12.

    Science.gov (United States)

    Hong, Seung Hwan; Joung, InSuk; Flores-Canales, Jose C; Manavalan, Balachandran; Cheng, Qianyi; Heo, Seungryong; Kim, Jong Yun; Lee, Sun Young; Nam, Mikyung; Joo, Keehyoung; Lee, In-Ho; Lee, Sung Jong; Lee, Jooyoung

    2018-03-01

    For protein structure modeling in the CASP12 experiment, we have developed a new protocol based on our previous CASP11 approach. The global optimization method of conformational space annealing (CSA) was applied to 3 stages of modeling: multiple sequence-structure alignment, three-dimensional (3D) chain building, and side-chain re-modeling. For better template selection and model selection, we updated our model quality assessment (QA) method with the newly developed SVMQA (support vector machine for quality assessment). For 3D chain building, we updated our energy function by including restraints generated from predicted residue-residue contacts. New energy terms for the predicted secondary structure and predicted solvent accessible surface area were also introduced. For difficult targets, we proposed a new method, LEEab, where the template term played a less significant role than it did in LEE, complemented by increased contributions from other terms such as the predicted contact term. For TBM (template-based modeling) targets, LEE performed better than LEEab, but for FM targets, LEEab was better. For model refinement, we modified our CASP11 molecular dynamics (MD) based protocol by using explicit solvents and tuning down restraint weights. Refinement results from MD simulations that used a new augmented statistical energy term in the force field were quite promising. Finally, when using inaccurate information (such as the predicted contacts), it was important to use the Lorentzian function for which the maximal penalty arising from wrong information is always bounded. © 2017 Wiley Periodicals, Inc.

  19. The utility of comparative models and the local model quality for protein crystal structure determination by Molecular Replacement

    Directory of Open Access Journals (Sweden)

    Pawlowski Marcin

    2012-11-01

    Full Text Available Abstract Background Computational models of protein structures were proved to be useful as search models in Molecular Replacement (MR, a common method to solve the phase problem faced by macromolecular crystallography. The success of MR depends on the accuracy of a search model. Unfortunately, this parameter remains unknown until the final structure of the target protein is determined. During the last few years, several Model Quality Assessment Programs (MQAPs that predict the local accuracy of theoretical models have been developed. In this article, we analyze whether the application of MQAPs improves the utility of theoretical models in MR. Results For our dataset of 615 search models, the real local accuracy of a model increases the MR success ratio by 101% compared to corresponding polyalanine templates. On the contrary, when local model quality is not utilized in MR, the computational models solved only 4.5% more MR searches than polyalanine templates. For the same dataset of the 615 models, a workflow combining MR with predicted local accuracy of a model found 45% more correct solution than polyalanine templates. To predict such accuracy MetaMQAPclust, a “clustering MQAP” was used. Conclusions Using comparative models only marginally increases the MR success ratio in comparison to polyalanine structures of templates. However, the situation changes dramatically once comparative models are used together with their predicted local accuracy. A new functionality was added to the GeneSilico Fold Prediction Metaserver in order to build models that are more useful for MR searches. Additionally, we have developed a simple method, AmIgoMR (Am I good for MR?, to predict if an MR search with a template-based model for a given template is likely to find the correct solution.

  20. SNP2Structure: A Public and Versatile Resource for Mapping and Three-Dimensional Modeling of Missense SNPs on Human Protein Structures

    Directory of Open Access Journals (Sweden)

    Difei Wang

    2015-01-01

    Full Text Available One of the long-standing challenges in biology is to understand how non-synonymous single nucleotide polymorphisms (nsSNPs change protein structure and further affect their function. While it is impractical to solve all the mutated protein structures experimentally, it is quite feasible to model the mutated structures in silico. Toward this goal, we built a publicly available structure database resource (SNP2Structure, https://apps.icbi.georgetown.edu/snp2structure focusing on missense mutations, msSNP. Compared with web portals with similar aims, SNP2Structure has the following major advantages. First, our portal offers direct comparison of two related 3D structures. Second, the protein models include all interacting molecules in the original PDB structures, so users are able to determine regions of potential interaction changes when a protein mutation occurs. Third, the mutated structures are available to download locally for further structural and functional analysis. Fourth, we used Jsmol package to display the protein structure that has no system compatibility issue. SNP2Structure provides reliable, high quality mapping of nsSNPs to 3D protein structures enabling researchers to explore the likely functional impact of human disease-causing mutations.

  1. The small heat shock proteins from Acidithiobacillus ferrooxidans: gene expression, phylogenetic analysis, and structural modeling

    Directory of Open Access Journals (Sweden)

    Ribeiro Daniela A

    2011-12-01

    Full Text Available Abstract Background Acidithiobacillus ferrooxidans is an acidophilic, chemolithoautotrophic bacterium that has been successfully used in metal bioleaching. In this study, an analysis of the A. ferrooxidans ATCC 23270 genome revealed the presence of three sHSP genes, Afe_1009, Afe_1437 and Afe_2172, that encode proteins from the HSP20 family, a class of intracellular multimers that is especially important in extremophile microorganisms. Results The expression of the sHSP genes was investigated in A. ferrooxidans cells submitted to a heat shock at 40°C for 15, 30 and 60 minutes. After 60 minutes, the gene on locus Afe_1437 was about 20-fold more highly expressed than the gene on locus Afe_2172. Bioinformatic and phylogenetic analyses showed that the sHSPs from A. ferrooxidans are possible non-paralogous proteins, and are regulated by the σ32 factor, a common transcription factor of heat shock proteins. Structural studies using homology molecular modeling indicated that the proteins encoded by Afe_1009 and Afe_1437 have a conserved α-crystallin domain and share similar structural features with the sHSP from Methanococcus jannaschii, suggesting that their biological assembly involves 24 molecules and resembles a hollow spherical shell. Conclusion We conclude that the sHSPs encoded by the Afe_1437 and Afe_1009 genes are more likely to act as molecular chaperones in the A. ferrooxidans heat shock response. In addition, the three sHSPs from A. ferrooxidans are not recent paralogs, and the Afe_1437 and Afe_1009 genes could be inherited horizontally by A. ferrooxidans.

  2. Clustering structures of large proteins using multifractal analyses based on a 6-letter model and hydrophobicity scale of amino acids

    International Nuclear Information System (INIS)

    Yang Jianyi; Yu Zuguo; Anh, Vo

    2009-01-01

    The Schneider and Wrede hydrophobicity scale of amino acids and the 6-letter model of protein are proposed to study the relationship between the primary structure and the secondary structural classification of proteins. Two kinds of multifractal analyses are performed on the two measures obtained from these two kinds of data on large proteins. Nine parameters from the multifractal analyses are considered to construct the parameter spaces. Each protein is represented by one point in these spaces. A procedure is proposed to separate large proteins in the α, β, α + β and α/β structural classes in these parameter spaces. Fisher's linear discriminant algorithm is used to assess our clustering accuracy on the 49 selected large proteins. Numerical results indicate that the discriminant accuracies are satisfactory. In particular, they reach 100.00% and 84.21% in separating the α proteins from the {β, α + β, α/β} proteins in a parameter space; 92.86% and 86.96% in separating the β proteins from the {α + β, α/β} proteins in another parameter space; 91.67% and 83.33% in separating the α/β proteins from the α + β proteins in the last parameter space.

  3. Docking-based modeling of protein-protein interfaces for extensive structural and functional characterization of missense mutations.

    Science.gov (United States)

    Barradas-Bautista, Didier; Fernández-Recio, Juan

    2017-01-01

    Next-generation sequencing (NGS) technologies are providing genomic information for an increasing number of healthy individuals and patient populations. In the context of the large amount of generated genomic data that is being generated, understanding the effect of disease-related mutations at molecular level can contribute to close the gap between genotype and phenotype and thus improve prevention, diagnosis or treatment of a pathological condition. In order to fully characterize the effect of a pathological mutation and have useful information for prediction purposes, it is important first to identify whether the mutation is located at a protein-binding interface, and second to understand the effect on the binding affinity of the affected interaction/s. Computational methods, such as protein docking are currently used to complement experimental efforts and could help to build the human structural interactome. Here we have extended the original pyDockNIP method to predict the location of disease-associated nsSNPs at protein-protein interfaces, when there is no available structure for the protein-protein complex. We have applied this approach to the pathological interaction networks of six diseases with low structural data on PPIs. This approach can almost double the number of nsSNPs that can be characterized and identify edgetic effects in many nsSNPs that were previously unknown. This can help to annotate and interpret genomic data from large-scale population studies, and to achieve a better understanding of disease at molecular level.

  4. Docking-based modeling of protein-protein interfaces for extensive structural and functional characterization of missense mutations.

    Directory of Open Access Journals (Sweden)

    Didier Barradas-Bautista

    Full Text Available Next-generation sequencing (NGS technologies are providing genomic information for an increasing number of healthy individuals and patient populations. In the context of the large amount of generated genomic data that is being generated, understanding the effect of disease-related mutations at molecular level can contribute to close the gap between genotype and phenotype and thus improve prevention, diagnosis or treatment of a pathological condition. In order to fully characterize the effect of a pathological mutation and have useful information for prediction purposes, it is important first to identify whether the mutation is located at a protein-binding interface, and second to understand the effect on the binding affinity of the affected interaction/s. Computational methods, such as protein docking are currently used to complement experimental efforts and could help to build the human structural interactome. Here we have extended the original pyDockNIP method to predict the location of disease-associated nsSNPs at protein-protein interfaces, when there is no available structure for the protein-protein complex. We have applied this approach to the pathological interaction networks of six diseases with low structural data on PPIs. This approach can almost double the number of nsSNPs that can be characterized and identify edgetic effects in many nsSNPs that were previously unknown. This can help to annotate and interpret genomic data from large-scale population studies, and to achieve a better understanding of disease at molecular level.

  5. PROCARB: A Database of Known and Modelled Carbohydrate-Binding Protein Structures with Sequence-Based Prediction Tools

    Directory of Open Access Journals (Sweden)

    Adeel Malik

    2010-01-01

    Full Text Available Understanding of the three-dimensional structures of proteins that interact with carbohydrates covalently (glycoproteins as well as noncovalently (protein-carbohydrate complexes is essential to many biological processes and plays a significant role in normal and disease-associated functions. It is important to have a central repository of knowledge available about these protein-carbohydrate complexes as well as preprocessed data of predicted structures. This can be significantly enhanced by tools de novo which can predict carbohydrate-binding sites for proteins in the absence of structure of experimentally known binding site. PROCARB is an open-access database comprising three independently working components, namely, (i Core PROCARB module, consisting of three-dimensional structures of protein-carbohydrate complexes taken from Protein Data Bank (PDB, (ii Homology Models module, consisting of manually developed three-dimensional models of N-linked and O-linked glycoproteins of unknown three-dimensional structure, and (iii CBS-Pred prediction module, consisting of web servers to predict carbohydrate-binding sites using single sequence or server-generated PSSM. Several precomputed structural and functional properties of complexes are also included in the database for quick analysis. In particular, information about function, secondary structure, solvent accessibility, hydrogen bonds and literature reference, and so forth, is included. In addition, each protein in the database is mapped to Uniprot, Pfam, PDB, and so forth.

  6. Structures composing protein domains.

    Science.gov (United States)

    Kubrycht, Jaroslav; Sigler, Karel; Souček, Pavel; Hudeček, Jiří

    2013-08-01

    This review summarizes available data concerning intradomain structures (IS) such as functionally important amino acid residues, short linear motifs, conserved or disordered regions, peptide repeats, broadly occurring secondary structures or folds, etc. IS form structural features (units or elements) necessary for interactions with proteins or non-peptidic ligands, enzyme reactions and some structural properties of proteins. These features have often been related to a single structural level (e.g. primary structure) mostly requiring certain structural context of other levels (e.g. secondary structures or supersecondary folds) as follows also from some examples reported or demonstrated here. In addition, we deal with some functionally important dynamic properties of IS (e.g. flexibility and different forms of accessibility), and more special dynamic changes of IS during enzyme reactions and allosteric regulation. Selected notes concern also some experimental methods, still more necessary tools of bioinformatic processing and clinically interesting relationships. Copyright © 2013 Elsevier Masson SAS. All rights reserved.

  7. eMatchSite: sequence order-independent structure alignments of ligand binding pockets in protein models.

    Directory of Open Access Journals (Sweden)

    Michal Brylinski

    2014-09-01

    Full Text Available Detecting similarities between ligand binding sites in the absence of global homology between target proteins has been recognized as one of the critical components of modern drug discovery. Local binding site alignments can be constructed using sequence order-independent techniques, however, to achieve a high accuracy, many current algorithms for binding site comparison require high-quality experimental protein structures, preferably in the bound conformational state. This, in turn, complicates proteome scale applications, where only various quality structure models are available for the majority of gene products. To improve the state-of-the-art, we developed eMatchSite, a new method for constructing sequence order-independent alignments of ligand binding sites in protein models. Large-scale benchmarking calculations using adenine-binding pockets in crystal structures demonstrate that eMatchSite generates accurate alignments for almost three times more protein pairs than SOIPPA. More importantly, eMatchSite offers a high tolerance to structural distortions in ligand binding regions in protein models. For example, the percentage of correctly aligned pairs of adenine-binding sites in weakly homologous protein models is only 4-9% lower than those aligned using crystal structures. This represents a significant improvement over other algorithms, e.g. the performance of eMatchSite in recognizing similar binding sites is 6% and 13% higher than that of SiteEngine using high- and moderate-quality protein models, respectively. Constructing biologically correct alignments using predicted ligand binding sites in protein models opens up the possibility to investigate drug-protein interaction networks for complete proteomes with prospective systems-level applications in polypharmacology and rational drug repositioning. eMatchSite is freely available to the academic community as a web-server and a stand-alone software distribution at http://www.brylinski.org/ematchsite.

  8. RCK: accurate and efficient inference of sequence- and structure-based protein-RNA binding models from RNAcompete data.

    Science.gov (United States)

    Orenstein, Yaron; Wang, Yuhao; Berger, Bonnie

    2016-06-15

    Protein-RNA interactions, which play vital roles in many processes, are mediated through both RNA sequence and structure. CLIP-based methods, which measure protein-RNA binding in vivo, suffer from experimental noise and systematic biases, whereas in vitro experiments capture a clearer signal of protein RNA-binding. Among them, RNAcompete provides binding affinities of a specific protein to more than 240 000 unstructured RNA probes in one experiment. The computational challenge is to infer RNA structure- and sequence-based binding models from these data. The state-of-the-art in sequence models, Deepbind, does not model structural preferences. RNAcontext models both sequence and structure preferences, but is outperformed by GraphProt. Unfortunately, GraphProt cannot detect structural preferences from RNAcompete data due to the unstructured nature of the data, as noted by its developers, nor can it be tractably run on the full RNACompete dataset. We develop RCK, an efficient, scalable algorithm that infers both sequence and structure preferences based on a new k-mer based model. Remarkably, even though RNAcompete data is designed to be unstructured, RCK can still learn structural preferences from it. RCK significantly outperforms both RNAcontext and Deepbind in in vitro binding prediction for 244 RNAcompete experiments. Moreover, RCK is also faster and uses less memory, which enables scalability. While currently on par with existing methods in in vivo binding prediction on a small scale test, we demonstrate that RCK will increasingly benefit from experimentally measured RNA structure profiles as compared to computationally predicted ones. By running RCK on the entire RNAcompete dataset, we generate and provide as a resource a set of protein-RNA structure-based models on an unprecedented scale. Software and models are freely available at http://rck.csail.mit.edu/ bab@mit.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by

  9. Accuracy issues involved in modeling in vivo protein structures using PM7.

    Science.gov (United States)

    Martin, Benjamin P; Brandon, Christopher J; Stewart, James J P; Braun-Sand, Sonja B

    2015-08-01

    Using the semiempirical method PM7, an attempt has been made to quantify the error in prediction of the in vivo structure of proteins relative to X-ray structures. Three important contributory factors are the experimental limitations of X-ray structures, the difference between the crystal and solution environments, and the errors due to PM7. The geometries of 19 proteins from the Protein Data Bank that had small R values, that is, high accuracy structures, were optimized and the resulting drop in heat of formation was calculated. Analysis of the changes showed that about 10% of this decrease in heat of formation was caused by faults in PM7, the balance being attributable to the X-ray structure and the difference between the crystal and solution environments. A previously unknown fault in PM7 was revealed during tests to validate the geometries generated using PM7. Clashscores generated by the Molprobity molecular mechanics structure validation program showed that PM7 was predicting unrealistically close contacts between nonbonding atoms in regions where the local geometry is dominated by very weak noncovalent interactions. The origin of this fault was traced to an underestimation of the core-core repulsion between atoms at distances smaller than the equilibrium distance. © 2015 The Authors. Proteins: Structure, Function, and Bioinformatics Published By Wiley Periodicals, Inc.

  10. Validation-driven protein-structure improvement

    NARCIS (Netherlands)

    Touw, W.G.

    2016-01-01

    High-quality protein structure models are essential for many Life Science applications, such as protein engineering, molecular dynamics, drug design, and homology modelling. The WHAT_CHECK model validation project and the PDB_REDO model optimisation project have shown that many structure models in

  11. Structural model for the interaction of a designed Ankyrin Repeat Protein with the human epidermal growth factor receptor 2.

    Directory of Open Access Journals (Sweden)

    V Chandana Epa

    Full Text Available Designed Ankyrin Repeat Proteins are a class of novel binding proteins that can be selected and evolved to bind to targets with high affinity and specificity. We are interested in the DARPin H10-2-G3, which has been evolved to bind with very high affinity to the human epidermal growth factor receptor 2 (HER2. HER2 is found to be over-expressed in 30% of breast cancers, and is the target for the FDA-approved therapeutic monoclonal antibodies trastuzumab and pertuzumab and small molecule tyrosine kinase inhibitors. Here, we use computational macromolecular docking, coupled with several interface metrics such as shape complementarity, interaction energy, and electrostatic complementarity, to model the structure of the complex between the DARPin H10-2-G3 and HER2. We analyzed the interface between the two proteins and then validated the structural model by showing that selected HER2 point mutations at the putative interface with H10-2-G3 reduce the affinity of binding up to 100-fold without affecting the binding of trastuzumab. Comparisons made with a subsequently solved X-ray crystal structure of the complex yielded a backbone atom root mean square deviation of 0.84-1.14 Ångstroms. The study presented here demonstrates the capability of the computational techniques of structural bioinformatics in generating useful structural models of protein-protein interactions.

  12. Integrating Model-Based Learning and Animations for Enhancing Students' Understanding of Proteins Structure and Function

    Science.gov (United States)

    Barak, Miri; Hussein-Farraj, Rania

    2013-01-01

    This paper describes a study conducted in the context of chemistry education reforms in Israel. The study examined a new biochemistry learning unit that was developed to promote in-depth understanding of 3D structures and functions of proteins and nucleic acids. Our goal was to examine whether, and to what extent teaching and learning via…

  13. Modeling the structure of SARS 3a transmembrane protein using a ...

    Indian Academy of Sciences (India)

    three α-helices has been subjected to MD simulations to examine its quality. The TM bundle was ... of the structure of the channel, however, are yet to be elucidated. ... interactions between the proteins and the lipid bilayer has been studied ...

  14. Investigating energy-based pool structure selection in the structure ensemble modeling with experimental distance constraints: The example from a multidomain protein Pub1.

    Science.gov (United States)

    Zhu, Guanhua; Liu, Wei; Bao, Chenglong; Tong, Dudu; Ji, Hui; Shen, Zuowei; Yang, Daiwen; Lu, Lanyuan

    2018-05-01

    The structural variations of multidomain proteins with flexible parts mediate many biological processes, and a structure ensemble can be determined by selecting a weighted combination of representative structures from a simulated structure pool, producing the best fit to experimental constraints such as interatomic distance. In this study, a hybrid structure-based and physics-based atomistic force field with an efficient sampling strategy is adopted to simulate a model di-domain protein against experimental paramagnetic relaxation enhancement (PRE) data that correspond to distance constraints. The molecular dynamics simulations produce a wide range of conformations depicted on a protein energy landscape. Subsequently, a conformational ensemble recovered with low-energy structures and the minimum-size restraint is identified in good agreement with experimental PRE rates, and the result is also supported by chemical shift perturbations and small-angle X-ray scattering data. It is illustrated that the regularizations of energy and ensemble-size prevent an arbitrary interpretation of protein conformations. Moreover, energy is found to serve as a critical control to refine the structure pool and prevent data overfitting, because the absence of energy regularization exposes ensemble construction to the noise from high-energy structures and causes a more ambiguous representation of protein conformations. Finally, we perform structure-ensemble optimizations with a topology-based structure pool, to enhance the understanding on the ensemble results from different sources of pool candidates. © 2018 Wiley Periodicals, Inc.

  15. The Puf family of RNA-binding proteins in plants: phylogeny, structural modeling, activity and subcellular localization

    Directory of Open Access Journals (Sweden)

    Tam Michael WC

    2010-03-01

    Full Text Available Abstract Background Puf proteins have important roles in controlling gene expression at the post-transcriptional level by promoting RNA decay and repressing translation. The Pumilio homology domain (PUM-HD is a conserved region within Puf proteins that binds to RNA with sequence specificity. Although Puf proteins have been well characterized in animal and fungal systems, little is known about the structural and functional characteristics of Puf-like proteins in plants. Results The Arabidopsis and rice genomes code for 26 and 19 Puf-like proteins, respectively, each possessing eight or fewer Puf repeats in their PUM-HD. Key amino acids in the PUM-HD of several of these proteins are conserved with those of animal and fungal homologs, whereas other plant Puf proteins demonstrate extensive variability in these amino acids. Three-dimensional modeling revealed that the predicted structure of this domain in plant Puf proteins provides a suitable surface for binding RNA. Electrophoretic gel mobility shift experiments showed that the Arabidopsis AtPum2 PUM-HD binds with high affinity to BoxB of the Drosophila Nanos Response Element I (NRE1 RNA, whereas a point mutation in the core of the NRE1 resulted in a significant reduction in binding affinity. Transient expression of several of the Arabidopsis Puf proteins as fluorescent protein fusions revealed a dynamic, punctate cytoplasmic pattern of localization for most of these proteins. The presence of predicted nuclear export signals and accumulation of AtPuf proteins in the nucleus after treatment of cells with leptomycin B demonstrated that shuttling of these proteins between the cytosol and nucleus is common among these proteins. In addition to the cytoplasmically enriched AtPum proteins, two AtPum proteins showed nuclear targeting with enrichment in the nucleolus. Conclusions The Puf family of RNA-binding proteins in plants consists of a greater number of members than any other model species studied to

  16. Global Structural Flexibility of Metalloproteins Regulates Reactivity of Transition Metal Ion in the Protein Core: An Experimental Study Using Thiol-subtilisin as a Model Protein.

    Science.gov (United States)

    Matsuo, Takashi; Kono, Takamasa; Shobu, Isamu; Ishida, Masaya; Gonda, Katsuya; Hirota, Shun

    2018-02-21

    The functions of metal-containing proteins (metalloproteins) are determined by the reactivities of transition metal ions at their active sites. Because protein macromolecular structures have several molecular degrees of freedom, global structural flexibility may also regulate the properties of metalloproteins. However, the influence of this factor has not been fully delineated in mechanistic studies of metalloproteins. Accordingly, we have investigated the relationship between global protein flexibility and the characteristics of a transition metal ion in the protein core using thiol-subtilisin (tSTL) with a Cys-coordinated Cu 2+ ion as a model system. Although tSTL has two Ca 2+ -binding sites, the Ca 2+ -binding status hardly affects its secondary structure. Nevertheless, guanidinium-induced denaturation and amide H/D exchange indicated the increase in the structural flexibility of tSTL by the removal of bound Ca 2+ ions. Electron paramagnetic resonance and absorption spectral changes have revealed that the protein flexibility determines the characteristics of a Cu 2+ ion in tSTL. Therefore, global protein flexibility should be recognized as an important factor that regulates the properties of metalloproteins. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  17. Exploring structural variability in X-ray crystallographic models using protein local optimization by torsion-angle sampling

    International Nuclear Information System (INIS)

    Knight, Jennifer L.; Zhou, Zhiyong; Gallicchio, Emilio; Himmel, Daniel M.; Friesner, Richard A.; Arnold, Eddy; Levy, Ronald M.

    2008-01-01

    Torsion-angle sampling, as implemented in the Protein Local Optimization Program (PLOP), is used to generate multiple structurally variable single-conformer models which are in good agreement with X-ray data. An ensemble-refinement approach to differentiate between positional uncertainty and conformational heterogeneity is proposed. Modeling structural variability is critical for understanding protein function and for modeling reliable targets for in silico docking experiments. Because of the time-intensive nature of manual X-ray crystallographic refinement, automated refinement methods that thoroughly explore conformational space are essential for the systematic construction of structurally variable models. Using five proteins spanning resolutions of 1.0–2.8 Å, it is demonstrated how torsion-angle sampling of backbone and side-chain libraries with filtering against both the chemical energy, using a modern effective potential, and the electron density, coupled with minimization of a reciprocal-space X-ray target function, can generate multiple structurally variable models which fit the X-ray data well. Torsion-angle sampling as implemented in the Protein Local Optimization Program (PLOP) has been used in this work. Models with the lowest R free values are obtained when electrostatic and implicit solvation terms are included in the effective potential. HIV-1 protease, calmodulin and SUMO-conjugating enzyme illustrate how variability in the ensemble of structures captures structural variability that is observed across multiple crystal structures and is linked to functional flexibility at hinge regions and binding interfaces. An ensemble-refinement procedure is proposed to differentiate between variability that is a consequence of physical conformational heterogeneity and that which reflects uncertainty in the atomic coordinates

  18. SDSL-ESR-based protein structure characterization.

    Science.gov (United States)

    Strancar, Janez; Kavalenka, Aleh; Urbancic, Iztok; Ljubetic, Ajasja; Hemminga, Marcus A

    2010-03-01

    As proteins are key molecules in living cells, knowledge about their structure can provide important insights and applications in science, biotechnology, and medicine. However, many protein structures are still a big challenge for existing high-resolution structure-determination methods, as can be seen in the number of protein structures published in the Protein Data Bank. This is especially the case for less-ordered, more hydrophobic and more flexible protein systems. The lack of efficient methods for structure determination calls for urgent development of a new class of biophysical techniques. This work attempts to address this problem with a novel combination of site-directed spin labelling electron spin resonance spectroscopy (SDSL-ESR) and protein structure modelling, which is coupled by restriction of the conformational spaces of the amino acid side chains. Comparison of the application to four different protein systems enables us to generalize the new method and to establish a general procedure for determination of protein structure.

  19. Structure of human Rad51 protein filament from molecular modeling and site-specific linear dichroism spectroscopy

    KAUST Repository

    Reymer, A.

    2009-07-08

    To get mechanistic insight into the DNA strand-exchange reaction of homologous recombination, we solved a filament structure of a human Rad51 protein, combining molecular modeling with experimental data. We build our structure on reported structures for central and N-terminal parts of pure (uncomplexed) Rad51 protein by aid of linear dichroism spectroscopy, providing angular orientations of substituted tyrosine residues of Rad51-dsDNA filaments in solution. The structure, validated by comparison with an electron microscopy density map and results from mutation analysis, is proposed to represent an active solution structure of the nucleo-protein complex. An inhomogeneously stretched double-stranded DNA fitted into the filament emphasizes the strategic positioning of 2 putative DNA-binding loops in a way that allows us speculate about their possibly distinct roles in nucleo-protein filament assembly and DNA strand-exchange reaction. The model suggests that the extension of a single-stranded DNA molecule upon binding of Rad51 is ensured by intercalation of Tyr-232 of the L1 loop, which might act as a docking tool, aligning protein monomers along the DNA strand upon filament assembly. Arg-235, also sitting on L1, is in the right position to make electrostatic contact with the phosphate backbone of the other DNA strand. The L2 loop position and its more ordered compact conformation makes us propose that this loop has another role, as a binding site for an incoming double-stranded DNA. Our filament structure and spectroscopic approach open the possibility of analyzing details along the multistep path of the strand-exchange reaction.

  20. Phylogenetic analysis and protein structure modelling identifies distinct Ca(2+)/Cation antiporters and conservation of gene family structure within Arabidopsis and rice species.

    Science.gov (United States)

    Pittman, Jon K; Hirschi, Kendal D

    2016-12-01

    The Ca(2+)/Cation Antiporter (CaCA) superfamily is an ancient and widespread family of ion-coupled cation transporters found in nearly all kingdoms of life. In animals, K(+)-dependent and K(+)-indendent Na(+)/Ca(2+) exchangers (NCKX and NCX) are important CaCA members. Recently it was proposed that all rice and Arabidopsis CaCA proteins should be classified as NCX proteins. Here we performed phylogenetic analysis of CaCA genes and protein structure homology modelling to further characterise members of this transporter superfamily. Phylogenetic analysis of rice and Arabidopsis CaCAs in comparison with selected CaCA members from non-plant species demonstrated that these genes form clearly distinct families, with the H(+)/Cation exchanger (CAX) and cation/Ca(2+) exchanger (CCX) families dominant in higher plants but the NCKX and NCX families absent. NCX-related Mg(2+)/H(+) exchanger (MHX) and CAX-related Na(+)/Ca(2+) exchanger-like (NCL) proteins are instead present. Analysis of genomes of ten closely-related rice species and four Arabidopsis-related species found that CaCA gene family structures are highly conserved within related plants, apart from minor variation. Protein structures were modelled for OsCAX1a and OsMHX1. Despite exhibiting broad structural conservation, there are clear structural differences observed between the different CaCA types. Members of the CaCA superfamily form clearly distinct families with different phylogenetic, structural and functional characteristics, and therefore should not be simply classified as NCX proteins, which should remain as a separate gene family.

  1. The interface of protein structure, protein biophysics, and molecular evolution

    Science.gov (United States)

    Liberles, David A; Teichmann, Sarah A; Bahar, Ivet; Bastolla, Ugo; Bloom, Jesse; Bornberg-Bauer, Erich; Colwell, Lucy J; de Koning, A P Jason; Dokholyan, Nikolay V; Echave, Julian; Elofsson, Arne; Gerloff, Dietlind L; Goldstein, Richard A; Grahnen, Johan A; Holder, Mark T; Lakner, Clemens; Lartillot, Nicholas; Lovell, Simon C; Naylor, Gavin; Perica, Tina; Pollock, David D; Pupko, Tal; Regan, Lynne; Roger, Andrew; Rubinstein, Nimrod; Shakhnovich, Eugene; Sjölander, Kimmen; Sunyaev, Shamil; Teufel, Ashley I; Thorne, Jeffrey L; Thornton, Joseph W; Weinreich, Daniel M; Whelan, Simon

    2012-01-01

    Abstract The interface of protein structural biology, protein biophysics, molecular evolution, and molecular population genetics forms the foundations for a mechanistic understanding of many aspects of protein biochemistry. Current efforts in interdisciplinary protein modeling are in their infancy and the state-of-the art of such models is described. Beyond the relationship between amino acid substitution and static protein structure, protein function, and corresponding organismal fitness, other considerations are also discussed. More complex mutational processes such as insertion and deletion and domain rearrangements and even circular permutations should be evaluated. The role of intrinsically disordered proteins is still controversial, but may be increasingly important to consider. Protein geometry and protein dynamics as a deviation from static considerations of protein structure are also important. Protein expression level is known to be a major determinant of evolutionary rate and several considerations including selection at the mRNA level and the role of interaction specificity are discussed. Lastly, the relationship between modeling and needed high-throughput experimental data as well as experimental examination of protein evolution using ancestral sequence resurrection and in vitro biochemistry are presented, towards an aim of ultimately generating better models for biological inference and prediction. PMID:22528593

  2. Target specific proteochemometric model development for BACE1 - protein flexibility and structural water are critical in virtual screening.

    Science.gov (United States)

    Manoharan, Prabu; Chennoju, Kiranmai; Ghoshal, Nanda

    2015-07-01

    BACE1 is an attractive target in Alzheimer's disease (AD) treatment. A rational drug design effort for the inhibition of BACE1 is actively pursued by researchers in both academic and pharmaceutical industries. This continued effort led to the steady accumulation of BACE1 crystal structures, co-complexed with different classes of inhibitors. This wealth of information is used in this study to develop target specific proteochemometric models and these models are exploited for predicting the prospective BACE1 inhibitors. The models developed in this study have performed excellently in predicting the computationally generated poses, separately obtained from single and ensemble docking approaches. The simple protein-ligand contact (SPLC) model outperforms other sophisticated high end models, in virtual screening performance, developed during this study. In an attempt to account for BACE1 protein active site flexibility information in predictive models, we included the change in the area of solvent accessible surface and the change in the volume of solvent accessible surface in our models. The ensemble and single receptor docking results obtained from this study indicate that the structural water mediated interactions improve the virtual screening results. Also, these waters are essential for recapitulating bioactive conformation during docking study. The proteochemometric models developed in this study can be used for the prediction of BACE1 inhibitors, during the early stage of AD drug discovery.

  3. Understanding the general packing rearrangements required for successful template based modeling of protein structure from a CASP experiment.

    Science.gov (United States)

    Day, Ryan; Joo, Hyun; Chavan, Archana C; Lennox, Kristin P; Chen, Y Ann; Dahl, David B; Vannucci, Marina; Tsai, Jerry W

    2013-02-01

    As an alternative to the common template based protein structure prediction methods based on main-chain position, a novel side-chain centric approach has been developed. Together with a Bayesian loop modeling procedure and a combination scoring function, the Stone Soup algorithm was applied to the CASP9 set of template based modeling targets. Although the method did not generate as large of perturbations to the template structures as necessary, the analysis of the results gives unique insights into the differences in packing between the target structures and their templates. Considerable variation in packing is found between target and template structures even when the structures are close, and this variation is found due to 2 and 3 body packing interactions. Outside the inherent restrictions in packing representation of the PDB, the first steps in correctly defining those regions of variable packing have been mapped primarily to local interactions, as the packing at the secondary and tertiary structure are largely conserved. Of the scoring functions used, a loop scoring function based on water structure exhibited some promise for discrimination. These results present a clear structural path for further development of a side-chain centered approach to template based modeling. Copyright © 2012 Elsevier Ltd. All rights reserved.

  4. Modeling structure of G protein-coupled receptors in huan genome

    KAUST Repository

    Zhang, Yang

    2016-01-01

    G protein-coupled receptors (or GPCRs) are integral transmembrane proteins responsible to various cellular signal transductions. Human GPCR proteins are encoded by 5% of human genes but account for the targets of 40% of the FDA approved drugs. Due

  5. Soliton concepts and protein structure

    Science.gov (United States)

    Krokhotin, Andrei; Niemi, Antti J.; Peng, Xubiao

    2012-03-01

    Structural classification shows that the number of different protein folds is surprisingly small. It also appears that proteins are built in a modular fashion from a relatively small number of components. Here we propose that the modular building blocks are made of the dark soliton solution of a generalized discrete nonlinear Schrödinger equation. We find that practically all protein loops can be obtained simply by scaling the size and by joining together a number of copies of the soliton, one after another. The soliton has only two loop-specific parameters, and we compute their statistical distribution in the Protein Data Bank (PDB). We explicitly construct a collection of 200 sets of parameters, each determining a soliton profile that describes a different short loop. The ensuing profiles cover practically all those proteins in PDB that have a resolution which is better than 2.0 Å, with a precision such that the average root-mean-square distance between the loop and its soliton is less than the experimental B-factor fluctuation distance. We also present two examples that describe how the loop library can be employed both to model and to analyze folded proteins.

  6. Protein Structure Refinement by Optimization

    DEFF Research Database (Denmark)

    Carlsen, Martin

    on whether the three-dimensional structure of a homologous sequence is known. Whether or not a protein model can be used for industrial purposes depends on the quality of the predicted structure. A model can be used to design a drug when the quality is high. The overall goal of this project is to assess...... that correlates maximally to a native-decoy distance. The main contribution of this thesis is methods developed for analyzing the performance of metrically trained knowledge-based potentials and for optimizing their performance while making them less dependent on the decoy set used to define them. We focus...... being at-least a local minimum of the potential. To address how far the current functional form of the potential is from an ideal potential we present two methods for finding the optimal metrically trained potential that simultaneous has a number of native structures as a local minimum. Our results...

  7. Mapping monomeric threading to protein-protein structure prediction.

    Science.gov (United States)

    Guerler, Aysam; Govindarajoo, Brandon; Zhang, Yang

    2013-03-25

    The key step of template-based protein-protein structure prediction is the recognition of complexes from experimental structure libraries that have similar quaternary fold. Maintaining two monomer and dimer structure libraries is however laborious, and inappropriate library construction can degrade template recognition coverage. We propose a novel strategy SPRING to identify complexes by mapping monomeric threading alignments to protein-protein interactions based on the original oligomer entries in the PDB, which does not rely on library construction and increases the efficiency and quality of complex template recognitions. SPRING is tested on 1838 nonhomologous protein complexes which can recognize correct quaternary template structures with a TM score >0.5 in 1115 cases after excluding homologous proteins. The average TM score of the first model is 60% and 17% higher than that by HHsearch and COTH, respectively, while the number of targets with an interface RMSD benchmark proteins. Although the relative performance of SPRING and ZDOCK depends on the level of homology filters, a combination of the two methods can result in a significantly higher model quality than ZDOCK at all homology thresholds. These data demonstrate a new efficient approach to quaternary structure recognition that is ready to use for genome-scale modeling of protein-protein interactions due to the high speed and accuracy.

  8. NAPS: Network Analysis of Protein Structures

    Science.gov (United States)

    Chakrabarty, Broto; Parekh, Nita

    2016-01-01

    Traditionally, protein structures have been analysed by the secondary structure architecture and fold arrangement. An alternative approach that has shown promise is modelling proteins as a network of non-covalent interactions between amino acid residues. The network representation of proteins provide a systems approach to topological analysis of complex three-dimensional structures irrespective of secondary structure and fold type and provide insights into structure-function relationship. We have developed a web server for network based analysis of protein structures, NAPS, that facilitates quantitative and qualitative (visual) analysis of residue–residue interactions in: single chains, protein complex, modelled protein structures and trajectories (e.g. from molecular dynamics simulations). The user can specify atom type for network construction, distance range (in Å) and minimal amino acid separation along the sequence. NAPS provides users selection of node(s) and its neighbourhood based on centrality measures, physicochemical properties of amino acids or cluster of well-connected residues (k-cliques) for further analysis. Visual analysis of interacting domains and protein chains, and shortest path lengths between pair of residues are additional features that aid in functional analysis. NAPS support various analyses and visualization views for identifying functional residues, provide insight into mechanisms of protein folding, domain-domain and protein–protein interactions for understanding communication within and between proteins. URL:http://bioinf.iiit.ac.in/NAPS/. PMID:27151201

  9. Discovery of Novel Inhibitors for Nek6 Protein through Homology Model Assisted Structure Based Virtual Screening and Molecular Docking Approaches

    Directory of Open Access Journals (Sweden)

    P. Srinivasan

    2014-01-01

    Full Text Available Nek6 is a member of the NIMA (never in mitosis, gene A-related serine/threonine kinase family that plays an important role in the initiation of mitotic cell cycle progression. This work is an attempt to emphasize the structural and functional relationship of Nek6 protein based on homology modeling and binding pocket analysis. The three-dimensional structure of Nek6 was constructed by molecular modeling studies and the best model was further assessed by PROCHECK, ProSA, and ERRAT plot in order to analyze the quality and consistency of generated model. The overall quality of computed model showed 87.4% amino acid residues under the favored region. A 3 ns molecular dynamics simulation confirmed that the structure was reliable and stable. Two lead compounds (Binding database ID: 15666, 18602 were retrieved through structure-based virtual screening and induced fit docking approaches as novel Nek6 inhibitors. Hence, we concluded that the potential compounds may act as new leads for Nek6 inhibitors designing.

  10. Supplementary Material for: Mycobacterium tuberculosis whole genome sequencing and protein structure modelling provides insights into anti-tuberculosis drug resistance

    KAUST Repository

    Phelan, Jody

    2016-01-01

    Abstract Background Combating the spread of drug resistant tuberculosis is a global health priority. Whole genome association studies are being applied to identify genetic determinants of resistance to anti-tuberculosis drugs. Protein structure and interaction modelling are used to understand the functional effects of putative mutations and provide insight into the molecular mechanisms leading to resistance. Methods To investigate the potential utility of these approaches, we analysed the genomes of 144 Mycobacterium tuberculosis clinical isolates from The Special Programme for Research and Training in Tropical Diseases (TDR) collection sourced from 20 countries in four continents. A genome-wide approach was applied to 127 isolates to identify polymorphisms associated with minimum inhibitory concentrations for first-line anti-tuberculosis drugs. In addition, the effect of identified candidate mutations on protein stability and interactions was assessed quantitatively with well-established computational methods. Results The analysis revealed that mutations in the genes rpoB (rifampicin), katG (isoniazid), inhA-promoter (isoniazid), rpsL (streptomycin) and embB (ethambutol) were responsible for the majority of resistance observed. A subset of the mutations identified in rpoB and katG were predicted to affect protein stability. Further, a strong direct correlation was observed between the minimum inhibitory concentration values and the distance of the mutated residues in the three-dimensional structures of rpoB and katG to their respective drugs binding sites. Conclusions Using the TDR resource, we demonstrate the usefulness of whole genome association and convergent evolution approaches to detect known and potentially novel mutations associated with drug resistance. Further, protein structural modelling could provide a means of predicting the impact of polymorphisms on drug efficacy in the absence of phenotypic data. These approaches could ultimately lead to novel

  11. Mycobacterium tuberculosis whole genome sequencing and protein structure modelling provides insights into anti-tuberculosis drug resistance

    KAUST Repository

    Phelan, Jody

    2016-03-23

    Background Combating the spread of drug resistant tuberculosis is a global health priority. Whole genome association studies are being applied to identify genetic determinants of resistance to anti-tuberculosis drugs. Protein structure and interaction modelling are used to understand the functional effects of putative mutations and provide insight into the molecular mechanisms leading to resistance. Methods To investigate the potential utility of these approaches, we analysed the genomes of 144 Mycobacterium tuberculosis clinical isolates from The Special Programme for Research and Training in Tropical Diseases (TDR) collection sourced from 20 countries in four continents. A genome-wide approach was applied to 127 isolates to identify polymorphisms associated with minimum inhibitory concentrations for first-line anti-tuberculosis drugs. In addition, the effect of identified candidate mutations on protein stability and interactions was assessed quantitatively with well-established computational methods. Results The analysis revealed that mutations in the genes rpoB (rifampicin), katG (isoniazid), inhA-promoter (isoniazid), rpsL (streptomycin) and embB (ethambutol) were responsible for the majority of resistance observed. A subset of the mutations identified in rpoB and katG were predicted to affect protein stability. Further, a strong direct correlation was observed between the minimum inhibitory concentration values and the distance of the mutated residues in the three-dimensional structures of rpoB and katG to their respective drugs binding sites. Conclusions Using the TDR resource, we demonstrate the usefulness of whole genome association and convergent evolution approaches to detect known and potentially novel mutations associated with drug resistance. Further, protein structural modelling could provide a means of predicting the impact of polymorphisms on drug efficacy in the absence of phenotypic data. These approaches could ultimately lead to novel resistance

  12. Biochemical characterization and structural modeling of human cathepsin E variant 2 in comparison to the wild-type protein

    Science.gov (United States)

    Puizdar, Vida; Zajc, Tajana; Žerovnik, Eva; Renko, Miha; Pieper, Ursula; Eswar, Narayanan; Šali, Andrej; Dolenc, Iztok; Turk, Vito

    2014-01-01

    Cathepsin E splice variant 2 appears in a number of gastric carcinoma. Here, we report detecting this variant in HeLa cells using polyclonal antibodies and biotinylated inhibitor pepstatin A. An overexpression of GFP fusion proteins of cathepsin E and its splice variant within HEK-293T cells was performed to show their localization. Their distribution under a fluorescence microscope showed that they are colocalized. We also expressed variant 1 and variant 2 of cathepsins E, with propeptide and without it, in Echerichia coli. After refolding from the inclusion bodies, the enzymatic activity and circular dichroism spectra of the splice variant 2 were compared to those of the wild-type mature active cathepsins E. While full-length cathepsin E variant1 is activated at acid pH, the splice variant remains inactive. In contrast to the active cathepsin E, the splice variant 2 predominantly assumes β-sheet structure, prone to oligomerization, at least under in vitro conditions, as shown by Atomic Force Microscopy as shallow disk-like particles. A comparative structure model of splice variant 2 was computed based on its alignment to the known structure of cathepsin E intermediate (Protein Data Bank code 1TZS), and used to rationalize its conformational properties and loss of activity. PMID:22718633

  13. Combining structural modeling with ensemble machine learning to accurately predict protein fold stability and binding affinity effects upon mutation.

    Directory of Open Access Journals (Sweden)

    Niklas Berliner

    Full Text Available Advances in sequencing have led to a rapid accumulation of mutations, some of which are associated with diseases. However, to draw mechanistic conclusions, a biochemical understanding of these mutations is necessary. For coding mutations, accurate prediction of significant changes in either the stability of proteins or their affinity to their binding partners is required. Traditional methods have used semi-empirical force fields, while newer methods employ machine learning of sequence and structural features. Here, we show how combining both of these approaches leads to a marked boost in accuracy. We introduce ELASPIC, a novel ensemble machine learning approach that is able to predict stability effects upon mutation in both, domain cores and domain-domain interfaces. We combine semi-empirical energy terms, sequence conservation, and a wide variety of molecular details with a Stochastic Gradient Boosting of Decision Trees (SGB-DT algorithm. The accuracy of our predictions surpasses existing methods by a considerable margin, achieving correlation coefficients of 0.77 for stability, and 0.75 for affinity predictions. Notably, we integrated homology modeling to enable proteome-wide prediction and show that accurate prediction on modeled structures is possible. Lastly, ELASPIC showed significant differences between various types of disease-associated mutations, as well as between disease and common neutral mutations. Unlike pure sequence-based prediction methods that try to predict phenotypic effects of mutations, our predictions unravel the molecular details governing the protein instability, and help us better understand the molecular causes of diseases.

  14. Protein Molecular Structures, Protein SubFractions, and Protein Availability Affected by Heat Processing: A Review

    International Nuclear Information System (INIS)

    Yu, P.

    2007-01-01

    The utilization and availability of protein depended on the types of protein and their specific susceptibility to enzymatic hydrolysis (inhibitory activities) in the gastrointestine and was highly associated with protein molecular structures. Studying internal protein structure and protein subfraction profiles leaded to an understanding of the components that make up a whole protein. An understanding of the molecular structure of the whole protein was often vital to understanding its digestive behavior and nutritive value in animals. In this review, recently obtained information on protein molecular structural effects of heat processing was reviewed, in relation to protein characteristics affecting digestive behavior and nutrient utilization and availability. The emphasis of this review was on (1) using the newly advanced synchrotron technology (S-FTIR) as a novel approach to reveal protein molecular chemistry affected by heat processing within intact plant tissues; (2) revealing the effects of heat processing on the profile changes of protein subfractions associated with digestive behaviors and kinetics manipulated by heat processing; (3) prediction of the changes of protein availability and supply after heat processing, using the advanced DVE/OEB and NRC-2001 models, and (4) obtaining information on optimal processing conditions of protein as intestinal protein source to achieve target values for potential high net absorbable protein in the small intestine. The information described in this article may give better insight in the mechanisms involved and the intrinsic protein molecular structural changes occurring upon processing.

  15. Oligomeric protein structure networks: insights into protein-protein interactions

    Directory of Open Access Journals (Sweden)

    Brinda KV

    2005-12-01

    Full Text Available Abstract Background Protein-protein association is essential for a variety of cellular processes and hence a large number of investigations are being carried out to understand the principles of protein-protein interactions. In this study, oligomeric protein structures are viewed from a network perspective to obtain new insights into protein association. Structure graphs of proteins have been constructed from a non-redundant set of protein oligomer crystal structures by considering amino acid residues as nodes and the edges are based on the strength of the non-covalent interactions between the residues. The analysis of such networks has been carried out in terms of amino acid clusters and hubs (highly connected residues with special emphasis to protein interfaces. Results A variety of interactions such as hydrogen bond, salt bridges, aromatic and hydrophobic interactions, which occur at the interfaces are identified in a consolidated manner as amino acid clusters at the interface, from this study. Moreover, the characterization of the highly connected hub-forming residues at the interfaces and their comparison with the hubs from the non-interface regions and the non-hubs in the interface regions show that there is a predominance of charged interactions at the interfaces. Further, strong and weak interfaces are identified on the basis of the interaction strength between amino acid residues and the sizes of the interface clusters, which also show that many protein interfaces are stronger than their monomeric protein cores. The interface strengths evaluated based on the interface clusters and hubs also correlate well with experimentally determined dissociation constants for known complexes. Finally, the interface hubs identified using the present method correlate very well with experimentally determined hotspots in the interfaces of protein complexes obtained from the Alanine Scanning Energetics database (ASEdb. A few predictions of interface hot

  16. Amino acid code of protein secondary structure.

    Science.gov (United States)

    Shestopalov, B V

    2003-01-01

    The calculation of protein three-dimensional structure from the amino acid sequence is a fundamental problem to be solved. This paper presents principles of the code theory of protein secondary structure, and their consequence--the amino acid code of protein secondary structure. The doublet code model of protein secondary structure, developed earlier by the author (Shestopalov, 1990), is part of this theory. The theory basis are: 1) the name secondary structure is assigned to the conformation, stabilized only by the nearest (intraresidual) and middle-range (at a distance no more than that between residues i and i + 5) interactions; 2) the secondary structure consists of regular (alpha-helical and beta-structural) and irregular (coil) segments; 3) the alpha-helices, beta-strands and coil segments are encoded, respectively, by residue pairs (i, i + 4), (i, i + 2), (i, i = 1), according to the numbers of residues per period, 3.6, 2, 1; 4) all such pairs in the amino acid sequence are codons for elementary structural elements, or structurons; 5) the codons are divided into 21 types depending on their strength, i.e. their encoding capability; 6) overlappings of structurons of one and the same structure generate the longer segments of this structure; 7) overlapping of structurons of different structures is forbidden, and therefore selection of codons is required, the codon selection is hierarchic; 8) the code theory of protein secondary structure generates six variants of the amino acid code of protein secondary structure. There are two possible kinds of model construction based on the theory: the physical one using physical properties of amino acid residues, and the statistical one using results of statistical analysis of a great body of structural data. Some evident consequences of the theory are: a) the theory can be used for calculating the secondary structure from the amino acid sequence as a partial solution of the problem of calculation of protein three

  17. Dynameomics: data-driven methods and models for utilizing large-scale protein structure repositories for improving fragment-based loop prediction.

    Science.gov (United States)

    Rysavy, Steven J; Beck, David A C; Daggett, Valerie

    2014-11-01

    Protein function is intimately linked to protein structure and dynamics yet experimentally determined structures frequently omit regions within a protein due to indeterminate data, which is often due protein dynamics. We propose that atomistic molecular dynamics simulations provide a diverse sampling of biologically relevant structures for these missing segments (and beyond) to improve structural modeling and structure prediction. Here we make use of the Dynameomics data warehouse, which contains simulations of representatives of essentially all known protein folds. We developed novel computational methods to efficiently identify, rank and retrieve small peptide structures, or fragments, from this database. We also created a novel data model to analyze and compare large repositories of structural data, such as contained within the Protein Data Bank and the Dynameomics data warehouse. Our evaluation compares these structural repositories for improving loop predictions and analyzes the utility of our methods and models. Using a standard set of loop structures, containing 510 loops, 30 for each loop length from 4 to 20 residues, we find that the inclusion of Dynameomics structures in fragment-based methods improves the quality of the loop predictions without being dependent on sequence homology. Depending on loop length, ∼ 25-75% of the best predictions came from the Dynameomics set, resulting in lower main chain root-mean-square deviations for all fragment lengths using the combined fragment library. We also provide specific cases where Dynameomics fragments provide better predictions for NMR loop structures than fragments from crystal structures. Online access to these fragment libraries is available at http://www.dynameomics.org/fragments. © 2014 The Protein Society.

  18. Structure of human Rad51 protein filament from molecular modeling and site-specific linear dichroism spectroscopy

    KAUST Repository

    Reymer, A.; Frykholm, K.; Morimatsu, K.; Takahashi, M.; Norden, B.

    2009-01-01

    for central and N-terminal parts of pure (uncomplexed) Rad51 protein by aid of linear dichroism spectroscopy, providing angular orientations of substituted tyrosine residues of Rad51-dsDNA filaments in solution. The structure, validated by comparison

  19. Supplementary Material for: Mycobacterium tuberculosis whole genome sequencing and protein structure modelling provides insights into anti-tuberculosis drug resistance

    KAUST Repository

    Phelan, Jody; Coll, Francesc; McNerney, Ruth; Ascher, David; Pires, Douglas; Furnham, Nick; Coeck, Nele; Hill-Cawthorne, Grant; Nair, Mridul; Mallard, Kim; Ramsay, Andrew; Campino, Susana; Hibberd, Martin; Pain, Arnab; Rigouts, Leen; Clark, Taane

    2016-01-01

    Abstract Background Combating the spread of drug resistant tuberculosis is a global health priority. Whole genome association studies are being applied to identify genetic determinants of resistance to anti-tuberculosis drugs. Protein structure

  20. Structural zinc(II thiolate complexes relevant to the modeling of Ada repair protein: Application toward alkylation reactions

    Directory of Open Access Journals (Sweden)

    Mohamed M. Ibrahim

    2014-11-01

    Full Text Available The TtZn(II-bound perchlorate complex [TtZn–OClO3] 1 (Ttxyly = hydrotris[N-xylyl-thioimidazolyl]borate was used for the synthesis of zinc(II-bound ethanthiothiol complex [TtZn–SCH2CH3] 2 and its hydrogen-bond containing analog Tt–ZnSCH2CH2–NH(COOC(CH33 3. These thiolate complexes were examined as structural models for the active sites of Ada repair protein toward methylation reactions. The Zn[S3O] coordination sphere in complex 1 includes three thione donors from the ligand Ttixyl and one oxygen donor from the perchlorate coligand in ideally tetrahedral arrangement around the zinc center. The average Zn(1–S(thione bond length is 2.344 Å, and the Zn(1–O(1 bond length is 1.917 Å.

  1. Dynameomics: Data-driven methods and models for utilizing large-scale protein structure repositories for improving fragment-based loop prediction

    Science.gov (United States)

    Rysavy, Steven J; Beck, David AC; Daggett, Valerie

    2014-01-01

    Protein function is intimately linked to protein structure and dynamics yet experimentally determined structures frequently omit regions within a protein due to indeterminate data, which is often due protein dynamics. We propose that atomistic molecular dynamics simulations provide a diverse sampling of biologically relevant structures for these missing segments (and beyond) to improve structural modeling and structure prediction. Here we make use of the Dynameomics data warehouse, which contains simulations of representatives of essentially all known protein folds. We developed novel computational methods to efficiently identify, rank and retrieve small peptide structures, or fragments, from this database. We also created a novel data model to analyze and compare large repositories of structural data, such as contained within the Protein Data Bank and the Dynameomics data warehouse. Our evaluation compares these structural repositories for improving loop predictions and analyzes the utility of our methods and models. Using a standard set of loop structures, containing 510 loops, 30 for each loop length from 4 to 20 residues, we find that the inclusion of Dynameomics structures in fragment-based methods improves the quality of the loop predictions without being dependent on sequence homology. Depending on loop length, ∼25–75% of the best predictions came from the Dynameomics set, resulting in lower main chain root-mean-square deviations for all fragment lengths using the combined fragment library. We also provide specific cases where Dynameomics fragments provide better predictions for NMR loop structures than fragments from crystal structures. Online access to these fragment libraries is available at http://www.dynameomics.org/fragments. PMID:25142412

  2. Probing the Energetics of Dynactin Filament Assembly and the Binding of Cargo Adaptor Proteins Using Molecular Dynamics Simulation and Electrostatics-Based Structural Modeling.

    Science.gov (United States)

    Zheng, Wenjun

    2017-01-10

    Dynactin, a large multiprotein complex, binds with the cytoplasmic dynein-1 motor and various adaptor proteins to allow recruitment and transportation of cellular cargoes toward the minus end of microtubules. The structure of the dynactin complex is built around an actin-like minifilament with a defined length, which has been visualized in a high-resolution structure of the dynactin filament determined by cryo-electron microscopy (cryo-EM). To understand the energetic basis of dynactin filament assembly, we used molecular dynamics simulation to probe the intersubunit interactions among the actin-like proteins, various capping proteins, and four extended regions of the dynactin shoulder. Our simulations revealed stronger intersubunit interactions at the barbed and pointed ends of the filament and involving the extended regions (compared with the interactions within the filament), which may energetically drive filament termination by the capping proteins and recruitment of the actin-like proteins by the extended regions, two key features of the dynactin filament assembly process. Next, we modeled the unknown binding configuration among dynactin, dynein tails, and a number of coiled-coil adaptor proteins (including several Bicaudal-D and related proteins and three HOOK proteins), and predicted a key set of charged residues involved in their electrostatic interactions. Our modeling is consistent with previous findings of conserved regions, functional sites, and disease mutations in the adaptor proteins and will provide a structural framework for future functional and mutational studies of these adaptor proteins. In sum, this study yielded rich structural and energetic information about dynactin and associated adaptor proteins that cannot be directly obtained from the cryo-EM structures with limited resolutions.

  3. Early cytoskeletal protein modifications precede overt structural degeneration in the DBA/2J mouse model of glaucoma

    Directory of Open Access Journals (Sweden)

    Gina Nicole Wilson

    2016-11-01

    Full Text Available Axonal transport deficits precede structural loss in glaucoma and other neurodegenerations. Impairments in structural support, including modified cytoskeletal proteins and microtubule-destabilizing elements, could be initiating factors in glaucoma pathogenesis. We investigated the time course of changes in protein levels and post-translational modifications in the DBA/2J mouse model of glaucoma. Using anterograde tract tracing of the retinal projection, we assessed major cytoskeletal and transported elements as a function of transport integrity in different stages of pathological progression. Using capillary-based electrophoresis, single- and multiplex immunosorbent assays, and immunofluorescence, we quantified hyperphosphorylated neurofilament-heavy chain, phosphorylated tau (ptau, calpain-mediated spectrin breakdown product (145/150kDa, β –tubulin, and amyloid-β42 proteins based on age and transport outcome to the superior colliculus (SC, the main retinal target in mice. Phosphorylated neurofilament-heavy chain (pNF-H was elevated within the optic nerve (ON and SC of 8-10 month-old DBA/2J mice, but was not evident in the retina until 12-15 months, suggesting that cytoskeletal modifications first appear in the distal retinal projection. As expected, higher pNF-H levels in the SC and retina were correlated with axonal transport deficits. Elevations in hyperphosphorylated tau (ptau occurred in ON and SC between 3-8 month of age while retinal ptau accumulations occurred at 12-15 months in DBA/2J mice. In vitro co-immunoprecipitation experiments suggested increased affinity of ptau for the retrograde motor complex protein, dynactin. We observed a transport-related decrease of β-tubulin in ON of 10-12 month-old DBA/2J mice, suggesting destabilized microtubule array. Elevations in calpain-mediated spectrin breakdown product were seen in ON and SC at the earliest age examined, well before axonal transport loss is evident. Finally, transport

  4. Molecular structure, dynamics and hydration studies of soybean storage proteins and model systems by nuclear magnetic resonance

    International Nuclear Information System (INIS)

    Kakalis, L.T.

    1989-01-01

    The potential of high-resolution 13 C NMR for the characterization of soybean storage proteins was explored. The spectra of a commercial soy protein isolate as well as those of alkali-denatured 7S and 11S soybean globulins were well resolved and tentatively assigned. Relaxation measurements indicated fast motion for several side chains and the protein backbone. Protein fractions (11S and 7S) were also investigated at various states of molecular association. The large size of the multisubunit soybean storage proteins affected adversely both the resolution and the sensitivity of their 13 C NMR spectra. A comparison of 17 O and 2 H NMR relaxation rates of water in solutions of lysozyme (a model system) as a function of concentration, pH and magnetic field suggested that only 17 O monitors directly the hydration of lysozyme. Analysis of 17 O NMR lysozyme hydration data in terms of a two-state, fast-exchange, anisotropic model resulted in hydration parameters which are consistent with the protein's physico-chemical properties. The same model was applied to the calculation of the amount and mobility of bound water in soy protein dispersions by means of 17 O NMR relaxation measurements as a function of protein concentration. The protein concentration dependences of 1 H transverse NMR relaxation measurements at various pH and ionic strength values were fitted by a viral expansion. The interpretation of the data was based on the effects of protein aggregation, salt binding and protein group ionization on the NMR measurements. In all cases, relaxation rates showed a linear dependence on protein activity

  5. Protein structure database search and evolutionary classification.

    Science.gov (United States)

    Yang, Jinn-Moon; Tung, Chi-Hua

    2006-01-01

    As more protein structures become available and structural genomics efforts provide structural models in a genome-wide strategy, there is a growing need for fast and accurate methods for discovering homologous proteins and evolutionary classifications of newly determined structures. We have developed 3D-BLAST, in part, to address these issues. 3D-BLAST is as fast as BLAST and calculates the statistical significance (E-value) of an alignment to indicate the reliability of the prediction. Using this method, we first identified 23 states of the structural alphabet that represent pattern profiles of the backbone fragments and then used them to represent protein structure databases as structural alphabet sequence databases (SADB). Our method enhanced BLAST as a search method, using a new structural alphabet substitution matrix (SASM) to find the longest common substructures with high-scoring structured segment pairs from an SADB database. Using personal computers with Intel Pentium4 (2.8 GHz) processors, our method searched more than 10 000 protein structures in 1.3 s and achieved a good agreement with search results from detailed structure alignment methods. [3D-BLAST is available at http://3d-blast.life.nctu.edu.tw].

  6. Protein interfacial structure and nanotoxicology

    International Nuclear Information System (INIS)

    White, John W.; Perriman, Adam W.; McGillivray, Duncan J.; Lin, J.-M.

    2009-01-01

    Here we briefly recapitulate the use of X-ray and neutron reflectometry at the air-water interface to find protein structures and thermodynamics at interfaces and test a possibility for understanding those interactions between nanoparticles and proteins which lead to nanoparticle toxicology through entry into living cells. Stable monomolecular protein films have been made at the air-water interface and, with a specially designed vessel, the substrate changed from that which the air-water interfacial film was deposited. This procedure allows interactions, both chemical and physical, between introduced species and the monomolecular film to be studied by reflectometry. The method is briefly illustrated here with some new results on protein-protein interaction between β-casein and κ-casein at the air-water interface using X-rays. These two proteins are an essential component of the structure of milk. In the experiments reported, specific and directional interactions appear to cause different interfacial structures if first, a β-casein monolayer is attacked by a κ-casein solution compared to the reverse. The additional contrast associated with neutrons will be an advantage here. We then show the first results of experiments on the interaction of a β-casein monolayer with a nanoparticle titanium oxide sol, foreshadowing the study of the nanoparticle 'corona' thought to be important for nanoparticle-cell wall penetration.

  7. Protein interfacial structure and nanotoxicology

    Energy Technology Data Exchange (ETDEWEB)

    White, John W. [Research School of Chemistry, Australian National University, Canberra (Australia)], E-mail: jww@rsc.anu.edu.au; Perriman, Adam W.; McGillivray, Duncan J.; Lin, J.-M. [Research School of Chemistry, Australian National University, Canberra (Australia)

    2009-02-21

    Here we briefly recapitulate the use of X-ray and neutron reflectometry at the air-water interface to find protein structures and thermodynamics at interfaces and test a possibility for understanding those interactions between nanoparticles and proteins which lead to nanoparticle toxicology through entry into living cells. Stable monomolecular protein films have been made at the air-water interface and, with a specially designed vessel, the substrate changed from that which the air-water interfacial film was deposited. This procedure allows interactions, both chemical and physical, between introduced species and the monomolecular film to be studied by reflectometry. The method is briefly illustrated here with some new results on protein-protein interaction between {beta}-casein and {kappa}-casein at the air-water interface using X-rays. These two proteins are an essential component of the structure of milk. In the experiments reported, specific and directional interactions appear to cause different interfacial structures if first, a {beta}-casein monolayer is attacked by a {kappa}-casein solution compared to the reverse. The additional contrast associated with neutrons will be an advantage here. We then show the first results of experiments on the interaction of a {beta}-casein monolayer with a nanoparticle titanium oxide sol, foreshadowing the study of the nanoparticle 'corona' thought to be important for nanoparticle-cell wall penetration.

  8. Structure and non-structure of centrosomal proteins.

    Science.gov (United States)

    Dos Santos, Helena G; Abia, David; Janowski, Robert; Mortuza, Gulnahar; Bertero, Michela G; Boutin, Maïlys; Guarín, Nayibe; Méndez-Giraldez, Raúl; Nuñez, Alfonso; Pedrero, Juan G; Redondo, Pilar; Sanz, María; Speroni, Silvia; Teichert, Florian; Bruix, Marta; Carazo, José M; Gonzalez, Cayetano; Reina, José; Valpuesta, José M; Vernos, Isabelle; Zabala, Juan C; Montoya, Guillermo; Coll, Miquel; Bastolla, Ugo; Serrano, Luis

    2013-01-01

    Here we perform a large-scale study of the structural properties and the expression of proteins that constitute the human Centrosome. Centrosomal proteins tend to be larger than generic human proteins (control set), since their genes contain in average more exons (20.3 versus 14.6). They are rich in predicted disordered regions, which cover 57% of their length, compared to 39% in the general human proteome. They also contain several regions that are dually predicted to be disordered and coiled-coil at the same time: 55 proteins (15%) contain disordered and coiled-coil fragments that cover more than 20% of their length. Helices prevail over strands in regions homologous to known structures (47% predicted helical residues against 17% predicted as strands), and even more in the whole centrosomal proteome (52% against 7%), while for control human proteins 34.5% of the residues are predicted as helical and 12.8% are predicted as strands. This difference is mainly due to residues predicted as disordered and helical (30% in centrosomal and 9.4% in control proteins), which may correspond to alpha-helix forming molecular recognition features (α-MoRFs). We performed expression assays for 120 full-length centrosomal proteins and 72 domain constructs that we have predicted to be globular. These full-length proteins are often insoluble: Only 39 out of 120 expressed proteins (32%) and 19 out of 72 domains (26%) were soluble. We built or retrieved structural models for 277 out of 361 human proteins whose centrosomal localization has been experimentally verified. We could not find any suitable structural template with more than 20% sequence identity for 84 centrosomal proteins (23%), for which around 74% of the residues are predicted to be disordered or coiled-coils. The three-dimensional models that we built are available at http://ub.cbm.uam.es/centrosome/models/index.php.

  9. A Mesoscopic Model for Protein-Protein Interactions in Solution

    OpenAIRE

    Lund, Mikael; Jönsson, Bo

    2003-01-01

    Protein self-association may be detrimental in biological systems, but can be utilized in a controlled fashion for protein crystallization. It is hence of considerable interest to understand how factors like solution conditions prevent or promote aggregation. Here we present a computational model describing interactions between protein molecules in solution. The calculations are based on a molecular description capturing the detailed structure of the protein molecule using x-ray or nuclear ma...

  10. Structural entanglements in protein complexes

    Science.gov (United States)

    Zhao, Yani; Chwastyk, Mateusz; Cieplak, Marek

    2017-06-01

    We consider multi-chain protein native structures and propose a criterion that determines whether two chains in the system are entangled or not. The criterion is based on the behavior observed by pulling at both termini of each chain simultaneously in the two chains. We have identified about 900 entangled systems in the Protein Data Bank and provided a more detailed analysis for several of them. We argue that entanglement enhances the thermodynamic stability of the system but it may have other functions: burying the hydrophobic residues at the interface and increasing the DNA or RNA binding area. We also study the folding and stretching properties of the knotted dimeric proteins MJ0366, YibK, and bacteriophytochrome. These proteins have been studied theoretically in their monomeric versions so far. The dimers are seen to separate on stretching through the tensile mechanism and the characteristic unraveling force depends on the pulling direction.

  11. Influence of myelin proteins on the structure and dynamics of a model membrane with emphasis on the low temperature regime

    Energy Technology Data Exchange (ETDEWEB)

    Knoll, W. [University Joseph Fourier, UFR PhiTEM, Grenoble (France); Institut Laue–Langevin, Grenoble (France); Peters, J. [University Joseph Fourier, UFR PhiTEM, Grenoble (France); Institut Laue–Langevin, Grenoble (France); Institut de Biologie Structurale, Grenoble (France); Kursula, P. [University of Oulu, Oulu (Finland); CSSB–HZI, DESY, Hamburg (Germany); Gerelli, Y. [Institut Laue–Langevin, Grenoble (France); Natali, F., E-mail: natali@ill.fr [Institut Laue–Langevin, Grenoble (France); CNR–IOM–OGG, c/o Institut Laue–Langevin, Grenoble (France)

    2014-11-28

    Myelin is an insulating, multi-lamellar membrane structure wrapped around selected nerve axons. Increasing the speed of nerve impulses, it is crucial for the proper functioning of the vertebrate nervous system. Human neurodegenerative diseases, such as multiple sclerosis, are linked to damage to the myelin sheath through demyelination. Myelin exhibits a well defined subset of myelin-specific proteins, whose influence on membrane dynamics, i.e., myelin flexibility and stability, has not yet been explored in detail. In a first paper [W. Knoll, J. Peters, P. Kursula, Y. Gerelli, J. Ollivier, B. Demé, M. Telling, E. Kemner, and F. Natali, Soft Matter 10, 519 (2014)] we were able to spotlight, through neutron scattering experiments, the role of peripheral nervous system myelin proteins on membrane stability at room temperature. In particular, the myelin basic protein and peripheral myelin protein 2 were found to synergistically influence the membrane structure while keeping almost unchanged the membrane mobility. Further insight is provided by this work, in which we particularly address the investigation of the membrane flexibility in the low temperature regime. We evidence a different behavior suggesting that the proton dynamics is reduced by the addition of the myelin basic protein accompanied by negligible membrane structural changes. Moreover, we address the importance of correct sample preparation and characterization for the success of the experiment and for the reliability of the obtained results.

  12. Combining NMR ensembles and molecular dynamics simulations provides more realistic models of protein structures in solution and leads to better chemical shift prediction

    International Nuclear Information System (INIS)

    Lehtivarjo, Juuso; Tuppurainen, Kari; Hassinen, Tommi; Laatikainen, Reino; Peräkylä, Mikael

    2012-01-01

    While chemical shifts are invaluable for obtaining structural information from proteins, they also offer one of the rare ways to obtain information about protein dynamics. A necessary tool in transforming chemical shifts into structural and dynamic information is chemical shift prediction. In our previous work we developed a method for 4D prediction of protein 1 H chemical shifts in which molecular motions, the 4th dimension, were modeled using molecular dynamics (MD) simulations. Although the approach clearly improved the prediction, the X-ray structures and single NMR conformers used in the model cannot be considered fully realistic models of protein in solution. In this work, NMR ensembles (NMRE) were used to expand the conformational space of proteins (e.g. side chains, flexible loops, termini), followed by MD simulations for each conformer to map the local fluctuations. Compared with the non-dynamic model, the NMRE+MD model gave 6–17% lower root-mean-square (RMS) errors for different backbone nuclei. The improved prediction indicates that NMR ensembles with MD simulations can be used to obtain a more realistic picture of protein structures in solutions and moreover underlines the importance of short and long time-scale dynamics for the prediction. The RMS errors of the NMRE+MD model were 0.24, 0.43, 0.98, 1.03, 1.16 and 2.39 ppm for 1 Hα, 1 HN, 13 Cα, 13 Cβ, 13 CO and backbone 15 N chemical shifts, respectively. The model is implemented in the prediction program 4DSPOT, available at http://www.uef.fi/4dspothttp://www.uef.fi/4dspot.

  13. Combining NMR ensembles and molecular dynamics simulations provides more realistic models of protein structures in solution and leads to better chemical shift prediction

    Energy Technology Data Exchange (ETDEWEB)

    Lehtivarjo, Juuso, E-mail: juuso.lehtivarjo@uef.fi; Tuppurainen, Kari; Hassinen, Tommi; Laatikainen, Reino [University of Eastern Finland, School of Pharmacy (Finland); Peraekylae, Mikael [University of Eastern Finland, Institute of Biomedicine (Finland)

    2012-03-15

    While chemical shifts are invaluable for obtaining structural information from proteins, they also offer one of the rare ways to obtain information about protein dynamics. A necessary tool in transforming chemical shifts into structural and dynamic information is chemical shift prediction. In our previous work we developed a method for 4D prediction of protein {sup 1}H chemical shifts in which molecular motions, the 4th dimension, were modeled using molecular dynamics (MD) simulations. Although the approach clearly improved the prediction, the X-ray structures and single NMR conformers used in the model cannot be considered fully realistic models of protein in solution. In this work, NMR ensembles (NMRE) were used to expand the conformational space of proteins (e.g. side chains, flexible loops, termini), followed by MD simulations for each conformer to map the local fluctuations. Compared with the non-dynamic model, the NMRE+MD model gave 6-17% lower root-mean-square (RMS) errors for different backbone nuclei. The improved prediction indicates that NMR ensembles with MD simulations can be used to obtain a more realistic picture of protein structures in solutions and moreover underlines the importance of short and long time-scale dynamics for the prediction. The RMS errors of the NMRE+MD model were 0.24, 0.43, 0.98, 1.03, 1.16 and 2.39 ppm for {sup 1}H{alpha}, {sup 1}HN, {sup 13}C{alpha}, {sup 13}C{beta}, {sup 13}CO and backbone {sup 15}N chemical shifts, respectively. The model is implemented in the prediction program 4DSPOT, available at http://www.uef.fi/4dspothttp://www.uef.fi/4dspot.

  14. Protein Structure Recognition: From Eigenvector Analysis to Structural Threading Method

    Energy Technology Data Exchange (ETDEWEB)

    Cao, Haibo [Iowa State Univ., Ames, IA (United States)

    2003-01-01

    In this work, they try to understand the protein folding problem using pair-wise hydrophobic interaction as the dominant interaction for the protein folding process. They found a strong correlation between amino acid sequences and the corresponding native structure of the protein. Some applications of this correlation were discussed in this dissertation include the domain partition and a new structural threading method as well as the performance of this method in the CASP5 competition. In the first part, they give a brief introduction to the protein folding problem. Some essential knowledge and progress from other research groups was discussed. This part includes discussions of interactions among amino acids residues, lattice HP model, and the design ability principle. In the second part, they try to establish the correlation between amino acid sequence and the corresponding native structure of the protein. This correlation was observed in the eigenvector study of protein contact matrix. They believe the correlation is universal, thus it can be used in automatic partition of protein structures into folding domains. In the third part, they discuss a threading method based on the correlation between amino acid sequences and ominant eigenvector of the structure contact-matrix. A mathematically straightforward iteration scheme provides a self-consistent optimum global sequence-structure alignment. The computational efficiency of this method makes it possible to search whole protein structure databases for structural homology without relying on sequence similarity. The sensitivity and specificity of this method is discussed, along with a case of blind test prediction. In the appendix, they list the overall performance of this threading method in CASP5 blind test in comparison with other existing approaches.

  15. Protein structure recognition: From eigenvector analysis to structural threading method

    Science.gov (United States)

    Cao, Haibo

    In this work, we try to understand the protein folding problem using pair-wise hydrophobic interaction as the dominant interaction for the protein folding process. We found a strong correlation between amino acid sequence and the corresponding native structure of the protein. Some applications of this correlation were discussed in this dissertation include the domain partition and a new structural threading method as well as the performance of this method in the CASP5 competition. In the first part, we give a brief introduction to the protein folding problem. Some essential knowledge and progress from other research groups was discussed. This part include discussions of interactions among amino acids residues, lattice HP model, and the designablity principle. In the second part, we try to establish the correlation between amino acid sequence and the corresponding native structure of the protein. This correlation was observed in our eigenvector study of protein contact matrix. We believe the correlation is universal, thus it can be used in automatic partition of protein structures into folding domains. In the third part, we discuss a threading method based on the correlation between amino acid sequence and ominant eigenvector of the structure contact-matrix. A mathematically straightforward iteration scheme provides a self-consistent optimum global sequence-structure alignment. The computational efficiency of this method makes it possible to search whole protein structure databases for structural homology without relying on sequence similarity. The sensitivity and specificity of this method is discussed, along with a case of blind test prediction. In the appendix, we list the overall performance of this threading method in CASP5 blind test in comparison with other existing approaches.

  16. Protein Structure Recognition: From Eigenvector Analysis to Structural Threading Method

    International Nuclear Information System (INIS)

    Haibo Cao

    2003-01-01

    In this work, they try to understand the protein folding problem using pair-wise hydrophobic interaction as the dominant interaction for the protein folding process. They found a strong correlation between amino acid sequences and the corresponding native structure of the protein. Some applications of this correlation were discussed in this dissertation include the domain partition and a new structural threading method as well as the performance of this method in the CASP5 competition. In the first part, they give a brief introduction to the protein folding problem. Some essential knowledge and progress from other research groups was discussed. This part includes discussions of interactions among amino acids residues, lattice HP model, and the design ability principle. In the second part, they try to establish the correlation between amino acid sequence and the corresponding native structure of the protein. This correlation was observed in the eigenvector study of protein contact matrix. They believe the correlation is universal, thus it can be used in automatic partition of protein structures into folding domains. In the third part, they discuss a threading method based on the correlation between amino acid sequences and ominant eigenvector of the structure contact-matrix. A mathematically straightforward iteration scheme provides a self-consistent optimum global sequence-structure alignment. The computational efficiency of this method makes it possible to search whole protein structure databases for structural homology without relying on sequence similarity. The sensitivity and specificity of this method is discussed, along with a case of blind test prediction. In the appendix, they list the overall performance of this threading method in CASP5 blind test in comparison with other existing approaches

  17. Hidden Structural Codes in Protein Intrinsic Disorder.

    Science.gov (United States)

    Borkosky, Silvia S; Camporeale, Gabriela; Chemes, Lucía B; Risso, Marikena; Noval, María Gabriela; Sánchez, Ignacio E; Alonso, Leonardo G; de Prat Gay, Gonzalo

    2017-10-17

    Intrinsic disorder is a major structural category in biology, accounting for more than 30% of coding regions across the domains of life, yet consists of conformational ensembles in equilibrium, a major challenge in protein chemistry. Anciently evolved papillomavirus genomes constitute an unparalleled case for sequence to structure-function correlation in cases in which there are no folded structures. E7, the major transforming oncoprotein of human papillomaviruses, is a paradigmatic example among the intrinsically disordered proteins. Analysis of a large number of sequences of the same viral protein allowed for the identification of a handful of residues with absolute conservation, scattered along the sequence of its N-terminal intrinsically disordered domain, which intriguingly are mostly leucine residues. Mutation of these led to a pronounced increase in both α-helix and β-sheet structural content, reflected by drastic effects on equilibrium propensities and oligomerization kinetics, and uncovers the existence of local structural elements that oppose canonical folding. These folding relays suggest the existence of yet undefined hidden structural codes behind intrinsic disorder in this model protein. Thus, evolution pinpoints conformational hot spots that could have not been identified by direct experimental methods for analyzing or perturbing the equilibrium of an intrinsically disordered protein ensemble.

  18. A high-grain diet alters the omasal epithelial structure and expression of tight junction proteins in a goat model.

    Science.gov (United States)

    Liu, Jun-Hua; Xu, Ting-Ting; Zhu, Wei-Yun; Mao, Sheng-Yong

    2014-07-01

    The omasal epithelial barrier plays important roles in maintaining nutrient absorption and immune homeostasis in ruminants. However, little information is currently available about the changes in omasal epithelial barrier function at the structural and molecular levels during feeding of a high-grain (HG) diet. Ten male goats were randomly assigned to two groups, fed either a hay diet (0% grain; n = 5) or HG diet (65% grain; n = 5). Changes in omasal epithelial structure and expression of tight junction (TJ) proteins were determined via electron microscopy and Western blot analysis. After 7 weeks on each diet, omasal contents in the HG group showed significantly lower pH (P diet showed profound alterations in omasal epithelial structure and TJ proteins, corresponding to depression of thickness of total epithelia, stratum granulosum, and the sum of the stratum spinosum and stratum basale, marked epithelial cellular damage, erosion of intercellular junctions and down-regulation in expression of the TJ proteins, claudin-4 and occludin. The study demonstrates that feeding a HG diet is associated with omasal epithelial cellular damage and changes in expression of TJ proteins. These research findings provide an insight into the possible significance of diet on the omasal epithelial barrier in ruminants. Copyright © 2014 Elsevier Ltd. All rights reserved.

  19. Modeling Mercury in Proteins

    Energy Technology Data Exchange (ETDEWEB)

    Smith, Jeremy C [ORNL; Parks, Jerry M [ORNL

    2016-01-01

    Mercury (Hg) is a naturally occurring element that is released into the biosphere both by natural processes and anthropogenic activities. Although its reduced, elemental form Hg(0) is relatively non-toxic, other forms such as Hg2+ and, in particular, its methylated form, methylmercury, are toxic, with deleterious effects on both ecosystems and humans. Microorganisms play important roles in the transformation of mercury in the environment. Inorganic Hg2+ can be methylated by certain bacteria and archaea to form methylmercury. Conversely, bacteria also demethylate methylmercury and reduce Hg2+ to relatively inert Hg(0). Transformations and toxicity occur as a result of mercury interacting with various proteins. Clearly, then, understanding the toxic effects of mercury and its cycling in the environment requires characterization of these interactions. Computational approaches are ideally suited to studies of mercury in proteins because they can provide a detailed picture and circumvent issues associated with toxicity. Here we describe computational methods for investigating and characterizing how mercury binds to proteins, how inter- and intra-protein transfer of mercury is orchestrated in biological systems, and how chemical reactions in proteins transform the metal. We describe quantum chemical analyses of aqueous Hg(II), which reveal critical factors that determine ligand binding propensities. We then provide a perspective on how we used chemical reasoning to discover how microorganisms methylate mercury. We also highlight our combined computational and experimental studies of the proteins and enzymes of the mer operon, a suite of genes that confers mercury resistance in many bacteria. Lastly, we place work on mercury in proteins in the context of what is needed for a comprehensive multi-scale model of environmental mercury cycling.

  20. Protein Function Prediction Based on Sequence and Structure Information

    KAUST Repository

    Smaili, Fatima Z.

    2016-01-01

    operate. In this master thesis project, we worked on inferring protein functions based on the primary protein sequence. In the approach we follow, 3D models are first constructed using I-TASSER. Functions are then deduced by structurally matching

  1. Structural modelling and comparative analysis of homologous, analogous and specific proteins from Trypanosoma cruzi versus Homo sapiens: putative drug targets for chagas' disease treatment.

    Science.gov (United States)

    Capriles, Priscila V S Z; Guimarães, Ana C R; Otto, Thomas D; Miranda, Antonio B; Dardenne, Laurent E; Degrave, Wim M

    2010-10-29

    Trypanosoma cruzi is the etiological agent of Chagas' disease, an endemic infection that causes thousands of deaths every year in Latin America. Therapeutic options remain inefficient, demanding the search for new drugs and/or new molecular targets. Such efforts can focus on proteins that are specific to the parasite, but analogous enzymes and enzymes with a three-dimensional (3D) structure sufficiently different from the corresponding host proteins may represent equally interesting targets. In order to find these targets we used the workflows MHOLline and AnEnΠ obtaining 3D models from homologous, analogous and specific proteins of Trypanosoma cruzi versus Homo sapiens. We applied genome wide comparative modelling techniques to obtain 3D models for 3,286 predicted proteins of T. cruzi. In combination with comparative genome analysis to Homo sapiens, we were able to identify a subset of 397 enzyme sequences, of which 356 are homologous, 3 analogous and 38 specific to the parasite. In this work, we present a set of 397 enzyme models of T. cruzi that can constitute potential structure-based drug targets to be investigated for the development of new strategies to fight Chagas' disease. The strategies presented here support the concept of structural analysis in conjunction with protein functional analysis as an interesting computational methodology to detect potential targets for structure-based rational drug design. For example, 2,4-dienoyl-CoA reductase (EC 1.3.1.34) and triacylglycerol lipase (EC 3.1.1.3), classified as analogous proteins in relation to H. sapiens enzymes, were identified as new potential molecular targets.

  2. Algorithms for Protein Structure Prediction

    DEFF Research Database (Denmark)

    Paluszewski, Martin

    -trace. Here we present three different approaches for reconstruction of C-traces from predictable measures. In our first approach [63, 62], the C-trace is positioned on a lattice and a tabu-search algorithm is applied to find minimum energy structures. The energy function is based on half-sphere-exposure (HSE......) is more robust than standard Monte Carlo search. In the second approach for reconstruction of C-traces, an exact branch and bound algorithm has been developed [67, 65]. The model is discrete and makes use of secondary structure predictions, HSE, CN and radius of gyration. We show how to compute good lower...... bounds for partial structures very fast. Using these lower bounds, we are able to find global minimum structures in a huge conformational space in reasonable time. We show that many of these global minimum structures are of good quality compared to the native structure. Our branch and bound algorithm...

  3. INTEGRATING GENETIC AND STRUCTURAL DATA ON HUMAN PROTEIN KINOME IN NETWORK-BASED MODELING OF KINASE SENSITIVITIES AND RESISTANCE TO TARGETED AND PERSONALIZED ANTICANCER DRUGS.

    Science.gov (United States)

    Verkhivker, Gennady M

    2016-01-01

    The human protein kinome presents one of the largest protein families that orchestrate functional processes in complex cellular networks, and when perturbed, can cause various cancers. The abundance and diversity of genetic, structural, and biochemical data underlies the complexity of mechanisms by which targeted and personalized drugs can combat mutational profiles in protein kinases. Coupled with the evolution of system biology approaches, genomic and proteomic technologies are rapidly identifying and charactering novel resistance mechanisms with the goal to inform rationale design of personalized kinase drugs. Integration of experimental and computational approaches can help to bring these data into a unified conceptual framework and develop robust models for predicting the clinical drug resistance. In the current study, we employ a battery of synergistic computational approaches that integrate genetic, evolutionary, biochemical, and structural data to characterize the effect of cancer mutations in protein kinases. We provide a detailed structural classification and analysis of genetic signatures associated with oncogenic mutations. By integrating genetic and structural data, we employ network modeling to dissect mechanisms of kinase drug sensitivities to oncogenic EGFR mutations. Using biophysical simulations and analysis of protein structure networks, we show that conformational-specific drug binding of Lapatinib may elicit resistant mutations in the EGFR kinase that are linked with the ligand-mediated changes in the residue interaction networks and global network properties of key residues that are responsible for structural stability of specific functional states. A strong network dependency on high centrality residues in the conformation-specific Lapatinib-EGFR complex may explain vulnerability of drug binding to a broad spectrum of mutations and the emergence of drug resistance. Our study offers a systems-based perspective on drug design by unravelling

  4. Multiple functional roles of the accessory I-domain of bacteriophage P22 coat protein revealed by NMR structure and CryoEM modeling.

    Science.gov (United States)

    Rizzo, Alessandro A; Suhanovsky, Margaret M; Baker, Matthew L; Fraser, LaTasha C R; Jones, Lisa M; Rempel, Don L; Gross, Michael L; Chiu, Wah; Alexandrescu, Andrei T; Teschke, Carolyn M

    2014-06-10

    Some capsid proteins built on the ubiquitous HK97-fold have accessory domains imparting specific functions. Bacteriophage P22 coat protein has a unique insertion domain (I-domain). Two prior I-domain models from subnanometer cryoelectron microscopy (cryoEM) reconstructions differed substantially. Therefore, the I-domain's nuclear magnetic resonance structure was determined and also used to improve cryoEM models of coat protein. The I-domain has an antiparallel six-stranded β-barrel fold, not previously observed in HK97-fold accessory domains. The D-loop, which is dynamic in the isolated I-domain and intact monomeric coat protein, forms stabilizing salt bridges between adjacent capsomers in procapsids. The S-loop is important for capsid size determination, likely through intrasubunit interactions. Ten of 18 coat protein temperature-sensitive-folding substitutions are in the I-domain, indicating its importance in folding and stability. Several are found on a positively charged face of the β-barrel that anchors the I-domain to a negatively charged surface of the coat protein HK97-core. Copyright © 2014 Elsevier Ltd. All rights reserved.

  5. Incorporating Modeling and Simulations in Undergraduate Biophysical Chemistry Course to Promote Understanding of Structure-Dynamics-Function Relationships in Proteins

    Science.gov (United States)

    Hati, Sanchita; Bhattacharyya, Sudeep

    2016-01-01

    A project-based biophysical chemistry laboratory course, which is offered to the biochemistry and molecular biology majors in their senior year, is described. In this course, the classroom study of the structure-function of biomolecules is integrated with the discovery-guided laboratory study of these molecules using computer modeling and…

  6. Modularity in protein structures: study on all-alpha proteins.

    Science.gov (United States)

    Khan, Taushif; Ghosh, Indira

    2015-01-01

    Modularity is known as one of the most important features of protein's robust and efficient design. The architecture and topology of proteins play a vital role by providing necessary robust scaffolds to support organism's growth and survival in constant evolutionary pressure. These complex biomolecules can be represented by several layers of modular architecture, but it is pivotal to understand and explore the smallest biologically relevant structural component. In the present study, we have developed a component-based method, using protein's secondary structures and their arrangements (i.e. patterns) in order to investigate its structural space. Our result on all-alpha protein shows that the known structural space is highly populated with limited set of structural patterns. We have also noticed that these frequently observed structural patterns are present as modules or "building blocks" in large proteins (i.e. higher secondary structure content). From structural descriptor analysis, observed patterns are found to be within similar deviation; however, frequent patterns are found to be distinctly occurring in diverse functions e.g. in enzymatic classes and reactions. In this study, we are introducing a simple approach to explore protein structural space using combinatorial- and graph-based geometry methods, which can be used to describe modularity in protein structures. Moreover, analysis indicates that protein function seems to be the driving force that shapes the known structure space.

  7. Protein enriched pasta: structure and digestibility of its protein network.

    Science.gov (United States)

    Laleg, Karima; Barron, Cécile; Santé-Lhoutellier, Véronique; Walrand, Stéphane; Micard, Valérie

    2016-02-01

    Wheat (W) pasta was enriched in 6% gluten (G), 35% faba (F) or 5% egg (E) to increase its protein content (13% to 17%). The impact of the enrichment on the multiscale structure of the pasta and on in vitro protein digestibility was studied. Increasing the protein content (W- vs. G-pasta) strengthened pasta structure at molecular and macroscopic scales but reduced its protein digestibility by 3% by forming a higher covalently linked protein network. Greater changes in the macroscopic and molecular structure of the pasta were obtained by varying the nature of protein used for enrichment. Proteins in G- and E-pasta were highly covalently linked (28-32%) resulting in a strong pasta structure. Conversely, F-protein (98% SDS-soluble) altered the pasta structure by diluting gluten and formed a weak protein network (18% covalent link). As a result, protein digestibility in F-pasta was significantly higher (46%) than in E- (44%) and G-pasta (39%). The effect of low (55 °C, LT) vs. very high temperature (90 °C, VHT) drying on the protein network structure and digestibility was shown to cause greater molecular changes than pasta formulation. Whatever the pasta, a general strengthening of its structure, a 33% to 47% increase in covalently linked proteins and a higher β-sheet structure were observed. However, these structural differences were evened out after the pasta was cooked, resulting in identical protein digestibility in LT and VHT pasta. Even after VHT drying, F-pasta had the best amino acid profile with the highest protein digestibility, proof of its nutritional interest.

  8. Neural Networks for protein Structure Prediction

    DEFF Research Database (Denmark)

    Bohr, Henrik

    1998-01-01

    This is a review about neural network applications in bioinformatics. Especially the applications to protein structure prediction, e.g. prediction of secondary structures, prediction of surface structure, fold class recognition and prediction of the 3-dimensional structure of protein backbones...

  9. Structure-based barcoding of proteins.

    Science.gov (United States)

    Metri, Rahul; Jerath, Gaurav; Kailas, Govind; Gacche, Nitin; Pal, Adityabarna; Ramakrishnan, Vibin

    2014-01-01

    A reduced representation in the format of a barcode has been developed to provide an overview of the topological nature of a given protein structure from 3D coordinate file. The molecular structure of a protein coordinate file from Protein Data Bank is first expressed in terms of an alpha-numero code and further converted to a barcode image. The barcode representation can be used to compare and contrast different proteins based on their structure. The utility of this method has been exemplified by comparing structural barcodes of proteins that belong to same fold family, and across different folds. In addition to this, we have attempted to provide an illustration to (i) the structural changes often seen in a given protein molecule upon interaction with ligands and (ii) Modifications in overall topology of a given protein during evolution. The program is fully downloadable from the website http://www.iitg.ac.in/probar/. © 2013 The Protein Society.

  10. GPCR-I-TASSER: A Hybrid Approach to G Protein-Coupled Receptor Structure Modeling and the Application to the Human Genome.

    Science.gov (United States)

    Zhang, Jian; Yang, Jianyi; Jang, Richard; Zhang, Yang

    2015-08-04

    Experimental structure determination remains difficult for G protein-coupled receptors (GPCRs). We propose a new hybrid protocol to construct GPCR structure models that integrates experimental mutagenesis data with ab initio transmembrane (TM) helix assembly simulations. The method was tested on 24 known GPCRs where the ab initio TM-helix assembly procedure constructed the correct fold for 20 cases. When combined with weak homology and sparse mutagenesis restraints, the method generated correct folds for all the tested cases with an average Cα root-mean-square deviation 2.4 Å in the TM regions. The new hybrid protocol was applied to model all 1,026 GPCRs in the human genome, where 923 have a high confidence score and are expected to have correct folds; these contain many pharmaceutically important families with no previously solved structures, including Trace amine, Prostanoids, Releasing hormones, Melanocortins, Vasopressin, and Neuropeptide Y receptors. The results demonstrate new progress on genome-wide structure modeling of TM proteins. Copyright © 2015 Elsevier Ltd. All rights reserved.

  11. Threading structural model of the manganese-stabilizing protein PsbO reveals presence of two possible beta-sandwich domains.

    Science.gov (United States)

    Pazos, F; Heredia, P; Valencia, A; de las Rivas, J

    2001-12-01

    The manganese-stabilizing protein (PsbO) is an essential component of photosystem II (PSII) and is present in all oxyphotosynthetic organisms. PsbO allows correct water splitting and oxygen evolution by stabilizing the reactions driven by the manganese cluster. Despite its important role, its structure and detailed functional mechanism are still unknown. In this article we propose a structural model based on fold recognition and molecular modeling. This model has additional support from a study of the distribution of characteristics of the PsbO sequence family, such as the distribution of conserved, apolar, tree-determinants, and correlated positions. Our threading results consistently showed PsbO as an all-beta (beta) protein, with two homologous beta domains of approximately 120 amino acids linked by a flexible Proline-Glycine-Glycine (PGG) motif. These features are compatible with a general elongated and flexible architecture, in which the two domains form a sandwich-type structure with Greek key topology. The first domain is predicted to include 8 to 9 beta-strands, the second domain 6 to 7 beta-strands. An Ig-like beta-sandwich structure was selected as a template to build the 3-D model. The second domain has, between the strands, long-loops rich in Pro and Gly that are difficult to model. One of these long loops includes a highly conserved region (between P148 and P174) and a short alpha-helix (between E181 and N188)). These regions are characteristic parts of PsbO and show that the second domain is not so similar to the template. Overall, the model was able to account for much of the experimental data reported by several authors, and it would allow the detection of key residues and regions that are proposed in this article as essential for the structure and function of PsbO. Copyright 2001 Wiley-Liss, Inc.

  12. Markov State Models Reveal a Two-Step Mechanism of miRNA Loading into the Human Argonaute Protein: Selective Binding followed by Structural Re-arrangement

    KAUST Repository

    Jiang, Hanlun

    2015-07-16

    Argonaute (Ago) proteins and microRNAs (miRNAs) are central components in RNA interference, which is a key cellular mechanism for sequence-specific gene silencing. Despite intensive studies, molecular mechanisms of how Ago recognizes miRNA remain largely elusive. In this study, we propose a two-step mechanism for this molecular recognition: selective binding followed by structural re-arrangement. Our model is based on the results of a combination of Markov State Models (MSMs), large-scale protein-RNA docking, and molecular dynamics (MD) simulations. Using MSMs, we identify an open state of apo human Ago-2 in fast equilibrium with partially open and closed states. Conformations in this open state are distinguished by their largely exposed binding grooves that can geometrically accommodate miRNA as indicated in our protein-RNA docking studies. miRNA may then selectively bind to these open conformations. Upon the initial binding, the complex may perform further structural re-arrangement as shown in our MD simulations and eventually reach the stable binary complex structure. Our results provide novel insights in Ago-miRNA recognition mechanisms and our methodology holds great potential to be widely applied in the studies of other important molecular recognition systems.

  13. Markov State Models Reveal a Two-Step Mechanism of miRNA Loading into the Human Argonaute Protein: Selective Binding followed by Structural Re-arrangement

    KAUST Repository

    Jiang, Hanlun; Sheong, Fu Kit; Zhu, Lizhe; Gao, Xin; Bernauer, Julie; Huang, Xuhui

    2015-01-01

    Argonaute (Ago) proteins and microRNAs (miRNAs) are central components in RNA interference, which is a key cellular mechanism for sequence-specific gene silencing. Despite intensive studies, molecular mechanisms of how Ago recognizes miRNA remain largely elusive. In this study, we propose a two-step mechanism for this molecular recognition: selective binding followed by structural re-arrangement. Our model is based on the results of a combination of Markov State Models (MSMs), large-scale protein-RNA docking, and molecular dynamics (MD) simulations. Using MSMs, we identify an open state of apo human Ago-2 in fast equilibrium with partially open and closed states. Conformations in this open state are distinguished by their largely exposed binding grooves that can geometrically accommodate miRNA as indicated in our protein-RNA docking studies. miRNA may then selectively bind to these open conformations. Upon the initial binding, the complex may perform further structural re-arrangement as shown in our MD simulations and eventually reach the stable binary complex structure. Our results provide novel insights in Ago-miRNA recognition mechanisms and our methodology holds great potential to be widely applied in the studies of other important molecular recognition systems.

  14. Bayesian comparison of protein structures using partial Procrustes distance.

    Science.gov (United States)

    Ejlali, Nasim; Faghihi, Mohammad Reza; Sadeghi, Mehdi

    2017-09-26

    An important topic in bioinformatics is the protein structure alignment. Some statistical methods have been proposed for this problem, but most of them align two protein structures based on the global geometric information without considering the effect of neighbourhood in the structures. In this paper, we provide a Bayesian model to align protein structures, by considering the effect of both local and global geometric information of protein structures. Local geometric information is incorporated to the model through the partial Procrustes distance of small substructures. These substructures are composed of β-carbon atoms from the side chains. Parameters are estimated using a Markov chain Monte Carlo (MCMC) approach. We evaluate the performance of our model through some simulation studies. Furthermore, we apply our model to a real dataset and assess the accuracy and convergence rate. Results show that our model is much more efficient than previous approaches.

  15. Implementation of a Parallel Protein Structure Alignment Service on Cloud

    Directory of Open Access Journals (Sweden)

    Che-Lun Hung

    2013-01-01

    Full Text Available Protein structure alignment has become an important strategy by which to identify evolutionary relationships between protein sequences. Several alignment tools are currently available for online comparison of protein structures. In this paper, we propose a parallel protein structure alignment service based on the Hadoop distribution framework. This service includes a protein structure alignment algorithm, a refinement algorithm, and a MapReduce programming model. The refinement algorithm refines the result of alignment. To process vast numbers of protein structures in parallel, the alignment and refinement algorithms are implemented using MapReduce. We analyzed and compared the structure alignments produced by different methods using a dataset randomly selected from the PDB database. The experimental results verify that the proposed algorithm refines the resulting alignments more accurately than existing algorithms. Meanwhile, the computational performance of the proposed service is proportional to the number of processors used in our cloud platform.

  16. SDSL-ESR-based protein structure characterization

    NARCIS (Netherlands)

    Strancar, J.; Kavalenka, A.A.; Urbancic, I.; Ljubetic, A.; Hemminga, M.A.

    2010-01-01

    As proteins are key molecules in living cells, knowledge about their structure can provide important insights and applications in science, biotechnology, and medicine. However, many protein structures are still a big challenge for existing high-resolution structure-determination methods, as can be

  17. Structural model of the hUbA1-UbcH10 quaternary complex: in silico and experimental analysis of the protein-protein interactions between E1, E2 and ubiquitin.

    Directory of Open Access Journals (Sweden)

    Stefania Correale

    Full Text Available UbcH10 is a component of the Ubiquitin Conjugation Enzymes (Ubc; E2 involved in the ubiquitination cascade controlling the cell cycle progression, whereby ubiquitin, activated by E1, is transferred through E2 to the target protein with the involvement of E3 enzymes. In this work we propose the first three dimensional model of the tetrameric complex formed by the human UbA1 (E1, two ubiquitin molecules and UbcH10 (E2, leading to the transthiolation reaction. The 3D model was built up by using an experimentally guided incremental docking strategy that combined homology modeling, protein-protein docking and refinement by means of molecular dynamics simulations. The structural features of the in silico model allowed us to identify the regions that mediate the recognition between the interacting proteins, revealing the active role of the ubiquitin crosslinked to E1 in the complex formation. Finally, the role of these regions involved in the E1-E2 binding was validated by designing short peptides that specifically interfere with the binding of UbcH10, thus supporting the reliability of the proposed model and representing valuable scaffolds for the design of peptidomimetic compounds that can bind selectively to Ubcs and inhibit the ubiquitylation process in pathological disorders.

  18. Overcoming barriers to membrane protein structure determination.

    Science.gov (United States)

    Bill, Roslyn M; Henderson, Peter J F; Iwata, So; Kunji, Edmund R S; Michel, Hartmut; Neutze, Richard; Newstead, Simon; Poolman, Bert; Tate, Christopher G; Vogel, Horst

    2011-04-01

    After decades of slow progress, the pace of research on membrane protein structures is beginning to quicken thanks to various improvements in technology, including protein engineering and microfocus X-ray diffraction. Here we review these developments and, where possible, highlight generic new approaches to solving membrane protein structures based on recent technological advances. Rational approaches to overcoming the bottlenecks in the field are urgently required as membrane proteins, which typically comprise ~30% of the proteomes of organisms, are dramatically under-represented in the structural database of the Protein Data Bank.

  19. Structural characterization of respiratory syncytial virus fusion inhibitor escape mutants: homology model of the F protein and a syncytium formation assay

    International Nuclear Information System (INIS)

    Morton, Craig J.; Cameron, Rachel; Lawrence, Lynne J.; Lin Bo; Lowe, Melinda; Luttick, Angela; Mason, Anthony; McKimm-Breschkin, Jenny; Parker, Michael W.; Ryan, Jane; Smout, Michael; Sullivan, Jayne; Tucker, Simon P.; Young, Paul R.

    2003-01-01

    Respiratory syncytial virus (RSV) is a ubiquitous human pathogen and the leading cause of lower respiratory tract infections in infants. Infection of cells and subsequent formation of syncytia occur through membrane fusion mediated by the RSV fusion protein (RSV-F). A novel in vitro assay of recombinant RSV-F function has been devised and used to characterize a number of escape mutants for three known inhibitors of RSV-F that have been isolated. Homology modeling of the RSV-F structure has been carried out on the basis of a chimera derived from the crystal structures of the RSV-F core and a fragment from the orthologous fusion protein from Newcastle disease virus (NDV). The structure correlates well with the appearance of RSV-F in electron micrographs, and the residues identified as contributing to specific binding sites for several monoclonal antibodies are arranged in appropriate solvent-accessible clusters. The positions of the characterized resistance mutants in the model structure identify two promising regions for the design of fusion inhibitors

  20. The PMDB Protein Model Database

    Science.gov (United States)

    Castrignanò, Tiziana; De Meo, Paolo D'Onorio; Cozzetto, Domenico; Talamo, Ivano Giuseppe; Tramontano, Anna

    2006-01-01

    The Protein Model Database (PMDB) is a public resource aimed at storing manually built 3D models of proteins. The database is designed to provide access to models published in the scientific literature, together with validating experimental data. It is a relational database and it currently contains >74 000 models for ∼240 proteins. The system is accessible at and allows predictors to submit models along with related supporting evidence and users to download them through a simple and intuitive interface. Users can navigate in the database and retrieve models referring to the same target protein or to different regions of the same protein. Each model is assigned a unique identifier that allows interested users to directly access the data. PMID:16381873

  1. De novo protein structure determination using sparse NMR data

    International Nuclear Information System (INIS)

    Bowers, Peter M.; Strauss, Charlie E.M.; Baker, David

    2000-01-01

    We describe a method for generating moderate to high-resolution protein structures using limited NMR data combined with the ab initio protein structure prediction method Rosetta. Peptide fragments are selected from proteins of known structure based on sequence similarity and consistency with chemical shift and NOE data. Models are built from these fragments by minimizing an energy function that favors hydrophobic burial, strand pairing, and satisfaction of NOE constraints. Models generated using this procedure with ∼1 NOE constraint per residue are in some cases closer to the corresponding X-ray structures than the published NMR solution structures. The method requires only the sparse constraints available during initial stages of NMR structure determination, and thus holds promise for increasing the speed with which protein solution structures can be determined

  2. PSAIA – Protein Structure and Interaction Analyzer

    Directory of Open Access Journals (Sweden)

    Vlahoviček Kristian

    2008-04-01

    Full Text Available Abstract Background PSAIA (Protein Structure and Interaction Analyzer was developed to compute geometric parameters for large sets of protein structures in order to predict and investigate protein-protein interaction sites. Results In addition to most relevant established algorithms, PSAIA offers a new method PIADA (Protein Interaction Atom Distance Algorithm for the determination of residue interaction pairs. We found that PIADA produced more satisfactory results than comparable algorithms implemented in PSAIA. Particular advantages of PSAIA include its capacity to combine different methods to detect the locations and types of interactions between residues and its ability, without any further automation steps, to handle large numbers of protein structures and complexes. Generally, the integration of a variety of methods enables PSAIA to offer easier automation of analysis and greater reliability of results. PSAIA can be used either via a graphical user interface or from the command-line. Results are generated in either tabular or XML format. Conclusion In a straightforward fashion and for large sets of protein structures, PSAIA enables the calculation of protein geometric parameters and the determination of location and type for protein-protein interaction sites. XML formatted output enables easy conversion of results to various formats suitable for statistic analysis. Results from smaller data sets demonstrated the influence of geometry on protein interaction sites. Comprehensive analysis of properties of large data sets lead to new information useful in the prediction of protein-protein interaction sites.

  3. Prediction of protein–protein interactions: unifying evolution and structure at protein interfaces

    International Nuclear Information System (INIS)

    Tuncbag, Nurcan; Gursoy, Attila; Keskin, Ozlem

    2011-01-01

    The vast majority of the chores in the living cell involve protein–protein interactions. Providing details of protein interactions at the residue level and incorporating them into protein interaction networks are crucial toward the elucidation of a dynamic picture of cells. Despite the rapid increase in the number of structurally known protein complexes, we are still far away from a complete network. Given experimental limitations, computational modeling of protein interactions is a prerequisite to proceed on the way to complete structural networks. In this work, we focus on the question 'how do proteins interact?' rather than 'which proteins interact?' and we review structure-based protein–protein interaction prediction approaches. As a sample approach for modeling protein interactions, PRISM is detailed which combines structural similarity and evolutionary conservation in protein interfaces to infer structures of complexes in the protein interaction network. This will ultimately help us to understand the role of protein interfaces in predicting bound conformations

  4. Solution NMR structure determination of proteins revisited

    International Nuclear Information System (INIS)

    Billeter, Martin; Wagner, Gerhard; Wuethrich, Kurt

    2008-01-01

    This 'Perspective' bears on the present state of protein structure determination by NMR in solution. The focus is on a comparison of the infrastructure available for NMR structure determination when compared to protein crystal structure determination by X-ray diffraction. The main conclusion emerges that the unique potential of NMR to generate high resolution data also on dynamics, interactions and conformational equilibria has contributed to a lack of standard procedures for structure determination which would be readily amenable to improved efficiency by automation. To spark renewed discussion on the topic of NMR structure determination of proteins, procedural steps with high potential for improvement are identified

  5. Extracting knowledge from protein structure geometry

    DEFF Research Database (Denmark)

    Røgen, Peter; Koehl, Patrice

    2013-01-01

    potential from geometric knowledge extracted from native and misfolded conformers of protein structures. This new potential, Metric Protein Potential (MPP), has two main features that are key to its success. Firstly, it is composite in that it includes local and nonlocal geometric information on proteins...

  6. Simulation of Protein Structure, Dynamics and Function in Organic Media

    National Research Council Canada - National Science Library

    Daggett, Valerie

    1998-01-01

    The overall goal of our ONR-sponsored research is to pursue realistic molecular modeling strudies pertinnent to the related properties of protein stability, dynamics, structure, function, and folding in aqueous solution...

  7. PSPP: a protein structure prediction pipeline for computing clusters.

    Directory of Open Access Journals (Sweden)

    Michael S Lee

    2009-07-01

    Full Text Available Protein structures are critical for understanding the mechanisms of biological systems and, subsequently, for drug and vaccine design. Unfortunately, protein sequence data exceed structural data by a factor of more than 200 to 1. This gap can be partially filled by using computational protein structure prediction. While structure prediction Web servers are a notable option, they often restrict the number of sequence queries and/or provide a limited set of prediction methodologies. Therefore, we present a standalone protein structure prediction software package suitable for high-throughput structural genomic applications that performs all three classes of prediction methodologies: comparative modeling, fold recognition, and ab initio. This software can be deployed on a user's own high-performance computing cluster.The pipeline consists of a Perl core that integrates more than 20 individual software packages and databases, most of which are freely available from other research laboratories. The query protein sequences are first divided into domains either by domain boundary recognition or Bayesian statistics. The structures of the individual domains are then predicted using template-based modeling or ab initio modeling. The predicted models are scored with a statistical potential and an all-atom force field. The top-scoring ab initio models are annotated by structural comparison against the Structural Classification of Proteins (SCOP fold database. Furthermore, secondary structure, solvent accessibility, transmembrane helices, and structural disorder are predicted. The results are generated in text, tab-delimited, and hypertext markup language (HTML formats. So far, the pipeline has been used to study viral and bacterial proteomes.The standalone pipeline that we introduce here, unlike protein structure prediction Web servers, allows users to devote their own computing assets to process a potentially unlimited number of queries as well as perform

  8. Molecular Characterization, Structural Modeling, and Evaluation of Antimicrobial Activity of Basrai Thaumatin-Like Protein against Fungal Infection

    Directory of Open Access Journals (Sweden)

    Nusrat Yasmin

    2017-01-01

    Full Text Available A thaumatin-like protein gene from Basrai banana was cloned and expressed in Escherichia coli. Amplified gene product was cloned into pTZ57R/T vector and subcloned into expression vector pET22b(+ and resulting pET22b-basrai TLP construct was introduced into E. coli BL21. Maximum protein expression was obtained at 0.7 mM IPTG concentration after 6 hours at 37°C. Western blot analysis showed the presence of approximately 20 kDa protein in induced cells. Basrai antifungal TLP was tried as pharmacological agent against fungal disease. Independently Basrai antifungal protein and amphotericin B exhibited their antifungal activity against A. fumigatus; however combined effect of both agents maximized activity against the pathogen. Docking studies were performed to evaluate the antimicrobial potential of TLP against A. fumigatus by probing binding pattern of antifungal protein with plasma membrane ergosterol of targeted fungal strain. Ice crystallization primarily damages frozen food items; however addition of antifreeze proteins limits the growth of ice crystal in frozen foods. The potential of Basrai TLP protein, as an antifreezing agent, in controlling the ice crystal formation in frozen yogurt was also studied. The scope of this study ranges from cost effective production of pharmaceutics to antifreezing and food preserving agent as well as other real life applications.

  9. Heterochiral Knottin Protein: Folding and Solution Structure.

    Science.gov (United States)

    Mong, Surin K; Cochran, Frank V; Yu, Hongtao; Graziano, Zachary; Lin, Yu-Shan; Cochran, Jennifer R; Pentelute, Bradley L

    2017-10-31

    Homochirality is a general feature of biological macromolecules, and Nature includes few examples of heterochiral proteins. Herein, we report on the design, chemical synthesis, and structural characterization of heterochiral proteins possessing loops of amino acids of chirality opposite to that of the rest of a protein scaffold. Using the protein Ecballium elaterium trypsin inhibitor II, we discover that selective β-alanine substitution favors the efficient folding of our heterochiral constructs. Solution nuclear magnetic resonance spectroscopy of one such heterochiral protein reveals a homogeneous global fold. Additionally, steered molecular dynamics simulation indicate β-alanine reduces the free energy required to fold the protein. We also find these heterochiral proteins to be more resistant to proteolysis than homochiral l-proteins. This work informs the design of heterochiral protein architectures containing stretches of both d- and l-amino acids.

  10. Alpha complexes in protein structure prediction

    DEFF Research Database (Denmark)

    Winter, Pawel; Fonseca, Rasmus

    2015-01-01

    Reducing the computational effort and increasing the accuracy of potential energy functions is of utmost importance in modeling biological systems, for instance in protein structure prediction, docking or design. Evaluating interactions between nonbonded atoms is the bottleneck of such computations......-complexes from scratch for every configuration encountered during the search for the native structure would make this approach hopelessly slow. However, it is argued that kinetic a-complexes can be used to reduce the computational effort of determining the potential energy when "moving" from one configuration...... to a neighboring one. As a consequence, relatively expensive (initial) construction of an a-complex is expected to be compensated by subsequent fast kinetic updates during the search process. Computational results presented in this paper are limited. However, they suggest that the applicability of a...

  11. Structure of synaptophysin: a hexameric MARVEL-domain channel protein.

    Science.gov (United States)

    Arthur, Christopher P; Stowell, Michael H B

    2007-06-01

    Synaptophysin I (SypI) is an archetypal member of the MARVEL-domain family of integral membrane proteins and one of the first synaptic vesicle proteins to be identified and cloned. Most all MARVEL-domain proteins are involved in membrane apposition and vesicle-trafficking events, but their precise role in these processes is unclear. We have purified mammalian SypI and determined its three-dimensional (3D) structure by using electron microscopy and single-particle 3D reconstruction. The hexameric structure resembles an open basket with a large pore and tenuous interactions within the cytosolic domain. The structure suggests a model for Synaptophysin's role in fusion and recycling that is regulated by known interactions with the SNARE machinery. This 3D structure of a MARVEL-domain protein provides a structural foundation for understanding the role of these important proteins in a variety of biological processes.

  12. K-nearest uphill clustering in the protein structure space

    KAUST Repository

    Cui, Xuefeng; Gao, Xin

    2016-01-01

    The protein structure classification problem, which is to assign a protein structure to a cluster of similar proteins, is one of the most fundamental problems in the construction and application of the protein structure space. Early manually curated

  13. Automated protein structure calculation from NMR data

    International Nuclear Information System (INIS)

    Williamson, Mike P.; Craven, C. Jeremy

    2009-01-01

    Current software is almost at the stage to permit completely automatic structure determination of small proteins of <15 kDa, from NMR spectra to structure validation with minimal user interaction. This goal is welcome, as it makes structure calculation more objective and therefore more easily validated, without any loss in the quality of the structures generated. Moreover, it releases expert spectroscopists to carry out research that cannot be automated. It should not take much further effort to extend automation to ca 20 kDa. However, there are technological barriers to further automation, of which the biggest are identified as: routines for peak picking; adoption and sharing of a common framework for structure calculation, including the assembly of an automated and trusted package for structure validation; and sample preparation, particularly for larger proteins. These barriers should be the main target for development of methodology for protein structure determination, particularly by structural genomics consortia

  14. Structural anatomy of telomere OB proteins.

    Science.gov (United States)

    Horvath, Martin P

    2011-10-01

    Telomere DNA-binding proteins protect the ends of chromosomes in eukaryotes. A subset of these proteins are constructed with one or more OB folds and bind with G+T-rich single-stranded DNA found at the extreme termini. The resulting DNA-OB protein complex interacts with other telomere components to coordinate critical telomere functions of DNA protection and DNA synthesis. While the first crystal and NMR structures readily explained protection of telomere ends, the picture of how single-stranded DNA becomes available to serve as primer and template for synthesis of new telomere DNA is only recently coming into focus. New structures of telomere OB fold proteins alongside insights from genetic and biochemical experiments have made significant contributions towards understanding how protein-binding OB proteins collaborate with DNA-binding OB proteins to recruit telomerase and DNA polymerase for telomere homeostasis. This review surveys telomere OB protein structures alongside highly comparable structures derived from replication protein A (RPA) components, with the goal of providing a molecular context for understanding telomere OB protein evolution and mechanism of action in protection and synthesis of telomere DNA.

  15. Understanding Protein-Protein Interactions Using Local Structural Features

    DEFF Research Database (Denmark)

    Planas-Iglesias, Joan; Bonet, Jaume; García-García, Javier

    2013-01-01

    Protein-protein interactions (PPIs) play a relevant role among the different functions of a cell. Identifying the PPI network of a given organism (interactome) is useful to shed light on the key molecular mechanisms within a biological system. In this work, we show the role of structural features...... interacting and non-interacting protein pairs to classify the structural features that sustain the binding (or non-binding) behavior. Our study indicates that not only the interacting region but also the rest of the protein surface are important for the interaction fate. The interpretation...... to score the likelihood of the interaction between two proteins and to develop a method for the prediction of PPIs. We have tested our method on several sets with unbalanced ratios of interactions and non-interactions to simulate real conditions, obtaining accuracies higher than 25% in the most unfavorable...

  16. Modelling of proteins in membranes

    DEFF Research Database (Denmark)

    Sperotto, Maria Maddalena; May, S.; Baumgaertner, A.

    2006-01-01

    This review describes some recent theories and simulations of mesoscopic and microscopic models of lipid membranes with embedded or attached proteins. We summarize results supporting our understanding of phenomena for which the activities of proteins in membranes are expected to be significantly ...

  17. Coarse-grain modelling of protein-protein interactions

    NARCIS (Netherlands)

    Baaden, Marc; Marrink, Siewert J.

    2013-01-01

    Here, we review recent advances towards the modelling of protein-protein interactions (PPI) at the coarse-grained (CG) level, a technique that is now widely used to understand protein affinity, aggregation and self-assembly behaviour. PPI models of soluble proteins and membrane proteins are

  18. Structural symmetry and protein function.

    Science.gov (United States)

    Goodsell, D S; Olson, A J

    2000-01-01

    The majority of soluble and membrane-bound proteins in modern cells are symmetrical oligomeric complexes with two or more subunits. The evolutionary selection of symmetrical oligomeric complexes is driven by functional, genetic, and physicochemical needs. Large proteins are selected for specific morphological functions, such as formation of rings, containers, and filaments, and for cooperative functions, such as allosteric regulation and multivalent binding. Large proteins are also more stable against denaturation and have a reduced surface area exposed to solvent when compared with many individual, smaller proteins. Large proteins are constructed as oligomers for reasons of error control in synthesis, coding efficiency, and regulation of assembly. Symmetrical oligomers are favored because of stability and finite control of assembly. Several functions limit symmetry, such as interaction with DNA or membranes, and directional motion. Symmetry is broken or modified in many forms: quasisymmetry, in which identical subunits adopt similar but different conformations; pleomorphism, in which identical subunits form different complexes; pseudosymmetry, in which different molecules form approximately symmetrical complexes; and symmetry mismatch, in which oligomers of different symmetries interact along their respective symmetry axes. Asymmetry is also observed at several levels. Nearly all complexes show local asymmetry at the level of side chain conformation. Several complexes have reciprocating mechanisms in which the complex is asymmetric, but, over time, all subunits cycle through the same set of conformations. Global asymmetry is only rarely observed. Evolution of oligomeric complexes may favor the formation of dimers over complexes with higher cyclic symmetry, through a mechanism of prepositioned pairs of interacting residues. However, examples have been found for all of the crystallographic point groups, demonstrating that functional need can drive the evolution of

  19. Efficient protein structure search using indexing methods.

    Science.gov (United States)

    Kim, Sungchul; Sael, Lee; Yu, Hwanjo

    2013-01-01

    Understanding functions of proteins is one of the most important challenges in many studies of biological processes. The function of a protein can be predicted by analyzing the functions of structurally similar proteins, thus finding structurally similar proteins accurately and efficiently from a large set of proteins is crucial. A protein structure can be represented as a vector by 3D-Zernike Descriptor (3DZD) which compactly represents the surface shape of the protein tertiary structure. This simplified representation accelerates the searching process. However, computing the similarity of two protein structures is still computationally expensive, thus it is hard to efficiently process many simultaneous requests of structurally similar protein search. This paper proposes indexing techniques which substantially reduce the search time to find structurally similar proteins. In particular, we first exploit two indexing techniques, i.e., iDistance and iKernel, on the 3DZDs. After that, we extend the techniques to further improve the search speed for protein structures. The extended indexing techniques build and utilize an reduced index constructed from the first few attributes of 3DZDs of protein structures. To retrieve top-k similar structures, top-10 × k similar structures are first found using the reduced index, and top-k structures are selected among them. We also modify the indexing techniques to support θ-based nearest neighbor search, which returns data points less than θ to the query point. The results show that both iDistance and iKernel significantly enhance the searching speed. In top-k nearest neighbor search, the searching time is reduced 69.6%, 77%, 77.4% and 87.9%, respectively using iDistance, iKernel, the extended iDistance, and the extended iKernel. In θ-based nearest neighbor serach, the searching time is reduced 80%, 81%, 95.6% and 95.6% using iDistance, iKernel, the extended iDistance, and the extended iKernel, respectively.

  20. Protein structure: geometry, topology and classification

    Energy Technology Data Exchange (ETDEWEB)

    Taylor, William R.; May, Alex C.W.; Brown, Nigel P.; Aszodi, Andras [Division of Mathematical Biology, National Institute for Medical Research, London (United Kingdom)

    2001-04-01

    The structural principals of proteins are reviewed and analysed from a geometric perspective with a view to revealing the underlying regularities in their construction. Computer methods for the automatic comparison and classification of these structures are then reviewed with an analysis of the statistical significance of comparing different shapes. Following an analysis of the current state of the classification of proteins, more abstract geometric and topological representations are explored, including the occurrence of knotted topologies. The review concludes with a consideration of the origin of higher-level symmetries in protein structure. (author)

  1. Taking advantage of local structure descriptors to analyze interresidue contacts in protein structures and protein complexes.

    Science.gov (United States)

    Martin, Juliette; Regad, Leslie; Etchebest, Catherine; Camproux, Anne-Claude

    2008-11-15

    Interresidue protein contacts in proteins structures and at protein-protein interface are classically described by the amino acid types of interacting residues and the local structural context of the contact, if any, is described using secondary structures. In this study, we present an alternate analysis of interresidue contact using local structures defined by the structural alphabet introduced by Camproux et al. This structural alphabet allows to describe a 3D structure as a sequence of prototype fragments called structural letters, of 27 different types. Each residue can then be assigned to a particular local structure, even in loop regions. The analysis of interresidue contacts within protein structures defined using Voronoï tessellations reveals that pairwise contact specificity is greater in terms of structural letters than amino acids. Using a simple heuristic based on specificity score comparison, we find that 74% of the long-range contacts within protein structures are better described using structural letters than amino acid types. The investigation is extended to a set of protein-protein complexes, showing that the similar global rules apply as for intraprotein contacts, with 64% of the interprotein contacts best described by local structures. We then present an evaluation of pairing functions integrating structural letters to decoy scoring and show that some complexes could benefit from the use of structural letter-based pairing functions.

  2. Simultaneous determination of protein structure and dynamics

    DEFF Research Database (Denmark)

    Lindorff-Larsen, Kresten; Best, Robert B.; DePristo, M. A.

    2005-01-01

    at the atomic level about the structural and dynamical features of proteins-with the ability of molecular dynamics simulations to explore a wide range of protein conformations. We illustrate the method for human ubiquitin in solution and find that there is considerable conformational heterogeneity throughout......We present a protocol for the experimental determination of ensembles of protein conformations that represent simultaneously the native structure and its associated dynamics. The procedure combines the strengths of nuclear magnetic resonance spectroscopy-for obtaining experimental information...... the protein structure. The interior atoms of the protein are tightly packed in each individual conformation that contributes to the ensemble but their overall behaviour can be described as having a significant degree of liquid-like character. The protocol is completely general and should lead to significant...

  3. Structural and Function Prediction of Musa acuminata subsp. Malaccensis Protein

    Directory of Open Access Journals (Sweden)

    Anum Munir

    2016-03-01

    Full Text Available Hypothetical proteins (HPs are the proteins whose presence has been anticipated, yet in vivo function has not been built up. Illustrating the structural and functional privileged insights of these HPs might likewise prompt a superior comprehension of the protein-protein associations or networks in diverse types of life. Bananas (Musa acuminata spp., including sweet and cooking types, are giant perennial monocotyledonous herbs of the order Zingiberales, a sister grouped to the all-around considered Poales, which incorporate oats. Bananas are crucial for nourishment security in numerous tropical and subtropical nations and the most prominent organic product in industrialized nations. In the present study, the hypothetical protein of M. acuminata (Banana was chosen for analysis and modeling by distinctive bioinformatics apparatuses and databases. As indicated by primary and secondary structure analysis, XP_009393594.1 is a stable hydrophobic protein containing a noteworthy extent of α-helices; Homology modeling was done utilizing SWISS-MODEL server where the templates identity with XP_009393594.1 protein was less which demonstrated novelty of our protein. Ab initio strategy was conducted to produce its 3D structure. A few evaluations of quality assessment and validation parameters determined the generated protein model as stable with genuinely great quality. Functional analysis was completed by ProtFun 2.2, and KEGG (KAAS, recommended that the hypothetical protein is a transcription factor with cytoplasmic domain as zinc finger. The protein was observed to be vital for translation process, involved in metabolism, signaling and cellular processes, genetic information processing and Zinc ion binding. It is suggested that further test approval would help to anticipate the structures and functions of other uncharacterized proteins of different plants and living being.

  4. Human cancer protein-protein interaction network: a structural perspective.

    Directory of Open Access Journals (Sweden)

    Gozde Kar

    2009-12-01

    Full Text Available Protein-protein interaction networks provide a global picture of cellular function and biological processes. Some proteins act as hub proteins, highly connected to others, whereas some others have few interactions. The dysfunction of some interactions causes many diseases, including cancer. Proteins interact through their interfaces. Therefore, studying the interface properties of cancer-related proteins will help explain their role in the interaction networks. Similar or overlapping binding sites should be used repeatedly in single interface hub proteins, making them promiscuous. Alternatively, multi-interface hub proteins make use of several distinct binding sites to bind to different partners. We propose a methodology to integrate protein interfaces into cancer interaction networks (ciSPIN, cancer structural protein interface network. The interactions in the human protein interaction network are replaced by interfaces, coming from either known or predicted complexes. We provide a detailed analysis of cancer related human protein-protein interfaces and the topological properties of the cancer network. The results reveal that cancer-related proteins have smaller, more planar, more charged and less hydrophobic binding sites than non-cancer proteins, which may indicate low affinity and high specificity of the cancer-related interactions. We also classified the genes in ciSPIN according to phenotypes. Within phenotypes, for breast cancer, colorectal cancer and leukemia, interface properties were found to be discriminating from non-cancer interfaces with an accuracy of 71%, 67%, 61%, respectively. In addition, cancer-related proteins tend to interact with their partners through distinct interfaces, corresponding mostly to multi-interface hubs, which comprise 56% of cancer-related proteins, and constituting the nodes with higher essentiality in the network (76%. We illustrate the interface related affinity properties of two cancer-related hub

  5. Protein Structure and the Sequential Structure of mRNA

    DEFF Research Database (Denmark)

    Brunak, Søren; Engelbrecht, Jacob

    1996-01-01

    entries in the Brookhaven Protein Data Bank produced 719 protein chains with matching mRNA sequence, amino acid sequence, and secondary structure assignment, By neural network analysis, we found strong signals in mRNA sequence regions surrounding helices and sheets, These signals do not originate from......A direct comparison of experimentally determined protein structures and their corresponding protein coding mRNA sequences has been performed, We examine whether real world data support the hypothesis that clusters of rare codons correlate with the location of structural units in the resulting...... protein, The degeneracy of the genetic code allows for a biased selection of codons which may control the translational rate of the ribosome, and may thus in vivo have a catalyzing effect on the folding of the polypeptide chain, A complete search for GenBank nucleotide sequences coding for structural...

  6. Inference of expanded Lrp-like feast/famine transcription factor targets in a non-model organism using protein structure-based prediction.

    Science.gov (United States)

    Ashworth, Justin; Plaisier, Christopher L; Lo, Fang Yin; Reiss, David J; Baliga, Nitin S

    2014-01-01

    Widespread microbial genome sequencing presents an opportunity to understand the gene regulatory networks of non-model organisms. This requires knowledge of the binding sites for transcription factors whose DNA-binding properties are unknown or difficult to infer. We adapted a protein structure-based method to predict the specificities and putative regulons of homologous transcription factors across diverse species. As a proof-of-concept we predicted the specificities and transcriptional target genes of divergent archaeal feast/famine regulatory proteins, several of which are encoded in the genome of Halobacterium salinarum. This was validated by comparison to experimentally determined specificities for transcription factors in distantly related extremophiles, chromatin immunoprecipitation experiments, and cis-regulatory sequence conservation across eighteen related species of halobacteria. Through this analysis we were able to infer that Halobacterium salinarum employs a divergent local trans-regulatory strategy to regulate genes (carA and carB) involved in arginine and pyrimidine metabolism, whereas Escherichia coli employs an operon. The prediction of gene regulatory binding sites using structure-based methods is useful for the inference of gene regulatory relationships in new species that are otherwise difficult to infer.

  7. Dynamic term structure models

    DEFF Research Database (Denmark)

    Andreasen, Martin Møller; Meldrum, Andrew

    This paper studies whether dynamic term structure models for US nominal bond yields should enforce the zero lower bound by a quadratic policy rate or a shadow rate specification. We address the question by estimating quadratic term structure models (QTSMs) and shadow rate models with at most four...

  8. Proteins with Novel Structure, Function and Dynamics

    Science.gov (United States)

    Pohorille, Andrew

    2014-01-01

    Recently, a small enzyme that ligates two RNA fragments with the rate of 10(exp 6) above background was evolved in vitro (Seelig and Szostak, Nature 448:828-831, 2007). This enzyme does not resemble any contemporary protein (Chao et al., Nature Chem. Biol. 9:81-83, 2013). It consists of a dynamic, catalytic loop, a small, rigid core containing two zinc ions coordinated by neighboring amino acids, and two highly flexible tails that might be unimportant for protein function. In contrast to other proteins, this enzyme does not contain ordered secondary structure elements, such as alpha-helix or beta-sheet. The loop is kept together by just two interactions of a charged residue and a histidine with a zinc ion, which they coordinate on the opposite side of the loop. Such structure appears to be very fragile. Surprisingly, computer simulations indicate otherwise. As the coordinating, charged residue is mutated to alanine, another, nearby charged residue takes its place, thus keeping the structure nearly intact. If this residue is also substituted by alanine a salt bridge involving two other, charged residues on the opposite sides of the loop keeps the loop in place. These adjustments are facilitated by high flexibility of the protein. Computational predictions have been confirmed experimentally, as both mutants retain full activity and overall structure. These results challenge our notions about what is required for protein activity and about the relationship between protein dynamics, stability and robustness. We hypothesize that small, highly dynamic proteins could be both active and fault tolerant in ways that many other proteins are not, i.e. they can adjust to retain their structure and activity even if subjected to mutations in structurally critical regions. This opens the doors for designing proteins with novel functions, structures and dynamics that have not been yet considered.

  9. Overcoming barriers to membrane protein structure determination

    NARCIS (Netherlands)

    Bill, Roslyn M.; Henderson, Peter J. F.; Iwata, So; Kunji, Edmund R. S.; Michel, Hartmut; Neutze, Richard; Newstead, Simon; Poolman, Bert; Tate, Christopher G.; Vogel, Horst

    After decades of slow progress, the pace of research on membrane protein structures is beginning to quicken thanks to various improvements in technology, including protein engineering and microfocus X-ray diffraction. Here we review these developments and, where possible, highlight generic new

  10. Protein structural similarity search by Ramachandran codes

    Directory of Open Access Journals (Sweden)

    Chang Chih-Hung

    2007-08-01

    Full Text Available Abstract Background Protein structural data has increased exponentially, such that fast and accurate tools are necessary to access structure similarity search. To improve the search speed, several methods have been designed to reduce three-dimensional protein structures to one-dimensional text strings that are then analyzed by traditional sequence alignment methods; however, the accuracy is usually sacrificed and the speed is still unable to match sequence similarity search tools. Here, we aimed to improve the linear encoding methodology and develop efficient search tools that can rapidly retrieve structural homologs from large protein databases. Results We propose a new linear encoding method, SARST (Structural similarity search Aided by Ramachandran Sequential Transformation. SARST transforms protein structures into text strings through a Ramachandran map organized by nearest-neighbor clustering and uses a regenerative approach to produce substitution matrices. Then, classical sequence similarity search methods can be applied to the structural similarity search. Its accuracy is similar to Combinatorial Extension (CE and works over 243,000 times faster, searching 34,000 proteins in 0.34 sec with a 3.2-GHz CPU. SARST provides statistically meaningful expectation values to assess the retrieved information. It has been implemented into a web service and a stand-alone Java program that is able to run on many different platforms. Conclusion As a database search method, SARST can rapidly distinguish high from low similarities and efficiently retrieve homologous structures. It demonstrates that the easily accessible linear encoding methodology has the potential to serve as a foundation for efficient protein structural similarity search tools. These search tools are supposed applicable to automated and high-throughput functional annotations or predictions for the ever increasing number of published protein structures in this post-genomic era.

  11. A 'periodic table' for protein structures.

    Science.gov (United States)

    Taylor, William R

    2002-04-11

    Current structural genomics programs aim systematically to determine the structures of all proteins coded in both human and other genomes, providing a complete picture of the number and variety of protein structures that exist. In the past, estimates have been made on the basis of the incomplete sample of structures currently known. These estimates have varied greatly (between 1,000 and 10,000; see for example refs 1 and 2), partly because of limited sample size but also owing to the difficulties of distinguishing one structure from another. This distinction is usually topological, based on the fold of the protein; however, in strict topological terms (neglecting to consider intra-chain cross-links), protein chains are open strings and hence are all identical. To avoid this trivial result, topologies are determined by considering secondary links in the form of intra-chain hydrogen bonds (secondary structure) and tertiary links formed by the packing of secondary structures. However, small additions to or loss of structure can make large changes to these perceived topologies and such subjective solutions are neither robust nor amenable to automation. Here I formalize both secondary and tertiary links to allow the rigorous and automatic definition of protein topology.

  12. Integrating protein structures and precomputed genealogies in the Magnum database: Examples with cellular retinoid binding proteins

    Directory of Open Access Journals (Sweden)

    Bradley Michael E

    2006-02-01

    Full Text Available Abstract Background When accurate models for the divergent evolution of protein sequences are integrated with complementary biological information, such as folded protein structures, analyses of the combined data often lead to new hypotheses about molecular physiology. This represents an excellent example of how bioinformatics can be used to guide experimental research. However, progress in this direction has been slowed by the lack of a publicly available resource suitable for general use. Results The precomputed Magnum database offers a solution to this problem for ca. 1,800 full-length protein families with at least one crystal structure. The Magnum deliverables include 1 multiple sequence alignments, 2 mapping of alignment sites to crystal structure sites, 3 phylogenetic trees, 4 inferred ancestral sequences at internal tree nodes, and 5 amino acid replacements along tree branches. Comprehensive evaluations revealed that the automated procedures used to construct Magnum produced accurate models of how proteins divergently evolve, or genealogies, and correctly integrated these with the structural data. To demonstrate Magnum's capabilities, we asked for amino acid replacements requiring three nucleotide substitutions, located at internal protein structure sites, and occurring on short phylogenetic tree branches. In the cellular retinoid binding protein family a site that potentially modulates ligand binding affinity was discovered. Recruitment of cellular retinol binding protein to function as a lens crystallin in the diurnal gecko afforded another opportunity to showcase the predictive value of a browsable database containing branch replacement patterns integrated with protein structures. Conclusion We integrated two areas of protein science, evolution and structure, on a large scale and created a precomputed database, known as Magnum, which is the first freely available resource of its kind. Magnum provides evolutionary and structural

  13. Structural studies of human glioma pathogenesis-related protein 1

    Energy Technology Data Exchange (ETDEWEB)

    Asojo, Oluwatoyin A., E-mail: oasojo@unmc.edu [College of Medicine, Nebraska Medical Center, Omaha, NE 68198-6495 (United States); Koski, Raymond A.; Bonafé, Nathalie [L2 Diagnostics LLC, 300 George Street, New Haven, CT 06511 (United States); College of Medicine, Nebraska Medical Center, Omaha, NE 68198-6495 (United States)

    2011-10-01

    Structural analysis of a truncated soluble domain of human glioma pathogenesis-related protein 1, a membrane protein implicated in the proliferation of aggressive brain cancer, is presented. Human glioma pathogenesis-related protein 1 (GLIPR1) is a membrane protein that is highly upregulated in brain cancers but is barely detectable in normal brain tissue. GLIPR1 is composed of a signal peptide that directs its secretion, a conserved cysteine-rich CAP (cysteine-rich secretory proteins, antigen 5 and pathogenesis-related 1 proteins) domain and a transmembrane domain. GLIPR1 is currently being investigated as a candidate for prostate cancer gene therapy and for glioblastoma targeted therapy. Crystal structures of a truncated soluble domain of the human GLIPR1 protein (sGLIPR1) solved by molecular replacement using a truncated polyalanine search model of the CAP domain of stecrisp, a snake-venom cysteine-rich secretory protein (CRISP), are presented. The correct molecular-replacement solution could only be obtained by removing all loops from the search model. The native structure was refined to 1.85 Å resolution and that of a Zn{sup 2+} complex was refined to 2.2 Å resolution. The latter structure revealed that the putative binding cavity coordinates Zn{sup 2+} similarly to snake-venom CRISPs, which are involved in Zn{sup 2+}-dependent mechanisms of inflammatory modulation. Both sGLIPR1 structures have extensive flexible loop/turn regions and unique charge distributions that were not observed in any of the previously reported CAP protein structures. A model is also proposed for the structure of full-length membrane-bound GLIPR1.

  14. Structural studies of potential new-generation antibiotic targets and the use of one of them, TonB as a model protein in protein engineering

    OpenAIRE

    Ciragan, Annika

    2017-01-01

    At the present time resistance to every main class of antibiotic has been observed. Therefore, the continuous development of new-generation antibiotics is crucial to combat the rise of antibiotic resistant strains. Identification of potential antibiotic targets and investigation of their structure and function represent a rational approach to developing a better understanding of the essential processes in which they are involved, and may lead to finding a mechanism to inhibit these processes....

  15. Structural analysis of recombinant human protein QM

    International Nuclear Information System (INIS)

    Gualberto, D.C.H.; Fernandes, J.L.; Silva, F.S.; Saraiva, K.W.; Affonso, R.; Pereira, L.M.; Silva, I.D.C.G.

    2012-01-01

    Full text: The ribosomal protein QM belongs to a family of ribosomal proteins, which is highly conserved from yeast to humans. The presence of the QM protein is necessary for joining the 60S and 40S subunits in a late step of the initiation of mRNA translation. Although the exact extra-ribosomal functions of QM are not yet fully understood, it has been identified as a putative tumor suppressor. This protein was reported to interact with the transcription factor c-Jun and thereby prevent c-Jun actives genes of the cellular growth. In this study, the human QM protein was expressed in bacterial system, in the soluble form and this structure was analyzed by Circular Dichroism and Fluorescence. The results of Circular Dichroism showed that this protein has less alpha helix than beta sheet, as described in the literature. QM protein does not contain a leucine zipper region; however the ion zinc is necessary for binding of QM to c-Jun. Then we analyzed the relationship between the removal of zinc ions and folding of protein. Preliminary results obtained by the technique Fluorescence showed a gradual increase in fluorescence with the addition of increasing concentration of EDTA. This suggests that the zinc is important in the tertiary structure of the protein. More studies are being made for better understand these results. (author)

  16. Constraint Logic Programming approach to protein structure prediction

    Directory of Open Access Journals (Sweden)

    Fogolari Federico

    2004-11-01

    Full Text Available Abstract Background The protein structure prediction problem is one of the most challenging problems in biological sciences. Many approaches have been proposed using database information and/or simplified protein models. The protein structure prediction problem can be cast in the form of an optimization problem. Notwithstanding its importance, the problem has very seldom been tackled by Constraint Logic Programming, a declarative programming paradigm suitable for solving combinatorial optimization problems. Results Constraint Logic Programming techniques have been applied to the protein structure prediction problem on the face-centered cube lattice model. Molecular dynamics techniques, endowed with the notion of constraint, have been also exploited. Even using a very simplified model, Constraint Logic Programming on the face-centered cube lattice model allowed us to obtain acceptable results for a few small proteins. As a test implementation their (known secondary structure and the presence of disulfide bridges are used as constraints. Simplified structures obtained in this way have been converted to all atom models with plausible structure. Results have been compared with a similar approach using a well-established technique as molecular dynamics. Conclusions The results obtained on small proteins show that Constraint Logic Programming techniques can be employed for studying protein simplified models, which can be converted into realistic all atom models. The advantage of Constraint Logic Programming over other, much more explored, methodologies, resides in the rapid software prototyping, in the easy way of encoding heuristics, and in exploiting all the advances made in this research area, e.g. in constraint propagation and its use for pruning the huge search space.

  17. Constraint Logic Programming approach to protein structure prediction.

    Science.gov (United States)

    Dal Palù, Alessandro; Dovier, Agostino; Fogolari, Federico

    2004-11-30

    The protein structure prediction problem is one of the most challenging problems in biological sciences. Many approaches have been proposed using database information and/or simplified protein models. The protein structure prediction problem can be cast in the form of an optimization problem. Notwithstanding its importance, the problem has very seldom been tackled by Constraint Logic Programming, a declarative programming paradigm suitable for solving combinatorial optimization problems. Constraint Logic Programming techniques have been applied to the protein structure prediction problem on the face-centered cube lattice model. Molecular dynamics techniques, endowed with the notion of constraint, have been also exploited. Even using a very simplified model, Constraint Logic Programming on the face-centered cube lattice model allowed us to obtain acceptable results for a few small proteins. As a test implementation their (known) secondary structure and the presence of disulfide bridges are used as constraints. Simplified structures obtained in this way have been converted to all atom models with plausible structure. Results have been compared with a similar approach using a well-established technique as molecular dynamics. The results obtained on small proteins show that Constraint Logic Programming techniques can be employed for studying protein simplified models, which can be converted into realistic all atom models. The advantage of Constraint Logic Programming over other, much more explored, methodologies, resides in the rapid software prototyping, in the easy way of encoding heuristics, and in exploiting all the advances made in this research area, e.g. in constraint propagation and its use for pruning the huge search space.

  18. Topological properties of complex networks in protein structures

    Science.gov (United States)

    Kim, Kyungsik; Jung, Jae-Won; Min, Seungsik

    2014-03-01

    We study topological properties of networks in structural classification of proteins. We model the native-state protein structure as a network made of its constituent amino-acids and their interactions. We treat four structural classes of proteins composed predominantly of α helices and β sheets and consider several proteins from each of these classes whose sizes range from amino acids of the Protein Data Bank. Particularly, we simulate and analyze the network metrics such as the mean degree, the probability distribution of degree, the clustering coefficient, the characteristic path length, the local efficiency, and the cost. This work was supported by the KMAR and DP under Grant WISE project (153-3100-3133-302-350).

  19. Protein structure determination by exhaustive search of Protein Data Bank derived databases.

    Science.gov (United States)

    Stokes-Rees, Ian; Sliz, Piotr

    2010-12-14

    Parallel sequence and structure alignment tools have become ubiquitous and invaluable at all levels in the study of biological systems. We demonstrate the application and utility of this same parallel search paradigm to the process of protein structure determination, benefitting from the large and growing corpus of known structures. Such searches were previously computationally intractable. Through the method of Wide Search Molecular Replacement, developed here, they can be completed in a few hours with the aide of national-scale federated cyberinfrastructure. By dramatically expanding the range of models considered for structure determination, we show that small (less than 12% structural coverage) and low sequence identity (less than 20% identity) template structures can be identified through multidimensional template scoring metrics and used for structure determination. Many new macromolecular complexes can benefit significantly from such a technique due to the lack of known homologous protein folds or sequences. We demonstrate the effectiveness of the method by determining the structure of a full-length p97 homologue from Trichoplusia ni. Example cases with the MHC/T-cell receptor complex and the EmoB protein provide systematic estimates of minimum sequence identity, structure coverage, and structural similarity required for this method to succeed. We describe how this structure-search approach and other novel computationally intensive workflows are made tractable through integration with the US national computational cyberinfrastructure, allowing, for example, rapid processing of the entire Structural Classification of Proteins protein fragment database.

  20. Protein Structure Determination Using Chemical Shifts

    DEFF Research Database (Denmark)

    Christensen, Anders Steen

    is determined using only chemical shifts recorded and assigned through automated processes. The CARMSD to the experimental X-ray for this structure is 1.1. Å. Additionally, the method is combined with very sparse NOE-restraints and evolutionary distance restraints and tested on several protein structures >100...

  1. On characterization of anisotropic plant protein structures

    NARCIS (Netherlands)

    Krintiras, G.A.; Göbel, J.; Bouwman, W.G.; Goot, van der A.J.; Stefanidis, G.D.

    2014-01-01

    In this paper, a set of complementary techniques was used to characterize surface and bulk structures of an anisotropic Soy Protein Isolate (SPI)–vital wheat gluten blend after it was subjected to heat and simple shear flow in a Couette Cell. The structured biopolymer blend can form a basis for a

  2. Effect of solvent on the structure of a protein (H3.1) with a coarse-grained model with knowledge-based interactions

    Science.gov (United States)

    Pandey, Ras; Farmer, Barry

    2013-03-01

    Quality of solvent plays a critical role in modulating the structure of a protein along with the temperature. Using a coarse-grained Monte Carlo simulation based on three knowledge-based contact potentials (MJ, BT, BFKV) we examine the structure and dynamics of a histone (H3.1). The empty lattice sites constitute the effective solvent medium in which the protein is embedded. Residue-solvent characteristic interaction is based on the hydropathy index while the residue-residue interaction is used from the knowledge-based contact matrices derived from ensembles of protein structures in the protein data bank. Large scale simulations are performed to analyze the structure of protein for a range of residue-solvent interaction strength, a measure of the solvent quality with each potential. Unlike the monotonic thermal response, the radius of gyration of the protein exhibits non-monotonic dependence of the solvent strength. Quantitative comparison of the structure and dynamics emerging from three knowledge-based potentials will be presented in this talk. This work is supported by Air Force Research Laboratory.

  3. Structural and Functional Annotation of Hypothetical Proteins of O139

    Directory of Open Access Journals (Sweden)

    Md. Saiful Islam

    2015-06-01

    Full Text Available In developing countries threat of cholera is a significant health concern whenever water purification and sewage disposal systems are inadequate. Vibrio cholerae is one of the responsible bacteria involved in cholera disease. The complete genome sequence of V. cholerae deciphers the presence of various genes and hypothetical proteins whose function are not yet understood. Hence analyzing and annotating the structure and function of hypothetical proteins is important for understanding the V. cholerae. V. cholerae O139 is the most common and pathogenic bacterial strain among various V. cholerae strains. In this study sequence of six hypothetical proteins of V. cholerae O139 has been annotated from NCBI. Various computational tools and databases have been used to determine domain family, protein-protein interaction, solubility of protein, ligand binding sites etc. The three dimensional structure of two proteins were modeled and their ligand binding sites were identified. We have found domains and families of only one protein. The analysis revealed that these proteins might have antibiotic resistance activity, DNA breaking-rejoining activity, integrase enzyme activity, restriction endonuclease, etc. Structural prediction of these proteins and detection of binding sites from this study would indicate a potential target aiding docking studies for therapeutic designing against cholera.

  4. Structure characterization of the central repetitive domain of high molecular weight gluten proteins .1. Model studies using cyclic and linear peptides

    NARCIS (Netherlands)

    VanDijk, AA; VanWijk, LL; VanVliet, A; Haris, P; VanSwieten, E; Tesser, GI; Robillard, GT

    The high molecular weight (HMW) proteins from wheat contain a repetitive domain that forms 60-80% of their sequence. The consensus peptides PGQGQQ and GYYPTSPQQ form more than 90% of the domain; both are predicted to adopt beta-turn structure. This paper describes the structural characterization of

  5. On the turn-inducing properties of asparagine: the structuring role of the amide side chain, from isolated model peptides to crystallized proteins.

    Science.gov (United States)

    Habka, S; Sohn, W Y; Vaquero-Vara, V; Géléoc, M; Tardivel, B; Brenner, V; Gloaguen, E; Mons, M

    2018-01-31

    Asparagine (Asn) is a powerful turn-inducer residue, with a large propensity to occupy the second position in the central region of β-turns of proteins. The present work aims at investigating the role of a local anchoring between the Asn side chain and the main chain in this remarkable property. For this purpose, the H-bonding patterns of an asparagine residue in an isolated protein chain fragment forming a γ- or a β-turn have been determined using IR/UV double resonance gas phase spectroscopy on laser-desorbed, jet-cooled short models in conjunction with relevant quantum chemistry calculations. These gas phase data provide evidence for an original double anchoring linking the Asn primary amide side chain (SC), which adopts a gauche+ rotameric form, to its main chain (MC) local environment. From both IR spectroscopic evidence (H-bond induced red shifts) and quantum chemistry, Asn SC is found to behave as a stronger H-bond acceptor than donor, resulting in stronger MC→SC H-bonds than SC→MC ones. These gas phase structural data, relevant to a hydrophobic environment, have been used as a reference to assess the anchoring taking place in high resolution crystallized proteins of the Protein Data Bank. This approach reveals that, when the SC adopts a gauche+ orientation, the stronger MC→SC bonds are preserved in many cases whereas the SC→MC bonds are always disrupted, in qualitative agreement with the gas phase ranking of these interactions. Most interestingly, when Asn occupies the second position of central part of a β-turn (i.e., the very turn-inducer position), the MC→SC H-bonds are also disrupted and replaced by a water-mediated SC to MC anchoring. Owing to the specific features of the hydrated Asn side chain, we propose that it could be a turn precursor structure, able to facilitate turn formation in the early events of the folding process.

  6. SA-Search: a web tool for protein structure mining based on a Structural Alphabet

    OpenAIRE

    Guyon, Frédéric; Camproux, Anne-Claude; Hochez, Joëlle; Tufféry, Pierre

    2004-01-01

    SA-Search is a web tool that can be used to mine for protein structures and extract structural similarities. It is based on a hidden Markov model derived Structural Alphabet (SA) that allows the compression of three-dimensional (3D) protein conformations into a one-dimensional (1D) representation using a limited number of prototype conformations. Using such a representation, classical methods developed for amino acid sequences can be employed. Currently, SA-Search permits the performance of f...

  7. Structural deformation upon protein-protein interaction: a structural alphabet approach.

    Science.gov (United States)

    Martin, Juliette; Regad, Leslie; Lecornet, Hélène; Camproux, Anne-Claude

    2008-02-28

    In a number of protein-protein complexes, the 3D structures of bound and unbound partners significantly differ, supporting the induced fit hypothesis for protein-protein binding. In this study, we explore the induced fit modifications on a set of 124 proteins available in both bound and unbound forms, in terms of local structure. The local structure is described thanks to a structural alphabet of 27 structural letters that allows a detailed description of the backbone. Using a control set to distinguish induced fit from experimental error and natural protein flexibility, we show that the fraction of structural letters modified upon binding is significantly greater than in the control set (36% versus 28%). This proportion is even greater in the interface regions (41%). Interface regions preferentially involve coils. Our analysis further reveals that some structural letters in coil are not favored in the interface. We show that certain structural letters in coil are particularly subject to modifications at the interface, and that the severity of structural change also varies. These information are used to derive a structural letter substitution matrix that summarizes the local structural changes observed in our data set. We also illustrate the usefulness of our approach to identify common binding motifs in unrelated proteins. Our study provides qualitative information about induced fit. These results could be of help for flexible docking.

  8. Structural deformation upon protein-protein interaction: A structural alphabet approach

    Directory of Open Access Journals (Sweden)

    Lecornet Hélène

    2008-02-01

    Full Text Available Abstract Background In a number of protein-protein complexes, the 3D structures of bound and unbound partners significantly differ, supporting the induced fit hypothesis for protein-protein binding. Results In this study, we explore the induced fit modifications on a set of 124 proteins available in both bound and unbound forms, in terms of local structure. The local structure is described thanks to a structural alphabet of 27 structural letters that allows a detailed description of the backbone. Using a control set to distinguish induced fit from experimental error and natural protein flexibility, we show that the fraction of structural letters modified upon binding is significantly greater than in the control set (36% versus 28%. This proportion is even greater in the interface regions (41%. Interface regions preferentially involve coils. Our analysis further reveals that some structural letters in coil are not favored in the interface. We show that certain structural letters in coil are particularly subject to modifications at the interface, and that the severity of structural change also varies. These information are used to derive a structural letter substitution matrix that summarizes the local structural changes observed in our data set. We also illustrate the usefulness of our approach to identify common binding motifs in unrelated proteins. Conclusion Our study provides qualitative information about induced fit. These results could be of help for flexible docking.

  9. Modeling disordered regions in proteins using Rosetta.

    Directory of Open Access Journals (Sweden)

    Ray Yu-Ruei Wang

    Full Text Available Protein structure prediction methods such as Rosetta search for the lowest energy conformation of the polypeptide chain. However, the experimentally observed native state is at a minimum of the free energy, rather than the energy. The neglect of the missing configurational entropy contribution to the free energy can be partially justified by the assumption that the entropies of alternative folded states, while very much less than unfolded states, are not too different from one another, and hence can be to a first approximation neglected when searching for the lowest free energy state. The shortcomings of current structure prediction methods may be due in part to the breakdown of this assumption. Particularly problematic are proteins with significant disordered regions which do not populate single low energy conformations even in the native state. We describe two approaches within the Rosetta structure modeling methodology for treating such regions. The first does not require advance knowledge of the regions likely to be disordered; instead these are identified by minimizing a simple free energy function used previously to model protein folding landscapes and transition states. In this model, residues can be either completely ordered or completely disordered; they are considered disordered if the gain in entropy outweighs the loss of favorable energetic interactions with the rest of the protein chain. The second approach requires identification in advance of the disordered regions either from sequence alone using for example the DISOPRED server or from experimental data such as NMR chemical shifts. During Rosetta structure prediction calculations the disordered regions make only unfavorable repulsive contributions to the total energy. We find that the second approach has greater practical utility and illustrate this with examples from de novo structure prediction, NMR structure calculation, and comparative modeling.

  10. Beta-structures in fibrous proteins.

    Science.gov (United States)

    Kajava, Andrey V; Squire, John M; Parry, David A D

    2006-01-01

    The beta-form of protein folding, one of the earliest protein structures to be defined, was originally observed in studies of silks. It was then seen in early studies of synthetic polypeptides and, of course, is now known to be present in a variety of guises as an essential component of globular protein structures. However, in the last decade or so it has become clear that the beta-conformation of chains is present not only in many of the amyloid structures associated with, for example, Alzheimer's Disease, but also in the prion structures associated with the spongiform encephalopathies. Furthermore, X-ray crystallography studies have revealed the high incidence of the beta-fibrous proteins among virulence factors of pathogenic bacteria and viruses. Here we describe the basic forms of the beta-fold, summarize the many different new forms of beta-structural fibrous arrangements that have been discovered, and review advances in structural studies of amyloid and prion fibrils. These and other issues are described in detail in later chapters.

  11. Fibrous Protein Structures: Hierarchy, History and Heroes.

    Science.gov (United States)

    Squire, John M; Parry, David A D

    2017-01-01

    During the 1930s and 1940s the technique of X-ray diffraction was applied widely by William Astbury and his colleagues to a number of naturally-occurring fibrous materials. On the basis of the diffraction patterns obtained, he observed that the structure of each of the fibres was dominated by one of a small number of different types of molecular conformation. One group of fibres, known as the k-m-e-f group of proteins (keratin - myosin - epidermin - fibrinogen), gave rise to diffraction characteristics that became known as the α-pattern. Others, such as those from a number of silks, gave rise to a different pattern - the β-pattern, while connective tissues yielded a third unique set of diffraction characteristics. At the time of Astbury's work, the structures of these materials were unknown, though the spacings of the main X-ray reflections gave an idea of the axial repeats and the lateral packing distances. In a breakthrough in the early 1950s, the basic structures of all of these fibrous proteins were determined. It was found that the long protein chains, composed of strings of amino acids, could be folded up in a systematic manner to generate a limited number of structures that were consistent with the X-ray data. The most important of these were known as the α-helix, the β-sheet, and the collagen triple helix. These studies provided information about the basic building blocks of all proteins, both fibrous and globular. They did not, however, provide detailed information about how these molecules packed together in three-dimensions to generate the fibres found in vivo. A number of possible packing arrangements were subsequently deduced from the X-ray diffraction and other data, but it is only in the last few years, through the continued improvements of electron microscopy, that the packing details within some fibrous proteins can now be seen directly. Here we outline briefly some of the milestones in fibrous protein structure determination, the role of the

  12. Illuminating structural proteins in viral "dark matter" with metaproteomics.

    Science.gov (United States)

    Brum, Jennifer R; Ignacio-Espinoza, J Cesar; Kim, Eun-Hae; Trubl, Gareth; Jones, Robert M; Roux, Simon; VerBerkmoes, Nathan C; Rich, Virginia I; Sullivan, Matthew B

    2016-03-01

    Viruses are ecologically important, yet environmental virology is limited by dominance of unannotated genomic sequences representing taxonomic and functional "viral dark matter." Although recent analytical advances are rapidly improving taxonomic annotations, identifying functional dark matter remains problematic. Here, we apply paired metaproteomics and dsDNA-targeted metagenomics to identify 1,875 virion-associated proteins from the ocean. Over one-half of these proteins were newly functionally annotated and represent abundant and widespread viral metagenome-derived protein clusters (PCs). One primarily unannotated PC dominated the dataset, but structural modeling and genomic context identified this PC as a previously unidentified capsid protein from multiple uncultivated tailed virus families. Furthermore, four of the five most abundant PCs in the metaproteome represent capsid proteins containing the HK97-like protein fold previously found in many viruses that infect all three domains of life. The dominance of these proteins within our dataset, as well as their global distribution throughout the world's oceans and seas, supports prior hypotheses that this HK97-like protein fold is the most abundant biological structure on Earth. Together, these culture-independent analyses improve virion-associated protein annotations, facilitate the investigation of proteins within natural viral communities, and offer a high-throughput means of illuminating functional viral dark matter.

  13. Ab initio protein structure assembly using continuous structure fragments and optimized knowledge-based force field.

    Science.gov (United States)

    Xu, Dong; Zhang, Yang

    2012-07-01

    Ab initio protein folding is one of the major unsolved problems in computational biology owing to the difficulties in force field design and conformational search. We developed a novel program, QUARK, for template-free protein structure prediction. Query sequences are first broken into fragments of 1-20 residues where multiple fragment structures are retrieved at each position from unrelated experimental structures. Full-length structure models are then assembled from fragments using replica-exchange Monte Carlo simulations, which are guided by a composite knowledge-based force field. A number of novel energy terms and Monte Carlo movements are introduced and the particular contributions to enhancing the efficiency of both force field and search engine are analyzed in detail. QUARK prediction procedure is depicted and tested on the structure modeling of 145 nonhomologous proteins. Although no global templates are used and all fragments from experimental structures with template modeling score >0.5 are excluded, QUARK can successfully construct 3D models of correct folds in one-third cases of short proteins up to 100 residues. In the ninth community-wide Critical Assessment of protein Structure Prediction experiment, QUARK server outperformed the second and third best servers by 18 and 47% based on the cumulative Z-score of global distance test-total scores in the FM category. Although ab initio protein folding remains a significant challenge, these data demonstrate new progress toward the solution of the most important problem in the field. Copyright © 2012 Wiley Periodicals, Inc.

  14. Metallic glasses: structural models

    International Nuclear Information System (INIS)

    Nassif, E.

    1984-01-01

    The aim of this work is to give a summary of the attempts made up to the present in order to discribe by structural models the atomic arrangement in metallic glasses, showing also why the structure factors and atomic distribution functions cannot be always experimentally determined with a reasonable accuracy. (M.W.O.) [pt

  15. A Kernel for Protein Secondary Structure Prediction

    OpenAIRE

    Guermeur , Yann; Lifchitz , Alain; Vert , Régis

    2004-01-01

    http://mitpress.mit.edu/catalog/item/default.asp?ttype=2&tid=10338&mode=toc; International audience; Multi-class support vector machines have already proved efficient in protein secondary structure prediction as ensemble methods, to combine the outputs of sets of classifiers based on different principles. In this chapter, their implementation as basic prediction methods, processing the primary structure or the profile of multiple alignments, is investigated. A kernel devoted to the task is in...

  16. A new activity of anti-HIV and anti-tumor protein GAP31: DNA adenosine glycosidase - Structural and modeling insight into its functions

    Energy Technology Data Exchange (ETDEWEB)

    Li, Hui-Guang [Department of Biochemistry, New York University School of Medicine, New York, NY 10016 (United States); Huang, Philip L. [American Biosciences, Boston, MA 02114 (United States); Zhang, Dawei; Sun, Yongtao [Department of Biochemistry, New York University School of Medicine, New York, NY 10016 (United States); Chen, Hao-Chia [Endocrinology and Reproduction Research Branch, National Institute of Child Health and Human Development, NIH, Bethesda, MD 20892 (United States); Zhang, John [Department of Chemistry, New York University, New York, NY 10003 (United States); Huang, Paul L. [Department of Medicine, Harvard Medical School and Massachusetts General Hospital, Boston, MA 02114 (United States); Kong, Xiang-Peng, E-mail: xiangpeng.kong@med.nyu.edu [Department of Biochemistry, New York University School of Medicine, New York, NY 10016 (United States); Lee-Huang, Sylvia, E-mail: sylvia.lee-huang@med.nyu.edu [Department of Biochemistry, New York University School of Medicine, New York, NY 10016 (United States)

    2010-01-01

    We report here the high-resolution atomic structures of GAP31 crystallized in the presence of HIV-LTR DNA oligonucleotides systematically designed to examine the adenosine glycosidase activity of this anti-HIV and anti-tumor plant protein. Structural analysis and molecular modeling lead to several novel findings. First, adenine is bound at the active site in the crystal structures of GAP31 to HIV-LTR duplex DNA with 5' overhanging adenosine ends, such as the 3'-processed HIV-LTR DNA but not to DNA duplex with blunt ends. Second, the active site pocket of GAP31 is ideally suited to accommodate the 5' overhanging adenosine of the 3'-processed HIV-LTR DNA and the active site residues are positioned to perform the adenosine glycosidase activity. Third, GAP31 also removes the 5'-end adenine from single-stranded HIV-LTR DNA oligonucleotide as well as any exposed adenosine, including that of single nucleotide dAMP but not from AMP. Fourth, GAP31 does not de-purinate guanosine from di-nucleotide GT. These results suggest that GAP31 has DNA adenosine glycosidase activity against accessible adenosine. This activity is distinct from the generally known RNA N-glycosidase activity toward the 28S rRNA. It may be an alternative function that contributes to the antiviral and anti-tumor activities of GAP31. These results provide molecular insights consistent with the anti-HIV mechanisms of GAP31 in its inhibition on the integration of viral DNA into the host genome by HIV-integrase as well as irreversible topological relaxation of the supercoiled viral DNA.

  17. Improved protein structure reconstruction using secondary structures, contacts at higher distance thresholds, and non-contacts.

    Science.gov (United States)

    Adhikari, Badri; Cheng, Jianlin

    2017-08-29

    Residue-residue contacts are key features for accurate de novo protein structure prediction. For the optimal utilization of these predicted contacts in folding proteins accurately, it is important to study the challenges of reconstructing protein structures using true contacts. Because contact-guided protein modeling approach is valuable for predicting the folds of proteins that do not have structural templates, it is necessary for reconstruction studies to focus on hard-to-predict protein structures. Using a data set consisting of 496 structural domains released in recent CASP experiments and a dataset of 150 representative protein structures, in this work, we discuss three techniques to improve the reconstruction accuracy using true contacts - adding secondary structures, increasing contact distance thresholds, and adding non-contacts. We find that reconstruction using secondary structures and contacts can deliver accuracy higher than using full contact maps. Similarly, we demonstrate that non-contacts can improve reconstruction accuracy not only when the used non-contacts are true but also when they are predicted. On the dataset consisting of 150 proteins, we find that by simply using low ranked predicted contacts as non-contacts and adding them as additional restraints, can increase the reconstruction accuracy by 5% when the reconstructed models are evaluated using TM-score. Our findings suggest that secondary structures are invaluable companions of contacts for accurate reconstruction. Confirming some earlier findings, we also find that larger distance thresholds are useful for folding many protein structures which cannot be folded using the standard definition of contacts. Our findings also suggest that for more accurate reconstruction using predicted contacts it is useful to predict contacts at higher distance thresholds (beyond 8 Å) and predict non-contacts.

  18. 3D bioprinting of structural proteins.

    Science.gov (United States)

    Włodarczyk-Biegun, Małgorzata K; Del Campo, Aránzazu

    2017-07-01

    3D bioprinting is a booming method to obtain scaffolds of different materials with predesigned and customized morphologies and geometries. In this review we focus on the experimental strategies and recent achievements in the bioprinting of major structural proteins (collagen, silk, fibrin), as a particularly interesting technology to reconstruct the biochemical and biophysical composition and hierarchical morphology of natural scaffolds. The flexibility in molecular design offered by structural proteins, combined with the flexibility in mixing, deposition, and mechanical processing inherent to bioprinting technologies, enables the fabrication of highly functional scaffolds and tissue mimics with a degree of complexity and organization which has only just started to be explored. Here we describe the printing parameters and physical (mechanical) properties of bioinks based on structural proteins, including the biological function of the printed scaffolds. We describe applied printing techniques and cross-linking methods, highlighting the modifications implemented to improve scaffold properties. The used cell types, cell viability, and possible construct applications are also reported. We envision that the application of printing technologies to structural proteins will enable unprecedented control over their supramolecular organization, conferring printed scaffolds biological properties and functions close to natural systems. Copyright © 2017 Elsevier Ltd. All rights reserved.

  19. PROGRAM SYSTEM AND INFORMATION METADATA BANK OF TERTIARY PROTEIN STRUCTURES

    Directory of Open Access Journals (Sweden)

    T. A. Nikitin

    2013-01-01

    Full Text Available The article deals with the architecture of metadata storage model for check results of three-dimensional protein structures. Concept database model was built. The service and procedure of database update as well as data transformation algorithms for protein structures and their quality were presented. Most important information about entries and their submission forms to store, access, and delivery to users were highlighted. Software suite was developed for the implementation of functional tasks using Java programming language in the NetBeans v.7.0 environment and JQL to query and interact with the database JavaDB. The service was tested and results have shown system effectiveness while protein structures filtration.

  20. Structural Equation Model Trees

    Science.gov (United States)

    Brandmaier, Andreas M.; von Oertzen, Timo; McArdle, John J.; Lindenberger, Ulman

    2013-01-01

    In the behavioral and social sciences, structural equation models (SEMs) have become widely accepted as a modeling tool for the relation between latent and observed variables. SEMs can be seen as a unification of several multivariate analysis techniques. SEM Trees combine the strengths of SEMs and the decision tree paradigm by building tree…

  1. Mechanical Modeling and Computer Simulation of Protein Folding

    Science.gov (United States)

    Prigozhin, Maxim B.; Scott, Gregory E.; Denos, Sharlene

    2014-01-01

    In this activity, science education and modern technology are bridged to teach students at the high school and undergraduate levels about protein folding and to strengthen their model building skills. Students are guided from a textbook picture of a protein as a rigid crystal structure to a more realistic view: proteins are highly dynamic…

  2. Functions and structures of eukaryotic recombination proteins

    International Nuclear Information System (INIS)

    Ogawa, Tomoko

    1994-01-01

    We have found that Rad51 and RecA Proteins form strikingly similar structures together with dsDNA and ATP. Their right handed helical nucleoprotein filaments extend the B-form DNA double helixes to 1.5 times in length and wind the helix. The similarity and uniqueness of their structures must reflect functional homologies between these proteins. Therefore, it is highly probable that similar recombination proteins are present in various organisms of different evolutional states. We have succeeded to clone RAD51 genes from human, mouse, chicken and fission yeast genes, and found that the homologues are widely distributed in eukaryotes. The HsRad51 and MmRad51 or ChRad51 proteins consist of 339 amino acids differing only by 4 or 12 amino acids, respectively, and highly homologous to both yeast proteins, but less so to Dmcl. All of these proteins are homologous to the region from residues 33 to 240 of RecA which was named ''homologous core. The homologous core is likely to be responsible for functions common for all of them, such as the formation of helical nucleoprotein filament that is considered to be involved in homologous pairing in the recombination reaction. The mouse gene is transcribed at a high level in thymus, spleen, testis, and ovary, at lower level in brain and at a further lower level in some other tissues. It is transcribed efficiently in recombination active tissues. A clear functional difference of Rad51 homologues from RecA was suggested by the failure of heterologous genes to complement the deficiency of Scrad51 mutants. This failure seems to reflect the absence of a compatible partner, such as ScRad52 protein in the case of ScRad51 protein, between different species. Thus, these discoveries play a role of the starting point to understand the fundamental gene targeting in mammalian cells and in gene therapy. (J.P.N.)

  3. Blind Test of Physics-Based Prediction of Protein Structures

    Science.gov (United States)

    Shell, M. Scott; Ozkan, S. Banu; Voelz, Vincent; Wu, Guohong Albert; Dill, Ken A.

    2009-01-01

    We report here a multiprotein blind test of a computer method to predict native protein structures based solely on an all-atom physics-based force field. We use the AMBER 96 potential function with an implicit (GB/SA) model of solvation, combined with replica-exchange molecular-dynamics simulations. Coarse conformational sampling is performed using the zipping and assembly method (ZAM), an approach that is designed to mimic the putative physical routes of protein folding. ZAM was applied to the folding of six proteins, from 76 to 112 monomers in length, in CASP7, a community-wide blind test of protein structure prediction. Because these predictions have about the same level of accuracy as typical bioinformatics methods, and do not utilize information from databases of known native structures, this work opens up the possibility of predicting the structures of membrane proteins, synthetic peptides, or other foldable polymers, for which there is little prior knowledge of native structures. This approach may also be useful for predicting physical protein folding routes, non-native conformations, and other physical properties from amino acid sequences. PMID:19186130

  4. Models of crk adaptor proteins in cancer.

    Science.gov (United States)

    Bell, Emily S; Park, Morag

    2012-05-01

    The Crk family of adaptor proteins (CrkI, CrkII, and CrkL), originally discovered as the oncogene fusion product, v-Crk, of the CT10 chicken retrovirus, lacks catalytic activity but engages with multiple signaling pathways through their SH2 and SH3 domains. Crk proteins link upstream tyrosine kinase and integrin-dependent signals to downstream effectors, acting as adaptors in diverse signaling pathways and cellular processes. Crk proteins are now recognized to play a role in the malignancy of many human cancers, stimulating renewed interest in their mechanism of action in cancer progression. The contribution of Crk signaling to malignancy has been predominantly studied in fibroblasts and in hematopoietic models and more recently in epithelial models. A mechanistic understanding of Crk proteins in cancer progression in vivo is still poorly understood in part due to the highly pleiotropic nature of Crk signaling. Recent advances in the structural organization of Crk domains, new roles in kinase regulation, and increased knowledge of the mechanisms and frequency of Crk overexpression in human cancers have provided an incentive for further study in in vivo models. An understanding of the mechanisms through which Crk proteins act as oncogenic drivers could have important implications in therapeutic targeting.

  5. Integrative structure modeling with the Integrative Modeling Platform.

    Science.gov (United States)

    Webb, Benjamin; Viswanath, Shruthi; Bonomi, Massimiliano; Pellarin, Riccardo; Greenberg, Charles H; Saltzberg, Daniel; Sali, Andrej

    2018-01-01

    Building models of a biological system that are consistent with the myriad data available is one of the key challenges in biology. Modeling the structure and dynamics of macromolecular assemblies, for example, can give insights into how biological systems work, evolved, might be controlled, and even designed. Integrative structure modeling casts the building of structural models as a computational optimization problem, for which information about the assembly is encoded into a scoring function that evaluates candidate models. Here, we describe our open source software suite for integrative structure modeling, Integrative Modeling Platform (https://integrativemodeling.org), and demonstrate its use. © 2017 The Protein Society.

  6. In silico modeling of the yeast protein and protein family interaction network

    Science.gov (United States)

    Goh, K.-I.; Kahng, B.; Kim, D.

    2004-03-01

    Understanding of how protein interaction networks of living organisms have evolved or are organized can be the first stepping stone in unveiling how life works on a fundamental ground. Here we introduce an in silico ``coevolutionary'' model for the protein interaction network and the protein family network. The essential ingredient of the model includes the protein family identity and its robustness under evolution, as well as the three previously proposed: gene duplication, divergence, and mutation. This model produces a prototypical feature of complex networks in a wide range of parameter space, following the generalized Pareto distribution in connectivity. Moreover, we investigate other structural properties of our model in detail with some specific values of parameters relevant to the yeast Saccharomyces cerevisiae, showing excellent agreement with the empirical data. Our model indicates that the physical constraints encoded via the domain structure of proteins play a crucial role in protein interactions.

  7. Protein structure based prediction of catalytic residues.

    Science.gov (United States)

    Fajardo, J Eduardo; Fiser, Andras

    2013-02-22

    Worldwide structural genomics projects continue to release new protein structures at an unprecedented pace, so far nearly 6000, but only about 60% of these proteins have any sort of functional annotation. We explored a range of features that can be used for the prediction of functional residues given a known three-dimensional structure. These features include various centrality measures of nodes in graphs of interacting residues: closeness, betweenness and page-rank centrality. We also analyzed the distance of functional amino acids to the general center of mass (GCM) of the structure, relative solvent accessibility (RSA), and the use of relative entropy as a measure of sequence conservation. From the selected features, neural networks were trained to identify catalytic residues. We found that using distance to the GCM together with amino acid type provide a good discriminant function, when combined independently with sequence conservation. Using an independent test set of 29 annotated protein structures, the method returned 411 of the initial 9262 residues as the most likely to be involved in function. The output 411 residues contain 70 of the annotated 111 catalytic residues. This represents an approximately 14-fold enrichment of catalytic residues on the entire input set (corresponding to a sensitivity of 63% and a precision of 17%), a performance competitive with that of other state-of-the-art methods. We found that several of the graph based measures utilize the same underlying feature of protein structures, which can be simply and more effectively captured with the distance to GCM definition. This also has the added the advantage of simplicity and easy implementation. Meanwhile sequence conservation remains by far the most influential feature in identifying functional residues. We also found that due the rapid changes in size and composition of sequence databases, conservation calculations must be recalibrated for specific reference databases.

  8. Computational modeling of allosteric regulation in the hsp90 chaperones: a statistical ensemble analysis of protein structure networks and allosteric communications.

    Directory of Open Access Journals (Sweden)

    Kristin Blacklock

    2014-06-01

    Full Text Available A fundamental role of the Hsp90 chaperone in regulating functional activity of diverse protein clients is essential for the integrity of signaling networks. In this work we have combined biophysical simulations of the Hsp90 crystal structures with the protein structure network analysis to characterize the statistical ensemble of allosteric interaction networks and communication pathways in the Hsp90 chaperones. We have found that principal structurally stable communities could be preserved during dynamic changes in the conformational ensemble. The dominant contribution of the inter-domain rigidity to the interaction networks has emerged as a common factor responsible for the thermodynamic stability of the active chaperone form during the ATPase cycle. Structural stability analysis using force constant profiling of the inter-residue fluctuation distances has identified a network of conserved structurally rigid residues that could serve as global mediating sites of allosteric communication. Mapping of the conformational landscape with the network centrality parameters has demonstrated that stable communities and mediating residues may act concertedly with the shifts in the conformational equilibrium and could describe the majority of functionally significant chaperone residues. The network analysis has revealed a relationship between structural stability, global centrality and functional significance of hotspot residues involved in chaperone regulation. We have found that allosteric interactions in the Hsp90 chaperone may be mediated by modules of structurally stable residues that display high betweenness in the global interaction network. The results of this study have suggested that allosteric interactions in the Hsp90 chaperone may operate via a mechanism that combines rapid and efficient communication by a single optimal pathway of structurally rigid residues and more robust signal transmission using an ensemble of suboptimal multiple

  9. Discrete Haar transform and protein structure.

    Science.gov (United States)

    Morosetti, S

    1997-12-01

    The discrete Haar transform of the sequence of the backbone dihedral angles (phi and psi) was performed over a set of X-ray protein structures of high resolution from the Brookhaven Protein Data Bank. Afterwards, the new dihedral angles were calculated by the inverse transform, using a growing number of Haar functions, from the lower to the higher degree. New structures were obtained using these dihedral angles, with standard values for bond lengths and angles, and with omega = 0 degree. The reconstructed structures were compared with the experimental ones, and analyzed by visual inspection and statistical analysis. When half of the Haar coefficients were used, all the reconstructed structures were not yet collapsed to a tertiary folding, but they showed yet realized most of the secondary motifs. These results indicate a substantial separation of structural information in the space of Haar transform, with the secondary structural information mainly present in the Haar coefficients of lower degrees, and the tertiary one present in the higher degree coefficients. Because of this separation, the representation of the folded structures in the space of Haar transform seems a promising candidate to encompass the problem of premature convergence in genetic algorithms.

  10. Recognition of functional sites in protein structures.

    Science.gov (United States)

    Shulman-Peleg, Alexandra; Nussinov, Ruth; Wolfson, Haim J

    2004-06-04

    Recognition of regions on the surface of one protein, that are similar to a binding site of another is crucial for the prediction of molecular interactions and for functional classifications. We first describe a novel method, SiteEngine, that assumes no sequence or fold similarities and is able to recognize proteins that have similar binding sites and may perform similar functions. We achieve high efficiency and speed by introducing a low-resolution surface representation via chemically important surface points, by hashing triangles of physico-chemical properties and by application of hierarchical scoring schemes for a thorough exploration of global and local similarities. We proceed to rigorously apply this method to functional site recognition in three possible ways: first, we search a given functional site on a large set of complete protein structures. Second, a potential functional site on a protein of interest is compared with known binding sites, to recognize similar features. Third, a complete protein structure is searched for the presence of an a priori unknown functional site, similar to known sites. Our method is robust and efficient enough to allow computationally demanding applications such as the first and the third. From the biological standpoint, the first application may identify secondary binding sites of drugs that may lead to side-effects. The third application finds new potential sites on the protein that may provide targets for drug design. Each of the three applications may aid in assigning a function and in classification of binding patterns. We highlight the advantages and disadvantages of each type of search, provide examples of large-scale searches of the entire Protein Data Base and make functional predictions.

  11. Protein 3D structure computed from evolutionary sequence variation.

    Directory of Open Access Journals (Sweden)

    Debora S Marks

    Full Text Available The evolutionary trajectory of a protein through sequence space is constrained by its function. Collections of sequence homologs record the outcomes of millions of evolutionary experiments in which the protein evolves according to these constraints. Deciphering the evolutionary record held in these sequences and exploiting it for predictive and engineering purposes presents a formidable challenge. The potential benefit of solving this challenge is amplified by the advent of inexpensive high-throughput genomic sequencing.In this paper we ask whether we can infer evolutionary constraints from a set of sequence homologs of a protein. The challenge is to distinguish true co-evolution couplings from the noisy set of observed correlations. We address this challenge using a maximum entropy model of the protein sequence, constrained by the statistics of the multiple sequence alignment, to infer residue pair couplings. Surprisingly, we find that the strength of these inferred couplings is an excellent predictor of residue-residue proximity in folded structures. Indeed, the top-scoring residue couplings are sufficiently accurate and well-distributed to define the 3D protein fold with remarkable accuracy.We quantify this observation by computing, from sequence alone, all-atom 3D structures of fifteen test proteins from different fold classes, ranging in size from 50 to 260 residues, including a G-protein coupled receptor. These blinded inferences are de novo, i.e., they do not use homology modeling or sequence-similar fragments from known structures. The co-evolution signals provide sufficient information to determine accurate 3D protein structure to 2.7-4.8 Å C(α-RMSD error relative to the observed structure, over at least two-thirds of the protein (method called EVfold, details at http://EVfold.org. This discovery provides insight into essential interactions constraining protein evolution and will facilitate a comprehensive survey of the universe of

  12. Protein Folding: Search for Basic Physical Models

    Directory of Open Access Journals (Sweden)

    Ivan Y. Torshin

    2003-01-01

    Full Text Available How a unique three-dimensional structure is rapidly formed from the linear sequence of a polypeptide is one of the important questions in contemporary science. Apart from biological context of in vivo protein folding (which has been studied only for a few proteins, the roles of the fundamental physical forces in the in vitro folding remain largely unstudied. Despite a degree of success in using descriptions based on statistical and/or thermodynamic approaches, few of the current models explicitly include more basic physical forces (such as electrostatics and Van Der Waals forces. Moreover, the present-day models rarely take into account that the protein folding is, essentially, a rapid process that produces a highly specific architecture. This review considers several physical models that may provide more direct links between sequence and tertiary structure in terms of the physical forces. In particular, elaboration of such simple models is likely to produce extremely effective computational techniques with value for modern genomics.

  13. Structural Elements Regulating AAA+ Protein Quality Control Machines.

    Science.gov (United States)

    Chang, Chiung-Wen; Lee, Sukyeong; Tsai, Francis T F

    2017-01-01

    Members of the ATPases Associated with various cellular Activities (AAA+) superfamily participate in essential and diverse cellular pathways in all kingdoms of life by harnessing the energy of ATP binding and hydrolysis to drive their biological functions. Although most AAA+ proteins share a ring-shaped architecture, AAA+ proteins have evolved distinct structural elements that are fine-tuned to their specific functions. A central question in the field is how ATP binding and hydrolysis are coupled to substrate translocation through the central channel of ring-forming AAA+ proteins. In this mini-review, we will discuss structural elements present in AAA+ proteins involved in protein quality control, drawing similarities to their known role in substrate interaction by AAA+ proteins involved in DNA translocation. Elements to be discussed include the pore loop-1, the Inter-Subunit Signaling (ISS) motif, and the Pre-Sensor I insert (PS-I) motif. Lastly, we will summarize our current understanding on the inter-relationship of those structural elements and propose a model how ATP binding and hydrolysis might be coupled to polypeptide translocation in protein quality control machines.

  14. MMM: A toolbox for integrative structure modeling.

    Science.gov (United States)

    Jeschke, Gunnar

    2018-01-01

    Structural characterization of proteins and their complexes may require integration of restraints from various experimental techniques. MMM (Multiscale Modeling of Macromolecules) is a Matlab-based open-source modeling toolbox for this purpose with a particular emphasis on distance distribution restraints obtained from electron paramagnetic resonance experiments on spin-labelled proteins and nucleic acids and their combination with atomistic structures of domains or whole protomers, small-angle scattering data, secondary structure information, homology information, and elastic network models. MMM does not only integrate various types of restraints, but also various existing modeling tools by providing a common graphical user interface to them. The types of restraints that can support such modeling and the available model types are illustrated by recent application examples. © 2017 The Protein Society.

  15. Comparative Study of Elastic Network Model and Protein Contact Network for Protein Complexes: The Hemoglobin Case

    Directory of Open Access Journals (Sweden)

    Guang Hu

    2017-01-01

    Full Text Available The overall topology and interfacial interactions play key roles in understanding structural and functional principles of protein complexes. Elastic Network Model (ENM and Protein Contact Network (PCN are two widely used methods for high throughput investigation of structures and interactions within protein complexes. In this work, the comparative analysis of ENM and PCN relative to hemoglobin (Hb was taken as case study. We examine four types of structural and dynamical paradigms, namely, conformational change between different states of Hbs, modular analysis, allosteric mechanisms studies, and interface characterization of an Hb. The comparative study shows that ENM has an advantage in studying dynamical properties and protein-protein interfaces, while PCN is better for describing protein structures quantitatively both from local and from global levels. We suggest that the integration of ENM and PCN would give a potential but powerful tool in structural systems biology.

  16. Protein Function Prediction Based on Sequence and Structure Information

    KAUST Repository

    Smaili, Fatima Z.

    2016-05-25

    The number of available protein sequences in public databases is increasing exponentially. However, a significant fraction of these sequences lack functional annotation which is essential to our understanding of how biological systems and processes operate. In this master thesis project, we worked on inferring protein functions based on the primary protein sequence. In the approach we follow, 3D models are first constructed using I-TASSER. Functions are then deduced by structurally matching these predicted models, using global and local similarities, through three independent enzyme commission (EC) and gene ontology (GO) function libraries. The method was tested on 250 “hard” proteins, which lack homologous templates in both structure and function libraries. The results show that this method outperforms the conventional prediction methods based on sequence similarity or threading. Additionally, our method could be improved even further by incorporating protein-protein interaction information. Overall, the method we use provides an efficient approach for automated functional annotation of non-homologous proteins, starting from their sequence.

  17. A probabilistic fragment-based protein structure prediction algorithm.

    Directory of Open Access Journals (Sweden)

    David Simoncini

    Full Text Available Conformational sampling is one of the bottlenecks in fragment-based protein structure prediction approaches. They generally start with a coarse-grained optimization where mainchain atoms and centroids of side chains are considered, followed by a fine-grained optimization with an all-atom representation of proteins. It is during this coarse-grained phase that fragment-based methods sample intensely the conformational space. If the native-like region is sampled more, the accuracy of the final all-atom predictions may be improved accordingly. In this work we present EdaFold, a new method for fragment-based protein structure prediction based on an Estimation of Distribution Algorithm. Fragment-based approaches build protein models by assembling short fragments from known protein structures. Whereas the probability mass functions over the fragment libraries are uniform in the usual case, we propose an algorithm that learns from previously generated decoys and steers the search toward native-like regions. A comparison with Rosetta AbInitio protocol shows that EdaFold is able to generate models with lower energies and to enhance the percentage of near-native coarse-grained decoys on a benchmark of [Formula: see text] proteins. The best coarse-grained models produced by both methods were refined into all-atom models and used in molecular replacement. All atom decoys produced out of EdaFold's decoy set reach high enough accuracy to solve the crystallographic phase problem by molecular replacement for some test proteins. EdaFold showed a higher success rate in molecular replacement when compared to Rosetta. Our study suggests that improving low resolution coarse-grained decoys allows computational methods to avoid subsequent sampling issues during all-atom refinement and to produce better all-atom models. EdaFold can be downloaded from http://www.riken.jp/zhangiru/software.html [corrected].

  18. Fragger: a protein fragment picker for structural queries.

    Science.gov (United States)

    Berenger, Francois; Simoncini, David; Voet, Arnout; Shrestha, Rojan; Zhang, Kam Y J

    2017-01-01

    Protein modeling and design activities often require querying the Protein Data Bank (PDB) with a structural fragment, possibly containing gaps. For some applications, it is preferable to work on a specific subset of the PDB or with unpublished structures. These requirements, along with specific user needs, motivated the creation of a new software to manage and query 3D protein fragments. Fragger is a protein fragment picker that allows protein fragment databases to be created and queried. All fragment lengths are supported and any set of PDB files can be used to create a database. Fragger can efficiently search a fragment database with a query fragment and a distance threshold. Matching fragments are ranked by distance to the query. The query fragment can have structural gaps and the allowed amino acid sequences matching a query can be constrained via a regular expression of one-letter amino acid codes. Fragger also incorporates a tool to compute the backbone RMSD of one versus many fragments in high throughput. Fragger should be useful for protein design, loop grafting and related structural bioinformatics tasks.

  19. Automatic protein structure solution from weak X-ray data

    Science.gov (United States)

    Skubák, Pavol; Pannu, Navraj S.

    2013-11-01

    Determining new protein structures from X-ray diffraction data at low resolution or with a weak anomalous signal is a difficult and often an impossible task. Here we propose a multivariate algorithm that simultaneously combines the structure determination steps. In tests on over 140 real data sets from the protein data bank, we show that this combined approach can automatically build models where current algorithms fail, including an anisotropically diffracting 3.88 Å RNA polymerase II data set. The method seamlessly automates the process, is ideal for non-specialists and provides a mathematical framework for successfully combining various sources of information in image processing.

  20. Critical Features of Fragment Libraries for Protein Structure Prediction.

    Science.gov (United States)

    Trevizani, Raphael; Custódio, Fábio Lima; Dos Santos, Karina Baptista; Dardenne, Laurent Emmanuel

    2017-01-01

    The use of fragment libraries is a popular approach among protein structure prediction methods and has proven to substantially improve the quality of predicted structures. However, some vital aspects of a fragment library that influence the accuracy of modeling a native structure remain to be determined. This study investigates some of these features. Particularly, we analyze the effect of using secondary structure prediction guiding fragments selection, different fragments sizes and the effect of structural clustering of fragments within libraries. To have a clearer view of how these factors affect protein structure prediction, we isolated the process of model building by fragment assembly from some common limitations associated with prediction methods, e.g., imprecise energy functions and optimization algorithms, by employing an exact structure-based objective function under a greedy algorithm. Our results indicate that shorter fragments reproduce the native structure more accurately than the longer. Libraries composed of multiple fragment lengths generate even better structures, where longer fragments show to be more useful at the beginning of the simulations. The use of many different fragment sizes shows little improvement when compared to predictions carried out with libraries that comprise only three different fragment sizes. Models obtained from libraries built using only sequence similarity are, on average, better than those built with a secondary structure prediction bias. However, we found that the use of secondary structure prediction allows greater reduction of the search space, which is invaluable for prediction methods. The results of this study can be critical guidelines for the use of fragment libraries in protein structure prediction.

  1. Contingency Table Browser - prediction of early stage protein structure.

    Science.gov (United States)

    Kalinowska, Barbara; Krzykalski, Artur; Roterman, Irena

    2015-01-01

    The Early Stage (ES) intermediate represents the starting structure in protein folding simulations based on the Fuzzy Oil Drop (FOD) model. The accuracy of FOD predictions is greatly dependent on the accuracy of the chosen intermediate. A suitable intermediate can be constructed using the sequence-structure relationship information contained in the so-called contingency table - this table expresses the likelihood of encountering various structural motifs for each tetrapeptide fragment in the amino acid sequence. The limited accuracy with which such structures could previously be predicted provided the motivation for a more indepth study of the contingency table itself. The Contingency Table Browser is a tool which can visualize, search and analyze the table. Our work presents possible applications of Contingency Table Browser, among them - analysis of specific protein sequences from the point of view of their structural ambiguity.

  2. Three-dimensional protein structure prediction: Methods and computational strategies.

    Science.gov (United States)

    Dorn, Márcio; E Silva, Mariel Barbachan; Buriol, Luciana S; Lamb, Luis C

    2014-10-12

    A long standing problem in structural bioinformatics is to determine the three-dimensional (3-D) structure of a protein when only a sequence of amino acid residues is given. Many computational methodologies and algorithms have been proposed as a solution to the 3-D Protein Structure Prediction (3-D-PSP) problem. These methods can be divided in four main classes: (a) first principle methods without database information; (b) first principle methods with database information; (c) fold recognition and threading methods; and (d) comparative modeling methods and sequence alignment strategies. Deterministic computational techniques, optimization techniques, data mining and machine learning approaches are typically used in the construction of computational solutions for the PSP problem. Our main goal with this work is to review the methods and computational strategies that are currently used in 3-D protein prediction. Copyright © 2014 Elsevier Ltd. All rights reserved.

  3. PCNA Structure and Interactions with Partner Proteins

    KAUST Repository

    Oke, Muse; Zaher, Manal S.; Hamdan, Samir

    2018-01-01

    Proliferating cell nuclear antigen (PCNA) consists of three identical monomers that topologically encircle double-stranded DNA. PCNA stimulates the processivity of DNA polymerase δ and, to a less extent, the intrinsically highly processive DNA polymerase ε. It also functions as a platform that recruits and coordinates the activities of a large number of DNA processing proteins. Emerging structural and biochemical studies suggest that the nature of PCNA-partner proteins interactions is complex. A hydrophobic groove at the front side of PCNA serves as a primary docking site for the consensus PIP box motifs present in many PCNA-binding partners. Sequences that immediately flank the PIP box motif or regions that are distant from it could also interact with the hydrophobic groove and other regions of PCNA. Posttranslational modifications on the backside of PCNA could add another dimension to its interaction with partner proteins. An encounter of PCNA with different DNA structures might also be involved in coordinating its interactions. Finally, the ability of PCNA to bind up to three proteins while topologically linked to DNA suggests that it would be a versatile toolbox in many different DNA processing reactions.

  4. PCNA Structure and Interactions with Partner Proteins

    KAUST Repository

    Oke, Muse

    2018-01-29

    Proliferating cell nuclear antigen (PCNA) consists of three identical monomers that topologically encircle double-stranded DNA. PCNA stimulates the processivity of DNA polymerase δ and, to a less extent, the intrinsically highly processive DNA polymerase ε. It also functions as a platform that recruits and coordinates the activities of a large number of DNA processing proteins. Emerging structural and biochemical studies suggest that the nature of PCNA-partner proteins interactions is complex. A hydrophobic groove at the front side of PCNA serves as a primary docking site for the consensus PIP box motifs present in many PCNA-binding partners. Sequences that immediately flank the PIP box motif or regions that are distant from it could also interact with the hydrophobic groove and other regions of PCNA. Posttranslational modifications on the backside of PCNA could add another dimension to its interaction with partner proteins. An encounter of PCNA with different DNA structures might also be involved in coordinating its interactions. Finally, the ability of PCNA to bind up to three proteins while topologically linked to DNA suggests that it would be a versatile toolbox in many different DNA processing reactions.

  5. Ultrafast protein structure-based virtual screening with Panther

    Science.gov (United States)

    Niinivehmas, Sanna P.; Salokas, Kari; Lätti, Sakari; Raunio, Hannu; Pentikäinen, Olli T.

    2015-10-01

    Molecular docking is by far the most common method used in protein structure-based virtual screening. This paper presents Panther, a novel ultrafast multipurpose docking tool. In Panther, a simple shape-electrostatic model of the ligand-binding area of the protein is created by utilizing the protein crystal structure. The features of the possible ligands are then compared to the model by using a similarity search algorithm. On average, one ligand can be processed in a few minutes by using classical docking methods, whereas using Panther processing takes Panther protocol can be used in several applications, such as speeding up the early phases of drug discovery projects, reducing the number of failures in the clinical phase of the drug development process, and estimating the environmental toxicity of chemicals. Panther-code is available in our web pages (http://www.jyu.fi/panther) free of charge after registration.

  6. Protein secondary structure: category assignment and predictability

    DEFF Research Database (Denmark)

    Andersen, Claus A.; Bohr, Henrik; Brunak, Søren

    2001-01-01

    In the last decade, the prediction of protein secondary structure has been optimized using essentially one and the same assignment scheme known as DSSP. We present here a different scheme, which is more predictable. This scheme predicts directly the hydrogen bonds, which stabilize the secondary......-forward neural network with one hidden layer on a data set identical to the one used in earlier work....

  7. Protein-mediated surface structuring in biomembranes

    Directory of Open Access Journals (Sweden)

    Maggio B.

    2005-01-01

    Full Text Available The lipids and proteins of biomembranes exhibit highly dissimilar conformations, geometrical shapes, amphipathicity, and thermodynamic properties which constrain their two-dimensional molecular packing, electrostatics, and interaction preferences. This causes inevitable development of large local tensions that frequently relax into phase or compositional immiscibility along lateral and transverse planes of the membrane. On the other hand, these effects constitute the very codes that mediate molecular and structural changes determining and controlling the possibilities for enzymatic activity, apposition and recombination in biomembranes. The presence of proteins constitutes a major perturbing factor for the membrane sculpturing both in terms of its surface topography and dynamics. We will focus on some results from our group within this context and summarize some recent evidence for the active involvement of extrinsic (myelin basic protein, integral (Folch-Lees proteolipid protein and amphitropic (c-Fos and c-Jun proteins, as well as a membrane-active amphitropic phosphohydrolytic enzyme (neutral sphingomyelinase, in the process of lateral segregation and dynamics of phase domains, sculpturing of the surface topography, and the bi-directional modulation of the membrane biochemical reactivity.

  8. Hydration dynamics near a model protein surface

    International Nuclear Information System (INIS)

    Russo, Daniela; Hura, Greg; Head-Gordon, Teresa

    2003-01-01

    The evolution of water dynamics from dilute to very high concentration solutions of a prototypical hydrophobic amino acid with its polar backbone, N-acetyl-leucine-methylamide (NALMA), is studied by quasi-elastic neutron scattering and molecular dynamics simulation for both the completely deuterated and completely hydrogenated leucine monomer. We observe several unexpected features in the dynamics of these biological solutions under ambient conditions. The NALMA dynamics shows evidence of de Gennes narrowing, an indication of coherent long timescale structural relaxation dynamics. The translational water dynamics are analyzed in a first approximation with a jump diffusion model. At the highest solute concentrations, the hydration water dynamics is significantly suppressed and characterized by a long residential time and a slow diffusion coefficient. The analysis of the more dilute concentration solutions takes into account the results of the 2.0M solution as a model of the first hydration shell. Subtracting the first hydration layer based on the 2.0M spectra, the translational diffusion dynamics is still suppressed, although the rotational relaxation time and residential time are converged to bulk-water values. Molecular dynamics analysis shows spatially heterogeneous dynamics at high concentration that becomes homogeneous at more dilute concentrations. We discuss the hydration dynamics results of this model protein system in the context of glassy systems, protein function, and protein-protein interfaces

  9. EDM-DEDM and protein crystal structure solution.

    Science.gov (United States)

    Caliandro, Rocco; Carrozzini, Benedetta; Cascarano, Giovanni Luca; Giacovazzo, Carmelo; Mazzone, Anna Maria; Siliqi, Dritan

    2009-05-01

    Electron-density modification (EDM) procedures are the classical tool for driving model phases closer to those of the target structure. They are often combined with automated model-building programs to provide a correct protein model. The task is not always performed, mostly because of the large initial phase error. A recently proposed procedure combined EDM with DEDM (difference electron-density modification); the method was applied to the refinement of phases obtained by molecular replacement, ab initio or SAD phasing [Caliandro, Carrozzini, Cascarano, Giacovazzo, Mazzone & Siliqi (2009), Acta Cryst. D65, 249-256] and was more effective in improving phases than EDM alone. In this paper, a novel fully automated protocol for protein structure refinement based on the iterative application of automated model-building programs combined with the additional power derived from the EDM-DEDM algorithm is presented. The cyclic procedure was successfully tested on challenging cases for which all other approaches had failed.

  10. Hill-Climbing search and diversification within an evolutionary approach to protein structure prediction.

    Science.gov (United States)

    Chira, Camelia; Horvath, Dragos; Dumitrescu, D

    2011-07-30

    Proteins are complex structures made of amino acids having a fundamental role in the correct functioning of living cells. The structure of a protein is the result of the protein folding process. However, the general principles that govern the folding of natural proteins into a native structure are unknown. The problem of predicting a protein structure with minimum-energy starting from the unfolded amino acid sequence is a highly complex and important task in molecular and computational biology. Protein structure prediction has important applications in fields such as drug design and disease prediction. The protein structure prediction problem is NP-hard even in simplified lattice protein models. An evolutionary model based on hill-climbing genetic operators is proposed for protein structure prediction in the hydrophobic - polar (HP) model. Problem-specific search operators are implemented and applied using a steepest-ascent hill-climbing approach. Furthermore, the proposed model enforces an explicit diversification stage during the evolution in order to avoid local optimum. The main features of the resulting evolutionary algorithm - hill-climbing mechanism and diversification strategy - are evaluated in a set of numerical experiments for the protein structure prediction problem to assess their impact to the efficiency of the search process. Furthermore, the emerging consolidated model is compared to relevant algorithms from the literature for a set of difficult bidimensional instances from lattice protein models. The results obtained by the proposed algorithm are promising and competitive with those of related methods.

  11. Hill-Climbing search and diversification within an evolutionary approach to protein structure prediction

    Directory of Open Access Journals (Sweden)

    Chira Camelia

    2011-07-01

    Full Text Available Abstract Proteins are complex structures made of amino acids having a fundamental role in the correct functioning of living cells. The structure of a protein is the result of the protein folding process. However, the general principles that govern the folding of natural proteins into a native structure are unknown. The problem of predicting a protein structure with minimum-energy starting from the unfolded amino acid sequence is a highly complex and important task in molecular and computational biology. Protein structure prediction has important applications in fields such as drug design and disease prediction. The protein structure prediction problem is NP-hard even in simplified lattice protein models. An evolutionary model based on hill-climbing genetic operators is proposed for protein structure prediction in the hydrophobic - polar (HP model. Problem-specific search operators are implemented and applied using a steepest-ascent hill-climbing approach. Furthermore, the proposed model enforces an explicit diversification stage during the evolution in order to avoid local optimum. The main features of the resulting evolutionary algorithm - hill-climbing mechanism and diversification strategy - are evaluated in a set of numerical experiments for the protein structure prediction problem to assess their impact to the efficiency of the search process. Furthermore, the emerging consolidated model is compared to relevant algorithms from the literature for a set of difficult bidimensional instances from lattice protein models. The results obtained by the proposed algorithm are promising and competitive with those of related methods.

  12. Effects of addition of hydrocolloids on the textural and structural properties of high-protein intermediate moisture food model systems containing sodium caseinate.

    Science.gov (United States)

    Li, J; Wu, Y; Ma, Y; Lu, N; Regenstein, J M; Zhou, P

    2017-08-01

    High-protein intermediate moisture food (HPIMF) containing sodium caseinate (NaCN) often gave a harder texture compared with that made from whey proteins or soy proteins, due to the aggregation of protein particles. The objectives of this study were to explore whether the addition of hydrocolloids could soften the texture and illustrate the possible mechanism. Three kinds of hydrocolloids, xanthan gum, κ-carrageenan, and gum arabic were chosen, and samples including of these three kinds of hydrocolloids were studied through texture analysis using a TPA test and microstructure observation by confocal laser scanning microscopy (CLSM) and scanning electron microscopy (SEM). The texture analysis results showed that xanthan gum was more effective at softening the HPIMF containing NaCN compared to κ-carrageenan and gum arabic. In addition, with the increase of xanthan gum concentration from 0.2 to 2%, the HPIMF matrix became softer, and fractures were observed during the compression for samples with xanthan gum added at low concentrations but not 2%. Microstructure observation suggested that the matrix originally dominated by the network formed through the aggregation of swollen protein particles was inhibited by the addition of xanthan gum, resulting in the softening of the texture and also contributing to the fracture during compression. With the increase of xanthan gum concentration up to 2%, the protein dominating network would be gradually replaced with a matrix dominated by the newly formed network of xanthan gum with protein particles as fillers. Furthermore, this formation of a xanthan gum dominating network structure also resulted in changes in small molecule distribution, as observed using low-field NMR.

  13. (PS)2: protein structure prediction server version 3.0.

    Science.gov (United States)

    Huang, Tsun-Tsao; Hwang, Jenn-Kang; Chen, Chu-Huang; Chu, Chih-Sheng; Lee, Chi-Wen; Chen, Chih-Chieh

    2015-07-01

    Protein complexes are involved in many biological processes. Examining coupling between subunits of a complex would be useful to understand the molecular basis of protein function. Here, our updated (PS)(2) web server predicts the three-dimensional structures of protein complexes based on comparative modeling; furthermore, this server examines the coupling between subunits of the predicted complex by combining structural and evolutionary considerations. The predicted complex structure could be indicated and visualized by Java-based 3D graphics viewers and the structural and evolutionary profiles are shown and compared chain-by-chain. For each subunit, considerations with or without the packing contribution of other subunits cause the differences in similarities between structural and evolutionary profiles, and these differences imply which form, complex or monomeric, is preferred in the biological condition for the subunit. We believe that the (PS)(2) server would be a useful tool for biologists who are interested not only in the structures of protein complexes but also in the coupling between subunits of the complexes. The (PS)(2) is freely available at http://ps2v3.life.nctu.edu.tw/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  14. Distance matrix-based approach to protein structure prediction.

    Science.gov (United States)

    Kloczkowski, Andrzej; Jernigan, Robert L; Wu, Zhijun; Song, Guang; Yang, Lei; Kolinski, Andrzej; Pokarowski, Piotr

    2009-03-01

    dynamics. After structure matching, we apply principal component analysis (PCA) to obtain the important apparent motions for both bound and unbound structures. There are significant similarities between the first few key motions and the first few low-frequency normal modes calculated from a static representative structure with an elastic network model (ENM) that is based on the contact matrix C (related to D), strongly suggesting that the variations among the observed structures and the corresponding conformational changes are facilitated by the low-frequency, global motions intrinsic to the structure. Similarities are also found when the approach is applied to an NMR ensemble, as well as to atomic molecular dynamics (MD) trajectories. Thus, a sufficiently large number of experimental structures can directly provide important information about protein dynamics, but ENM can also provide a similar sampling of conformations. Finally, we use distance constraints from databases of known protein structures for structure refinement. We use the distributions of distances of various types in known protein structures to obtain the most probable ranges or the mean-force potentials for the distances. We then impose these constraints on structures to be refined or include the mean-force potentials directly in the energy minimization so that more plausible structural models can be built. This approach has been successfully used by us in 2006 in the CASPR structure refinement (http://predictioncenter.org/caspR).

  15. Improved hybrid optimization algorithm for 3D protein structure prediction.

    Science.gov (United States)

    Zhou, Changjun; Hou, Caixia; Wei, Xiaopeng; Zhang, Qiang

    2014-07-01

    A new improved hybrid optimization algorithm - PGATS algorithm, which is based on toy off-lattice model, is presented for dealing with three-dimensional protein structure prediction problems. The algorithm combines the particle swarm optimization (PSO), genetic algorithm (GA), and tabu search (TS) algorithms. Otherwise, we also take some different improved strategies. The factor of stochastic disturbance is joined in the particle swarm optimization to improve the search ability; the operations of crossover and mutation that are in the genetic algorithm are changed to a kind of random liner method; at last tabu search algorithm is improved by appending a mutation operator. Through the combination of a variety of strategies and algorithms, the protein structure prediction (PSP) in a 3D off-lattice model is achieved. The PSP problem is an NP-hard problem, but the problem can be attributed to a global optimization problem of multi-extremum and multi-parameters. This is the theoretical principle of the hybrid optimization algorithm that is proposed in this paper. The algorithm combines local search and global search, which overcomes the shortcoming of a single algorithm, giving full play to the advantage of each algorithm. In the current universal standard sequences, Fibonacci sequences and real protein sequences are certified. Experiments show that the proposed new method outperforms single algorithms on the accuracy of calculating the protein sequence energy value, which is proved to be an effective way to predict the structure of proteins.

  16. Hydrogen atoms in protein structures: high-resolution X-ray diffraction structure of the DFPase

    Science.gov (United States)

    2013-01-01

    Background Hydrogen atoms represent about half of the total number of atoms in proteins and are often involved in substrate recognition and catalysis. Unfortunately, X-ray protein crystallography at usual resolution fails to access directly their positioning, mainly because light atoms display weak contributions to diffraction. However, sub-Ångstrom diffraction data, careful modeling and a proper refinement strategy can allow the positioning of a significant part of hydrogen atoms. Results A comprehensive study on the X-ray structure of the diisopropyl-fluorophosphatase (DFPase) was performed, and the hydrogen atoms were modeled, including those of solvent molecules. This model was compared to the available neutron structure of DFPase, and differences in the protein and the active site solvation were noticed. Conclusions A further examination of the DFPase X-ray structure provides substantial evidence about the presence of an activated water molecule that may constitute an interesting piece of information as regard to the enzymatic hydrolysis mechanism. PMID:23915572

  17. Optimal neural networks for protein-structure prediction

    International Nuclear Information System (INIS)

    Head-Gordon, T.; Stillinger, F.H.

    1993-01-01

    The successful application of neural-network algorithms for prediction of protein structure is stymied by three problem areas: the sparsity of the database of known protein structures, poorly devised network architectures which make the input-output mapping opaque, and a global optimization problem in the multiple-minima space of the network variables. We present a simplified polypeptide model residing in two dimensions with only two amino-acid types, A and B, which allows the determination of the global energy structure for all possible sequences of pentamer, hexamer, and heptamer lengths. This model simplicity allows us to compile a complete structural database and to devise neural networks that reproduce the tertiary structure of all sequences with absolute accuracy and with the smallest number of network variables. These optimal networks reveal that the three problem areas are convoluted, but that thoughtful network designs can actually deconvolute these detrimental traits to provide network algorithms that genuinely impact on the ability of the network to generalize or learn the desired mappings. Furthermore, the two-dimensional polypeptide model shows sufficient chemical complexity so that transfer of neural-network technology to more realistic three-dimensional proteins is evident

  18. Protein Secondary Structure Prediction Using Deep Convolutional Neural Fields.

    Science.gov (United States)

    Wang, Sheng; Peng, Jian; Ma, Jianzhu; Xu, Jinbo

    2016-01-11

    Protein secondary structure (SS) prediction is important for studying protein structure and function. When only the sequence (profile) information is used as input feature, currently the best predictors can obtain ~80% Q3 accuracy, which has not been improved in the past decade. Here we present DeepCNF (Deep Convolutional Neural Fields) for protein SS prediction. DeepCNF is a Deep Learning extension of Conditional Neural Fields (CNF), which is an integration of Conditional Random Fields (CRF) and shallow neural networks. DeepCNF can model not only complex sequence-structure relationship by a deep hierarchical architecture, but also interdependency between adjacent SS labels, so it is much more powerful than CNF. Experimental results show that DeepCNF can obtain ~84% Q3 accuracy, ~85% SOV score, and ~72% Q8 accuracy, respectively, on the CASP and CAMEO test proteins, greatly outperforming currently popular predictors. As a general framework, DeepCNF can be used to predict other protein structure properties such as contact number, disorder regions, and solvent accessibility.

  19. Annotating the protein-RNA interaction sites in proteins using evolutionary information and protein backbone structure.

    Science.gov (United States)

    Li, Tao; Li, Qian-Zhong

    2012-11-07

    RNA-protein interactions play important roles in various biological processes. The precise detection of RNA-protein interaction sites is very important for understanding essential biological processes and annotating the function of the proteins. In this study, based on various features from amino acid sequence and structure, including evolutionary information, solvent accessible surface area and torsion angles (φ, ψ) in the backbone structure of the polypeptide chain, a computational method for predicting RNA-binding sites in proteins is proposed. When the method is applied to predict RNA-binding sites in three datasets: RBP86 containing 86 protein chains, RBP107 containing 107 proteins chains and RBP109 containing 109 proteins chains, better sensitivities and specificities are obtained compared to previously published methods in five-fold cross-validation tests. In order to make further examination for the efficiency of our method, the RBP107 dataset is used as training set, RBP86 and RBP109 datasets are used as the independent test sets. In addition, as examples of our prediction, RNA-binding sites in a few proteins are presented. The annotated results are consistent with the PDB annotation. These results show that our method is useful for annotating RNA binding sites of novel proteins.

  20. ECONGAS - model structure

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1997-12-31

    This report documents a numerical simulation model of the natural gas market in Germany, France, the Netherlands and Belgium. It is a part of a project called ``Internationalization and structural change in the gas market`` aiming to enhance the understanding of the factors behind the current and upcoming changes in the European gas market, especially the downstream part of the gas chain. The model takes European border prices of gas as given, adds transmission and distribution cost and profit margins as well as gas taxes to calculate gas prices. The model includes demand sub-models for households, chemical industry, other industry, the commercial sector and electricity generation. Demand responses to price changes are assumed to take time, and the long run effects are significantly larger than the short run effects. For the household sector and the electricity sector, the dynamics are modeled by distinguishing between energy use in the old and new capital stock. In addition to prices and the activity level (GDP), the model includes the extension of the gas network as a potentially important variable in explaining the development of gas demand. The properties of numerical simulation models are often described by dynamic multipliers, which describe the behaviour of important variables when key explanatory variables are changed. At the end, the report shows the results of a model experiment where the costs in transmission and distribution were reduced. 6 refs., 9 figs., 1 tab.

  1. ECONGAS - model structure

    International Nuclear Information System (INIS)

    1997-01-01

    This report documents a numerical simulation model of the natural gas market in Germany, France, the Netherlands and Belgium. It is a part of a project called ''Internationalization and structural change in the gas market'' aiming to enhance the understanding of the factors behind the current and upcoming changes in the European gas market, especially the downstream part of the gas chain. The model takes European border prices of gas as given, adds transmission and distribution cost and profit margins as well as gas taxes to calculate gas prices. The model includes demand sub-models for households, chemical industry, other industry, the commercial sector and electricity generation. Demand responses to price changes are assumed to take time, and the long run effects are significantly larger than the short run effects. For the household sector and the electricity sector, the dynamics are modeled by distinguishing between energy use in the old and new capital stock. In addition to prices and the activity level (GDP), the model includes the extension of the gas network as a potentially important variable in explaining the development of gas demand. The properties of numerical simulation models are often described by dynamic multipliers, which describe the behaviour of important variables when key explanatory variables are changed. At the end, the report shows the results of a model experiment where the costs in transmission and distribution were reduced. 6 refs., 9 figs., 1 tab

  2. Biophysical and structural considerations for protein sequence evolution

    Directory of Open Access Journals (Sweden)

    Grahnen Johan A

    2011-12-01

    Full Text Available Abstract Background Protein sequence evolution is constrained by the biophysics of folding and function, causing interdependence between interacting sites in the sequence. However, current site-independent models of sequence evolutions do not take this into account. Recent attempts to integrate the influence of structure and biophysics into phylogenetic models via statistical/informational approaches have not resulted in expected improvements in model performance. This suggests that further innovations are needed for progress in this field. Results Here we develop a coarse-grained physics-based model of protein folding and binding function, and compare it to a popular informational model. We find that both models violate the assumption of the native sequence being close to a thermodynamic optimum, causing directional selection away from the native state. Sampling and simulation show that the physics-based model is more specific for fold-defining interactions that vary less among residue type. The informational model diffuses further in sequence space with fewer barriers and tends to provide less support for an invariant sites model, although amino acid substitutions are generally conservative. Both approaches produce sequences with natural features like dN/dS Conclusions Simple coarse-grained models of protein folding can describe some natural features of evolving proteins but are currently not accurate enough to use in evolutionary inference. This is partly due to improper packing of the hydrophobic core. We suggest possible improvements on the representation of structure, folding energy, and binding function, as regards both native and non-native conformations, and describe a large number of possible applications for such a model.

  3. Protein crystal structure analysis using synchrotron radiation at atomic resolution

    International Nuclear Information System (INIS)

    Nonaka, Takamasa

    1999-01-01

    We can now obtain a detailed picture of protein, allowing the identification of individual atoms, by interpreting the diffraction of X-rays from a protein crystal at atomic resolution, 1.2 A or better. As of this writing, about 45 unique protein structures beyond 1.2 A resolution have been deposited in the Protein Data Bank. This review provides a simplified overview of how protein crystallographers use such diffraction data to solve, refine, and validate protein structures. (author)

  4. What determines the structures of native folds of proteins?

    International Nuclear Information System (INIS)

    Trovato, Antonio; Hoang, Trinh X; Banavar, Jayanth R; Maritan, Amos; Seno, Flavio

    2005-01-01

    We review a simple physical model (Hoang et al 2004 Proc. Natl Acad. Sci. USA 101 7960, Banavar et al 2004 Phys. Rev. E at press) which captures the essential physico-chemical ingredients that determine protein structure, such as the inherent anisotropy of a chain molecule, the geometrical and energetic constraints placed by hydrogen bonds, sterics, and hydrophobicity. Within this framework, marginally compact conformations resembling the native state folds of proteins emerge as competing minima in the free energy landscape. Here we demonstrate that a hydrophobic-polar (HP) sequence composed of regularly repeated patterns has as its ground state a β-helical structure remarkably similar to a known architecture in the Protein Data Bank

  5. GIS: a comprehensive source for protein structure similarities.

    Science.gov (United States)

    Guerler, Aysam; Knapp, Ernst-Walter

    2010-07-01

    A web service for analysis of protein structures that are sequentially or non-sequentially similar was generated. Recently, the non-sequential structure alignment algorithm GANGSTA+ was introduced. GANGSTA+ can detect non-sequential structural analogs for proteins stated to possess novel folds. Since GANGSTA+ ignores the polypeptide chain connectivity of secondary structure elements (i.e. alpha-helices and beta-strands), it is able to detect structural similarities also between proteins whose sequences were reshuffled during evolution. GANGSTA+ was applied in an all-against-all comparison on the ASTRAL40 database (SCOP version 1.75), which consists of >10,000 protein domains yielding about 55 x 10(6) possible protein structure alignments. Here, we provide the resulting protein structure alignments as a public web-based service, named GANGSTA+ Internet Services (GIS). We also allow to browse the ASTRAL40 database of protein structures with GANGSTA+ relative to an externally given protein structure using different constraints to select specific results. GIS allows us to analyze protein structure families according to the SCOP classification scheme. Additionally, users can upload their own protein structures for pairwise protein structure comparison, alignment against all protein structures of the ASTRAL40 database (SCOP version 1.75) or symmetry analysis. GIS is publicly available at http://agknapp.chemie.fu-berlin.de/gplus.

  6. Structure and Pathology of Tau Protein in Alzheimer Disease

    Directory of Open Access Journals (Sweden)

    Michala Kolarova

    2012-01-01

    Full Text Available Alzheimer's disease (AD is the most common type of dementia. In connection with the global trend of prolonging human life and the increasing number of elderly in the population, the AD becomes one of the most serious health and socioeconomic problems of the present. Tau protein promotes assembly and stabilizes microtubules, which contributes to the proper function of neuron. Alterations in the amount or the structure of tau protein can affect its role as a stabilizer of microtubules as well as some of the processes in which it is implicated. The molecular mechanisms governing tau aggregation are mainly represented by several posttranslational modifications that alter its structure and conformational state. Hence, abnormal phosphorylation and truncation of tau protein have gained attention as key mechanisms that become tau protein in a pathological entity. Evidences about the clinicopathological significance of phosphorylated and truncated tau have been documented during the progression of AD as well as their capacity to exert cytotoxicity when expressed in cell and animal models. This paper describes the normal structure and function of tau protein and its major alterations during its pathological aggregation in AD.

  7. Protein Loop Structure Prediction Using Conformational Space Annealing.

    Science.gov (United States)

    Heo, Seungryong; Lee, Juyong; Joo, Keehyoung; Shin, Hang-Cheol; Lee, Jooyoung

    2017-05-22

    We have developed a protein loop structure prediction method by combining a new energy function, which we call E PLM (energy for protein loop modeling), with the conformational space annealing (CSA) global optimization algorithm. The energy function includes stereochemistry, dynamic fragment assembly, distance-scaled finite ideal gas reference (DFIRE), and generalized orientation- and distance-dependent terms. For the conformational search of loop structures, we used the CSA algorithm, which has been quite successful in dealing with various hard global optimization problems. We assessed the performance of E PLM with two widely used loop-decoy sets, Jacobson and RAPPER, and compared the results against the DFIRE potential. The accuracy of model selection from a pool of loop decoys as well as de novo loop modeling starting from randomly generated structures was examined separately. For the selection of a nativelike structure from a decoy set, E PLM was more accurate than DFIRE in the case of the Jacobson set and had similar accuracy in the case of the RAPPER set. In terms of sampling more nativelike loop structures, E PLM outperformed E DFIRE for both decoy sets. This new approach equipped with E PLM and CSA can serve as the state-of-the-art de novo loop modeling method.

  8. Structure based alignment and clustering of proteins (STRALCP)

    Science.gov (United States)

    Zemla, Adam T.; Zhou, Carol E.; Smith, Jason R.; Lam, Marisa W.

    2013-06-18

    Disclosed are computational methods of clustering a set of protein structures based on local and pair-wise global similarity values. Pair-wise local and global similarity values are generated based on pair-wise structural alignments for each protein in the set of protein structures. Initially, the protein structures are clustered based on pair-wise local similarity values. The protein structures are then clustered based on pair-wise global similarity values. For each given cluster both a representative structure and spans of conserved residues are identified. The representative protein structure is used to assign newly-solved protein structures to a group. The spans are used to characterize conservation and assign a "structural footprint" to the cluster.

  9. An Integrated Framework Advancing Membrane Protein Modeling and Design.

    Directory of Open Access Journals (Sweden)

    Rebecca F Alford

    2015-09-01

    Full Text Available Membrane proteins are critical functional molecules in the human body, constituting more than 30% of open reading frames in the human genome. Unfortunately, a myriad of difficulties in overexpression and reconstitution into membrane mimetics severely limit our ability to determine their structures. Computational tools are therefore instrumental to membrane protein structure prediction, consequently increasing our understanding of membrane protein function and their role in disease. Here, we describe a general framework facilitating membrane protein modeling and design that combines the scientific principles for membrane protein modeling with the flexible software architecture of Rosetta3. This new framework, called RosettaMP, provides a general membrane representation that interfaces with scoring, conformational sampling, and mutation routines that can be easily combined to create new protocols. To demonstrate the capabilities of this implementation, we developed four proof-of-concept applications for (1 prediction of free energy changes upon mutation; (2 high-resolution structural refinement; (3 protein-protein docking; and (4 assembly of symmetric protein complexes, all in the membrane environment. Preliminary data show that these algorithms can produce meaningful scores and structures. The data also suggest needed improvements to both sampling routines and score functions. Importantly, the applications collectively demonstrate the potential of combining the flexible nature of RosettaMP with the power of Rosetta algorithms to facilitate membrane protein modeling and design.

  10. Course 12: Proteins: Structural, Thermodynamic and Kinetic Aspects

    Science.gov (United States)

    Finkelstein, A. V.

    1 Introduction 2 Overview of protein architectures and discussion of physical background of their natural selection 2.1 Protein structures 2.2 Physical selection of protein structures 3 Thermodynamic aspects of protein folding 3.1 Reversible denaturation of protein structures 3.2 What do denatured proteins look like? 3.3 Why denaturation of a globular protein is the first-order phase transition 3.4 "Gap" in energy spectrum: The main characteristic that distinguishes protein chains from random polymers 4 Kinetic aspects of protein folding 4.1 Protein folding in vivo 4.2 Protein folding in vitro (in the test-tube) 4.3 Theory of protein folding rates and solution of the Levinthal paradox

  11. Correlated mutations in protein sequences: Phylogenetic and structural effects

    Energy Technology Data Exchange (ETDEWEB)

    Lapedes, A.S. [Los Alamos National Lab., NM (United States). Theoretical Div.]|[Santa Fe Inst., NM (United States); Giraud, B.G. [C.E.N. Saclay, Gif/Yvette (France). Service Physique Theorique; Liu, L.C. [Los Alamos National Lab., NM (United States). Theoretical Div.; Stormo, G.D. [Univ. of Colorado, Boulder, CO (United States). Dept. of Molecular, Cellular and Developmental Biology

    1998-12-01

    Covariation analysis of sets of aligned sequences for RNA molecules is relatively successful in elucidating RNA secondary structure, as well as some aspects of tertiary structure. Covariation analysis of sets of aligned sequences for protein molecules is successful in certain instances in elucidating certain structural and functional links, but in general, pairs of sites displaying highly covarying mutations in protein sequences do not necessarily correspond to sites that are spatially close in the protein structure. In this paper the authors identify two reasons why naive use of covariation analysis for protein sequences fails to reliably indicate sequence positions that are spatially proximate. The first reason involves the bias introduced in calculation of covariation measures due to the fact that biological sequences are generally related by a non-trivial phylogenetic tree. The authors present a null-model approach to solve this problem. The second reason involves linked chains of covariation which can result in pairs of sites displaying significant covariation even though they are not spatially proximate. They present a maximum entropy solution to this classic problem of causation versus correlation. The methodologies are validated in simulation.

  12. Attenuated Total Reflection Fourier Transform Infrared (ATR FT-IR) Spectroscopy as an Analytical Method to Investigate the Secondary Structure of a Model Protein Embedded in Solid Lipid Matrices.

    Science.gov (United States)

    Zeeshan, Farrukh; Tabbassum, Misbah; Jorgensen, Lene; Medlicott, Natalie J

    2018-02-01

    Protein drugs may encounter conformational perturbations during the formulation processing of lipid-based solid dosage forms. In aqueous protein solutions, attenuated total reflection Fourier transform infrared (ATR FT-IR) spectroscopy can investigate these conformational changes following the subtraction of spectral interference of solvent with protein amide I bands. However, in solid dosage forms, the possible spectral contribution of lipid carriers to protein amide I band may be an obstacle to determine conformational alterations. The objective of this study was to develop an ATR FT-IR spectroscopic method for the analysis of protein secondary structure embedded in solid lipid matrices. Bovine serum albumin (BSA) was chosen as a model protein, while Precirol AT05 (glycerol palmitostearate, melting point 58 ℃) was employed as the model lipid matrix. Bovine serum albumin was incorporated into lipid using physical mixing, melting and mixing, or wet granulation mixing methods. Attenuated total reflection FT-IR spectroscopy and size exclusion chromatography (SEC) were performed for the analysis of BSA secondary structure and its dissolution in aqueous media, respectively. The results showed significant interference of Precirol ATO5 with BSA amide I band which was subtracted up to 90% w/w lipid content to analyze BSA secondary structure. In addition, ATR FT-IR spectroscopy also detected thermally denatured BSA solid alone and in the presence of lipid matrix indicating its suitability for the detection of denatured protein solids in lipid matrices. Despite being in the solid state, conformational changes occurred to BSA upon incorporation into solid lipid matrices. However, the extent of these conformational alterations was found to be dependent on the mixing method employed as indicated by area overlap calculations. For instance, the melting and mixing method imparted negligible effect on BSA secondary structure, whereas the wet granulation mixing method promoted

  13. 3DProIN: Protein-Protein Interaction Networks and Structure Visualization.

    Science.gov (United States)

    Li, Hui; Liu, Chunmei

    2014-06-14

    3DProIN is a computational tool to visualize protein-protein interaction networks in both two dimensional (2D) and three dimensional (3D) view. It models protein-protein interactions in a graph and explores the biologically relevant features of the tertiary structures of each protein in the network. Properties such as color, shape and name of each node (protein) of the network can be edited in either 2D or 3D views. 3DProIN is implemented using 3D Java and C programming languages. The internet crawl technique is also used to parse dynamically grasped protein interactions from protein data bank (PDB). It is a java applet component that is embedded in the web page and it can be used on different platforms including Linux, Mac and Window using web browsers such as Firefox, Internet Explorer, Chrome and Safari. It also was converted into a mac app and submitted to the App store as a free app. Mac users can also download the app from our website. 3DProIN is available for academic research at http://bicompute.appspot.com.

  14. PCI-SS: MISO dynamic nonlinear protein secondary structure prediction

    Directory of Open Access Journals (Sweden)

    Aboul-Magd Mohammed O

    2009-07-01

    Full Text Available Abstract Background Since the function of a protein is largely dictated by its three dimensional configuration, determining a protein's structure is of fundamental importance to biology. Here we report on a novel approach to determining the one dimensional secondary structure of proteins (distinguishing α-helices, β-strands, and non-regular structures from primary sequence data which makes use of Parallel Cascade Identification (PCI, a powerful technique from the field of nonlinear system identification. Results Using PSI-BLAST divergent evolutionary profiles as input data, dynamic nonlinear systems are built through a black-box approach to model the process of protein folding. Genetic algorithms (GAs are applied in order to optimize the architectural parameters of the PCI models. The three-state prediction problem is broken down into a combination of three binary sub-problems and protein structure classifiers are built using 2 layers of PCI classifiers. Careful construction of the optimization, training, and test datasets ensures that no homology exists between any training and testing data. A detailed comparison between PCI and 9 contemporary methods is provided over a set of 125 new protein chains guaranteed to be dissimilar to all training data. Unlike other secondary structure prediction methods, here a web service is developed to provide both human- and machine-readable interfaces to PCI-based protein secondary structure prediction. This server, called PCI-SS, is available at http://bioinf.sce.carleton.ca/PCISS. In addition to a dynamic PHP-generated web interface for humans, a Simple Object Access Protocol (SOAP interface is added to permit invocation of the PCI-SS service remotely. This machine-readable interface facilitates incorporation of PCI-SS into multi-faceted systems biology analysis pipelines requiring protein secondary structure information, and greatly simplifies high-throughput analyses. XML is used to represent the input

  15. Structural system identification: Structural dynamics model validation

    Energy Technology Data Exchange (ETDEWEB)

    Red-Horse, J.R.

    1997-04-01

    Structural system identification is concerned with the development of systematic procedures and tools for developing predictive analytical models based on a physical structure`s dynamic response characteristics. It is a multidisciplinary process that involves the ability (1) to define high fidelity physics-based analysis models, (2) to acquire accurate test-derived information for physical specimens using diagnostic experiments, (3) to validate the numerical simulation model by reconciling differences that inevitably exist between the analysis model and the experimental data, and (4) to quantify uncertainties in the final system models and subsequent numerical simulations. The goal of this project was to develop structural system identification techniques and software suitable for both research and production applications in code and model validation.

  16. Solution structure of the human signaling protein RACK1

    Directory of Open Access Journals (Sweden)

    Papa Priscila F

    2010-06-01

    Full Text Available Abstract Background The adaptor protein RACK1 (receptor of activated kinase 1 was originally identified as an anchoring protein for protein kinase C. RACK1 is a 36 kDa protein, and is composed of seven WD repeats which mediate its protein-protein interactions. RACK1 is ubiquitously expressed and has been implicated in diverse cellular processes involving: protein translation regulation, neuropathological processes, cellular stress, and tissue development. Results In this study we performed a biophysical analysis of human RACK1 with the aim of obtaining low resolution structural information. Small angle X-ray scattering (SAXS experiments demonstrated that human RACK1 is globular and monomeric in solution and its low resolution structure is strikingly similar to that of an homology model previously calculated by us and to the crystallographic structure of RACK1 isoform A from Arabidopsis thaliana. Both sedimentation velocity and sedimentation equilibrium analytical ultracentrifugation techniques showed that RACK1 is predominantly a monomer of around 37 kDa in solution, but also presents small amounts of oligomeric species. Moreover, hydrodynamic data suggested that RACK1 has a slightly asymmetric shape. The interaction of RACK1 and Ki-1/57 was tested by sedimentation equilibrium. The results suggested that the association between RACK1 and Ki-1/57(122-413 follows a stoichiometry of 1:1. The binding constant (KB observed for RACK1-Ki-1/57(122-413 interaction was of around (1.5 ± 0.2 × 106 M-1 and resulted in a dissociation constant (KD of (0.7 ± 0.1 × 10-6 M. Moreover, the fluorescence data also suggests that the interaction may occur in a cooperative fashion. Conclusion Our SAXS and analytical ultracentrifugation experiments indicated that RACK1 is predominantly a monomer in solution. RACK1 and Ki-1/57(122-413 interact strongly under the tested conditions.

  17. Structural determination of intact proteins using mass spectrometry

    Science.gov (United States)

    Kruppa, Gary [San Francisco, CA; Schoeniger, Joseph S [Oakland, CA; Young, Malin M [Livermore, CA

    2008-05-06

    The present invention relates to novel methods of determining the sequence and structure of proteins. Specifically, the present invention allows for the analysis of intact proteins within a mass spectrometer. Therefore, preparatory separations need not be performed prior to introducing a protein sample into the mass spectrometer. Also disclosed herein are new instrumental developments for enhancing the signal from the desired modified proteins, methods for producing controlled protein fragments in the mass spectrometer, eliminating complex microseparations, and protein preparatory chemical steps necessary for cross-linking based protein structure determination.Additionally, the preferred method of the present invention involves the determination of protein structures utilizing a top-down analysis of protein structures to search for covalent modifications. In the preferred method, intact proteins are ionized and fragmented within the mass spectrometer.

  18. I-TASSER server for protein 3D structure prediction

    Directory of Open Access Journals (Sweden)

    Zhang Yang

    2008-01-01

    Full Text Available Abstract Background Prediction of 3-dimensional protein structures from amino acid sequences represents one of the most important problems in computational structural biology. The community-wide Critical Assessment of Structure Prediction (CASP experiments have been designed to obtain an objective assessment of the state-of-the-art of the field, where I-TASSER was ranked as the best method in the server section of the recent 7th CASP experiment. Our laboratory has since then received numerous requests about the public availability of the I-TASSER algorithm and the usage of the I-TASSER predictions. Results An on-line version of I-TASSER is developed at the KU Center for Bioinformatics which has generated protein structure predictions for thousands of modeling requests from more than 35 countries. A scoring function (C-score based on the relative clustering structural density and the consensus significance score of multiple threading templates is introduced to estimate the accuracy of the I-TASSER predictions. A large-scale benchmark test demonstrates a strong correlation between the C-score and the TM-score (a structural similarity measurement with values in [0, 1] of the first models with a correlation coefficient of 0.91. Using a C-score cutoff > -1.5 for the models of correct topology, both false positive and false negative rates are below 0.1. Combining C-score and protein length, the accuracy of the I-TASSER models can be predicted with an average error of 0.08 for TM-score and 2 Å for RMSD. Conclusion The I-TASSER server has been developed to generate automated full-length 3D protein structural predictions where the benchmarked scoring system helps users to obtain quantitative assessments of the I-TASSER models. The output of the I-TASSER server for each query includes up to five full-length models, the confidence score, the estimated TM-score and RMSD, and the standard deviation of the estimations. The I-TASSER server is freely available

  19. Prediction of protein-protein interaction sites in sequences and 3D structures by random forests.

    Directory of Open Access Journals (Sweden)

    Mile Sikić

    2009-01-01

    Full Text Available Identifying interaction sites in proteins provides important clues to the function of a protein and is becoming increasingly relevant in topics such as systems biology and drug discovery. Although there are numerous papers on the prediction of interaction sites using information derived from structure, there are only a few case reports on the prediction of interaction residues based solely on protein sequence. Here, a sliding window approach is combined with the Random Forests method to predict protein interaction sites using (i a combination of sequence- and structure-derived parameters and (ii sequence information alone. For sequence-based prediction we achieved a precision of 84% with a 26% recall and an F-measure of 40%. When combined with structural information, the prediction performance increases to a precision of 76% and a recall of 38% with an F-measure of 51%. We also present an attempt to rationalize the sliding window size and demonstrate that a nine-residue window is the most suitable for predictor construction. Finally, we demonstrate the applicability of our prediction methods by modeling the Ras-Raf complex using predicted interaction sites as target binding interfaces. Our results suggest that it is possible to predict protein interaction sites with quite a high accuracy using only sequence information.

  20. Modulating nanoparticle superlattice structure using proteins with tunable bond distributions

    International Nuclear Information System (INIS)

    McMillan, Janet R.; Brodin, Jeffrey D.; Millan, Jaime A.; Lee, Byeongdu; Olvera de la Cruz, Monica; Mirkin, Chad A.

    2017-01-01

    Here, we investigate the use of proteins with tunable DNA modification distributions to modulate nanoparticle superlattice structure. Using Beta-galactosidase (βgal) as a model system, we have employed the orthogonal chemical reactivities of surface amines and thiols to synthesize protein-DNA conjugates with 36 evenly distributed or 8 specifically positioned oligonucleotides. When assembled into crystalline superlattices with AuNPs, we find that the distribution of DNA modifications modulates the favored structure: βgal with uniformly distributed DNA bonding elements results in body-centered cubic crystals, whereas DNA functionalization of cysteines results in AB 2 packing. We probe the role of protein oligonucleotide number and conjugate size on this observation, which revealed the importance of oligonucleotide distribution and number in this observed assembly behavior. These results indicate that proteins with defined DNA-modification patterns are powerful tools to control the nanoparticle superlattices architecture, and establish the importance of oligonucleotide distribution in the assembly behavior of protein-DNA conjugates.

  1. Protein structure similarity from principle component correlation analysis

    Directory of Open Access Journals (Sweden)

    Chou James

    2006-01-01

    Full Text Available Abstract Background Owing to rapid expansion of protein structure databases in recent years, methods of structure comparison are becoming increasingly effective and important in revealing novel information on functional properties of proteins and their roles in the grand scheme of evolutionary biology. Currently, the structural similarity between two proteins is measured by the root-mean-square-deviation (RMSD in their best-superimposed atomic coordinates. RMSD is the golden rule of measuring structural similarity when the structures are nearly identical; it, however, fails to detect the higher order topological similarities in proteins evolved into different shapes. We propose new algorithms for extracting geometrical invariants of proteins that can be effectively used to identify homologous protein structures or topologies in order to quantify both close and remote structural similarities. Results We measure structural similarity between proteins by correlating the principle components of their secondary structure interaction matrix. In our approach, the Principle Component Correlation (PCC analysis, a symmetric interaction matrix for a protein structure is constructed with relationship parameters between secondary elements that can take the form of distance, orientation, or other relevant structural invariants. When using a distance-based construction in the presence or absence of encoded N to C terminal sense, there are strong correlations between the principle components of interaction matrices of structurally or topologically similar proteins. Conclusion The PCC method is extensively tested for protein structures that belong to the same topological class but are significantly different by RMSD measure. The PCC analysis can also differentiate proteins having similar shapes but different topological arrangements. Additionally, we demonstrate that when using two independently defined interaction matrices, comparison of their maximum

  2. Structure and Sequence Search on Aptamer-Protein Docking

    Science.gov (United States)

    Xiao, Jiajie; Bonin, Keith; Guthold, Martin; Salsbury, Freddie

    2015-03-01

    Interactions between proteins and deoxyribonucleic acid (DNA) play a significant role in the living systems, especially through gene regulation. However, short nucleic acids sequences (aptamers) with specific binding affinity to specific proteins exhibit clinical potential as therapeutics. Our capillary and gel electrophoresis selection experiments show that specific sequences of aptamers can be selected that bind specific proteins. Computationally, given the experimentally-determined structure and sequence of a thrombin-binding aptamer, we can successfully dock the aptamer onto thrombin in agreement with experimental structures of the complex. In order to further study the conformational flexibility of this thrombin-binding aptamer and to potentially develop a predictive computational model of aptamer-binding, we use GPU-enabled molecular dynamics simulations to both examine the conformational flexibility of the aptamer in the absence of binding to thrombin, and to determine our ability to fold an aptamer. This study should help further de-novo predictions of aptamer sequences by enabling the study of structural and sequence-dependent effects on aptamer-protein docking specificity.

  3. Defining an essence of structure determining residue contacts in proteins.

    Science.gov (United States)

    Sathyapriya, R; Duarte, Jose M; Stehr, Henning; Filippis, Ioannis; Lappe, Michael

    2009-12-01

    The network of native non-covalent residue contacts determines the three-dimensional structure of a protein. However, not all contacts are of equal structural significance, and little knowledge exists about a minimal, yet sufficient, subset required to define the global features of a protein. Characterisation of this "structural essence" has remained elusive so far: no algorithmic strategy has been devised to-date that could outperform a random selection in terms of 3D reconstruction accuracy (measured as the Ca RMSD). It is not only of theoretical interest (i.e., for design of advanced statistical potentials) to identify the number and nature of essential native contacts-such a subset of spatial constraints is very useful in a number of novel experimental methods (like EPR) which rely heavily on constraint-based protein modelling. To derive accurate three-dimensional models from distance constraints, we implemented a reconstruction pipeline using distance geometry. We selected a test-set of 12 protein structures from the four major SCOP fold classes and performed our reconstruction analysis. As a reference set, series of random subsets (ranging from 10% to 90% of native contacts) are generated for each protein, and the reconstruction accuracy is computed for each subset. We have developed a rational strategy, termed "cone-peeling" that combines sequence features and network descriptors to select minimal subsets that outperform the reference sets. We present, for the first time, a rational strategy to derive a structural essence of residue contacts and provide an estimate of the size of this minimal subset. Our algorithm computes sparse subsets capable of determining the tertiary structure at approximately 4.8 A Ca RMSD with as little as 8% of the native contacts (Ca-Ca and Cb-Cb). At the same time, a randomly chosen subset of native contacts needs about twice as many contacts to reach the same level of accuracy. This "structural essence" opens new avenues in the

  4. Protein Data Bank (PDB): The Single Global Macromolecular Structure Archive.

    Science.gov (United States)

    Burley, Stephen K; Berman, Helen M; Kleywegt, Gerard J; Markley, John L; Nakamura, Haruki; Velankar, Sameer

    2017-01-01

    The Protein Data Bank (PDB)--the single global repository of experimentally determined 3D structures of biological macromolecules and their complexes--was established in 1971, becoming the first open-access digital resource in the biological sciences. The PDB archive currently houses ~130,000 entries (May 2017). It is managed by the Worldwide Protein Data Bank organization (wwPDB; wwpdb.org), which includes the RCSB Protein Data Bank (RCSB PDB; rcsb.org), the Protein Data Bank Japan (PDBj; pdbj.org), the Protein Data Bank in Europe (PDBe; pdbe.org), and BioMagResBank (BMRB; www.bmrb.wisc.edu). The four wwPDB partners operate a unified global software system that enforces community-agreed data standards and supports data Deposition, Biocuration, and Validation of ~11,000 new PDB entries annually (deposit.wwpdb.org). The RCSB PDB currently acts as the archive keeper, ensuring disaster recovery of PDB data and coordinating weekly updates. wwPDB partners disseminate the same archival data from multiple FTP sites, while operating complementary websites that provide their own views of PDB data with selected value-added information and links to related data resources. At present, the PDB archives experimental data, associated metadata, and 3D-atomic level structural models derived from three well-established methods: crystallography, nuclear magnetic resonance spectroscopy (NMR), and electron microscopy (3DEM). wwPDB partners are working closely with experts in related experimental areas (small-angle scattering, chemical cross-linking/mass spectrometry, Forster energy resonance transfer or FRET, etc.) to establish a federation of data resources that will support sustainable archiving and validation of 3D structural models and experimental data derived from integrative or hybrid methods.

  5. Nonlinear deterministic structures and the randomness of protein sequences

    CERN Document Server

    Huang Yan Zhao

    2003-01-01

    To clarify the randomness of protein sequences, we make a detailed analysis of a set of typical protein sequences representing each structural classes by using nonlinear prediction method. No deterministic structures are found in these protein sequences and this implies that they behave as random sequences. We also give an explanation to the controversial results obtained in previous investigations.

  6. The structure of a cholesterol-trapping protein

    Science.gov (United States)

    cholesterol-trapping protein Contact: Dan Krotz, dakrotz@lbl.gov Berkeley Lab Science Beat Lab website index Institute researchers determined the three-dimensional structure of a protein that controls cholesterol level in the bloodstream. Knowing the structure of the protein, a cellular receptor that ensnares

  7. STRUCTURAL FEATURES OF PLANT CHITINASES AND CHITIN-BINDING PROTEINS

    NARCIS (Netherlands)

    BEINTEMA, JJ

    1994-01-01

    Structural features of plant chitinases and chitin-binding proteins are discussed. Many of these proteins consist of multiple domains,of which the chitin-binding hevein domain is a predominant one. X-ray and NMR structures of representatives of the major classes of these proteins are available now,

  8. Molecular modeling of protein materials: case study of elastin

    International Nuclear Information System (INIS)

    Tarakanova, Anna; Buehler, Markus J

    2013-01-01

    Molecular modeling of protein materials is a quickly growing area of research that has produced numerous contributions in fields ranging from structural engineering to medicine and biology. We review here the history and methods commonly employed in molecular modeling of protein materials, emphasizing the advantages for using modeling as a complement to experimental work. We then consider a case study of the protein elastin, a critically important ‘mechanical protein’ to exemplify the approach in an area where molecular modeling has made a significant impact. We outline the progression of computational modeling studies that have considerably enhanced our understanding of this important protein which endows elasticity and recoil to the tissues it is found in, including the skin, lungs, arteries and the heart. A vast collection of literature has been directed at studying the structure and function of this protein for over half a century, the first molecular dynamics study of elastin being reported in the 1980s. We review the pivotal computational works that have considerably enhanced our fundamental understanding of elastin's atomistic structure and its extraordinary qualities—focusing on two in particular: elastin's superb elasticity and the inverse temperature transition—the remarkable ability of elastin to take on a more structured conformation at higher temperatures, suggesting its effectiveness as a biomolecular switch. Our hope is to showcase these methods as both complementary and enriching to experimental approaches that have thus far dominated the study of most protein-based materials. (topical review)

  9. Predicting protein folding pathways at the mesoscopic level based on native interactions between secondary structure elements

    Directory of Open Access Journals (Sweden)

    Sze Sing-Hoi

    2008-07-01

    Full Text Available Abstract Background Since experimental determination of protein folding pathways remains difficult, computational techniques are often used to simulate protein folding. Most current techniques to predict protein folding pathways are computationally intensive and are suitable only for small proteins. Results By assuming that the native structure of a protein is known and representing each intermediate conformation as a collection of fully folded structures in which each of them contains a set of interacting secondary structure elements, we show that it is possible to significantly reduce the conformation space while still being able to predict the most energetically favorable folding pathway of large proteins with hundreds of residues at the mesoscopic level, including the pig muscle phosphoglycerate kinase with 416 residues. The model is detailed enough to distinguish between different folding pathways of structurally very similar proteins, including the streptococcal protein G and the peptostreptococcal protein L. The model is also able to recognize the differences between the folding pathways of protein G and its two structurally similar variants NuG1 and NuG2, which are even harder to distinguish. We show that this strategy can produce accurate predictions on many other proteins with experimentally determined intermediate folding states. Conclusion Our technique is efficient enough to predict folding pathways for both large and small proteins at the mesoscopic level. Such a strategy is often the only feasible choice for large proteins. A software program implementing this strategy (SSFold is available at http://faculty.cs.tamu.edu/shsze/ssfold.

  10. Functional structural motifs for protein-ligand, protein-protein, and protein-nucleic acid interactions and their connection to supersecondary structures.

    Science.gov (United States)

    Kinjo, Akira R; Nakamura, Haruki

    2013-01-01

    Protein functions are mediated by interactions between proteins and other molecules. One useful approach to analyze protein functions is to compare and classify the structures of interaction interfaces of proteins. Here, we describe the procedures for compiling a database of interface structures and efficiently comparing the interface structures. To do so requires a good understanding of the data structures of the Protein Data Bank (PDB). Therefore, we also provide a detailed account of the PDB exchange dictionary necessary for extracting data that are relevant for analyzing interaction interfaces and secondary structures. We identify recurring structural motifs by classifying similar interface structures, and we define a coarse-grained representation of supersecondary structures (SSS) which represents a sequence of two or three secondary structure elements including their relative orientations as a string of four to seven letters. By examining the correspondence between structural motifs and SSS strings, we show that no SSS string has particularly high propensity to be found interaction interfaces in general, indicating any SSS can be used as a binding interface. When individual structural motifs are examined, there are some SSS strings that have high propensity for particular groups of structural motifs. In addition, it is shown that while the SSS strings found in particular structural motifs for nonpolymer and protein interfaces are as abundant as in other structural motifs that belong to the same subunit, structural motifs for nucleic acid interfaces exhibit somewhat stronger preference for SSS strings. In regard to protein folds, many motif-specific SSS strings were found across many folds, suggesting that SSS may be a useful description to investigate the universality of ligand binding modes.

  11. Validation of Structures in the Protein Data Bank.

    Science.gov (United States)

    Gore, Swanand; Sanz García, Eduardo; Hendrickx, Pieter M S; Gutmanas, Aleksandras; Westbrook, John D; Yang, Huanwang; Feng, Zukang; Baskaran, Kumaran; Berrisford, John M; Hudson, Brian P; Ikegawa, Yasuyo; Kobayashi, Naohiro; Lawson, Catherine L; Mading, Steve; Mak, Lora; Mukhopadhyay, Abhik; Oldfield, Thomas J; Patwardhan, Ardan; Peisach, Ezra; Sahni, Gaurav; Sekharan, Monica R; Sen, Sanchayita; Shao, Chenghua; Smart, Oliver S; Ulrich, Eldon L; Yamashita, Reiko; Quesada, Martha; Young, Jasmine Y; Nakamura, Haruki; Markley, John L; Berman, Helen M; Burley, Stephen K; Velankar, Sameer; Kleywegt, Gerard J

    2017-12-05

    The Worldwide PDB recently launched a deposition, biocuration, and validation tool: OneDep. At various stages of OneDep data processing, validation reports for three-dimensional structures of biological macromolecules are produced. These reports are based on recommendations of expert task forces representing crystallography, nuclear magnetic resonance, and cryoelectron microscopy communities. The reports provide useful metrics with which depositors can evaluate the quality of the experimental data, the structural model, and the fit between them. The validation module is also available as a stand-alone web server and as a programmatically accessible web service. A growing number of journals require the official wwPDB validation reports (produced at biocuration) to accompany manuscripts describing macromolecular structures. Upon public release of the structure, the validation report becomes part of the public PDB archive. Geometric quality scores for proteins in the PDB archive have improved over the past decade. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.

  12. CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure prediction

    KAUST Repository

    Cui, Xuefeng; Lu, Zhiwu; Wang, Sheng; Jing-Yan Wang, Jim; Gao, Xin

    2016-01-01

    Motivation: Protein homology detection, a fundamental problem in computational biology, is an indispensable step toward predicting protein structures and understanding protein functions. Despite the advances in recent decades on sequence alignment

  13. A modeling strategy for G-protein coupled receptors

    Directory of Open Access Journals (Sweden)

    Anna Kahler

    2016-03-01

    Full Text Available Cell responses can be triggered via G-protein coupled receptors (GPCRs that interact with small molecules, peptides or proteins and transmit the signal over the membrane via structural changes to activate intracellular pathways. GPCRs are characterized by a rather low sequence similarity and exhibit structural differences even for functionally closely related GPCRs. An accurate structure prediction for GPCRs is therefore not straightforward. We propose a computational approach that relies on the generation of several independent models based on different template structures, which are subsequently refined by molecular dynamics simulations. A comparison of their conformational stability and the agreement with GPCR-typical structural features is then used to select a favorable model. This strategy was applied to predict the structure of the herpesviral chemokine receptor US28 by generating three independent models based on the known structures of the chemokine receptors CXCR1, CXCR4, and CCR5. Model refinement and evaluation suggested that the model based on CCR5 exhibits the most favorable structural properties. In particular, the GPCR-typical structural features, such as a conserved water cluster or conserved non-covalent contacts, are present to a larger extent in the model based on CCR5 compared to the other models. A final model validation based on the recently published US28 crystal structure confirms that the CCR5-based model is the most accurate and exhibits 80.8% correctly modeled residues within the transmembrane helices. The structural agreement between the selected model and the crystal structure suggests that our modeling strategy may also be more generally applicable to other GPCRs of unknown structure.

  14. K-nearest uphill clustering in the protein structure space

    KAUST Repository

    Cui, Xuefeng

    2016-08-26

    The protein structure classification problem, which is to assign a protein structure to a cluster of similar proteins, is one of the most fundamental problems in the construction and application of the protein structure space. Early manually curated protein structure classifications (e.g., SCOP and CATH) are very successful, but recently suffer the slow updating problem because of the increased throughput of newly solved protein structures. Thus, fully automatic methods to cluster proteins in the protein structure space have been designed and developed. In this study, we observed that the SCOP superfamilies are highly consistent with clustering trees representing hierarchical clustering procedures, but the tree cutting is very challenging and becomes the bottleneck of clustering accuracy. To overcome this challenge, we proposed a novel density-based K-nearest uphill clustering method that effectively eliminates noisy pairwise protein structure similarities and identifies density peaks as cluster centers. Specifically, the density peaks are identified based on K-nearest uphills (i.e., proteins with higher densities) and K-nearest neighbors. To our knowledge, this is the first attempt to apply and develop density-based clustering methods in the protein structure space. Our results show that our density-based clustering method outperforms the state-of-the-art clustering methods previously applied to the problem. Moreover, we observed that computational methods and human experts could produce highly similar clusters at high precision values, while computational methods also suggest to split some large superfamilies into smaller clusters. © 2016 Elsevier B.V.

  15. DeepQA: Improving the estimation of single protein model quality with deep belief networks

    OpenAIRE

    Cao, Renzhi; Bhattacharya, Debswapna; Hou, Jie; Cheng, Jianlin

    2016-01-01

    Background Protein quality assessment (QA) useful for ranking and selecting protein models has long been viewed as one of the major challenges for protein tertiary structure prediction. Especially, estimating the quality of a single protein model, which is important for selecting a few good models out of a large model pool consisting of mostly low-quality models, is still a largely unsolved problem. Results We introduce a novel single-model quality assessment method DeepQA based on deep belie...

  16. Modelling Protein Dynamics on the Microsecond Time Scale

    DEFF Research Database (Denmark)

    Siuda, Iwona Anna

    Recent years have shown an increase in coarse-grained (CG) molecular dynamics simulations, providing structural and dynamic details of large proteins and enabling studies of self-assembly of biological materials. It is not easy to acquire such data experimentally, and access is also still limited...... in atomistic simulations. During her PhD studies, Iwona Siuda used MARTINI CG models to study the dynamics of different globular and membrane proteins. In several cases, the MARTINI model was sufficient to study conformational changes of small, purely alpha-helical proteins. However, in studies of larger......ELNEDIN was therefore proposed as part of the work. Iwona Siuda’s results from the CG simulations had biological implications that provide insights into possible mechanisms of the periplasmic leucine-binding protein, the sarco(endo)plasmic reticulum calcium pump, and several proteins from the saposin-like proteins...

  17. Validation of protein models by a neural network approach

    Directory of Open Access Journals (Sweden)

    Fantucci Piercarlo

    2008-01-01

    Full Text Available Abstract Background The development and improvement of reliable computational methods designed to evaluate the quality of protein models is relevant in the context of protein structure refinement, which has been recently identified as one of the bottlenecks limiting the quality and usefulness of protein structure prediction. Results In this contribution, we present a computational method (Artificial Intelligence Decoys Evaluator: AIDE which is able to consistently discriminate between correct and incorrect protein models. In particular, the method is based on neural networks that use as input 15 structural parameters, which include energy, solvent accessible surface, hydrophobic contacts and secondary structure content. The results obtained with AIDE on a set of decoy structures were evaluated using statistical indicators such as Pearson correlation coefficients, Znat, fraction enrichment, as well as ROC plots. It turned out that AIDE performances are comparable and often complementary to available state-of-the-art learning-based methods. Conclusion In light of the results obtained with AIDE, as well as its comparison with available learning-based methods, it can be concluded that AIDE can be successfully used to evaluate the quality of protein structures. The use of AIDE in combination with other evaluation tools is expected to further enhance protein refinement efforts.

  18. Use of designed sequences in protein structure recognition.

    Science.gov (United States)

    Kumar, Gayatri; Mudgal, Richa; Srinivasan, Narayanaswamy; Sandhya, Sankaran

    2018-05-09

    Knowledge of the protein structure is a pre-requisite for improved understanding of molecular function. The gap in the sequence-structure space has increased in the post-genomic era. Grouping related protein sequences into families can aid in narrowing the gap. In the Pfam database, structure description is provided for part or full-length proteins of 7726 families. For the remaining 52% of the families, information on 3-D structure is not yet available. We use the computationally designed sequences that are intermediately related to two protein domain families, which are already known to share the same fold. These strategically designed sequences enable detection of distant relationships and here, we have employed them for the purpose of structure recognition of protein families of yet unknown structure. We first measured the success rate of our approach using a dataset of protein families of known fold and achieved a success rate of 88%. Next, for 1392 families of yet unknown structure, we made structural assignments for part/full length of the proteins. Fold association for 423 domains of unknown function (DUFs) are provided as a step towards functional annotation. The results indicate that knowledge-based filling of gaps in protein sequence space is a lucrative approach for structure recognition. Such sequences assist in traversal through protein sequence space and effectively function as 'linkers', where natural linkers between distant proteins are unavailable. This article was reviewed by Oliviero Carugo, Christine Orengo and Srikrishna Subramanian.

  19. Energetically Unfavorable Amide Conformations for N6-Acetyllysine Side Chains in Refined Protein Structures

    Science.gov (United States)

    Genshaft, Alexander; Moser, Joe-Ann S.; D'Antonio, Edward L.; Bowman, Christine M.; Christianson, David W.

    2013-01-01

    The reversible acetylation of lysine to form N6-acetyllysine in the regulation of protein function is a hallmark of epigenetics. Acetylation of the positively charged amino group of the lysine side chain generates a neutral N-alkylacetamide moiety that serves as a molecular “switch” for the modulation of protein function and protein-protein interactions. We now report the analysis of 381 N6-acetyllysine side chain amide conformations as found in 79 protein crystal structures and 11 protein NMR structures deposited in the Protein Data Bank (PDB) of the Research Collaboratory for Structural Bioinformatics. We find that only 74.3% of N6-acetyllysine residues in protein crystal structures and 46.5% in protein NMR structures contain amide groups with energetically preferred trans or generously trans conformations. Surprisingly, 17.6% of N6-acetyllysine residues in protein crystal structures and 5.3% in protein NMR structures contain amide groups with energetically unfavorable cis or generously cis conformations. Even more surprisingly, 8.1% of N6-acetyllysine residues in protein crystal structures and 48.2% in NMR structures contain amide groups with energetically prohibitive twisted conformations that approach the transition state structure for cis-trans isomerization. In contrast, 109 unique N-alkylacetamide groups contained in 84 highly-accurate small molecule crystal structures retrieved from the Cambridge Structural Database exclusively adopt energetically preferred trans conformations. Therefore, we conclude that cis and twisted N6-acetyllysine amides in protein structures deposited in the PDB are erroneously modeled due to their energetically unfavorable or prohibitive conformations. PMID:23401043

  20. Roles of beta-turns in protein folding: from peptide models to protein engineering.

    Science.gov (United States)

    Marcelino, Anna Marie C; Gierasch, Lila M

    2008-05-01

    Reverse turns are a major class of protein secondary structure; they represent sites of chain reversal and thus sites where the globular character of a protein is created. It has been speculated for many years that turns may nucleate the formation of structure in protein folding, as their propensity to occur will favor the approximation of their flanking regions and their general tendency to be hydrophilic will favor their disposition at the solvent-accessible surface. Reverse turns are local features, and it is therefore not surprising that their structural properties have been extensively studied using peptide models. In this article, we review research on peptide models of turns to test the hypothesis that the propensities of turns to form in short peptides will relate to the roles of corresponding sequences in protein folding. Turns with significant stability as isolated entities should actively promote the folding of a protein, and by contrast, turn sequences that merely allow the chain to adopt conformations required for chain reversal are predicted to be passive in the folding mechanism. We discuss results of protein engineering studies of the roles of turn residues in folding mechanisms. Factors that correlate with the importance of turns in folding indeed include their intrinsic stability, as well as their topological context and their participation in hydrophobic networks within the protein's structure.

  1. An Efficient Null Model for Conformational Fluctuations in Proteins

    DEFF Research Database (Denmark)

    Harder, Tim Philipp; Borg, Mikael; Bottaro, Sandro

    2012-01-01

    Protein dynamics play a crucial role in function, catalytic activity, and pathogenesis. Consequently, there is great interest in computational methods that probe the conformational fluctuations of a protein. However, molecular dynamics simulations are computationally costly and therefore are often...... limited to comparatively short timescales. TYPHON is a probabilistic method to explore the conformational space of proteins under the guidance of a sophisticated probabilistic model of local structure and a given set of restraints that represent nonlocal interactions, such as hydrogen bonds or disulfide...... on conformational fluctuations that is in correspondence with experimental measurements. TYPHON provides a flexible, yet computationally efficient, method to explore possible conformational fluctuations in proteins....

  2. Using an alignment of fragment strings for comparing protein structures

    DEFF Research Database (Denmark)

    Friedberg, Iddo; Harder, Tim; Kolodny, Rachel

    2007-01-01

    . RESULTS: Here we describe the use of a particular structure fragment library, denoted here as KL-strings, for the 1D representation of protein structure. Using KL-strings, we develop an infrastructure for comparing protein structures with a 1D representation. This study focuses on the added value gained...

  3. Rheology and structure of milk protein gels

    NARCIS (Netherlands)

    Vliet, van T.; Lakemond, C.M.M.; Visschers, R.W.

    2004-01-01

    Recent studies on gel formation and rheology of milk gels are reviewed. A distinction is made between gels formed by aggregated casein, gels of `pure` whey proteins and gels in which both casein and whey proteins contribute to their properties. For casein' whey protein mixtures, it has been shown

  4. PRODUCT STRUCTURE DIGITAL MODEL

    Directory of Open Access Journals (Sweden)

    V.M. Sineglazov

    2005-02-01

    Full Text Available  Research results of representation of product structure made by means of CADDS5 computer-aided design (CAD system, Product Data Management Optegra (PDM system and Product Life Cycle Management Wind-chill system (PLM, are examined in this work. Analysis of structure component development and its storage in various systems is carried out. Algorithms of structure transformation required for correct representation of the structure are considered. Management analysis of electronic mockup presentation of the product structure is carried out for Windchill system.

  5. ORION: a web server for protein fold recognition and structure prediction using evolutionary hybrid profiles.

    Science.gov (United States)

    Ghouzam, Yassine; Postic, Guillaume; Guerin, Pierre-Edouard; de Brevern, Alexandre G; Gelly, Jean-Christophe

    2016-06-20

    Protein structure prediction based on comparative modeling is the most efficient way to produce structural models when it can be performed. ORION is a dedicated webserver based on a new strategy that performs this task. The identification by ORION of suitable templates is performed using an original profile-profile approach that combines sequence and structure evolution information. Structure evolution information is encoded into profiles using structural features, such as solvent accessibility and local conformation -with Protein Blocks-, which give an accurate description of the local protein structure. ORION has recently been improved, increasing by 5% the quality of its results. The ORION web server accepts a single protein sequence as input and searches homologous protein structures within minutes. Various databases such as PDB, SCOP and HOMSTRAD can be mined to find an appropriate structural template. For the modeling step, a protein 3D structure can be directly obtained from the selected template by MODELLER and displayed with global and local quality model estimation measures. The sequence and the predicted structure of 4 examples from the CAMEO server and a recent CASP11 target from the 'Hard' category (T0818-D1) are shown as pertinent examples. Our web server is accessible at http://www.dsimb.inserm.fr/ORION/.

  6. Structure-Energy Relationships of Halogen Bonds in Proteins.

    Science.gov (United States)

    Scholfield, Matthew R; Ford, Melissa Coates; Carlsson, Anna-Carin C; Butta, Hawera; Mehl, Ryan A; Ho, P Shing

    2017-06-06

    The structures and stabilities of proteins are defined by a series of weak noncovalent electrostatic, van der Waals, and hydrogen bond (HB) interactions. In this study, we have designed and engineered halogen bonds (XBs) site-specifically to study their structure-energy relationship in a model protein, T4 lysozyme. The evidence for XBs is the displacement of the aromatic side chain toward an oxygen acceptor, at distances that are equal to or less than the sums of their respective van der Waals radii, when the hydroxyl substituent of the wild-type tyrosine is replaced by a halogen. In addition, thermal melting studies show that the iodine XB rescues the stabilization energy from an otherwise destabilizing substitution (at an equivalent noninteracting site), indicating that the interaction is also present in solution. Quantum chemical calculations show that the XB complements an HB at this site and that solvent structure must also be considered in trying to design molecular interactions such as XBs into biological systems. A bromine substitution also shows displacement of the side chain, but the distances and geometries do not indicate formation of an XB. Thus, we have dissected the contributions from various noncovalent interactions of halogens introduced into proteins, to drive the application of XBs, particularly in biomolecular design.

  7. Protein 8-class secondary structure prediction using conditional neural fields.

    Science.gov (United States)

    Wang, Zhiyong; Zhao, Feng; Peng, Jian; Xu, Jinbo

    2011-10-01

    Compared with the protein 3-class secondary structure (SS) prediction, the 8-class prediction gains less attention and is also much more challenging, especially for proteins with few sequence homologs. This paper presents a new probabilistic method for 8-class SS prediction using conditional neural fields (CNFs), a recently invented probabilistic graphical model. This CNF method not only models the complex relationship between sequence features and SS, but also exploits the interdependency among SS types of adjacent residues. In addition to sequence profiles, our method also makes use of non-evolutionary information for SS prediction. Tested on the CB513 and RS126 data sets, our method achieves Q8 accuracy of 64.9 and 64.7%, respectively, which are much better than the SSpro8 web server (51.0 and 48.0%, respectively). Our method can also be used to predict other structure properties (e.g. solvent accessibility) of a protein or the SS of RNA. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  8. Protein Secondary Structures (α-helix and β-sheet) at a Cellular Level and Protein Fractions in Relation to Rumen Degradation Behaviours of Protein: A New Approach

    International Nuclear Information System (INIS)

    Yu, P.

    2007-01-01

    Studying the secondary structure of proteins leads to an understanding of the components that make up a whole protein, and such an understanding of the structure of the whole protein is often vital to understanding its digestive behaviour and nutritive value in animals. The main protein secondary structures are the α-helix and β-sheet. The percentage of these two structures in protein secondary structures influences protein nutritive value, quality and digestive behaviour. A high percentage of β-sheet structure may partly cause a low access to gastrointestinal digestive enzymes, which results in a low protein value. The objectives of the present study were to use advanced synchrotron-based Fourier transform IR (S-FTIR) microspectroscopy as a new approach to reveal the molecular chemistry of the protein secondary structures of feed tissues affected by heat-processing within intact tissue at a cellular level, and to quantify protein secondary structures using multicomponent peak modelling Gaussian and Lorentzian methods, in relation to protein digestive behaviours and nutritive value in the rumen, which was determined using the Cornell Net Carbohydrate Protein System. The synchrotron-based molecular chemistry research experiment was performed at the National Synchrotron Light Source at Brookhaven National Laboratory, US Department of Energy. The results showed that, with S-FTIR microspectroscopy, the molecular chemistry, ultrastructural chemical make-up and nutritive characteristics could be revealed at a high ultraspatial resolution (∼10 μm). S-FTIR microspectroscopy revealed that the secondary structure of protein differed between raw and roasted golden flaxseeds in terms of the percentages and ratio of α-helixes and β-sheets in the mid-IR range at the cellular level. By using multicomponent peak modelling, the results show that the roasting reduced (P <0.05) the percentage of α-helixes (from 47.1% to 36.1%: S-FTIR absorption intensity), increased the

  9. Protein Secondary Structures (alpha-helix and beta-sheet) at a Cellular Levle and Protein Fractions in Relation to Rumen Degradation Behaviours of Protein: A New Approach

    Energy Technology Data Exchange (ETDEWEB)

    Yu,P.

    2007-01-01

    Studying the secondary structure of proteins leads to an understanding of the components that make up a whole protein, and such an understanding of the structure of the whole protein is often vital to understanding its digestive behaviour and nutritive value in animals. The main protein secondary structures are the {alpha}-helix and {beta}-sheet. The percentage of these two structures in protein secondary structures influences protein nutritive value, quality and digestive behaviour. A high percentage of {beta}-sheet structure may partly cause a low access to gastrointestinal digestive enzymes, which results in a low protein value. The objectives of the present study were to use advanced synchrotron-based Fourier transform IR (S-FTIR) microspectroscopy as a new approach to reveal the molecular chemistry of the protein secondary structures of feed tissues affected by heat-processing within intact tissue at a cellular level, and to quantify protein secondary structures using multicomponent peak modelling Gaussian and Lorentzian methods, in relation to protein digestive behaviours and nutritive value in the rumen, which was determined using the Cornell Net Carbohydrate Protein System. The synchrotron-based molecular chemistry research experiment was performed at the National Synchrotron Light Source at Brookhaven National Laboratory, US Department of Energy. The results showed that, with S-FTIR microspectroscopy, the molecular chemistry, ultrastructural chemical make-up and nutritive characteristics could be revealed at a high ultraspatial resolution ({approx}10 {mu}m). S-FTIR microspectroscopy revealed that the secondary structure of protein differed between raw and roasted golden flaxseeds in terms of the percentages and ratio of {alpha}-helixes and {beta}-sheets in the mid-IR range at the cellular level. By using multicomponent peak modelling, the results show that the roasting reduced (P <0.05) the percentage of {alpha}-helixes (from 47.1% to 36.1%: S

  10. Structure of the DNA-bound BRCA1 C-terminal region from human replication factor C p140 and model of the protein-DNA complex

    NARCIS (Netherlands)

    Kobayashi, M.; AB, E.; Bonvin, A.M.J.J.; Siegal, G.

    2010-01-01

    BRCA1 C-terminal domain (BRCT)-containing proteins are found widely throughout the animal and bacteria kingdoms where they are exclusively involved in cell cycle regulation and DNA metabolism. Whereas most BRCT domains are involved in protein-protein interactions, a small subset has bona fide DNA

  11. Automatic structure classification of small proteins using random forest

    Directory of Open Access Journals (Sweden)

    Hirst Jonathan D

    2010-07-01

    Full Text Available Abstract Background Random forest, an ensemble based supervised machine learning algorithm, is used to predict the SCOP structural classification for a target structure, based on the similarity of its structural descriptors to those of a template structure with an equal number of secondary structure elements (SSEs. An initial assessment of random forest is carried out for domains consisting of three SSEs. The usability of random forest in classifying larger domains is demonstrated by applying it to domains consisting of four, five and six SSEs. Results Random forest, trained on SCOP version 1.69, achieves a predictive accuracy of up to 94% on an independent and non-overlapping test set derived from SCOP version 1.73. For classification to the SCOP Class, Fold, Super-family or Family levels, the predictive quality of the model in terms of Matthew's correlation coefficient (MCC ranged from 0.61 to 0.83. As the number of constituent SSEs increases the MCC for classification to different structural levels decreases. Conclusions The utility of random forest in classifying domains from the place-holder classes of SCOP to the true Class, Fold, Super-family or Family levels is demonstrated. Issues such as introduction of a new structural level in SCOP and the merger of singleton levels can also be addressed using random forest. A real-world scenario is mimicked by predicting the classification for those protein structures from the PDB, which are yet to be assigned to the SCOP classification hierarchy.

  12. PDB2CD visualises dynamics within protein structures.

    Science.gov (United States)

    Janes, Robert W

    2017-10-01

    Proteins tend to have defined conformations, a key factor in enabling their function. Atomic resolution structures of proteins are predominantly obtained by either solution nuclear magnetic resonance (NMR) or crystal structure methods. However, when considering a protein whose structure has been determined by both these approaches, on many occasions, the resultant conformations are subtly different, as illustrated by the examples in this study. The solution NMR approach invariably results in a cluster of structures whose conformations satisfy the distance boundaries imposed by the data collected; it might be argued that this is evidence of the dynamics of proteins when in solution. In crystal structures, the proteins are often in an energy minimum state which can result in an increase in the extent of regular secondary structure present relative to the solution state depicted by NMR, because the more dynamic ends of alpha helices and beta strands can become ordered at the lower temperatures. This study examines a novel way to display the differences in conformations within an NMR ensemble and between these and a crystal structure of a protein. Circular dichroism (CD) spectroscopy can be used to characterise protein structures in solution. Using the new bioinformatics tool, PDB2CD, which generates CD spectra from atomic resolution protein structures, the differences between, and possible dynamic range of, conformations adopted by a protein can be visualised.

  13. DNA mimic proteins: functions, structures, and bioinformatic analysis.

    Science.gov (United States)

    Wang, Hao-Ching; Ho, Chun-Han; Hsu, Kai-Cheng; Yang, Jinn-Moon; Wang, Andrew H-J

    2014-05-13

    DNA mimic proteins have DNA-like negative surface charge distributions, and they function by occupying the DNA binding sites of DNA binding proteins to prevent these sites from being accessed by DNA. DNA mimic proteins control the activities of a variety of DNA binding proteins and are involved in a wide range of cellular mechanisms such as chromatin assembly, DNA repair, transcription regulation, and gene recombination. However, the sequences and structures of DNA mimic proteins are diverse, making them difficult to predict by bioinformatic search. To date, only a few DNA mimic proteins have been reported. These DNA mimics were not found by searching for functional motifs in their sequences but were revealed only by structural analysis of their charge distribution. This review highlights the biological roles and structures of 16 reported DNA mimic proteins. We also discuss approaches that might be used to discover new DNA mimic proteins.

  14. Anomalous diffusion in neutral evolution of model proteins

    Science.gov (United States)

    Nelson, Erik D.; Grishin, Nick V.

    2015-06-01

    Protein evolution is frequently explored using minimalist polymer models, however, little attention has been given to the problem of structural drift, or diffusion. Here, we study neutral evolution of small protein motifs using an off-lattice heteropolymer model in which individual monomers interact as low-resolution amino acids. In contrast to most earlier models, both the length and folded structure of the polymers are permitted to change. To describe structural change, we compute the mean-square distance (MSD) between monomers in homologous folds separated by n neutral mutations. We find that structural change is episodic, and, averaged over lineages (for example, those extending from a single sequence), exhibits a power-law dependence on n . We show that this exponent depends on the alignment method used, and we analyze the distribution of waiting times between neutral mutations. The latter are more disperse than for models required to maintain a specific fold, but exhibit a similar power-law tail.

  15. De novo structural modeling and computational sequence analysis ...

    African Journals Online (AJOL)

    Different bioinformatics tools and machine learning techniques were used for protein structural classification. De novo protein modeling was performed by using I-TASSER server. The final model obtained was accessed by PROCHECK and DFIRE2, which confirmed that the final model is reliable. Until complete biochemical ...

  16. Crystal Structure of a Lipid G Protein-Coupled Receptor

    Energy Technology Data Exchange (ETDEWEB)

    Hanson, Michael A; Roth, Christopher B; Jo, Euijung; Griffith, Mark T; Scott, Fiona L; Reinhart, Greg; Desale, Hans; Clemons, Bryan; Cahalan, Stuart M; Schuerer, Stephan C; Sanna, M Germana; Han, Gye Won; Kuhn, Peter; Rosen, Hugh; Stevens, Raymond C [Scripps; (Receptos)

    2012-03-01

    The lyso-phospholipid sphingosine 1-phosphate modulates lymphocyte trafficking, endothelial development and integrity, heart rate, and vascular tone and maturation by activating G protein-coupled sphingosine 1-phosphate receptors. Here, we present the crystal structure of the sphingosine 1-phosphate receptor 1 fused to T4-lysozyme (S1P1-T4L) in complex with an antagonist sphingolipid mimic. Extracellular access to the binding pocket is occluded by the amino terminus and extracellular loops of the receptor. Access is gained by ligands entering laterally between helices I and VII within the transmembrane region of the receptor. This structure, along with mutagenesis, agonist structure-activity relationship data, and modeling, provides a detailed view of the molecular recognition and requirement for hydrophobic volume that activates S1P1, resulting in the modulation of immune and stromal cell responses.

  17. MolTalk--a programming library for protein structures and structure analysis.

    Science.gov (United States)

    Diemand, Alexander V; Scheib, Holger

    2004-04-19

    Two of the mostly unsolved but increasingly urgent problems for modern biologists are a) to quickly and easily analyse protein structures and b) to comprehensively mine the wealth of information, which is distributed along with the 3D co-ordinates by the Protein Data Bank (PDB). Tools which address this issue need to be highly flexible and powerful but at the same time must be freely available and easy to learn. We present MolTalk, an elaborate programming language, which consists of the programming library libmoltalk implemented in Objective-C and the Smalltalk-based interpreter MolTalk. MolTalk combines the advantages of an easy to learn and programmable procedural scripting with the flexibility and power of a full programming language. An overview of currently available applications of MolTalk is given and with PDBChainSaw one such application is described in more detail. PDBChainSaw is a MolTalk-based parser and information extraction utility of PDB files. Weekly updates of the PDB are synchronised with PDBChainSaw and are available for free download from the MolTalk project page http://www.moltalk.org following the link to PDBChainSaw. For each chain in a protein structure, PDBChainSaw extracts the sequence from its co-ordinates and provides additional information from the PDB-file header section, such as scientific organism, compound name, and EC code. MolTalk provides a rich set of methods to analyse and even modify experimentally determined or modelled protein structures. These methods vary in complexity and are thus suitable for beginners and advanced programmers alike. We envision MolTalk to be most valuable in the following applications:1) To analyse protein structures repetitively in large-scale, i.e. to benchmark protein structure prediction methods or to evaluate structural models. The quality of the resulting 3D-models can be assessed by e.g. calculating a Ramachandran-Sasisekharan plot.2) To quickly retrieve information for (a limited number of

  18. MolTalk – a programming library for protein structures and structure analysis

    Science.gov (United States)

    Diemand, Alexander V; Scheib, Holger

    2004-01-01

    Background Two of the mostly unsolved but increasingly urgent problems for modern biologists are a) to quickly and easily analyse protein structures and b) to comprehensively mine the wealth of information, which is distributed along with the 3D co-ordinates by the Protein Data Bank (PDB). Tools which address this issue need to be highly flexible and powerful but at the same time must be freely available and easy to learn. Results We present MolTalk, an elaborate programming language, which consists of the programming library libmoltalk implemented in Objective-C and the Smalltalk-based interpreter MolTalk. MolTalk combines the advantages of an easy to learn and programmable procedural scripting with the flexibility and power of a full programming language. An overview of currently available applications of MolTalk is given and with PDBChainSaw one such application is described in more detail. PDBChainSaw is a MolTalk-based parser and information extraction utility of PDB files. Weekly updates of the PDB are synchronised with PDBChainSaw and are available for free download from the MolTalk project page following the link to PDBChainSaw. For each chain in a protein structure, PDBChainSaw extracts the sequence from its co-ordinates and provides additional information from the PDB-file header section, such as scientific organism, compound name, and EC code. Conclusion MolTalk provides a rich set of methods to analyse and even modify experimentally determined or modelled protein structures. These methods vary in complexity and are thus suitable for beginners and advanced programmers alike. We envision MolTalk to be most valuable in the following applications: 1) To analyse protein structures repetitively in large-scale, i.e. to benchmark protein structure prediction methods or to evaluate structural models. The quality of the resulting 3D-models can be assessed by e.g. calculating a Ramachandran-Sasisekharan plot. 2) To quickly retrieve information for (a limited

  19. MolTalk – a programming library for protein structures and structure analysis

    Directory of Open Access Journals (Sweden)

    Diemand Alexander V

    2004-04-01

    Full Text Available Abstract Background Two of the mostly unsolved but increasingly urgent problems for modern biologists are a to quickly and easily analyse protein structures and b to comprehensively mine the wealth of information, which is distributed along with the 3D co-ordinates by the Protein Data Bank (PDB. Tools which address this issue need to be highly flexible and powerful but at the same time must be freely available and easy to learn. Results We present MolTalk, an elaborate programming language, which consists of the programming library libmoltalk implemented in Objective-C and the Smalltalk-based interpreter MolTalk. MolTalk combines the advantages of an easy to learn and programmable procedural scripting with the flexibility and power of a full programming language. An overview of currently available applications of MolTalk is given and with PDBChainSaw one such application is described in more detail. PDBChainSaw is a MolTalk-based parser and information extraction utility of PDB files. Weekly updates of the PDB are synchronised with PDBChainSaw and are available for free download from the MolTalk project page http://www.moltalk.org following the link to PDBChainSaw. For each chain in a protein structure, PDBChainSaw extracts the sequence from its co-ordinates and provides additional information from the PDB-file header section, such as scientific organism, compound name, and EC code. Conclusion MolTalk provides a rich set of methods to analyse and even modify experimentally determined or modelled protein structures. These methods vary in complexity and are thus suitable for beginners and advanced programmers alike. We envision MolTalk to be most valuable in the following applications: 1 To analyse protein structures repetitively in large-scale, i.e. to benchmark protein structure prediction methods or to evaluate structural models. The quality of the resulting 3D-models can be assessed by e.g. calculating a Ramachandran-Sasisekharan plot. 2 To

  20. From Extraction of Local Structures of Protein Energy Landscapes to Improved Decoy Selection in Template-Free Protein Structure Prediction.

    Science.gov (United States)

    Akhter, Nasrin; Shehu, Amarda

    2018-01-19

    Due to the essential role that the three-dimensional conformation of a protein plays in regulating interactions with molecular partners, wet and dry laboratories seek biologically-active conformations of a protein to decode its function. Computational approaches are gaining prominence due to the labor and cost demands of wet laboratory investigations. Template-free methods can now compute thousands of conformations known as decoys, but selecting native conformations from the generated decoys remains challenging. Repeatedly, research has shown that the protein energy functions whose minima are sought in the generation of decoys are unreliable indicators of nativeness. The prevalent approach ignores energy altogether and clusters decoys by conformational similarity. Complementary recent efforts design protein-specific scoring functions or train machine learning models on labeled decoys. In this paper, we show that an informative consideration of energy can be carried out under the energy landscape view. Specifically, we leverage local structures known as basins in the energy landscape probed by a template-free method. We propose and compare various strategies of basin-based decoy selection that we demonstrate are superior to clustering-based strategies. The presented results point to further directions of research for improving decoy selection, including the ability to properly consider the multiplicity of native conformations of proteins.

  1. From Extraction of Local Structures of Protein Energy Landscapes to Improved Decoy Selection in Template-Free Protein Structure Prediction

    Directory of Open Access Journals (Sweden)

    Nasrin Akhter

    2018-01-01

    Full Text Available Due to the essential role that the three-dimensional conformation of a protein plays in regulating interactions with molecular partners, wet and dry laboratories seek biologically-active conformations of a protein to decode its function. Computational approaches are gaining prominence due to the labor and cost demands of wet laboratory investigations. Template-free methods can now compute thousands of conformations known as decoys, but selecting native conformations from the generated decoys remains challenging. Repeatedly, research has shown that the protein energy functions whose minima are sought in the generation of decoys are unreliable indicators of nativeness. The prevalent approach ignores energy altogether and clusters decoys by conformational similarity. Complementary recent efforts design protein-specific scoring functions or train machine learning models on labeled decoys. In this paper, we show that an informative consideration of energy can be carried out under the energy landscape view. Specifically, we leverage local structures known as basins in the energy landscape probed by a template-free method. We propose and compare various strategies of basin-based decoy selection that we demonstrate are superior to clustering-based strategies. The presented results point to further directions of research for improving decoy selection, including the ability to properly consider the multiplicity of native conformations of proteins.

  2. Relation between native ensembles and experimental structures of proteins

    DEFF Research Database (Denmark)

    Best, R. B.; Lindorff-Larsen, Kresten; DePristo, M. A.

    2006-01-01

    Different experimental structures of the same protein or of proteins with high sequence similarity contain many small variations. Here we construct ensembles of "high-sequence similarity Protein Data Bank" (HSP) structures and consider the extent to which such ensembles represent the structural...... Data Bank ensembles; moreover, we show that the effects of uncertainties in structure determination are insufficient to explain the results. These results highlight the importance of accounting for native-state protein dynamics in making comparisons with ensemble-averaged experimental data and suggest...... heterogeneity of the native state in solution. We find that different NMR measurements probing structure and dynamics of given proteins in solution, including order parameters, scalar couplings, and residual dipolar couplings, are remarkably well reproduced by their respective high-sequence similarity Protein...

  3. NMR structure of the protein NP-247299.1: comparison with the crystal structure

    International Nuclear Information System (INIS)

    Jaudzems, Kristaps; Geralt, Michael; Serrano, Pedro; Mohanty, Biswaranjan; Horst, Reto; Pedrini, Bill; Elsliger, Marc-André; Wilson, Ian A.; Wüthrich, Kurt

    2010-01-01

    Comparison of the NMR and crystal structures of a protein determined using largely automated methods has enabled the interpretation of local differences in the highly similar structures. These differences are found in segments of higher B values in the crystal and correlate with dynamic processes on the NMR chemical shift timescale observed in solution. The NMR structure of the protein NP-247299.1 in solution at 313 K has been determined and is compared with the X-ray crystal structure, which was also solved in the Joint Center for Structural Genomics (JCSG) at 100 K and at 1.7 Å resolution. Both structures were obtained using the current largely automated crystallographic and solution NMR methods used by the JCSG. This paper assesses the accuracy and precision of the results from these recently established automated approaches, aiming for quantitative statements about the location of structure variations that may arise from either one of the methods used or from the different environments in solution and in the crystal. To evaluate the possible impact of the different software used for the crystallographic and the NMR structure determinations and analysis, the concept is introduced of reference structures, which are computed using the NMR software with input of upper-limit distance constraints derived from the molecular models representing the results of the two structure determinations. The use of this new approach is explored to quantify global differences that arise from the different methods of structure determination and analysis versus those that represent interesting local variations or dynamics. The near-identity of the protein core in the NMR and crystal structures thus provided a basis for the identification of complementary information from the two different methods. It was thus observed that locally increased crystallographic B values correlate with dynamic structural polymorphisms in solution, including that the solution state of the protein involves

  4. Exploiting conformational ensembles in modeling protein-protein interactions on the proteome scale

    Science.gov (United States)

    Kuzu, Guray; Gursoy, Attila; Nussinov, Ruth; Keskin, Ozlem

    2013-01-01

    Cellular functions are performed through protein-protein interactions; therefore, identification of these interactions is crucial for understanding biological processes. Recent studies suggest that knowledge-based approaches are more useful than ‘blind’ docking for modeling at large scales. However, a caveat of knowledge-based approaches is that they treat molecules as rigid structures. The Protein Data Bank (PDB) offers a wealth of conformations. Here, we exploited ensemble of the conformations in predictions by a knowledge-based method, PRISM. We tested ‘difficult’ cases in a docking-benchmark dataset, where the unbound and bound protein forms are structurally different. Considering alternative conformations for each protein, the percentage of successfully predicted interactions increased from ~26% to 66%, and 57% of the interactions were successfully predicted in an ‘unbiased’ scenario, in which data related to the bound forms were not utilized. If the appropriate conformation, or relevant template interface, is unavailable in the PDB, PRISM could not predict the interaction successfully. The pace of the growth of the PDB promises a rapid increase of ensemble conformations emphasizing the merit of such knowledge-based ensemble strategies for higher success rates in protein-protein interaction predictions on an interactome-scale. We constructed the structural network of ERK interacting proteins as a case study. PMID:23590674

  5. Integrated materials–structural models

    DEFF Research Database (Denmark)

    Stang, Henrik; Geiker, Mette Rica

    2008-01-01

    , repair works and strengthening methods for structures. A very significant part of the infrastructure consists of reinforced concrete structures. Even though reinforced concrete structures typically are very competitive, certain concrete structures suffer from various types of degradation. A framework...... should define a framework in which materials research results eventually should fit in and on the other side the materials research should define needs and capabilities in structural modelling. Integrated materials-structural models of a general nature are almost non-existent in the field of cement based...

  6. Crystal structure of the Japanese encephalitis virus envelope protein.

    Science.gov (United States)

    Luca, Vincent C; AbiMansour, Jad; Nelson, Christopher A; Fremont, Daved H

    2012-02-01

    Japanese encephalitis virus (JEV) is the leading global cause of viral encephalitis. The JEV envelope protein (E) facilitates cellular attachment and membrane fusion and is the primary target of neutralizing antibodies. We have determined the 2.1-Å resolution crystal structure of the JEV E ectodomain refolded from bacterial inclusion bodies. The E protein possesses the three domains characteristic of flavivirus envelopes and epitope mapping of neutralizing antibodies onto the structure reveals determinants that correspond to the domain I lateral ridge, fusion loop, domain III lateral ridge, and domain I-II hinge. While monomeric in solution, JEV E assembles as an antiparallel dimer in the crystal lattice organized in a highly similar fashion as seen in cryo-electron microscopy models of mature flavivirus virions. The dimer interface, however, is remarkably small and lacks many of the domain II contacts observed in other flavivirus E homodimers. In addition, uniquely conserved histidines within the JEV serocomplex suggest that pH-mediated structural transitions may be aided by lateral interactions outside the dimer interface in the icosahedral virion. Our results suggest that variation in dimer structure and stability may significantly influence the assembly, receptor interaction, and uncoating of virions.

  7. SA-Search: a web tool for protein structure mining based on a Structural Alphabet.

    Science.gov (United States)

    Guyon, Frédéric; Camproux, Anne-Claude; Hochez, Joëlle; Tufféry, Pierre

    2004-07-01

    SA-Search is a web tool that can be used to mine for protein structures and extract structural similarities. It is based on a hidden Markov model derived Structural Alphabet (SA) that allows the compression of three-dimensional (3D) protein conformations into a one-dimensional (1D) representation using a limited number of prototype conformations. Using such a representation, classical methods developed for amino acid sequences can be employed. Currently, SA-Search permits the performance of fast 3D similarity searches such as the extraction of exact words using a suffix tree approach, and the search for fuzzy words viewed as a simple 1D sequence alignment problem. SA-Search is available at http://bioserv.rpbs.jussieu.fr/cgi-bin/SA-Search.

  8. Current strategies for protein production and purification enabling membrane protein structural biology.

    Science.gov (United States)

    Pandey, Aditya; Shin, Kyungsoo; Patterson, Robin E; Liu, Xiang-Qin; Rainey, Jan K

    2016-12-01

    Membrane proteins are still heavily under-represented in the protein data bank (PDB), owing to multiple bottlenecks. The typical low abundance of membrane proteins in their natural hosts makes it necessary to overexpress these proteins either in heterologous systems or through in vitro translation/cell-free expression. Heterologous expression of proteins, in turn, leads to multiple obstacles, owing to the unpredictability of compatibility of the target protein for expression in a given host. The highly hydrophobic and (or) amphipathic nature of membrane proteins also leads to challenges in producing a homogeneous, stable, and pure sample for structural studies. Circumventing these hurdles has become possible through the introduction of novel protein production protocols; efficient protein isolation and sample preparation methods; and, improvement in hardware and software for structural characterization. Combined, these advances have made the past 10-15 years very exciting and eventful for the field of membrane protein structural biology, with an exponential growth in the number of solved membrane protein structures. In this review, we focus on both the advances and diversity of protein production and purification methods that have allowed this growth in structural knowledge of membrane proteins through X-ray crystallography, nuclear magnetic resonance (NMR) spectroscopy, and cryo-electron microscopy (cryo-EM).

  9. Modeling Structural Brain Connectivity

    DEFF Research Database (Denmark)

    Ambrosen, Karen Marie Sandø

    The human brain consists of a gigantic complex network of interconnected neurons. Together all these connections determine who we are, how we react and how we interpret the world. Knowledge about how the brain is connected can further our understanding of the brain’s structural organization, help...... improve diagnosis, and potentially allow better treatment of a wide range of neurological disorders. Tractography based on diffusion magnetic resonance imaging is a unique tool to estimate this “structural connectivity” of the brain non-invasively and in vivo. During the last decade, brain connectivity...... has increasingly been analyzed using graph theoretic measures adopted from network science and this characterization of the brain’s structural connectivity has been shown to be useful for the classification of populations, such as healthy and diseased subjects. The structural connectivity of the brain...

  10. Ice cream structure modification by ice-binding proteins.

    Science.gov (United States)

    Kaleda, Aleksei; Tsanev, Robert; Klesment, Tiina; Vilu, Raivo; Laos, Katrin

    2018-04-25

    Ice-binding proteins (IBPs), also known as antifreeze proteins, were added to ice cream to investigate their effect on structure and texture. Ice recrystallization inhibition was assessed in the ice cream mixes using a novel accelerated microscope assay and the ice cream microstructure was studied using an ice crystal dispersion method. It was found that adding recombinantly produced fish type III IBPs at a concentration 3 mg·L -1 made ice cream hard and crystalline with improved shape preservation during melting. Ice creams made with IBPs (both from winter rye, and type III IBP) had aggregates of ice crystals that entrapped pockets of the ice cream mixture in a rigid network. Larger individual ice crystals and no entrapment in control ice creams was observed. Based on these results a model of ice crystals aggregates formation in the presence of IBPs was proposed. Copyright © 2017 Elsevier Ltd. All rights reserved.

  11. WildSpan: mining structured motifs from protein sequences

    Directory of Open Access Journals (Sweden)

    Chen Chien-Yu

    2011-03-01

    Full Text Available Abstract Background Automatic extraction of motifs from biological sequences is an important research problem in study of molecular biology. For proteins, it is desired to discover sequence motifs containing a large number of wildcard symbols, as the residues associated with functional sites are usually largely separated in sequences. Discovering such patterns is time-consuming because abundant combinations exist when long gaps (a gap consists of one or more successive wildcards are considered. Mining algorithms often employ constraints to narrow down the search space in order to increase efficiency. However, improper constraint models might degrade the sensitivity and specificity of the motifs discovered by computational methods. We previously proposed a new constraint model to handle large wildcard regions for discovering functional motifs of proteins. The patterns that satisfy the proposed constraint model are called W-patterns. A W-pattern is a structured motif that groups motif symbols into pattern blocks interleaved with large irregular gaps. Considering large gaps reflects the fact that functional residues are not always from a single region of protein sequences, and restricting motif symbols into clusters corresponds to the observation that short motifs are frequently present within protein families. To efficiently discover W-patterns for large-scale sequence annotation and function prediction, this paper first formally introduces the problem to solve and proposes an algorithm named WildSpan (sequential pattern mining across large wildcard regions that incorporates several pruning strategies to largely reduce the mining cost. Results WildSpan is shown to efficiently find W-patterns containing conserved residues that are far separated in sequences. We conducted experiments with two mining strategies, protein-based and family-based mining, to evaluate the usefulness of W-patterns and performance of WildSpan. The protein-based mining mode

  12. Ion pairs in non-redundant protein structures

    Indian Academy of Sciences (India)

    Ion pairs contribute to several functions including the activity of catalytic triads, fusion of viral membranes, stability in thermophilic proteins and solvent–protein interactions. Furthermore, they have the ability to affect the stability of protein structures and are also a part of the forces that act to hold monomers together.

  13. The structure and function of endophilin proteins

    DEFF Research Database (Denmark)

    Kjaerulff, Ole; Brodin, Lennart; Jung, Anita

    2011-01-01

    Members of the BAR domain protein superfamily are essential elements of cellular traffic. Endophilins are among the best studied BAR domain proteins. They have a prominent function in synaptic vesicle endocytosis (SVE), receptor trafficking and apoptosis, and in other processes that require...

  14. Structure refinement of flexible proteins using dipolar couplings: Application to the protein p8MTCP1

    International Nuclear Information System (INIS)

    Demene, Helene; Ducat, Thierry; Barthe, Philippe; Delsuc, Marc-Andre; Roumestand, Christian

    2002-01-01

    The present study deals with the relevance of using mobility-averaged dipolar couplings for the structure refinement of flexible proteins. The 68-residue protein p8 MTCP1 has been chosen as model for this study. Its solution state consists mainly of three α-helices. The two N-terminal helices are strapped in a well-determined α-hairpin, whereas, due to an intrinsic mobility, the position of the third helix is less well defined in the NMR structure. To further characterize the degrees of freedom of this helix, we have measured the dipolar coupling constants in the backbone of p8 MTCP1 in a bicellar medium. We show here that including D HN dip dipolar couplings in the structure calculation protocol improves the structure of the α-hairpin but not the positioning of the third helix. This is due to the motional averaging of the dipolar couplings measured in the last helix. Performing two calculations with different force constants for the dipolar restraints highlights the inconstancy of these mobility-averaged dipolar couplings. Alternatively, prior to any structure calculations, comparing the values of the dipolar couplings measured in helix III to values back-calculated from an ideal helix demonstrates that they are atypical for a helix. This can be partly attributed to mobility effects since the inclusion of the 15 N relaxation derived order parameter allows for a better fit

  15. Oscillating water column structural model

    Energy Technology Data Exchange (ETDEWEB)

    Copeland, Guild [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Bull, Diana L [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Jepsen, Richard Alan [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Gordon, Margaret Ellen [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2014-09-01

    An oscillating water column (OWC) wave energy converter is a structure with an opening to the ocean below the free surface, i.e. a structure with a moonpool. Two structural models for a non-axisymmetric terminator design OWC, the Backward Bent Duct Buoy (BBDB) are discussed in this report. The results of this structural model design study are intended to inform experiments and modeling underway in support of the U.S. Department of Energy (DOE) initiated Reference Model Project (RMP). A detailed design developed by Re Vision Consulting used stiffeners and girders to stabilize the structure against the hydrostatic loads experienced by a BBDB device. Additional support plates were added to this structure to account for loads arising from the mooring line attachment points. A simplified structure was designed in a modular fashion. This simplified design allows easy alterations to the buoyancy chambers and uncomplicated analysis of resulting changes in buoyancy.

  16. Decreased C-reactive protein induces abnormal vascular structure in a rat model of liver dysfunction induced by bile duct ligation

    Directory of Open Access Journals (Sweden)

    Ji Hye Jun

    2016-09-01

    Full Text Available Background/Aims Chronic liver disease leads to liver fibrosis, and although the liver does have a certain regenerative capacity, this disease is associated with dysfunction of the liver vessels. C-reactive protein (CRP is produced in the liver and circulated from there for metabolism. CRP was recently shown to inhibit angiogenesis by inducing endothelial cell dysfunction. The objective of this study was to determine the effect of CRP levels on angiogenesis in a rat model of liver dysfunction induced by bile duct ligation (BDL. Methods The diameter of the hepatic vein was analyzed in rat liver tissues using hematoxylin and eosin (H&E staining. The expression levels of angiogenic factors, albumin, and CRP were analyzed by real-time PCR and Western blotting. A tube formation assay was performed to confirm the effect of CRP on angiogenesis in human umbilical vein endothelial cells (HUVECs treated with lithocholic acid (LCA and siRNA-CRP. Results The diameter of the hepatic portal vein increased significantly with the progression of cirrhosis. The expression levels of angiogenic factors were increased in the cirrhotic liver. In contrast, the expression levels of albumin and CRP were significantly lower in the liver tissue obtained from the BDL rat model than in the normal liver. The CRP level was correlated with the expression of albumin in hepatocytes treated with LCA and siRNA-CRP. Tube formation was significantly decreased in HUVECs when they were treated with LCA or a combination of LCA and siRNA-CRP. Conclusion CRP seems to be involved in the abnormal formation of vessels in hepatic disease, and so it could be a useful diagnostic marker for hepatic disease.

  17. Decreased C-reactive protein induces abnormal vascular structure in a rat model of liver dysfunction induced by bile duct ligation.

    Science.gov (United States)

    Jun, Ji Hye; Choi, Jong Ho; Bae, Si Hyun; Oh, Seh Hoon; Kim, Gi Jin

    2016-09-01

    Chronic liver disease leads to liver fibrosis, and although the liver does have a certain regenerative capacity, this disease is associated with dysfunction of the liver vessels. C-reactive protein (CRP) is produced in the liver and circulated from there for metabolism. CRP was recently shown to inhibit angiogenesis by inducing endothelial cell dysfunction. The objective of this study was to determine the effect of CRP levels on angiogenesis in a rat model of liver dysfunction induced by bile duct ligation (BDL). The diameter of the hepatic vein was analyzed in rat liver tissues using hematoxylin and eosin (H&E) staining. The expression levels of angiogenic factors, albumin, and CRP were analyzed by real-time PCR and Western blotting. A tube formation assay was performed to confirm the effect of CRP on angiogenesis in human umbilical vein endothelial cells (HUVECs) treated with lithocholic acid (LCA) and siRNA-CRP. The diameter of the hepatic portal vein increased significantly with the progression of cirrhosis. The expression levels of angiogenic factors were increased in the cirrhotic liver. In contrast, the expression levels of albumin and CRP were significantly lower in the liver tissue obtained from the BDL rat model than in the normal liver. The CRP level was correlated with the expression of albumin in hepatocytes treated with LCA and siRNA-CRP. Tube formation was significantly decreased in HUVECs when they were treated with LCA or a combination of LCA and siRNA-CRP. CRP seems to be involved in the abnormal formation of vessels in hepatic disease, and so it could be a useful diagnostic marker for hepatic disease.

  18. BLAST-based structural annotation of protein residues using Protein Data Bank.

    Science.gov (United States)

    Singh, Harinder; Raghava, Gajendra P S

    2016-01-25

    In the era of next-generation sequencing where thousands of genomes have been already sequenced; size of protein databases is growing with exponential rate. Structural annotation of these proteins is one of the biggest challenges for the computational biologist. Although, it is easy to perform BLAST search against Protein Data Bank (PDB) but it is difficult for a biologist to annotate protein residues from BLAST search. A web-server StarPDB has been developed for structural annotation of a protein based on its similarity with known protein structures. It uses standard BLAST software for performing similarity search of a query protein against protein structures in PDB. This server integrates wide range modules for assigning different types of annotation that includes, Secondary-structure, Accessible surface area, Tight-turns, DNA-RNA and Ligand modules. Secondary structure module allows users to predict regular secondary structure states to each residue in a protein. Accessible surface area predict the exposed or buried residues in a protein. Tight-turns module is designed to predict tight turns like beta-turns in a protein. DNA-RNA module developed for predicting DNA and RNA interacting residues in a protein. Similarly, Ligand module of server allows one to predicted ligands, metal and nucleotides ligand interacting residues in a protein. In summary, this manuscript presents a web server for comprehensive annotation of a protein based on similarity search. It integrates number of visualization tools that facilitate users to understand structure and function of protein residues. This web server is available freely for scientific community from URL http://crdd.osdd.net/raghava/starpdb .

  19. MetaGO: Predicting Gene Ontology of Non-homologous Proteins Through Low-Resolution Protein Structure Prediction and Protein-Protein Network Mapping.

    Science.gov (United States)

    Zhang, Chengxin; Zheng, Wei; Freddolino, Peter L; Zhang, Yang

    2018-03-10

    Homology-based transferal remains the major approach to computational protein function annotations, but it becomes increasingly unreliable when the sequence identity between query and template decreases below 30%. We propose a novel pipeline, MetaGO, to deduce Gene Ontology attributes of proteins by combining sequence homology-based annotation with low-resolution structure prediction and comparison, and partner's homology-based protein-protein network mapping. The pipeline was tested on a large-scale set of 1000 non-redundant proteins from the CAFA3 experiment. Under the stringent benchmark conditions where templates with >30% sequence identity to the query are excluded, MetaGO achieves average F-measures of 0.487, 0.408, and 0.598, for Molecular Function, Biological Process, and Cellular Component, respectively, which are significantly higher than those achieved by other state-of-the-art function annotations methods. Detailed data analysis shows that the major advantage of the MetaGO lies in the new functional homolog detections from partner's homology-based network mapping and structure-based local and global structure alignments, the confidence scores of which can be optimally combined through logistic regression. These data demonstrate the power of using a hybrid model incorporating protein structure and interaction networks to deduce new functional insights beyond traditional sequence homology-based referrals, especially for proteins that lack homologous function templates. The MetaGO pipeline is available at http://zhanglab.ccmb.med.umich.edu/MetaGO/. Copyright © 2018. Published by Elsevier Ltd.

  20. Some Recent Developments in Structure and Glassy Behavior of Proteins

    Science.gov (United States)

    Hu, Chin-Kun

    2012-02-01

    We have used ARVO developed by us to find that the ratio of volume and surface area of proteins in Protein Data Bank distributed in a very narrow region [1]. Such result is useful for the determination of protein 3D structures. It has been widely known that a spin glass model can be used to understand the slow relaxation behavior of a glass at low temperatures [2]. We have used molecular dynamics and simple models of polymer chains to study relaxation and aggregation of proteins under various conditions and found that polymer chains with neighboring monomers connected by rigid bonds can relax very slowly and show glassy behavior [3]. We have also found that native collagen fibrils show glassy behavior at room temperatures [4]. The results of [3] and [4] about the glassy behavior of polymers or proteins are useful for understanding the mechanism for a biological system to maintain in a non-equilibrium state, including the ancient seed [5], which can maintain in a non-equilibrium state for a very long time. (1) M.-C. Wu, M. S. Li, W.-J. Ma, M. Kouza, and C.-K. Hu, EPL, in press (2011); (2) C. Dasgupta, S.-K. Ma, and C.-K. Hu. Phys. Rev. B 20, 3837-3849 (1979); (3) W.-J. Ma and C.-K. Hu, J. Phys. Soc. Japan 79, 024005, 024006, 054001, and 104002 (2010), C.-K. Hu and W.-J. Ma, Prog. Theor. Phys. Supp. 184, 369 (2010); S. G. Gevorkian, A. E. Allahverdyan, D. S. Gevorgyan and C.-K. Hu, EPL 95, 23001 (2011); S. Sallon, et al. Science 320, 1464 (2008).

  1. The contact activation proteins: a structure/function overview

    NARCIS (Netherlands)

    Meijers, J. C.; McMullen, B. A.; Bouma, B. N.

    1992-01-01

    In recent years, extensive knowledge has been obtained on the structure/function relationships of blood coagulation proteins. In this overview, we present recent developments on the structure/function relationships of the contact activation proteins: factor XII, high molecular weight kininogen,

  2. CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure prediction

    KAUST Repository

    Cui, Xuefeng

    2016-06-15

    Motivation: Protein homology detection, a fundamental problem in computational biology, is an indispensable step toward predicting protein structures and understanding protein functions. Despite the advances in recent decades on sequence alignment, threading and alignment-free methods, protein homology detection remains a challenging open problem. Recently, network methods that try to find transitive paths in the protein structure space demonstrate the importance of incorporating network information of the structure space. Yet, current methods merge the sequence space and the structure space into a single space, and thus introduce inconsistency in combining different sources of information. Method: We present a novel network-based protein homology detection method, CMsearch, based on cross-modal learning. Instead of exploring a single network built from the mixture of sequence and structure space information, CMsearch builds two separate networks to represent the sequence space and the structure space. It then learns sequence–structure correlation by simultaneously taking sequence information, structure information, sequence space information and structure space information into consideration. Results: We tested CMsearch on two challenging tasks, protein homology detection and protein structure prediction, by querying all 8332 PDB40 proteins. Our results demonstrate that CMsearch is insensitive to the similarity metrics used to define the sequence and the structure spaces. By using HMM–HMM alignment as the sequence similarity metric, CMsearch clearly outperforms state-of-the-art homology detection methods and the CASP-winning template-based protein structure prediction methods.

  3. Sampling Realistic Protein Conformations Using Local Structural Bias

    DEFF Research Database (Denmark)

    Hamelryck, Thomas Wim; Kent, John T.; Krogh, A.

    2006-01-01

    The prediction of protein structure from sequence remains a major unsolved problem in biology. The most successful protein structure prediction methods make use of a divide-and-conquer strategy to attack the problem: a conformational sampling method generates plausible candidate structures, which...... are subsequently accepted or rejected using an energy function. Conceptually, this often corresponds to separating local structural bias from the long-range interactions that stabilize the compact, native state. However, sampling protein conformations that are compatible with the local structural bias encoded...... in a given protein sequence is a long-standing open problem, especially in continuous space. We describe an elegant and mathematically rigorous method to do this, and show that it readily generates native-like protein conformations simply by enforcing compactness. Our results have far-reaching implications...

  4. Rapid and reliable protein structure determination via chemical shift threading.

    Science.gov (United States)

    Hafsa, Noor E; Berjanskii, Mark V; Arndt, David; Wishart, David S

    2018-01-01

    Protein structure determination using nuclear magnetic resonance (NMR) spectroscopy can be both time-consuming and labor intensive. Here we demonstrate how chemical shift threading can permit rapid, robust, and accurate protein structure determination using only chemical shift data. Threading is a relatively old bioinformatics technique that uses a combination of sequence information and predicted (or experimentally acquired) low-resolution structural data to generate high-resolution 3D protein structures. The key motivations behind using NMR chemical shifts for protein threading lie in the fact that they are easy to measure, they are available prior to 3D structure determination, and they contain vital structural information. The method we have developed uses not only sequence and chemical shift similarity but also chemical shift-derived secondary structure, shift-derived super-secondary structure, and shift-derived accessible surface area to generate a high quality protein structure regardless of the sequence similarity (or lack thereof) to a known structure already in the PDB. The method (called E-Thrifty) was found to be very fast (often chemical shift refinement, these results suggest that protein structure determination, using only NMR chemical shifts, is becoming increasingly practical and reliable. E-Thrifty is available as a web server at http://ethrifty.ca .

  5. Is protein structure prediction still an enigma?

    African Journals Online (AJOL)

    STORAGESEVER

    2008-12-29

    Dec 29, 2008 ... Computer methods for protein analysis address this problem since they study the .... neighbor methods, molecular dynamic simulation, and approaches .... fuzzy clustering, neural net works, logistic regression, decision tree ...

  6. Structure-function correlations of pulmonary surfactant protein SP-B and the saposin-like family of proteins.

    Science.gov (United States)

    Olmeda, Bárbara; García-Álvarez, Begoña; Pérez-Gil, Jesús

    2013-03-01

    Pulmonary surfactant is a lipid-protein complex secreted by the respiratory epithelium of mammalian lungs, which plays an essential role in stabilising the alveolar surface and so reducing the work of breathing. The surfactant protein SP-B is part of this complex, and is strictly required for the assembly of pulmonary surfactant and its extracellular development to form stable surface-active films at the air-liquid alveolar interface, making the lack of SP-B incompatible with life. In spite of its physiological importance, a model for the structure and the mechanism of action of SP-B is still needed. The sequence of SP-B is homologous to that of the saposin-like family of proteins, which are membrane-interacting polypeptides with apparently diverging activities, from the co-lipase action of saposins to facilitate the degradation of sphingolipids in the lysosomes to the cytolytic actions of some antibiotic proteins, such as NK-lysin and granulysin or the amoebapore of Entamoeba histolytica. Numerous studies on the interactions of these proteins with membranes have still not explained how a similar sequence and a potentially related fold can sustain such apparently different activities. In the present review, we have summarised the most relevant features of the structure, lipid-protein and protein-protein interactions of SP-B and the saposin-like family of proteins, as a basis to propose an integrated model and a common mechanistic framework of the apparent functional versatility of the saposin fold.

  7. Critical assessment of methods of protein structure prediction (CASP)-round IX

    KAUST Repository

    Moult, John; Fidelis, Krzysztof; Kryshtafovych, Andriy; Tramontano, Anna

    2011-01-01

    This article is an introduction to the special issue of the journal PROTEINS, dedicated to the ninth Critical Assessment of Structure Prediction (CASP) experiment to assess the state of the art in protein structure modeling. The article describes the conduct of the experiment, the categories of prediction included, and outlines the evaluation and assessment procedures. Methods for modeling protein structure continue to advance, although at a more modest pace than in the early CASP experiments. CASP developments of note are indications of improvement in model accuracy for some classes of target, an improved ability to choose the most accurate of a set of generated models, and evidence of improvement in accuracy for short "new fold" models. In addition, a new analysis of regions of models not derivable from the most obvious template structure has revealed better performance than expected.

  8. Bluetongue virus non-structural protein 1 is a positive regulator of viral protein synthesis

    Directory of Open Access Journals (Sweden)

    Boyce Mark

    2012-08-01

    Full Text Available Abstract Background Bluetongue virus (BTV is a double-stranded RNA (dsRNA virus of the Reoviridae family, which encodes its genes in ten linear dsRNA segments. BTV mRNAs are synthesised by the viral RNA-dependent RNA polymerase (RdRp as exact plus sense copies of the genome segments. Infection of mammalian cells with BTV rapidly replaces cellular protein synthesis with viral protein synthesis, but the regulation of viral gene expression in the Orbivirus genus has not been investigated. Results Using an mRNA reporter system based on genome segment 10 of BTV fused with GFP we identify the protein characteristic of this genus, non-structural protein 1 (NS1 as sufficient to upregulate translation. The wider applicability of this phenomenon among the viral genes is demonstrated using the untranslated regions (UTRs of BTV genome segments flanking the quantifiable Renilla luciferase ORF in chimeric mRNAs. The UTRs of viral mRNAs are shown to be determinants of the amount of protein synthesised, with the pre-expression of NS1 increasing the quantity in each case. The increased expression induced by pre-expression of NS1 is confirmed in virus infected cells by generating a replicating virus which expresses the reporter fused with genome segment 10, using reverse genetics. Moreover, NS1-mediated upregulation of expression is restricted to mRNAs which lack the cellular 3′ poly(A sequence identifying the 3′ end as a necessary determinant in specifically increasing the translation of viral mRNA in the presence of cellular mRNA. Conclusions NS1 is identified as a positive regulator of viral protein synthesis. We propose a model of translational regulation where NS1 upregulates the synthesis of viral proteins, including itself, and creates a positive feedback loop of NS1 expression, which rapidly increases the expression of all the viral proteins. The efficient translation of viral reporter mRNAs among cellular mRNAs can account for the observed

  9. Bluetongue virus non-structural protein 1 is a positive regulator of viral protein synthesis.

    Science.gov (United States)

    Boyce, Mark; Celma, Cristina C P; Roy, Polly

    2012-08-29

    Bluetongue virus (BTV) is a double-stranded RNA (dsRNA) virus of the Reoviridae family, which encodes its genes in ten linear dsRNA segments. BTV mRNAs are synthesised by the viral RNA-dependent RNA polymerase (RdRp) as exact plus sense copies of the genome segments. Infection of mammalian cells with BTV rapidly replaces cellular protein synthesis with viral protein synthesis, but the regulation of viral gene expression in the Orbivirus genus has not been investigated. Using an mRNA reporter system based on genome segment 10 of BTV fused with GFP we identify the protein characteristic of this genus, non-structural protein 1 (NS1) as sufficient to upregulate translation. The wider applicability of this phenomenon among the viral genes is demonstrated using the untranslated regions (UTRs) of BTV genome segments flanking the quantifiable Renilla luciferase ORF in chimeric mRNAs. The UTRs of viral mRNAs are shown to be determinants of the amount of protein synthesised, with the pre-expression of NS1 increasing the quantity in each case. The increased expression induced by pre-expression of NS1 is confirmed in virus infected cells by generating a replicating virus which expresses the reporter fused with genome segment 10, using reverse genetics. Moreover, NS1-mediated upregulation of expression is restricted to mRNAs which lack the cellular 3' poly(A) sequence identifying the 3' end as a necessary determinant in specifically increasing the translation of viral mRNA in the presence of cellular mRNA. NS1 is identified as a positive regulator of viral protein synthesis. We propose a model of translational regulation where NS1 upregulates the synthesis of viral proteins, including itself, and creates a positive feedback loop of NS1 expression, which rapidly increases the expression of all the viral proteins. The efficient translation of viral reporter mRNAs among cellular mRNAs can account for the observed replacement of cellular protein synthesis with viral protein

  10. Solution structure and dynamics of melanoma inhibitory activity protein

    International Nuclear Information System (INIS)

    Lougheed, Julie C.; Domaille, Peter J.; Handel, Tracy M.

    2002-01-01

    Melanoma inhibitory activity (MIA) is a small secreted protein that is implicated in cartilage cell maintenance and melanoma metastasis. It is representative of a recently discovered family of proteins that contain a Src Homologous 3 (SH3) subdomain. While SH3 domains are normally found in intracellular proteins and mediate protein-protein interactions via recognition of polyproline helices, MIA is single-domain extracellular protein, and it probably binds to a different class of ligands.Here we report the assignments, solution structure, and dynamics of human MIA determined by heteronuclear NMR methods. The structures were calculated in a semi-automated manner without manual assignment of NOE crosspeaks, and have a backbone rmsd of 0.38 A over the ordered regions of the protein. The structure consists of an SH3-like subdomain with N- and C-terminal extensions of approximately 20 amino acids each that together form a novel fold. The rmsd between the solution structure and our recently reported crystal structure is 0.86 A over the ordered regions of the backbone, and the main differences are localized to the most dynamic regions of the protein. The similarity between the NMR and crystal structures supports the use of automated NOE assignments and ambiguous restraints to accelerate the calculation of NMR structures

  11. Global optimization of proteins using a dynamical lattice model: Ground states and energy landscapes

    OpenAIRE

    Dressel, F.; Kobe, S.

    2004-01-01

    A simple approach is proposed to investigate the protein structure. Using a low complexity model, a simple pairwise interaction and the concept of global optimization, we are able to calculate ground states of proteins, which are in agreement with experimental data. All possible model structures of small proteins are available below a certain energy threshold. The exact lowenergy landscapes for the trp cage protein (1L2Y) is presented showing the connectivity of all states and energy barriers.

  12. Building a Better Fragment Library for De Novo Protein Structure Prediction

    Science.gov (United States)

    de Oliveira, Saulo H. P.; Shi, Jiye; Deane, Charlotte M.

    2015-01-01

    Fragment-based approaches are the current standard for de novo protein structure prediction. These approaches rely on accurate and reliable fragment libraries to generate good structural models. In this work, we describe a novel method for structure fragment library generation and its application in fragment-based de novo protein structure prediction. The importance of correct testing procedures in assessing the quality of fragment libraries is demonstrated. In particular, the exclusion of homologs to the target from the libraries to correctly simulate a de novo protein structure prediction scenario, something which surprisingly is not always done. We demonstrate that fragments presenting different predominant predicted secondary structures should be treated differently during the fragment library generation step and that exhaustive and random search strategies should both be used. This information was used to develop a novel method, Flib. On a validation set of 41 structurally diverse proteins, Flib libraries presents both a higher precision and coverage than two of the state-of-the-art methods, NNMake and HHFrag. Flib also achieves better precision and coverage on the set of 275 protein domains used in the two previous experiments of the the Critical Assessment of Structure Prediction (CASP9 and CASP10). We compared Flib libraries against NNMake libraries in a structure prediction context. Of the 13 cases in which a correct answer was generated, Flib models were more accurate than NNMake models for 10. “Flib is available for download at: http://www.stats.ox.ac.uk/research/proteins/resources”. PMID:25901595

  13. Building a better fragment library for de novo protein structure prediction.

    Directory of Open Access Journals (Sweden)

    Saulo H P de Oliveira

    Full Text Available Fragment-based approaches are the current standard for de novo protein structure prediction. These approaches rely on accurate and reliable fragment libraries to generate good structural models. In this work, we describe a novel method for structure fragment library generation and its application in fragment-based de novo protein structure prediction. The importance of correct testing procedures in assessing the quality of fragment libraries is demonstrated. In particular, the exclusion of homologs to the target from the libraries to correctly simulate a de novo protein structure prediction scenario, something which surprisingly is not always done. We demonstrate that fragments presenting different predominant predicted secondary structures should be treated differently during the fragment library generation step and that exhaustive and random search strategies should both be used. This information was used to develop a novel method, Flib. On a validation set of 41 structurally diverse proteins, Flib libraries presents both a higher precision and coverage than two of the state-of-the-art methods, NNMake and HHFrag. Flib also achieves better precision and coverage on the set of 275 protein domains used in the two previous experiments of the the Critical Assessment of Structure Prediction (CASP9 and CASP10. We compared Flib libraries against NNMake libraries in a structure prediction context. Of the 13 cases in which a correct answer was generated, Flib models were more accurate than NNMake models for 10. "Flib is available for download at: http://www.stats.ox.ac.uk/research/proteins/resources".

  14. Structural and Biochemical Studies of LysM Proteins

    DEFF Research Database (Denmark)

    Wong, Mei Mei Jaslyn Elizabeth

    2017-01-01

    . Most of the signalling components in the Nod factor signalling pathway have been identified through genetic approaches. The current symbiosis signalling model, however, lacks components that could link Nod factor perception at the plasma membrane to downstream responses, such as calcium influx and perinuclear calcium...... involved in peptidoglycan hydrolysis; the Cell Wall Lytic enzyme associated with cell Separation (CwlS) from Bacillus subtilis, and P60_Tth from Thermus thermopiles. Biochemical studies conducted on purified CwlS showed that multiple LysM modules function cooperatively to bind N-acetylglucosamine (NAG......-induced intermolecular dimerization was observed in the co-crystal structure of P60_2LysM and NAG6. Until today, this is the only structural evidence illustrating intermolecular dimerization of LysM proteins. Intermolecular dimerization of plant LysM receptor kinases (RK) has been proposed as a mechanism...

  15. A computational model of the LGI1 protein suggests a common binding site for ADAM proteins.

    Directory of Open Access Journals (Sweden)

    Emanuela Leonardi

    Full Text Available Mutations of human leucine-rich glioma inactivated (LGI1 gene encoding the epitempin protein cause autosomal dominant temporal lateral epilepsy (ADTLE, a rare familial partial epileptic syndrome. The LGI1 gene seems to have a role on the transmission of neuronal messages but the exact molecular mechanism remains unclear. In contrast to other genes involved in epileptic disorders, epitempin shows no homology with known ion channel genes but contains two domains, composed of repeated structural units, known to mediate protein-protein interactions.A three dimensional in silico model of the two epitempin domains was built to predict the structure-function relationship and propose a functional model integrating previous experimental findings. Conserved and electrostatic charged regions of the model surface suggest a possible arrangement between the two domains and identifies a possible ADAM protein binding site in the β-propeller domain and another protein binding site in the leucine-rich repeat domain. The functional model indicates that epitempin could mediate the interaction between proteins localized to different synaptic sides in a static way, by forming a dimer, or in a dynamic way, by binding proteins at different times.The model was also used to predict effects of known disease-causing missense mutations. Most of the variants are predicted to alter protein folding while several other map to functional surface regions. In agreement with experimental evidence, this suggests that non-secreted LGI1 mutants could be retained within the cell by quality control mechanisms or by altering interactions required for the secretion process.

  16. Function and structure of GFP-like proteins in the protein data bank.

    Science.gov (United States)

    Ong, Wayne J-H; Alvarez, Samuel; Leroux, Ivan E; Shahid, Ramza S; Samma, Alex A; Peshkepija, Paola; Morgan, Alicia L; Mulcahy, Shawn; Zimmer, Marc

    2011-04-01

    The RCSB protein databank contains 266 crystal structures of green fluorescent proteins (GFP) and GFP-like proteins. This is the first systematic analysis of all the GFP-like structures in the pdb. We have used the pdb to examine the function of fluorescent proteins (FP) in nature, aspects of excited state proton transfer (ESPT) in FPs, deformation from planarity of the chromophore and chromophore maturation. The conclusions reached in this review are that (1) The lid residues are highly conserved, particularly those on the "top" of the β-barrel. They are important to the function of GFP-like proteins, perhaps in protecting the chromophore or in β-barrel formation. (2) The primary/ancestral function of GFP-like proteins may well be to aid in light induced electron transfer. (3) The structural prerequisites for light activated proton pumps exist in many structures and it's possible that like bioluminescence, proton pumps are secondary functions of GFP-like proteins. (4) In most GFP-like proteins the protein matrix exerts a significant strain on planar chromophores forcing most GFP-like proteins to adopt non-planar chromophores. These chromophoric deviations from planarity play an important role in determining the fluorescence quantum yield. (5) The chemospatial characteristics of the chromophore cavity determine the isomerization state of the chromophore. The cavities of highlighter proteins that can undergo cis/trans isomerization have chemospatial properties that are common to both cis and trans GFP-like proteins.

  17. VoroMQA: Assessment of protein structure quality using interatomic contact areas.

    Science.gov (United States)

    Olechnovič, Kliment; Venclovas, Česlovas

    2017-06-01

    In the absence of experimentally determined protein structure many biological questions can be addressed using computational structural models. However, the utility of protein structural models depends on their quality. Therefore, the estimation of the quality of predicted structures is an important problem. One of the approaches to this problem is the use of knowledge-based statistical potentials. Such methods typically rely on the statistics of distances and angles of residue-residue or atom-atom interactions collected from experimentally determined structures. Here, we present VoroMQA (Voronoi tessellation-based Model Quality Assessment), a new method for the estimation of protein structure quality. Our method combines the idea of statistical potentials with the use of interatomic contact areas instead of distances. Contact areas, derived using Voronoi tessellation of protein structure, are used to describe and seamlessly integrate both explicit interactions between protein atoms and implicit interactions of protein atoms with solvent. VoroMQA produces scores at atomic, residue, and global levels, all in the fixed range from 0 to 1. The method was tested on the CASP data and compared to several other single-model quality assessment methods. VoroMQA showed strong performance in the recognition of the native structure and in the structural model selection tests, thus demonstrating the efficacy of interatomic contact areas in estimating protein structure quality. The software implementation of VoroMQA is freely available as a standalone application and as a web server at http://bioinformatics.lt/software/voromqa. Proteins 2017; 85:1131-1145. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.

  18. A protein relational database and protein family knowledge bases to facilitate structure-based design analyses.

    Science.gov (United States)

    Mobilio, Dominick; Walker, Gary; Brooijmans, Natasja; Nilakantan, Ramaswamy; Denny, R Aldrin; Dejoannis, Jason; Feyfant, Eric; Kowticwar, Rupesh K; Mankala, Jyoti; Palli, Satish; Punyamantula, Sairam; Tatipally, Maneesh; John, Reji K; Humblet, Christine

    2010-08-01

    The Protein Data Bank is the most comprehensive source of experimental macromolecular structures. It can, however, be difficult at times to locate relevant structures with the Protein Data Bank search interface. This is particularly true when searching for complexes containing specific interactions between protein and ligand atoms. Moreover, searching within a family of proteins can be tedious. For example, one cannot search for some conserved residue as residue numbers vary across structures. We describe herein three databases, Protein Relational Database, Kinase Knowledge Base, and Matrix Metalloproteinase Knowledge Base, containing protein structures from the Protein Data Bank. In Protein Relational Database, atom-atom distances between protein and ligand have been precalculated allowing for millisecond retrieval based on atom identity and distance constraints. Ring centroids, centroid-centroid and centroid-atom distances and angles have also been included permitting queries for pi-stacking interactions and other structural motifs involving rings. Other geometric features can be searched through the inclusion of residue pair and triplet distances. In Kinase Knowledge Base and Matrix Metalloproteinase Knowledge Base, the catalytic domains have been aligned into common residue numbering schemes. Thus, by searching across Protein Relational Database and Kinase Knowledge Base, one can easily retrieve structures wherein, for example, a ligand of interest is making contact with the gatekeeper residue.

  19. Tuning structure of oppositely charged nanoparticle and protein complexes

    Energy Technology Data Exchange (ETDEWEB)

    Kumar, Sugam, E-mail: sugam@barc.gov.in; Aswal, V. K., E-mail: sugam@barc.gov.in [Solid State Physics Division, Bhabha Atomic Research Centre, Mumbai-400085 (India); Callow, P. [Institut Laue Langevin, DS/LSS, 6 rue Jules Horowitz, 38042 Grenoble Cedex 9 (France)

    2014-04-24

    Small-angle neutron scattering (SANS) has been used to probe the structures of anionic silica nanoparticles (LS30) and cationic lyszyme protein (M.W. 14.7kD, I.P. ∼ 11.4) by tuning their interaction through the pH variation. The protein adsorption on nanoparticles is found to be increasing with pH and determined by the electrostatic attraction between two components as well as repulsion between protein molecules. We show the strong electrostatic attraction between nanoparticles and protein molecules leads to protein-mediated aggregation of nanoparticles which are characterized by fractal structures. At pH 5, the protein adsorption gives rise to nanoparticle aggregation having surface fractal morphology with close packing of nanoparticles. The surface fractals transform to open structures of mass fractal morphology at higher pH (7 and 9) on approaching isoelectric point (I.P.)

  20. Studying Membrane Protein Structure and Function Using Nanodiscs

    DEFF Research Database (Denmark)

    Huda, Pie

    The structure and dynamic of membrane proteins can provide valuable information about general functions, diseases and effects of various drugs. Studying membrane proteins are a challenge as an amphiphilic environment is necessary to stabilise the protein in a functionally and structurally relevant...... form. This is most typically achieved through the use of detergent based reconstitution systems. However, time and again such systems fail to provide a suitable environment causing aggregation and inactivation. Nanodiscs are self-assembled lipoproteins containing two membrane scaffold proteins...... and a lipid bilayer in defined nanometer size, which can act as a stabiliser for membrane proteins. This enables both functional and structural investigation of membrane proteins in a detergent free environment which is closer to the native situation. Understanding the self-assembly of nanodiscs is important...

  1. Exploring protein dynamics space: the dynasome as the missing link between protein structure and function.

    Directory of Open Access Journals (Sweden)

    Ulf Hensen

    Full Text Available Proteins are usually described and classified according to amino acid sequence, structure or function. Here, we develop a minimally biased scheme to compare and classify proteins according to their internal mobility patterns. This approach is based on the notion that proteins not only fold into recurring structural motifs but might also be carrying out only a limited set of recurring mobility motifs. The complete set of these patterns, which we tentatively call the dynasome, spans a multi-dimensional space with axes, the dynasome descriptors, characterizing different aspects of protein dynamics. The unique dynamic fingerprint of each protein is represented as a vector in the dynasome space. The difference between any two vectors, consequently, gives a reliable measure of the difference between the corresponding protein dynamics. We characterize the properties of the dynasome by comparing the dynamics fingerprints obtained from molecular dynamics simulations of 112 proteins but our approach is, in principle, not restricted to any specific source of data of protein dynamics. We conclude that: 1. the dynasome consists of a continuum of proteins, rather than well separated classes. 2. For the majority of proteins we observe strong correlations between structure and dynamics. 3. Proteins with similar function carry out similar dynamics, which suggests a new method to improve protein function annotation based on protein dynamics.

  2. Host Proteins Determine MRSA Biofilm Structure and Integrity

    DEFF Research Database (Denmark)

    Dreier, Cindy; Nielsen, Astrid; Jørgensen, Nis Pedersen

    Human extracellular matrix (hECM) proteins aids the initial attachment and initiation of an infection, by specific binding to bacterial cell surface proteins. However, the importance of hECM proteins in structure, integrity and antibiotic resilience of a biofilm is unknown. This study aims...... to determine how specific hECM proteins affect S. aureus USA300 JE2 biofilms. Biofilms were grown in the presence of synovial fluid from rheumatoid arteritis patients to mimic in vivo conditions, where bacteria incorporate hECM proteins into the biofilm matrix. Difference in biofilm structure, with and without...... addition of hECM to growth media, was visualized by confocal laser scanning microscopy. Two enzymatic degradation experiments were used to study biofilm matrix composition and importance of hECM proteins: enzymatic removal of specific hECM proteins from growth media, before biofilm formation, and enzymatic...

  3. Phenolic promiscuity in the cell nucleus--epigallocatechingallate (EGCG) and theaflavin-3,3'-digallate from green and black tea bind to model cell nuclear structures including histone proteins, double stranded DNA and telomeric quadruplex DNA.

    Science.gov (United States)

    Mikutis, Gediminas; Karaköse, Hande; Jaiswal, Rakesh; LeGresley, Adam; Islam, Tuhidul; Fernandez-Lahore, Marcelo; Kuhnert, Nikolai

    2013-02-01

    Flavanols from tea have been reported to accumulate in the cell nucleus in considerable concentrations. The nature of this phenomenon, which could provide novel approaches in understanding the well-known beneficial health effects of tea phenols, is investigated in this contribution. The interaction between epigallocatechin gallate (EGCG) from green tea and a selection of theaflavins from black tea with selected cell nuclear structures such as model histone proteins, double stranded DNA and quadruplex DNA was investigated using mass spectrometry, Circular Dichroism spectroscopy and fluorescent assays. The selected polyphenols were shown to display affinity to all of the selected cell nuclear structures, thereby demonstrating a degree of unexpected molecular promiscuity. Most interestingly theaflavin-digallate was shown to display the highest affinity to quadruplex DNA reported for any naturally occurring molecule reported so far. This finding has immediate implications in rationalising the chemopreventive effect of the tea beverage against cancer and possibly the role of tea phenolics as "life span essentials".

  4. Integral membrane protein structure determination using pseudocontact shifts

    Energy Technology Data Exchange (ETDEWEB)

    Crick, Duncan J.; Wang, Jue X. [University of Cambridge, Department of Biochemistry (United Kingdom); Graham, Bim; Swarbrick, James D. [Monash University, Monash Institute of Pharmaceutical Sciences (Australia); Mott, Helen R.; Nietlispach, Daniel, E-mail: dn206@cam.ac.uk [University of Cambridge, Department of Biochemistry (United Kingdom)

    2015-04-15

    Obtaining enough experimental restraints can be a limiting factor in the NMR structure determination of larger proteins. This is particularly the case for large assemblies such as membrane proteins that have been solubilized in a membrane-mimicking environment. Whilst in such cases extensive deuteration strategies are regularly utilised with the aim to improve the spectral quality, these schemes often limit the number of NOEs obtainable, making complementary strategies highly beneficial for successful structure elucidation. Recently, lanthanide-induced pseudocontact shifts (PCSs) have been established as a structural tool for globular proteins. Here, we demonstrate that a PCS-based approach can be successfully applied for the structure determination of integral membrane proteins. Using the 7TM α-helical microbial receptor pSRII, we show that PCS-derived restraints from lanthanide binding tags attached to four different positions of the protein facilitate the backbone structure determination when combined with a limited set of NOEs. In contrast, the same set of NOEs fails to determine the correct 3D fold. The latter situation is frequently encountered in polytopical α-helical membrane proteins and a PCS approach is thus suitable even for this particularly challenging class of membrane proteins. The ease of measuring PCSs makes this an attractive route for structure determination of large membrane proteins in general.

  5. Using linear algebra for protein structural comparison and classification.

    Science.gov (United States)

    Gomide, Janaína; Melo-Minardi, Raquel; Dos Santos, Marcos Augusto; Neshich, Goran; Meira, Wagner; Lopes, Júlio César; Santoro, Marcelo

    2009-07-01

    In this article, we describe a novel methodology to extract semantic characteristics from protein structures using linear algebra in order to compose structural signature vectors which may be used efficiently to compare and classify protein structures into fold families. These signatures are built from the pattern of hydrophobic intrachain interactions using Singular Value Decomposition (SVD) and Latent Semantic Indexing (LSI) techniques. Considering proteins as documents and contacts as terms, we have built a retrieval system which is able to find conserved contacts in samples of myoglobin fold family and to retrieve these proteins among proteins of varied folds with precision of up to 80%. The classifier is a web tool available at our laboratory website. Users can search for similar chains from a specific PDB, view and compare their contact maps and browse their structures using a JMol plug-in.

  6. Using linear algebra for protein structural comparison and classification

    Directory of Open Access Journals (Sweden)

    Janaína Gomide

    2009-01-01

    Full Text Available In this article, we describe a novel methodology to extract semantic characteristics from protein structures using linear algebra in order to compose structural signature vectors which may be used efficiently to compare and classify protein structures into fold families. These signatures are built from the pattern of hydrophobic intrachain interactions using Singular Value Decomposition (SVD and Latent Semantic Indexing (LSI techniques. Considering proteins as documents and contacts as terms, we have built a retrieval system which is able to find conserved contacts in samples of myoglobin fold family and to retrieve these proteins among proteins of varied folds with precision of up to 80%. The classifier is a web tool available at our laboratory website. Users can search for similar chains from a specific PDB, view and compare their contact maps and browse their structures using a JMol plug-in.

  7. Structural studies on proton/protonation of the protein molecule

    International Nuclear Information System (INIS)

    Morimoto, Yukio; Kida, Akiko; Chatake, Toshiyuki; Yamaguchi, Hiroshi; Hosokawa, Keiichi; Murakami, Takuto; Umino, Masaaki; Tanaka, Ichiro; Hisatome, Ichiro; Yanagisawa, Yasutake; Fujiwara, Satoshi; Hidaka, Yuji; Shimamoto, Shigeru; Fujiwara, Mitsutoshi; Nakanishi, Takeyoshi

    2015-01-01

    This paper reports three studies involved in the analysis of protons and protonation at physiologically active sites in protein molecules. (1) 'Elucidation of the higher-order structure formation and activity performing mechanism of yeast proteasome.' With an aim to apply to anti-cancer drugs, this study performed the shape analysis of the total structure of 26S proteasome using small-angle X-ray scattering to clarify the complex form where controlling elements bonded to the both ends of 20S catalyst body, and analyzed the complex structure between the active sites of 20S and inhibitor (drug). (2) 'Basic study on the neutron experiment of biomolecules such as physiologically active substances derived from Natto-bacteria.' This study conducted the purification, crystallization, and X-ray analysis experiment of nattokinase; high-grade purification and solution experiment of vitamin K2 (menaquinone-7); and Z-DNA crystal structure study related to the neutron crystal analysis of DNA as another biomolecule structure study. (3) 'Functional evaluation on digestive enzymes derived from Nephila clavata.' As an Alzheimer's disease-related amyloid fibril formation model, this study carried out elucidation on the fibrosis and fiber-forming mechanism of the traction fiber of Nephila clavata, and the functional analysis of its degrading enzyme. (A.O.)

  8. How Many Protein Sequences Fold to a Given Structure? A Coevolutionary Analysis.

    Science.gov (United States)

    Tian, Pengfei; Best, Robert B

    2017-10-17

    Quantifying the relationship between protein sequence and structure is key to understanding the protein universe. A fundamental measure of this relationship is the total number of amino acid sequences that can fold to a target protein structure, known as the "sequence capacity," which has been suggested as a proxy for how designable a given protein fold is. Although sequence capacity has been extensively studied using lattice models and theory, numerical estimates for real protein structures are currently lacking. In this work, we have quantitatively estimated the sequence capacity of 10 proteins with a variety of different structures using a statistical model based on residue-residue co-evolution to capture the variation of sequences from the same protein family. Remarkably, we find that even for the smallest protein folds, such as the WW domain, the number of foldable sequences is extremely large, exceeding the Avogadro constant. In agreement with earlier theoretical work, the calculated sequence capacity is positively correlated with the size of the protein, or better, the density of contacts. This allows the absolute sequence capacity of a given protein to be approximately predicted from its structure. On the other hand, the relative sequence capacity, i.e., normalized by the total number of possible sequences, is an extremely tiny number and is strongly anti-correlated with the protein length. Thus, although there may be more foldable sequences for larger proteins, it will be much harder to find them. Lastly, we have correlated the evolutionary age of proteins in the CATH database with their sequence capacity as predicted by our model. The results suggest a trade-off between the opposing requirements of high designability and the likelihood of a novel fold emerging by chance. Published by Elsevier Inc.

  9. Structural Mass Spectrometry of Proteins Using Hydroxyl Radical Based Protein Footprinting

    OpenAIRE

    Wang, Liwen; Chance, Mark R.

    2011-01-01

    Structural MS is a rapidly growing field with many applications in basic research and pharmaceutical drug development. In this feature article the overall technology is described and several examples of how hydroxyl radical based footprinting MS can be used to map interfaces, evaluate protein structure, and identify ligand dependent conformational changes in proteins are described.

  10. Physiologically Based Pharmacokinetic Modeling of Therapeutic Proteins.

    Science.gov (United States)

    Wong, Harvey; Chow, Timothy W

    2017-09-01

    Biologics or therapeutic proteins are becoming increasingly important as treatments for disease. The most common class of biologics are monoclonal antibodies (mAbs). Recently, there has been an increase in the use of physiologically based pharmacokinetic (PBPK) modeling in the pharmaceutical industry in drug development. We review PBPK models for therapeutic proteins with an emphasis on mAbs. Due to their size and similarity to endogenous antibodies, there are distinct differences between PBPK models for small molecules and mAbs. The high-level organization of a typical mAb PBPK model consists of a whole-body PBPK model with organ compartments interconnected by both blood and lymph flows. The whole-body PBPK model is coupled with tissue-level submodels used to describe key mechanisms governing mAb disposition including tissue efflux via the lymphatic system, elimination by catabolism, protection from catabolism binding to the neonatal Fc (FcRn) receptor, and nonlinear binding to specific pharmacological targets of interest. The use of PBPK modeling in the development of therapeutic proteins is still in its infancy. Further application of PBPK modeling for therapeutic proteins will help to define its developing role in drug discovery and development. Copyright © 2017 American Pharmacists Association®. Published by Elsevier Inc. All rights reserved.

  11. Predicting turns in proteins with a unified model.

    Directory of Open Access Journals (Sweden)

    Qi Song

    Full Text Available MOTIVATION: Turns are a critical element of the structure of a protein; turns play a crucial role in loops, folds, and interactions. Current prediction methods are well developed for the prediction of individual turn types, including α-turn, β-turn, and γ-turn, etc. However, for further protein structure and function prediction it is necessary to develop a uniform model that can accurately predict all types of turns simultaneously. RESULTS: In this study, we present a novel approach, TurnP, which offers the ability to investigate all the turns in a protein based on a unified model. The main characteristics of TurnP are: (i using newly exploited features of structural evolution information (secondary structure and shape string of protein based on structure homologies, (ii considering all types of turns in a unified model, and (iii practical capability of accurate prediction of all turns simultaneously for a query. TurnP utilizes predicted secondary structures and predicted shape strings, both of which have greater accuracy, based on innovative technologies which were both developed by our group. Then, sequence and structural evolution features, which are profile of sequence, profile of secondary structures and profile of shape strings are generated by sequence and structure alignment. When TurnP was validated on a non-redundant dataset (4,107 entries by five-fold cross-validation, we achieved an accuracy of 88.8% and a sensitivity of 71.8%, which exceeded the most state-of-the-art predictors of certain type of turn. Newly determined sequences, the EVA and CASP9 datasets were used as independent tests and the results we achieved were outstanding for turn predictions and confirmed the good performance of TurnP for practical applications.

  12. Effects of lysine residues on structural characteristics and stability of tau proteins

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Myeongsang; Baek, Inchul; Choi, Hyunsung; Kim, Jae In; Na, Sungsoo, E-mail: nass@korea.ac.kr

    2015-10-23

    Pathological amyloid proteins have been implicated in neuro-degenerative diseases, specifically Alzheimer's, Parkinson's, Lewy-body diseases and prion related diseases. In prion related diseases, functional tau proteins can be transformed into pathological agents by environmental factors, including oxidative stress, inflammation, Aβ-mediated toxicity and covalent modification. These pathological agents are stable under physiological conditions and are not easily degraded. This un-degradable characteristic of tau proteins enables their utilization as functional materials to capturing the carbon dioxides. For the proper utilization of amyloid proteins as functional materials efficiently, a basic study regarding their structural characteristic is necessary. Here, we investigated the basic tau protein structure of wild-type (WT) and tau proteins with lysine residues mutation at glutamic residue (Q2K) on tau protein at atomistic scale. We also reported the size effect of both the WT and Q2K structures, which allowed us to identify the stability of those amyloid structures. - Highlights: • Lysine mutation effect alters the structure conformation and characteristic of tau. • Over the 15 layers both WT and Q2K models, both tau proteins undergo fractions. • Lysine mutation causes the increment of non-bonded energy and solvent accessible surface area. • Structural instability of Q2K model was proved by the number of hydrogen bonds analysis.

  13. Effects of lysine residues on structural characteristics and stability of tau proteins

    International Nuclear Information System (INIS)

    Lee, Myeongsang; Baek, Inchul; Choi, Hyunsung; Kim, Jae In; Na, Sungsoo

    2015-01-01

    Pathological amyloid proteins have been implicated in neuro-degenerative diseases, specifically Alzheimer's, Parkinson's, Lewy-body diseases and prion related diseases. In prion related diseases, functional tau proteins can be transformed into pathological agents by environmental factors, including oxidative stress, inflammation, Aβ-mediated toxicity and covalent modification. These pathological agents are stable under physiological conditions and are not easily degraded. This un-degradable characteristic of tau proteins enables their utilization as functional materials to capturing the carbon dioxides. For the proper utilization of amyloid proteins as functional materials efficiently, a basic study regarding their structural characteristic is necessary. Here, we investigated the basic tau protein structure of wild-type (WT) and tau proteins with lysine residues mutation at glutamic residue (Q2K) on tau protein at atomistic scale. We also reported the size effect of both the WT and Q2K structures, which allowed us to identify the stability of those amyloid structures. - Highlights: • Lysine mutation effect alters the structure conformation and characteristic of tau. • Over the 15 layers both WT and Q2K models, both tau proteins undergo fractions. • Lysine mutation causes the increment of non-bonded energy and solvent accessible surface area. • Structural instability of Q2K model was proved by the number of hydrogen bonds analysis.

  14. Protein Secondary Structure Prediction Using AutoEncoder Network and Bayes Classifier

    Science.gov (United States)

    Wang, Leilei; Cheng, Jinyong

    2018-03-01

    Protein secondary structure prediction is belong to bioinformatics,and it's important in research area. In this paper, we propose a new prediction way of protein using bayes classifier and autoEncoder network. Our experiments show some algorithms including the construction of the model, the classification of parameters and so on. The data set is a typical CB513 data set for protein. In terms of accuracy, the method is the cross validation based on the 3-fold. Then we can get the Q3 accuracy. Paper results illustrate that the autoencoder network improved the prediction accuracy of protein secondary structure.

  15. Evaluation of variability in high-resolution protein structures by global distance scoring

    Directory of Open Access Journals (Sweden)

    Risa Anzai

    2018-01-01

    Full Text Available Systematic analysis of the statistical and dynamical properties of proteins is critical to understanding cellular events. Extraction of biologically relevant information from a set of high-resolution structures is important because it can provide mechanistic details behind the functional properties of protein families, enabling rational comparison between families. Most of the current structural comparisons are pairwise-based, which hampers the global analysis of increasing contents in the Protein Data Bank. Additionally, pairing of protein structures introduces uncertainty with respect to reproducibility because it frequently accompanies other settings for superimposition. This study introduces intramolecular distance scoring for the global analysis of proteins, for each of which at least several high-resolution structures are available. As a pilot study, we have tested 300 human proteins and showed that the method is comprehensively used to overview advances in each protein and protein family at the atomic level. This method, together with the interpretation of the model calculations, provide new criteria for understanding specific structural variation in a protein, enabling global comparison of the variability in proteins from different species.

  16. Evaluation of variability in high-resolution protein structures by global distance scoring.

    Science.gov (United States)

    Anzai, Risa; Asami, Yoshiki; Inoue, Waka; Ueno, Hina; Yamada, Koya; Okada, Tetsuji

    2018-01-01

    Systematic analysis of the statistical and dynamical properties of proteins is critical to understanding cellular events. Extraction of biologically relevant information from a set of high-resolution structures is important because it can provide mechanistic details behind the functional properties of protein families, enabling rational comparison between families. Most of the current structural comparisons are pairwise-based, which hampers the global analysis of increasing contents in the Protein Data Bank. Additionally, pairing of protein structures introduces uncertainty with respect to reproducibility because it frequently accompanies other settings for superimposition. This study introduces intramolecular distance scoring for the global analysis of proteins, for each of which at least several high-resolution structures are available. As a pilot study, we have tested 300 human proteins and showed that the method is comprehensively used to overview advances in each protein and protein family at the atomic level. This method, together with the interpretation of the model calculations, provide new criteria for understanding specific structural variation in a protein, enabling global comparison of the variability in proteins from different species.

  17. Refinement of homology-based protein structures by molecular dynamics simulation techniques

    NARCIS (Netherlands)

    Fan, H; Mark, AE

    The use of classical molecular dynamics simulations, performed in explicit water, for the refinement of structural models of proteins generated ab initio or based on homology has been investigated. The study involved a test set of 15 proteins that were previously used by Baker and coworkers to

  18. Structural study of surfactant-dependent interaction with protein

    Energy Technology Data Exchange (ETDEWEB)

    Mehan, Sumit; Aswal, Vinod K., E-mail: vkaswal@barc.gov.in [Solid State Physics Division, Bhabha Atomic Research Centre, Mumbai 400 085 (India); Kohlbrecher, Joachim [Laboratory for Neutron Scattering, Paul Scherrer Institut, CH-5232 PSI Villigen (Switzerland)

    2015-06-24

    Small-angle neutron scattering (SANS) has been used to study the complex structure of anionic BSA protein with three different (cationic DTAB, anionic SDS and non-ionic C12E10) surfactants. These systems form very different surfactant-dependent complexes. We show that the structure of protein-surfactant complex is initiated by the site-specific electrostatic interaction between the components, followed by the hydrophobic interaction at high surfactant concentrations. It is also found that hydrophobic interaction is preferred over the electrostatic interaction in deciding the resultant structure of protein-surfactant complexes.

  19. Rhabdovirus matrix protein structures reveal a novel mode of self-association.

    Directory of Open Access Journals (Sweden)

    Stephen C Graham

    2008-12-01

    Full Text Available The matrix (M proteins of rhabdoviruses are multifunctional proteins essential for virus maturation and budding that also regulate the expression of viral and host proteins. We have solved the structures of M from the vesicular stomatitis virus serotype New Jersey (genus: Vesiculovirus and from Lagos bat virus (genus: Lyssavirus, revealing that both share a common fold despite sharing no identifiable sequence homology. Strikingly, in both structures a stretch of residues from the otherwise-disordered N terminus of a crystallographically adjacent molecule is observed binding to a hydrophobic cavity on the surface of the protein, thereby forming non-covalent linear polymers of M in the crystals. While the overall topology of the interaction is conserved between the two structures, the molecular details of the interactions are completely different. The observed interactions provide a compelling model for the flexible self-assembly of the matrix protein during virion morphogenesis and may also modulate interactions with host proteins.

  20. The protein structure determines the sensitizing capacity of Brazil nut 2S albumin (Ber e1) in a rat food allergy model

    NARCIS (Netherlands)

    Bilsen, J.H. van; Knippels, L.M.; Penninks, A.H.; Nieuwenhuizen, W.F.; Jongh, H.H. de; Koppelman, S.J.

    2013-01-01

    It is not exactly known why certain food proteins are more likely to sensitize. One of the characteristics of most food allergens is that they are stable to the acidic and proteolytic conditions in the digestive tract. This property is thought to be a risk factor in allergic sensitization. The

  1. p15PAF is an intrinsically disordered protein with nonrandom structural preferences at sites of interaction with other proteins.

    Science.gov (United States)

    De Biasio, Alfredo; Ibáñez de Opakua, Alain; Cordeiro, Tiago N; Villate, Maider; Merino, Nekane; Sibille, Nathalie; Lelli, Moreno; Diercks, Tammo; Bernadó, Pau; Blanco, Francisco J

    2014-02-18

    We present to our knowledge the first structural characterization of the proliferating-cell-nuclear-antigen-associated factor p15(PAF), showing that it is monomeric and intrinsically disordered in solution but has nonrandom conformational preferences at sites of protein-protein interactions. p15(PAF) is a 12 kDa nuclear protein that acts as a regulator of DNA repair during DNA replication. The p15(PAF) gene is overexpressed in several types of human cancer. The nearly complete NMR backbone assignment of p15(PAF) allowed us to measure 86 N-H(N) residual dipolar couplings. Our residual dipolar coupling analysis reveals nonrandom conformational preferences in distinct regions, including the proliferating-cell-nuclear-antigen-interacting protein motif (PIP-box) and the KEN-box (recognized by the ubiquitin ligase that targets p15(PAF) for degradation). In accordance with these findings, analysis of the (15)N R2 relaxation rates shows a relatively reduced mobility for the residues in these regions. The agreement between the experimental small angle x-ray scattering curve of p15(PAF) and that computed from a statistical coil ensemble corrected for the presence of local secondary structural elements further validates our structural model for p15(PAF). The coincidence of these transiently structured regions with protein-protein interaction and posttranslational modification sites suggests a possible role for these structures as molecular recognition elements for p15(PAF). Copyright © 2014 Biophysical Society. Published by Elsevier Inc. All rights reserved.

  2. Structural analysis of a set of proteins resulting from a bacterial genomics project.

    Science.gov (United States)

    Badger, J; Sauder, J M; Adams, J M; Antonysamy, S; Bain, K; Bergseid, M G; Buchanan, S G; Buchanan, M D; Batiyenko, Y; Christopher, J A; Emtage, S; Eroshkina, A; Feil, I; Furlong, E B; Gajiwala, K S; Gao, X; He, D; Hendle, J; Huber, A; Hoda, K; Kearins, P; Kissinger, C; Laubert, B; Lewis, H A; Lin, J; Loomis, K; Lorimer, D; Louie, G; Maletic, M; Marsh, C D; Miller, I; Molinari, J; Muller-Dieckmann, H J; Newman, J M; Noland, B W; Pagarigan, B; Park, F; Peat, T S; Post, K W; Radojicic, S; Ramos, A; Romero, R; Rutter, M E; Sanderson, W E; Schwinn, K D; Tresser, J; Winhoven, J; Wright, T A; Wu, L; Xu, J; Harris, T J R

    2005-09-01

    The targets of the Structural GenomiX (SGX) bacterial genomics project were proteins conserved in multiple prokaryotic organisms with no obvious sequence homolog in the Protein Data Bank of known structures. The outcome of this work was 80 structures, covering 60 unique sequences and 49 different genes. Experimental phase determination from proteins incorporating Se-Met was carried out for 45 structures with most of the remainder solved by molecular replacement using members of the experimentally phased set as search models. An automated tool was developed to deposit these structures in the Protein Data Bank, along with the associated X-ray diffraction data (including refined experimental phases) and experimentally confirmed sequences. BLAST comparisons of the SGX structures with structures that had appeared in the Protein Data Bank over the intervening 3.5 years since the SGX target list had been compiled identified homologs for 49 of the 60 unique sequences represented by the SGX structures. This result indicates that, for bacterial structures that are relatively easy to express, purify, and crystallize, the structural coverage of gene space is proceeding rapidly. More distant sequence-structure relationships between the SGX and PDB structures were investigated using PDB-BLAST and Combinatorial Extension (CE). Only one structure, SufD, has a truly unique topology compared to all folds in the PDB. Copyright 2005 Wiley-Liss, Inc.

  3. 3D complex: a structural classification of protein complexes.

    Directory of Open Access Journals (Sweden)

    Emmanuel D Levy

    2006-11-01

    Full Text Available Most of the proteins in a cell assemble into complexes to carry out their function. It is therefore crucial to understand the physicochemical properties as well as the evolution of interactions between proteins. The Protein Data Bank represents an important source of information for such studies, because more than half of the structures are homo- or heteromeric protein complexes. Here we propose the first hierarchical classification of whole protein complexes of known 3-D structure, based on representing their fundamental structural features as a graph. This classification provides the first overview of all the complexes in the Protein Data Bank and allows nonredundant sets to be derived at different levels of detail. This reveals that between one-half and two-thirds of known structures are multimeric, depending on the level of redundancy accepted. We also analyse the structures in terms of the topological arrangement of their subunits and find that they form a small number of arrangements compared with all theoretically possible ones. This is because most complexes contain four subunits or less, and the large majority are homomeric. In addition, there is a strong tendency for symmetry in complexes, even for heteromeric complexes. Finally, through comparison of Biological Units in the Protein Data Bank with the Protein Quaternary Structure database, we identified many possible errors in quaternary structure assignments. Our classification, available as a database and Web server at http://www.3Dcomplex.org, will be a starting point for future work aimed at understanding the structure and evolution of protein complexes.

  4. Structural and binding studies of SAP-1 protein with heparin.

    Science.gov (United States)

    Yadav, Vikash K; Mandal, Rahul S; Puniya, Bhanwar L; Kumar, Rahul; Dey, Sharmistha; Singh, Sarman; Yadav, Savita

    2015-03-01

    SAP-1 is a low molecular weight cysteine protease inhibitor (CPI) which belongs to type-2 cystatins family. SAP-1 protein purified from human seminal plasma (HuSP) has been shown to inhibit cysteine and serine proteases and exhibit interesting biological properties, including high temperature and pH stability. Heparin is a naturally occurring glycosaminoglycan (with varied chain length) which interacts with a number of proteins and regulates multiple steps in different biological processes. As an anticoagulant, heparin enhances inhibition of thrombin by the serpin antithrombin III. Therefore, we have employed surface plasmon resonance (SPR) to improve our understanding of the binding interaction between heparin and SAP-1 (protease inhibitor). SPR data suggest that SAP-1 binds to heparin with a significant affinity (KD = 158 nm). SPR solution competition studies using heparin oligosaccharides showed that the binding of SAP-1 to heparin is dependent on chain length. Large oligosaccharides show strong binding affinity for SAP-1. Further to get insight into the structural aspect of interactions between SAP-1 and heparin, we used modelled structure of the SAP-1 and docked with heparin and heparin-derived polysaccharides. The results suggest that a positively charged residue lysine plays important role in these interactions. Such information should improve our understanding of how heparin, present in the reproductive tract, regulates cystatins activity. © 2014 John Wiley & Sons A/S.

  5. Functional Coverage of the Human Genome by Existing Structures, Structural Genomics Targets, and Homology Models.

    Directory of Open Access Journals (Sweden)

    2005-08-01

    Full Text Available The bias in protein structure and function space resulting from experimental limitations and targeting of particular functional classes of proteins by structural biologists has long been recognized, but never continuously quantified. Using the Enzyme Commission and the Gene Ontology classifications as a reference frame, and integrating structure data from the Protein Data Bank (PDB, target sequences from the structural genomics projects, structure homology derived from the SUPERFAMILY database, and genome annotations from Ensembl and NCBI, we provide a quantified view, both at the domain and whole-protein levels, of the current and projected coverage of protein structure and function space relative to the human genome. Protein structures currently provide at least one domain that covers 37% of the functional classes identified in the genome; whole structure coverage exists for 25% of the genome. If all the structural genomics targets were solved (twice the current number of structures in the PDB, it is estimated that structures of one domain would cover 69% of the functional classes identified and complete structure coverage would be 44%. Homology models from existing experimental structures extend the 37% coverage to 56% of the genome as single domains and 25% to 31% for complete structures. Coverage from homology models is not evenly distributed by protein family, reflecting differing degrees of sequence and structure divergence within families. While these data provide coverage, conversely, they also systematically highlight functional classes of proteins for which structures should be determined. Current key functional families without structure representation are highlighted here; updated information on the "most wanted list" that should be solved is available on a weekly basis from http://function.rcsb.org:8080/pdb/function_distribution/index.html.

  6. Structural dynamic modifications via models

    Indian Academy of Sciences (India)

    The study shows that as many as half of the matrix ... the dynamicist's analytical modelling skill which would appear both in the numerator as. Figure 2. ..... Brandon J A 1990 Strategies for structural dynamic modification (New York: John Wiley).

  7. Structure-Based Turbulence Model

    National Research Council Canada - National Science Library

    Reynolds, W

    2000-01-01

    .... Maire carried out this work as part of his Phi) research. During the award period we began to explore ways to simplify the structure-based modeling so that it could be used in repetitive engineering calculations...

  8. Probabilistic modeling of timber structures

    DEFF Research Database (Denmark)

    Köhler, Jochen; Sørensen, John Dalsgaard; Faber, Michael Havbro

    2007-01-01

    The present paper contains a proposal for the probabilistic modeling of timber material properties. It is produced in the context of the Probabilistic Model Code (PMC) of the Joint Committee on Structural Safety (JCSS) [Joint Committee of Structural Safety. Probabilistic Model Code, Internet...... Publication: www.jcss.ethz.ch; 2001] and of the COST action E24 ‘Reliability of Timber Structures' [COST Action E 24, Reliability of timber structures. Several meetings and Publications, Internet Publication: http://www.km.fgg.uni-lj.si/coste24/coste24.htm; 2005]. The present proposal is based on discussions...... and comments from participants of the COST E24 action and the members of the JCSS. The paper contains a description of the basic reference properties for timber strength parameters and ultimate limit state equations for timber components. The recommended probabilistic model for these basic properties...

  9. Protein homology model refinement by large-scale energy optimization.

    Science.gov (United States)

    Park, Hahnbeom; Ovchinnikov, Sergey; Kim, David E; DiMaio, Frank; Baker, David

    2018-03-20

    Proteins fold to their lowest free-energy structures, and hence the most straightforward way to increase the accuracy of a partially incorrect protein structure model is to search for the lowest-energy nearby structure. This direct approach has met with little success for two reasons: first, energy function inaccuracies can lead to false energy minima, resulting in model degradation rather than improvement; and second, even with an accurate energy function, the search problem is formidable because the energy only drops considerably in the immediate vicinity of the global minimum, and there are a very large number of degrees of freedom. Here we describe a large-scale energy optimization-based refinement method that incorporates advances in both search and energy function accuracy that can substantially improve the accuracy of low-resolution homology models. The method refined low-resolution homology models into correct folds for 50 of 84 diverse protein families and generated improved models in recent blind structure prediction experiments. Analyses of the basis for these improvements reveal contributions from both the improvements in conformational sampling techniques and the energy function.

  10. MODexplorer: an integrated tool for exploring protein sequence, structure and function relationships.

    KAUST Repository

    Kosinski, Jan

    2013-02-08

    SUMMARY: MODexplorer is an integrated tool aimed at exploring the sequence, structural and functional diversity in protein families useful in homology modeling and in analyzing protein families in general. It takes as input either the sequence or the structure of a protein and provides alignments with its homologs along with a variety of structural and functional annotations through an interactive interface. The annotations include sequence conservation, similarity scores, ligand-, DNA- and RNA-binding sites, secondary structure, disorder, crystallographic structure resolution and quality scores of models implied by the alignments to the homologs of known structure. MODexplorer can be used to analyze sequence and structural conservation among the structures of similar proteins, to find structures of homologs solved in different conformational state or with different ligands and to transfer functional annotations. Furthermore, if the structure of the query is not known, MODexplorer can be used to select the modeling templates taking all this information into account and to build a comparative model. AVAILABILITY AND IMPLEMENTATION: Freely available on the web at http://modorama.biocomputing.it/modexplorer. Website implemented in HTML and JavaScript with all major browsers supported. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

  11. MODexplorer: an integrated tool for exploring protein sequence, structure and function relationships.

    KAUST Repository

    Kosinski, Jan; Barbato, Alessandro; Tramontano, Anna

    2013-01-01

    SUMMARY: MODexplorer is an integrated tool aimed at exploring the sequence, structural and functional diversity in protein families useful in homology modeling and in analyzing protein families in general. It takes as input either the sequence or the structure of a protein and provides alignments with its homologs along with a variety of structural and functional annotations through an interactive interface. The annotations include sequence conservation, similarity scores, ligand-, DNA- and RNA-binding sites, secondary structure, disorder, crystallographic structure resolution and quality scores of models implied by the alignments to the homologs of known structure. MODexplorer can be used to analyze sequence and structural conservation among the structures of similar proteins, to find structures of homologs solved in different conformational state or with different ligands and to transfer functional annotations. Furthermore, if the structure of the query is not known, MODexplorer can be used to select the modeling templates taking all this information into account and to build a comparative model. AVAILABILITY AND IMPLEMENTATION: Freely available on the web at http://modorama.biocomputing.it/modexplorer. Website implemented in HTML and JavaScript with all major browsers supported. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

  12. Temporal structures in shell models

    DEFF Research Database (Denmark)

    Okkels, F.

    2001-01-01

    The intermittent dynamics of the turbulent Gledzer, Ohkitani, and Yamada shell-model is completely characterized by a single type of burstlike structure, which moves through the shells like a front. This temporal structure is described by the dynamics of the instantaneous configuration of the shell...

  13. Structuring very large domain models

    DEFF Research Database (Denmark)

    Störrle, Harald

    2010-01-01

    View/Viewpoint approaches like IEEE 1471-2000, or Kruchten's 4+1-view model are used to structure software architectures at a high level of granularity. While research has focused on architectural languages and with consistency between multiple views, practical questions such as the structuring a...

  14. Relationship between Molecular Structure Characteristics of Feed Proteins and Protein In vitro Digestibility and Solubility.

    Science.gov (United States)

    Bai, Mingmei; Qin, Guixin; Sun, Zewei; Long, Guohui

    2016-08-01

    The nutritional value of feed proteins and their utilization by livestock are related not only to the chemical composition but also to the structure of feed proteins, but few studies thus far have investigated the relationship between the structure of feed proteins and their solubility as well as digestibility in monogastric animals. To address this question we analyzed soybean meal, fish meal, corn distiller's dried grains with solubles, corn gluten meal, and feather meal by Fourier transform infrared (FTIR) spectroscopy to determine the protein molecular spectral band characteristics for amides I and II as well as α-helices and β-sheets and their ratios. Protein solubility and in vitro digestibility were measured with the Kjeldahl method using 0.2% KOH solution and the pepsin-pancreatin two-step enzymatic method, respectively. We found that all measured spectral band intensities (height and area) of feed proteins were correlated with their the in vitro digestibility and solubility (p≤0.003); moreover, the relatively quantitative amounts of α-helices, random coils, and α-helix to β-sheet ratio in protein secondary structures were positively correlated with protein in vitro digestibility and solubility (p≤0.004). On the other hand, the percentage of β-sheet structures was negatively correlated with protein in vitro digestibility (pdigestibility at 28 h and solubility. Furthermore, the α-helix-to-β-sheet ratio can be used to predict the nutritional value of feed proteins.

  15. Effects of NMR spectral resolution on protein structure calculation.

    Directory of Open Access Journals (Sweden)

    Suhas Tikole

    Full Text Available Adequate digital resolution and signal sensitivity are two critical factors for protein structure determinations by solution NMR spectroscopy. The prime objective for obtaining high digital resolution is to resolve peak overlap, especially in NOESY spectra with thousands of signals where the signal analysis needs to be performed on a large scale. Achieving maximum digital resolution is usually limited by the practically available measurement time. We developed a method utilizing non-uniform sampling for balancing digital resolution and signal sensitivity, and performed a large-scale analysis of the effect of the digital resolution on the accuracy of the resulting protein structures. Structure calculations were performed as a function of digital resolution for about 400 proteins with molecular sizes ranging between 5 and 33 kDa. The structural accuracy was assessed by atomic coordinate RMSD values from the reference structures of the proteins. In addition, we monitored also the number of assigned NOESY cross peaks, the average signal sensitivity, and the chemical shift spectral overlap. We show that high resolution is equally important for proteins of every molecular size. The chemical shift spectral overlap depends strongly on the corresponding spectral digital resolution. Thus, knowing the extent of overlap can be a predictor of the resulting structural accuracy. Our results show that for every molecular size a minimal digital resolution, corresponding to the natural linewidth, needs to be achieved for obtaining the highest accuracy possible for the given protein size using state-of-the-art automated NOESY assignment and structure calculation methods.

  16. Penetration of the signal sequence of Escherichia coli PhoE protein into phospholipid model membranes leads to lipid-specific changes in signal peptide structure and alterations of lipid organization

    International Nuclear Information System (INIS)

    Batenburg, A.M.; Demel, R.A.; Verkleij, A.J.; de Kruijff, B.

    1988-01-01

    In order to obtain more insight in the initial steps of the process of protein translocation across membranes, biophysical investigations were undertaken on the lipid specificity and structural consequences of penetration of the PhoE signal peptide into lipid model membranes and on the conformation of the signal peptide adopted upon interaction with the lipids. When the monolayer technique and differential scanning calorimetry are used, a stronger penetration is observed for negatively charged lipids, significantly influenced by the physical state of the lipid but not by temperature or acyl chain unsaturation as such. Although the interaction is principally electrostatic, as indicated also by the strong penetration of N-terminal fragments into negatively charged lipid monolayers, the effect of ionic strength suggests an additional hydrophobic component. Most interestingly with regard to the mechanism of protein translocation, the molecular area of the peptide in the monolayer also shows lipid specificity: the area in the presence of PC is consistent with a looped helical orientation, whereas in the presence of cardiolipin a time-dependent conformational change is observed, most likely leading from a looped to a stretched orientation with the N-terminus directed toward the water. This is in line also with the determined peptide-lipid stoichiometry. Preliminary 31 P NMR and electron microscopy data on the interaction with lipid bilayer systems indicate loss of bilayer structure

  17. Structural basis for target protein recognition by the protein disulfide reductase thioredoxin

    DEFF Research Database (Denmark)

    Maeda, Kenji; Hägglund, Per; Finnie, Christine

    2006-01-01

    Thioredoxin is ubiquitous and regulates various target proteins through disulfide bond reduction. We report the structure of thioredoxin (HvTrxh2 from barley) in a reaction intermediate complex with a protein substrate, barley alpha-amylase/subtilisin inhibitor (BASI). The crystal structure...... of this mixed disulfide shows a conserved hydrophobic motif in thioredoxin interacting with a sequence of residues from BASI through van der Waals contacts and backbone-backbone hydrogen bonds. The observed structural complementarity suggests that the recognition of features around protein disulfides plays...... a major role in the specificity and protein disulfide reductase activity of thioredoxin. This novel insight into the function of thioredoxin constitutes a basis for comprehensive understanding of its biological role. Moreover, comparison with structurally related proteins shows that thioredoxin shares...

  18. Dissecting protein loops with a statistical scalpel suggests a functional implication of some structural motifs.

    Science.gov (United States)

    Regad, Leslie; Martin, Juliette; Camproux, Anne-Claude

    2011-06-20

    One of the strategies for protein function annotation is to search particular structural motifs that are known to be shared by proteins with a given function. Here, we present a systematic extraction of structural motifs of seven residues from protein loops and we explore their correspondence with functional sites. Our approach is based on the structural alphabet HMM-SA (Hidden Markov Model - Structural Alphabet), which allows simplification of protein structures into uni-dimensional sequences, and advanced pattern statistics adapted to short sequences. Structural motifs of interest are selected by looking for structural motifs significantly over-represented in SCOP superfamilies in protein loops. We discovered two types of structural motifs significantly over-represented in SCOP superfamilies: (i) ubiquitous motifs, shared by several superfamilies and (ii) superfamily-specific motifs, over-represented in few superfamilies. A comparison of ubiquitous words with known small structural motifs shows that they contain well-described motifs as turn, niche or nest motifs. A comparison between superfamily-specific motifs and biological annotations of Swiss-Prot reveals that some of them actually correspond to functional sites involved in the binding sites of small ligands, such as ATP/GTP, NAD(P) and SAH/SAM. Our findings show that statistical over-representation in SCOP superfamilies is linked to functional features. The detection of over-represented motifs within structures simplified by HMM-SA is therefore a promising approach for prediction of functional sites and annotation of uncharacterized proteins.

  19. Fatgraph models of RNA structure

    Directory of Open Access Journals (Sweden)

    Huang Fenix

    2017-01-01

    Full Text Available In this review paper we discuss fatgraphs as a conceptual framework for RNA structures. We discuss various notions of coarse-grained RNA structures and relate them to fatgraphs.We motivate and discuss the main intuition behind the fatgraph model and showcase its applicability to canonical as well as noncanonical base pairs. Recent discoveries regarding novel recursions of pseudoknotted (pk configurations as well as their translation into context-free grammars for pk-structures are discussed. This is shown to allow for extending the concept of partition functions of sequences w.r.t. a fixed structure having non-crossing arcs to pk-structures. We discuss minimum free energy folding of pk-structures and combine these above results outlining how to obtain an inverse folding algorithm for PK structures.

  20. GalaxyHomomer: a web server for protein homo-oligomer structure prediction from a monomer sequence or structure.

    Science.gov (United States)

    Baek, Minkyung; Park, Taeyong; Heo, Lim; Park, Chiwook; Seok, Chaok

    2017-07-03

    Homo-oligomerization of proteins is abundant in nature, and is often intimately related with the physiological functions of proteins, such as in metabolism, signal transduction or immunity. Information on the homo-oligomer structure is therefore important to obtain a molecular-level understanding of protein functions and their regulation. Currently available web servers predict protein homo-oligomer structures either by template-based modeling using homo-oligomer templates selected from the protein structure database or by ab initio docking of monomer structures resolved by experiment or predicted by computation. The GalaxyHomomer server, freely accessible at http://galaxy.seoklab.org/homomer, carries out template-based modeling, ab initio docking or both depending on the availability of proper oligomer templates. It also incorporates recently developed model refinement methods that can consistently improve model quality. Moreover, the server provides additional options that can be chosen by the user depending on the availability of information on the monomer structure, oligomeric state and locations of unreliable/flexible loops or termini. The performance of the server was better than or comparable to that of other available methods when tested on benchmark sets and in a recent CASP performed in a blind fashion. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  1. Structure and function of nanoparticle-protein conjugates

    International Nuclear Information System (INIS)

    Aubin-Tam, M-E; Hamad-Schifferli, K

    2008-01-01

    Conjugation of proteins to nanoparticles has numerous applications in sensing, imaging, delivery, catalysis, therapy and control of protein structure and activity. Therefore, characterizing the nanoparticle-protein interface is of great importance. A variety of covalent and non-covalent linking chemistries have been reported for nanoparticle attachment. Site-specific labeling is desirable in order to control the protein orientation on the nanoparticle, which is crucial in many applications such as fluorescence resonance energy transfer. We evaluate methods for successful site-specific attachment. Typically, a specific protein residue is linked directly to the nanoparticle core or to the ligand. As conjugation often affects the protein structure and function, techniques to probe structure and activity are assessed. We also examine how molecular dynamics simulations of conjugates would complete those experimental techniques in order to provide atomistic details on the effect of nanoparticle attachment. Characterization studies of nanoparticle-protein complexes show that the structure and function are influenced by the chemistry of the nanoparticle ligand, the nanoparticle size, the nanoparticle material, the stoichiometry of the conjugates, the labeling site on the protein and the nature of the linkage (covalent versus non-covalent)

  2. Computing a new family of shape descriptors for protein structures

    DEFF Research Database (Denmark)

    Røgen, Peter; Sinclair, Robert

    2003-01-01

    The large-scale 3D structure of a protein can be represented by the polygonal curve through the carbon a atoms of the protein backbone. We introduce an algorithm for computing the average number of times that a given configuration of crossings on such polygonal curves is seen, the average being...

  3. Protein structure estimation from NMR data by matrix completion.

    Science.gov (United States)

    Li, Zhicheng; Li, Yang; Lei, Qiang; Zhao, Qing

    2017-09-01

    Knowledge of protein structures is very important to understand their corresponding physical and chemical properties. Nuclear Magnetic Resonance (NMR) spectroscopy is one of the main methods to measure protein structure. In this paper, we propose a two-stage approach to calculate the structure of a protein from a highly incomplete distance matrix, where most data are obtained from NMR. We first randomly "guess" a small part of unobservable distances by utilizing the triangle inequality, which is crucial for the second stage. Then we use matrix completion to calculate the protein structure from the obtained incomplete distance matrix. We apply the accelerated proximal gradient algorithm to solve the corresponding optimization problem. Furthermore, the recovery error of our method is analyzed, and its efficiency is demonstrated by several practical examples.

  4. Model of a DNA-protein complex of the architectural monomeric protein MC1 from Euryarchaea.

    Directory of Open Access Journals (Sweden)

    Françoise Paquet

    Full Text Available In Archaea the two major modes of DNA packaging are wrapping by histone proteins or bending by architectural non-histone proteins. To supplement our knowledge about the binding mode of the different DNA-bending proteins observed across the three domains of life, we present here the first model of a complex in which the monomeric Methanogen Chromosomal protein 1 (MC1 from Euryarchaea binds to the concave side of a strongly bent DNA. In laboratory growth conditions MC1 is the most abundant architectural protein present in Methanosarcina thermophila CHTI55. Like most proteins that strongly bend DNA, MC1 is known to bind in the minor groove. Interaction areas for MC1 and DNA were mapped by Nuclear Magnetic Resonance (NMR data. The polarity of protein binding was determined using paramagnetic probes attached to the DNA. The first structural model of the DNA-MC1 complex we propose here was obtained by two complementary docking approaches and is in good agreement with the experimental data previously provided by electron microscopy and biochemistry. Residues essential to DNA-binding and -bending were highlighted and confirmed by site-directed mutagenesis. It was found that the Arg25 side-chain was essential to neutralize the negative charge of two phosphates that come very close in response to a dramatic curvature of the DNA.

  5. Computational design of proteins with novel structure and functions

    International Nuclear Information System (INIS)

    Yang Wei; Lai Lu-Hua

    2016-01-01

    Computational design of proteins is a relatively new field, where scientists search the enormous sequence space for sequences that can fold into desired structure and perform desired functions. With the computational approach, proteins can be designed, for example, as regulators of biological processes, novel enzymes, or as biotherapeutics. These approaches not only provide valuable information for understanding of sequence–structure–function relations in proteins, but also hold promise for applications to protein engineering and biomedical research. In this review, we briefly introduce the rationale for computational protein design, then summarize the recent progress in this field, including de novo protein design, enzyme design, and design of protein–protein interactions. Challenges and future prospects of this field are also discussed. (topical review)

  6. Potato leafroll virus structural proteins manipulate overlapping, yet distinct protein interaction networks during infection.

    Science.gov (United States)

    DeBlasio, Stacy L; Johnson, Richard; Sweeney, Michelle M; Karasev, Alexander; Gray, Stewart M; MacCoss, Michael J; Cilia, Michelle

    2015-06-01

    Potato leafroll virus (PLRV) produces a readthrough protein (RTP) via translational readthrough of the coat protein amber stop codon. The RTP functions as a structural component of the virion and as a nonincorporated protein in concert with numerous insect and plant proteins to regulate virus movement/transmission and tissue tropism. Affinity purification coupled to quantitative MS was used to generate protein interaction networks for a PLRV mutant that is unable to produce the read through domain (RTD) and compared to the known wild-type PLRV protein interaction network. By quantifying differences in the protein interaction networks, we identified four distinct classes of PLRV-plant interactions: those plant and nonstructural viral proteins interacting with assembled coat protein (category I); plant proteins in complex with both coat protein and RTD (category II); plant proteins in complex with the RTD (category III); and plant proteins that had higher affinity for virions lacking the RTD (category IV). Proteins identified as interacting with the RTD are potential candidates for regulating viral processes that are mediated by the RTP such as phloem retention and systemic movement and can potentially be useful targets for the development of strategies to prevent infection and/or viral transmission of Luteoviridae species that infect important crop species. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  7. Analysis and Ranking of Protein-Protein Docking Models Using Inter-Residue Contacts and Inter-Molecular Contact Maps

    KAUST Repository

    Oliva, Romina; Chermak, Edrisse; Cavallo, Luigi

    2015-01-01

    In view of the increasing interest both in inhibitors of protein-protein interactions and in protein drugs themselves, analysis of the three-dimensional structure of protein-protein complexes is assuming greater relevance in drug design. In the many cases where an experimental structure is not available, protein-protein docking becomes the method of choice for predicting the arrangement of the complex. However, reliably scoring protein-protein docking poses is still an unsolved problem. As a consequence, the screening of many docking models is usually required in the analysis step, to possibly single out the correct ones. Here, making use of exemplary cases, we review our recently introduced methods for the analysis of protein complex structures and for the scoring of protein docking poses, based on the use of inter-residue contacts and their visualization in inter-molecular contact maps. We also show that the ensemble of tools we developed can be used in the context of rational drug design targeting protein-protein interactions.

  8. Analysis and Ranking of Protein-Protein Docking Models Using Inter-Residue Contacts and Inter-Molecular Contact Maps

    KAUST Repository

    Oliva, Romina

    2015-07-01

    In view of the increasing interest both in inhibitors of protein-protein interactions and in protein drugs themselves, analysis of the three-dimensional structure of protein-protein complexes is assuming greater relevance in drug design. In the many cases where an experimental structure is not available, protein-protein docking becomes the method of choice for predicting the arrangement of the complex. However, reliably scoring protein-protein docking poses is still an unsolved problem. As a consequence, the screening of many docking models is usually required in the analysis step, to possibly single out the correct ones. Here, making use of exemplary cases, we review our recently introduced methods for the analysis of protein complex structures and for the scoring of protein docking poses, based on the use of inter-residue contacts and their visualization in inter-molecular contact maps. We also show that the ensemble of tools we developed can be used in the context of rational drug design targeting protein-protein interactions.

  9. Influence of degree correlations on network structure and stability in protein-protein interaction networks

    Directory of Open Access Journals (Sweden)

    Zimmer Ralf

    2007-08-01

    Full Text Available Abstract Background The existence of negative correlations between degrees of interacting proteins is being discussed since such negative degree correlations were found for the large-scale yeast protein-protein interaction (PPI network of Ito et al. More recent studies observed no such negative correlations for high-confidence interaction sets. In this article, we analyzed a range of experimentally derived interaction networks to understand the role and prevalence of degree correlations in PPI networks. We investigated how degree correlations influence the structure of networks and their tolerance against perturbations such as the targeted deletion of hubs. Results For each PPI network, we simulated uncorrelated, positively and negatively correlated reference networks. Here, a simple model was developed which can create different types of degree correlations in a network without changing the degree distribution. Differences in static properties associated with degree correlations were compared by analyzing the network characteristics of the original PPI and reference networks. Dynamics were compared by simulating the effect of a selective deletion of hubs in all networks. Conclusion Considerable differences between the network types were found for the number of components in the original networks. Negatively correlated networks are fragmented into significantly less components than observed for positively correlated networks. On the other hand, the selective deletion of hubs showed an increased structural tolerance to these deletions for the positively correlated networks. This results in a lower rate of interaction loss in these networks compared to the negatively correlated networks and a decreased disintegration rate. Interestingly, real PPI networks are most similar to the randomly correlated references with respect to all properties analyzed. Thus, although structural properties of networks can be modified considerably by degree

  10. Scoring predictive models using a reduced representation of proteins: model and energy definition.

    Science.gov (United States)

    Fogolari, Federico; Pieri, Lidia; Dovier, Agostino; Bortolussi, Luca; Giugliarelli, Gilberto; Corazza, Alessandra; Esposito, Gennaro; Viglino, Paolo

    2007-03-23

    Reduced representations of proteins have been playing a keyrole in the study of protein folding. Many such models are available, with different representation detail. Although the usefulness of many such models for structural bioinformatics applications has been demonstrated in recent years, there are few intermediate resolution models endowed with an energy model capable, for instance, of detecting native or native-like structures among decoy sets. The aim of the present work is to provide a discrete empirical potential for a reduced protein model termed here PC2CA, because it employs a PseudoCovalent structure with only 2 Centers of interactions per Amino acid, suitable for protein model quality assessment. All protein structures in the set top500H have been converted in reduced form. The distribution of pseudobonds, pseudoangle, pseudodihedrals and distances between centers of interactions have been converted into potentials of mean force. A suitable reference distribution has been defined for non-bonded interactions which takes into account excluded volume effects and protein finite size. The correlation between adjacent main chain pseudodihedrals has been converted in an additional energetic term which is able to account for cooperative effects in secondary structure elements. Local energy surface exploration is performed in order to increase the robustness of the energy function. The model and the energy definition proposed have been tested on all the multiple decoys' sets in the Decoys'R'us database. The energetic model is able to recognize, for almost all sets, native-like structures (RMSD less than 2.0 A). These results and those obtained in the blind CASP7 quality assessment experiment suggest that the model compares well with scoring potentials with finer granularity and could be useful for fast exploration of conformational space. Parameters are available at the url: http://www.dstb.uniud.it/~ffogolari/download/.

  11. Crystal Structure of the 23S rRNA Fragment Specific to r-Protein L1 and Designed Model of the Ribosomal L1 Stalk from Haloarcula marismortui

    Directory of Open Access Journals (Sweden)

    Azat Gabdulkhakov

    2017-02-01

    Full Text Available The crystal structure of the 92-nucleotide L1-specific fragment of 23S rRNA from Haloarcula marismortui (Hma has been determined at 3.3 Å resolution. Similar to the corresponding bacterial rRNA fragments, this structure contains joined helix 76-77 topped by an approximately globular structure formed by the residual part of the L1 stalk rRNA. The position of HmaL1 relative to the rRNA was found by its docking to the rRNA fragment using the L1-rRNA complex from Thermus thermophilus as a guide model. In spite of the anomalous negative charge of the halophilic archaeal protein, the conformation of the HmaL1-rRNA interface appeared to be very close to that observed in all known L1-rRNA complexes. The designed structure of the L1 stalk was incorporated into the H. marismortui 50S ribosomal subunit. Comparison of relative positions of L1 stalks in 50S subunits from H. marismortui and T. thermophilus made it possible to reveal the site of inflection of rRNA during the ribosome function.

  12. Constraining cyclic peptides to mimic protein structure motifs

    DEFF Research Database (Denmark)

    Hill, Timothy A.; Shepherd, Nicholas E.; Diness, Frederik

    2014-01-01

    peptides can have protein-like biological activities and potencies, enabling their uses as biological probes and leads to therapeutics, diagnostics and vaccines. This Review highlights examples of cyclic peptides that mimic three-dimensional structures of strand, turn or helical segments of peptides...... and proteins, and identifies some additional restraints incorporated into natural product cyclic peptides and synthetic macrocyclic pepti-domimetics that refine peptide structure and confer biological properties....

  13. Overcoming bottlenecks in the membrane protein structural biology pipeline.

    Science.gov (United States)

    Hardy, David; Bill, Roslyn M; Jawhari, Anass; Rothnie, Alice J

    2016-06-15

    Membrane proteins account for a third of the eukaryotic proteome, but are greatly under-represented in the Protein Data Bank. Unfortunately, recent technological advances in X-ray crystallography and EM cannot account for the poor solubility and stability of membrane protein samples. A limitation of conventional detergent-based methods is that detergent molecules destabilize membrane proteins, leading to their aggregation. The use of orthologues, mutants and fusion tags has helped improve protein stability, but at the expense of not working with the sequence of interest. Novel detergents such as glucose neopentyl glycol (GNG), maltose neopentyl glycol (MNG) and calixarene-based detergents can improve protein stability without compromising their solubilizing properties. Styrene maleic acid lipid particles (SMALPs) focus on retaining the native lipid bilayer of a membrane protein during purification and biophysical analysis. Overcoming bottlenecks in the membrane protein structural biology pipeline, primarily by maintaining protein stability, will facilitate the elucidation of many more membrane protein structures in the near future. © 2016 The Author(s). published by Portland Press Limited on behalf of the Biochemical Society.

  14. Functional diversification of structurally alike NLR proteins in plants.

    Science.gov (United States)

    Chakraborty, Joydeep; Jain, Akansha; Mukherjee, Dibya; Ghosh, Suchismita; Das, Sampa

    2018-04-01

    In due course of evolution many pathogens alter their effector molecules to modulate the host plants' metabolism and immune responses triggered upon proper recognition by the intracellular nucleotide-binding oligomerization domain containing leucine-rich repeat (NLR) proteins. Likewise, host plants have also evolved with diversified NLR proteins as a survival strategy to win the battle against pathogen invasion. NLR protein indeed detects pathogen derived effector proteins leading to the activation of defense responses associated with programmed cell death (PCD). In this interactive process, genome structure and plasticity play pivotal role in the development of innate immunity. Despite being quite conserved with similar biological functions in all eukaryotes, the intracellular NLR immune receptor proteins happen to be structurally distinct. Recent studies have made progress in identifying transcriptional regulatory complexes activated by NLR proteins. In this review, we attempt to decipher the intracellular NLR proteins mediated surveillance across the evolutionarily diverse taxa, highlighting some of the recent updates on NLR protein compartmentalization, molecular interactions before and after activation along with insights into the finer role of these receptor proteins to combat invading pathogens upon their recognition. Latest information on NLR sensors, helpers and NLR proteins with integrated domains in the context of plant pathogen interactions are also discussed. Copyright © 2018 Elsevier B.V. All rights reserved.

  15. Combining neural networks for protein secondary structure prediction

    DEFF Research Database (Denmark)

    Riis, Søren Kamaric

    1995-01-01

    In this paper structured neural networks are applied to the problem of predicting the secondary structure of proteins. A hierarchical approach is used where specialized neural networks are designed for each structural class and then combined using another neural network. The submodels are designed...... by using a priori knowledge of the mapping between protein building blocks and the secondary structure and by using weight sharing. Since none of the individual networks have more than 600 adjustable weights over-fitting is avoided. When ensembles of specialized experts are combined the performance...

  16. Structural determinants for protein adsorption/non-adsorption to silica surface

    International Nuclear Information System (INIS)

    Mathe, Christelle; Devineau, Stephanie; Aude, Jean-Christophe; Lagniel, Gilles; Chedin, Stephane; Legros, Veronique; Mathon, Marie-Helene; Renault, Jean-Philippe; Pin, Serge; Boulard, Yves; Labarre, Jean

    2013-01-01

    The understanding of the mechanisms involved in the interaction of proteins with inorganic surfaces is of major interest in both fundamental research and applications such as nano-technology. However, despite intense research, the mechanisms and the structural determinants of protein/surface interactions are still unclear. We developed a strategy consisting in identifying, in a mixture of hundreds of soluble proteins, those proteins that are adsorbed on the surface and those that are not. If the two protein subsets are large enough, their statistical comparative analysis must reveal the physicochemical determinants relevant for adsorption versus non-adsorption. This methodology was tested with silica nanoparticles. We found that the adsorbed proteins contain a higher number of charged amino acids, particularly arginine, which is consistent with involvement of this basic amino acid in electrostatic interactions with silica. The analysis also identified a marked bias toward low aromatic amino acid content (phenylalanine, tryptophan, tyrosine and histidine) in adsorbed proteins. Structural analyses and molecular dynamics simulations of proteins from the two groups indicate that non-adsorbed proteins have twice as many p-p interactions and higher structural rigidity. The data are consistent with the notion that adsorption is correlated with the flexibility of the protein and with its ability to spread on the surface. Our findings led us to propose a refined model of protein adsorption. (authors)

  17. Structural determinants for protein adsorption/non-adsorption to silica surface.

    Directory of Open Access Journals (Sweden)

    Christelle Mathé

    Full Text Available The understanding of the mechanisms involved in the interaction of proteins with inorganic surfaces is of major interest in both fundamental research and applications such as nanotechnology. However, despite intense research, the mechanisms and the structural determinants of protein/surface interactions are still unclear. We developed a strategy consisting in identifying, in a mixture of hundreds of soluble proteins, those proteins that are adsorbed on the surface and those that are not. If the two protein subsets are large enough, their statistical comparative analysis must reveal the physicochemical determinants relevant for adsorption versus non-adsorption. This methodology was tested with silica nanoparticles. We found that the adsorbed proteins contain a higher number of charged amino acids, particularly arginine, which is consistent with involvement of this basic amino acid in electrostatic interactions with silica. The analysis also identified a marked bias toward low aromatic amino acid content (phenylalanine, tryptophan, tyrosine and histidine in adsorbed proteins. Structural analyses and molecular dynamics simulations of proteins from the two groups indicate that non-adsorbed proteins have twice as many π-π interactions and higher structural rigidity. The data are consistent with the notion that adsorption is correlated with the flexibility of the protein and with its ability to spread on the surface. Our findings led us to propose a refined model of protein adsorption.

  18. SCOWLP classification: Structural comparison and analysis of protein binding regions

    Directory of Open Access Journals (Sweden)

    Anders Gerd

    2008-01-01

    Full Text Available Abstract Background Detailed information about protein interactions is critical for our understanding of the principles governing protein recognition mechanisms. The structures of many proteins have been experimentally determined in complex with different ligands bound either in the same or different binding regions. Thus, the structural interactome requires the development of tools to classify protein binding regions. A proper classification may provide a general view of the regions that a protein uses to bind others and also facilitate a detailed comparative analysis of the interacting information for specific protein binding regions at atomic level. Such classification might be of potential use for deciphering protein interaction networks, understanding protein function, rational engineering and design. Description Protein binding regions (PBRs might be ideally described as well-defined separated regions that share no interacting residues one another. However, PBRs are often irregular, discontinuous and can share a wide range of interacting residues among them. The criteria to define an individual binding region can be often arbitrary and may differ from other binding regions within a protein family. Therefore, the rational behind protein interface classification should aim to fulfil the requirements of the analysis to be performed. We extract detailed interaction information of protein domains, peptides and interfacial solvent from the SCOWLP database and we classify the PBRs of each domain family. For this purpose, we define a similarity index based on the overlapping of interacting residues mapped in pair-wise structural alignments. We perform our classification with agglomerative hierarchical clustering using the complete-linkage method. Our classification is calculated at different similarity cut-offs to allow flexibility in the analysis of PBRs, feature especially interesting for those protein families with conflictive binding regions

  19. Sequential Release of Proteins from Structured Multishell Microcapsules.

    Science.gov (United States)

    Shimanovich, Ulyana; Michaels, Thomas C T; De Genst, Erwin; Matak-Vinkovic, Dijana; Dobson, Christopher M; Knowles, Tuomas P J

    2017-10-09

    In nature, a wide range of functional materials is based on proteins. Increasing attention is also turning to the use of proteins as artificial biomaterials in the form of films, gels, particles, and fibrils that offer great potential for applications in areas ranging from molecular medicine to materials science. To date, however, most such applications have been limited to single component materials despite the fact that their natural analogues are composed of multiple types of proteins with a variety of functionalities that are coassembled in a highly organized manner on the micrometer scale, a process that is currently challenging to achieve in the laboratory. Here, we demonstrate the fabrication of multicomponent protein microcapsules where the different components are positioned in a controlled manner. We use molecular self-assembly to generate multicomponent structures on the nanometer scale and droplet microfluidics to bring together the different components on the micrometer scale. Using this approach, we synthesize a wide range of multiprotein microcapsules containing three well-characterized proteins: glucagon, insulin, and lysozyme. The localization of each protein component in multishell microcapsules has been detected by labeling protein molecules with different fluorophores, and the final three-dimensional microcapsule structure has been resolved by using confocal microscopy together with image analysis techniques. In addition, we show that these structures can be used to tailor the release of such functional proteins in a sequential manner. Moreover, our observations demonstrate that the protein release mechanism from multishell capsules is driven by the kinetic control of mass transport of the cargo and by the dissolution of the shells. The ability to generate artificial materials that incorporate a variety of different proteins with distinct functionalities increases the breadth of the potential applications of artificial protein-based materials

  20. Structuring oil by protein building blocks

    NARCIS (Netherlands)

    Vries, de Auke

    2017-01-01

    Over the recent years, structuring of oil into ‘organogels’ or ‘oleogels’ has gained much attention amongst colloid-, material,- and food scientists. Potentially, these oleogels could be used as an alternative for saturated- and trans fats in food products. To develop oleogels as a

  1. Handbook of structural equation modeling

    CERN Document Server

    Hoyle, Rick H

    2012-01-01

    The first comprehensive structural equation modeling (SEM) handbook, this accessible volume presents both the mechanics of SEM and specific SEM strategies and applications. The editor, contributors, and editorial advisory board are leading methodologists who have organized the book to move from simpler material to more statistically complex modeling approaches. Sections cover the foundations of SEM; statistical underpinnings, from assumptions to model modifications; steps in implementation, from data preparation through writing the SEM report; and basic and advanced applications, inclu

  2. Chaperonin Structure - The Large Multi-Subunit Protein Complex

    Directory of Open Access Journals (Sweden)

    Irena Roterman

    2009-03-01

    Full Text Available The multi sub-unit protein structure representing the chaperonins group is analyzed with respect to its hydrophobicity distribution. The proteins of this group assist protein folding supported by ATP. The specific axial symmetry GroEL structure (two rings of seven units stacked back to back - 524 aa each and the GroES (single ring of seven units - 97 aa each polypeptide chains are analyzed using the hydrophobicity distribution expressed as excess/deficiency all over the molecule to search for structure-to-function relationships. The empirically observed distribution of hydrophobic residues is confronted with the theoretical one representing the idealized hydrophobic core with hydrophilic residues exposure on the surface. The observed discrepancy between these two distributions seems to be aim-oriented, determining the structure-to-function relation. The hydrophobic force field structure generated by the chaperonin capsule is presented. Its possible influence on substrate folding is suggested.

  3. NMR structural studies of peptides and proteins in membranes

    Energy Technology Data Exchange (ETDEWEB)

    Opella, S J [Pennsylvania Univ., Philadelphia, PA (United States). Dept. of Chemistry

    1994-12-31

    The use of NMR methodology in structural studies is described as applicable to larger proteins, considering that the majority of membrane proteins is constructed from a limited repertoire of structural and dynamic elements. The membrane associated domains of these proteins are made up of long hydrophobic membrane spanning helices, shorter amphipathic bridging helices in the plane of the bilayer, connecting loops with varying degrees of mobility, and mobile N- and C- terminal sections. NMR studies have been successful in identifying all of these elements and their orientations relative to each other and the membrane bilayer 19 refs., 9 figs.

  4. High throughput platforms for structural genomics of integral membrane proteins.

    Science.gov (United States)

    Mancia, Filippo; Love, James

    2011-08-01

    Structural genomics approaches on integral membrane proteins have been postulated for over a decade, yet specific efforts are lagging years behind their soluble counterparts. Indeed, high throughput methodologies for production and characterization of prokaryotic integral membrane proteins are only now emerging, while large-scale efforts for eukaryotic ones are still in their infancy. Presented here is a review of recent literature on actively ongoing structural genomics of membrane protein initiatives, with a focus on those aimed at implementing interesting techniques aimed at increasing our rate of success for this class of macromolecules. Copyright © 2011 Elsevier Ltd. All rights reserved.

  5. Mining protein loops using a structural alphabet and statistical exceptionality

    Directory of Open Access Journals (Sweden)

    Martin Juliette

    2010-02-01

    Full Text Available Abstract Background Protein loops encompass 50% of protein residues in available three-dimensional structures. These regions are often involved in protein functions, e.g. binding site, catalytic pocket... However, the description of protein loops with conventional tools is an uneasy task. Regular secondary structures, helices and strands, have been widely studied whereas loops, because they are highly variable in terms of sequence and structure, are difficult to analyze. Due to data sparsity, long loops have rarely been systematically studied. Results We developed a simple and accurate method that allows the description and analysis of the structures of short and long loops using structural motifs without restriction on loop length. This method is based on the structural alphabet HMM-SA. HMM-SA allows the simplification of a three-dimensional protein structure into a one-dimensional string of states, where each state is a four-residue prototype fragment, called structural letter. The difficult task of the structural grouping of huge data sets is thus easily accomplished by handling structural letter strings as in conventional protein sequence analysis. We systematically extracted all seven-residue fragments in a bank of 93000 protein loops and grouped them according to the structural-letter sequence, named structural word. This approach permits a systematic analysis of loops of all sizes since we consider the structural motifs of seven residues rather than complete loops. We focused the analysis on highly recurrent words of loops (observed more than 30 times. Our study reveals that 73% of loop-lengths are covered by only 3310 highly recurrent structural words out of 28274 observed words. These structural words have low structural variability (mean RMSd of 0.85 Å. As expected, half of these motifs display a flanking-region preference but interestingly, two thirds are shared by short (less than 12 residues and long loops. Moreover, half of

  6. Mining protein loops using a structural alphabet and statistical exceptionality.

    Science.gov (United States)

    Regad, Leslie; Martin, Juliette; Nuel, Gregory; Camproux, Anne-Claude

    2010-02-04

    Protein loops encompass 50% of protein residues in available three-dimensional structures. These regions are often involved in protein functions, e.g. binding site, catalytic pocket... However, the description of protein loops with conventional tools is an uneasy task. Regular secondary structures, helices and strands, have been widely studied whereas loops, because they are highly variable in terms of sequence and structure, are difficult to analyze. Due to data sparsity, long loops have rarely been systematically studied. We developed a simple and accurate method that allows the description and analysis of the structures of short and long loops using structural motifs without restriction on loop length. This method is based on the structural alphabet HMM-SA. HMM-SA allows the simplification of a three-dimensional protein structure into a one-dimensional string of states, where each state is a four-residue prototype fragment, called structural letter. The difficult task of the structural grouping of huge data sets is thus easily accomplished by handling structural letter strings as in conventional protein sequence analysis. We systematically extracted all seven-residue fragments in a bank of 93000 protein loops and grouped them according to the structural-letter sequence, named structural word. This approach permits a systematic analysis of loops of all sizes since we consider the structural motifs of seven residues rather than complete loops. We focused the analysis on highly recurrent words of loops (observed more than 30 times). Our study reveals that 73% of loop-lengths are covered by only 3310 highly recurrent structural words out of 28274 observed words). These structural words have low structural variability (mean RMSd of 0.85 A). As expected, half of these motifs display a flanking-region preference but interestingly, two thirds are shared by short (less than 12 residues) and long loops. Moreover, half of recurrent motifs exhibit a significant level of

  7. Structural protein relationships among eastern equine encephalitis viruses.

    Science.gov (United States)

    Strizki, J M; Repik, P M

    1994-11-01

    We have re-evaluated the relationships among the polypeptides of eastern equine encephalitis (EEE) viruses using SDS-PAGE and peptide mapping of individual virion proteins. Four to five distinct polypeptide bands were detected upon SDS-PAGE analysis of viruses: the E1, E2 and C proteins normally associated with alphavirus virions, as well as an additional more rapidly-migrating E2-associated protein and a high M(r) (HMW) protein. In contrast with previous findings by others, the electrophoretic profiles of the virion proteins of EEE viruses displayed a marked correlation with serotype. The protein profiles of the 33 North American (NA)-serotype viruses examined were remarkably homogeneous, with variation detected only in the E1 protein of two isolates. In contrast, considerable heterogeneity was observed in the migration profiles of both the E1 and E2 glycoproteins of the 13 South American (SA)-type viruses examined. Peptide mapping of individual virion proteins using limited proteolysis with Staphylococcus aureus V8 protease confirmed that, in addition to the homogeneity evident among NA-type viruses and relative heterogeneity among SA-type viruses, the E1 and E2 proteins of NA- and SA-serotype viruses exhibited serotype-specific structural variation. The C protein was highly conserved among isolates of both virus serotypes. Endoglycosidase analyses of intact virions did not reveal substantial glycosylation differences between the glycoproteins of NA- and SA-serotype viruses. Both the HMW protein and the E2 protein (doublet) of EEE virus appeared to contain, at least in part, high-mannose type N-linked oligosaccharides. No evidence of O-linked glycans was found on either the E1 or the E2 glycoprotein. Despite the observed structural differences between proteins of NA- and SA-type viruses, Western blot analyses utilizing polyclonal antibodies indicated that immunoreactive epitopes appeared to be conserved.

  8. An Algebro-Topological Description of Protein Domain Structure

    Science.gov (United States)

    Penner, Robert Clark; Knudsen, Michael; Wiuf, Carsten; Andersen, Jørgen Ellegaard

    2011-01-01

    The space of possible protein structures appears vast and continuous, and the relationship between primary, secondary and tertiary structure levels is complex. Protein structure comparison and classification is therefore a difficult but important task since structure is a determinant for molecular interaction and function. We introduce a novel mathematical abstraction based on geometric topology to describe protein domain structure. Using the locations of the backbone atoms and the hydrogen bonds, we build a combinatorial object – a so-called fatgraph. The description is discrete yet gives rise to a 2-dimensional mathematical surface. Thus, each protein domain corresponds to a particular mathematical surface with characteristic topological invariants, such as the genus (number of holes) and the number of boundary components. Both invariants are global fatgraph features reflecting the interconnectivity of the domain by hydrogen bonds. We introduce the notion of robust variables, that is variables that are robust towards minor changes in the structure/fatgraph, and show that the genus and the number of boundary components are robust. Further, we invesigate the distribution of different fatgraph variables and show how only four variables are capable of distinguishing different folds. We use local (secondary) and global (tertiary) fatgraph features to describe domain structures and illustrate that they are useful for classification of domains in CATH. In addition, we combine our method with two other methods thereby using primary, secondary, and tertiary structure information, and show that we can identify a large percentage of new and unclassified structures in CATH. PMID:21629687

  9. Beyond Textbook Illustrations: Hand-Held Models of Ordered DNA and Protein Structures as 3D Supplements to Enhance Student Learning of Helical Biopolymers

    Science.gov (United States)

    Jittivadhna, Karnyupha; Ruenwongsa, Pintip; Panijpan, Bhinyo

    2010-01-01

    Textbook illustrations of 3D biopolymers on printed paper, regardless of how detailed and colorful, suffer from its two-dimensionality. For beginners, computer screen display of skeletal models of biopolymers and their animation usually does not provide the at-a-glance 3D perception and details, which can be done by good hand-held models. Here, we…

  10. Linking structural features of protein complexes and biological function.

    Science.gov (United States)

    Sowmya, Gopichandran; Breen, Edmond J; Ranganathan, Shoba

    2015-09-01

    Protein-protein interaction (PPI) establishes the central basis for complex cellular networks in a biological cell. Association of proteins with other proteins occurs at varying affinities, yet with a high degree of specificity. PPIs lead to diverse functionality such as catalysis, regulation, signaling, immunity, and inhibition, playing a crucial role in functional genomics. The molecular principle of such interactions is often elusive in nature. Therefore, a comprehensive analysis of known protein complexes from the Protein Data Bank (PDB) is essential for the characterization of structural interface features to determine structure-function relationship. Thus, we analyzed a nonredundant dataset of 278 heterodimer protein complexes, categorized into major functional classes, for distinguishing features. Interestingly, our analysis has identified five key features (interface area, interface polar residue abundance, hydrogen bonds, solvation free energy gain from interface formation, and binding energy) that are discriminatory among the functional classes using Kruskal-Wallis rank sum test. Significant correlations between these PPI interface features amongst functional categories are also documented. Salt bridges correlate with interface area in regulator-inhibitors (r = 0.75). These representative features have implications for the prediction of potential function of novel protein complexes. The results provide molecular insights for better understanding of PPIs and their relation to biological functions. © 2015 The Protein Society.

  11. A computer graphics program system for protein structure representation.

    Science.gov (United States)

    Ross, A M; Golub, E E

    1988-01-01

    We have developed a computer graphics program system for the schematic representation of several protein secondary structure analysis algorithms. The programs calculate the probability of occurrence of alpha-helix, beta-sheet and beta-turns by the method of Chou and Fasman and assign unique predicted structure to each residue using a novel conflict resolution algorithm based on maximum likelihood. A detailed structure map containing secondary structure, hydrophobicity, sequence identity, sequence numbering and the location of putative N-linked glycosylation sites is then produced. In addition, helical wheel diagrams and hydrophobic moment calculations can be performed to further analyze the properties of selected regions of the sequence. As they require only structure specification as input, the graphics programs can easily be adapted for use with other secondary structure prediction schemes. The use of these programs to analyze protein structure-function relationships is described and evaluated. PMID:2832829

  12. Hyperactive antifreeze proteins from longhorn beetles: some structural insights.

    Science.gov (United States)

    Kristiansen, Erlend; Wilkens, Casper; Vincents, Bjarne; Friis, Dennis; Lorentzen, Anders Blomkild; Jenssen, Håvard; Løbner-Olesen, Anders; Ramløv, Hans

    2012-11-01

    This study reports on structural characteristics of hyperactive antifreeze proteins (AFPs) from two species of longhorn beetles. In Rhagium mordax, eight unique mRNAs coding for five different mature AFPs were identified from cold-hardy individuals. These AFPs are apparently homologues to a previously characterized AFP from the closely related species Rhagium inquisitor, and consist of six identifiable repeats of a putative ice binding motif TxTxTxT spaced irregularly apart by segments varying in length from 13 to 20 residues. Circular dichroism spectra show that the AFPs from both species have a high content of β-sheet and low levels of α-helix and random coil. Theoretical predictions of residue-specific secondary structure locate these β-sheets within the putative ice-binding motifs and the central parts of the segments separating them, consistent with an overall β-helical structure with the ice-binding motifs stacked in a β-sheet on one side of the coil. Molecular dynamics models based on these findings show that these AFPs would be energetically stable in a β-helical conformation. Copyright © 2012 Elsevier Ltd. All rights reserved.

  13. Crystal structure of Homo sapiens protein LOC79017

    Energy Technology Data Exchange (ETDEWEB)

    Bae, Euiyoung; Bingman, Craig A.; Aceti, David J.; Phillips, Jr., George N. (UW)

    2010-02-08

    LOC79017 (MW 21.0 kDa, residues 1-188) was annotated as a hypothetical protein encoded by Homo sapiens chromosome 7 open reading frame 24. It was selected as a target by the Center for Eukaryotic Structural Genomics (CESG) because it did not share more than 30% sequence identity with any protein for which the three-dimensional structure is known. The biological function of the protein has not been established yet. Parts of LOC79017 were identified as members of uncharacterized Pfam families (residues 1-95 as PB006073 and residues 104-180 as PB031696). BLAST searches revealed homologues of LOC79017 in many eukaryotes, but none of them have been functionally characterized. Here, we report the crystal structure of H. sapiens protein LOC79017 (UniGene code Hs.530024, UniProt code O75223, CESG target number go.35223).

  14. Deprotonated imidodiphosphate in AMPPNP-containing protein structures

    International Nuclear Information System (INIS)

    Dauter, Miroslawa; Dauter, Zbigniew

    2011-01-01

    In certain AMPPNP-containing protein structures, the nitrogen bridging the two terminal phosphate groups can be deprotonated. Many different proteins utilize the chemical energy provided by the cofactor adenosine triphosphate (ATP) for their proper function. A number of structures in the Protein Data Bank (PDB) contain adenosine 5′-(β,γ-imido)triphosphate (AMPPNP), a nonhydrolysable analog of ATP in which the bridging O atom between the two terminal phosphate groups is substituted by the imido function. Under mild conditions imides do not have acidic properties and thus the imide nitrogen should be protonated. However, an analysis of protein structures containing AMPPNP reveals that the imide group is deprotonated in certain complexes if the negative charges of the phosphate moieties in AMPPNP are in part neutralized by coordinating divalent metals or a guanidinium group of an arginine

  15. EVA: continuous automatic evaluation of protein structure prediction servers.

    Science.gov (United States)

    Eyrich, V A; Martí-Renom, M A; Przybylski, D; Madhusudhan, M S; Fiser, A; Pazos, F; Valencia, A; Sali, A; Rost, B

    2001-12-01

    Evaluation of protein structure prediction methods is difficult and time-consuming. Here, we describe EVA, a web server for assessing protein structure prediction methods, in an automated, continuous and large-scale fashion. Currently, EVA evaluates the performance of a variety of prediction methods available through the internet. Every week, the sequences of the latest experimentally determined protein structures are sent to prediction servers, results are collected, performance is evaluated, and a summary is published on the web. EVA has so far collected data for more than 3000 protein chains. These results may provide valuable insight to both developers and users of prediction methods. http://cubic.bioc.columbia.edu/eva. eva@cubic.bioc.columbia.edu

  16. Protein folding simulations: from coarse-grained model to all-atom model.

    Science.gov (United States)

    Zhang, Jian; Li, Wenfei; Wang, Jun; Qin, Meng; Wu, Lei; Yan, Zhiqiang; Xu, Weixin; Zuo, Guanghong; Wang, Wei

    2009-06-01

    Protein folding is an important and challenging problem in molecular biology. During the last two decades, molecular dynamics (MD) simulation has proved to be a paramount tool and was widely used to study protein structures, folding kinetics and thermodynamics, and structure-stability-function relationship. It was also used to help engineering and designing new proteins, and to answer even more general questions such as the minimal number of amino acid or the evolution principle of protein families. Nowadays, the MD simulation is still undergoing rapid developments. The first trend is to toward developing new coarse-grained models and studying larger and more complex molecular systems such as protein-protein complex and their assembling process, amyloid related aggregations, and structure and motion of chaperons, motors, channels and virus capsides; the second trend is toward building high resolution models and explore more detailed and accurate pictures of protein folding and the associated processes, such as the coordination bond or disulfide bond involved folding, the polarization, charge transfer and protonate/deprotonate process involved in metal coupled folding, and the ion permeation and its coupling with the kinetics of channels. On these new territories, MD simulations have given many promising results and will continue to offer exciting views. Here, we review several new subjects investigated by using MD simulations as well as the corresponding developments of appropriate protein models. These include but are not limited to the attempt to go beyond the topology based Gō-like model and characterize the energetic factors in protein structures and dynamics, the study of the thermodynamics and kinetics of disulfide bond involved protein folding, the modeling of the interactions between chaperonin and the encapsulated protein and the protein folding under this circumstance, the effort to clarify the important yet still elusive folding mechanism of protein BBL

  17. De novo protein structure generation from incomplete chemical shift assignments

    Energy Technology Data Exchange (ETDEWEB)

    Shen Yang [National Institutes of Health, Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases (United States); Vernon, Robert; Baker, David [University of Washington, Department of Biochemistry and Howard Hughes Medical Institute (United States); Bax, Ad [National Institutes of Health, Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases (United States)], E-mail: bax@nih.gov

    2009-02-15

    NMR chemical shifts provide important local structural information for proteins. Consistent structure generation from NMR chemical shift data has recently become feasible for proteins with sizes of up to 130 residues, and such structures are of a quality comparable to those obtained with the standard NMR protocol. This study investigates the influence of the completeness of chemical shift assignments on structures generated from chemical shifts. The Chemical-Shift-Rosetta (CS-Rosetta) protocol was used for de novo protein structure generation with various degrees of completeness of the chemical shift assignment, simulated by omission of entries in the experimental chemical shift data previously used for the initial demonstration of the CS-Rosetta approach. In addition, a new CS-Rosetta protocol is described that improves robustness of the method for proteins with missing or erroneous NMR chemical shift input data. This strategy, which uses traditional Rosetta for pre-filtering of the fragment selection process, is demonstrated for two paramagnetic proteins and also for two proteins with solid-state NMR chemical shift assignments.

  18. Relationship between Molecular Structure Characteristics of Feed Proteins and Protein Digestibility and Solubility

    Directory of Open Access Journals (Sweden)

    Mingmei Bai

    2016-08-01

    Full Text Available The nutritional value of feed proteins and their utilization by livestock are related not only to the chemical composition but also to the structure of feed proteins, but few studies thus far have investigated the relationship between the structure of feed proteins and their solubility as well as digestibility in monogastric animals. To address this question we analyzed soybean meal, fish meal, corn distiller’s dried grains with solubles, corn gluten meal, and feather meal by Fourier transform infrared (FTIR spectroscopy to determine the protein molecular spectral band characteristics for amides I and II as well as α-helices and β-sheets and their ratios. Protein solubility and in vitro digestibility were measured with the Kjeldahl method using 0.2% KOH solution and the pepsin-pancreatin two-step enzymatic method, respectively. We found that all measured spectral band intensities (height and area of feed proteins were correlated with their the in vitro digestibility and solubility (p≤0.003; moreover, the relatively quantitative amounts of α-helices, random coils, and α-helix to β-sheet ratio in protein secondary structures were positively correlated with protein in vitro digestibility and solubility (p≤0.004. On the other hand, the percentage of β-sheet structures was negatively correlated with protein in vitro digestibility (p<0.001 and solubility (p = 0.002. These results demonstrate that the molecular structure characteristics of feed proteins are closely related to their in vitro digestibility at 28 h and solubility. Furthermore, the α-helix-to-β-sheet ratio can be used to predict the nutritional value of feed proteins.

  19. Characterization of structural proteins of hirame rhabdovirus, HRV

    Science.gov (United States)

    Nishizawa, Toyohiko; Yoshimizu, Mamoru; Winton, James; Ahne, Winfried; Kimura, Takahisa

    1991-01-01

    Structural proteins of hirame rhabdovirus (HRV) were analyzed by SDS-polyacrylarnide gel electrophoresis, western blotting, 2-dimensional gel electrophoresis, and Triton X-100 treatment. Purified HRV virions were composed of: polymerase (L), glycoprotein (G), nucleoprotein (N), and 2 matrix proteins (M1 and M2). Based upon their relative mobilities, the estimated molecular weights of the proteins were: L, 156 KDa; G, 68 KDa; N, 46.4 KDa; M1, 26.4 KDa; and M2, 19.9 KDa. The electrophorehc pattern formed by the structural proteins of HRV was clearly different from that formed by pike fry rhabdovirus, spring viremia of carp virus, eel virus of America, and eel virus European X which belong to the Vesiculovirus genus; however, it resembled the pattern formed by structural proteins of viral hemorrhagic septicemia virus (VHSV) and infectious hematopoietic necrosis virus (IHNV) which are members of the Lyssavirus genus. Among HRV, IHNV, and VHSV, differences were observed in the relative mobilities of the G, N, M1, and M2 proteins. Western blot analysis revealed that the G. N, and M2 proteins of HRV shared antigenic determinants with IHNV and VHSV, but not with any of the 4 fish vesiculoviruses tested. Cross-reactions between the M1 proteins of HRV, IHNV, or VHSV were not detected in this assay. Two-dimensional gel electrophoresis was used to show that HRV differed from IHNV or VHSV in the isoelectric point (PI) of the M1 and M2 proteins. In this system, 2 forms of the M1 protein of HRV and IHNV were observed.These subspecies of M1 had the same relative mobility but different p1 values. Treatment of purified virions with 2% Triton X-100 in Tris buffer containing NaCl removed the G, M1, and M2 proteins of IHNV, but HRV virions were more stable under these conditions.

  20. Amino Acid Molecular Units: Building Primary and Secondary Protein Structures

    Directory of Open Access Journals (Sweden)

    Aparecido R. Silva

    2008-05-01

    Full Text Available In order to guarantee the learning quality and suitable knowledge  use  about structural biology, it is fundamental to  exist, since the beginning of  students’ formation, the possibility of clear visualization of biomolecule structures. Nevertheless, the didactic books can only bring  schematic  drawings; even more elaborated figures and graphic computation  do not permit the necessary interaction.  The representation of three-dimensional molecular structures with ludic models, built with representative units, have supplied to the students and teachers a successfully experience to  visualize such structures and correlate them to the real molecules.  The design and applicability of the representative units were discussed with researchers and teachers before mould implementation.  In this stage  it  will be presented the  developed  kit  containing the  representative  plastic parts of the main amino acids.  The kit can demonstrate the interaction among the amino acids  functional groups  (represented by colors, shapes,  sizes and  the peptidic bonds between them  facilitating the assembly and visuali zation of the primary and secondary protein structure.  The models were designed for  Ca,  amino,  carboxyl groups  and  hydrogen. The  lateral chains have  well defined models that represent their geometrical shape.  The completed kit set  will be presented in this meeting (patent requested.  In the last phase of the project will be realized  an effective evaluation  of the kit  as a facilitative didactic tool of the teaching/learning process in the Structural Molecular Biology area.

  1. Cold-set globular protein gels: Interactions, structure and rheology as a function of protein concentration.

    NARCIS (Netherlands)

    Alting, A.C.; Hamer, R.J.; Kruif, de C.G.

    2003-01-01

    We identified the contribution of covalent and noncovalent interactions to the scaling behavior of the structural and rheological properties in a cold gelling protein system. The system we studied consisted of two types of whey protein aggregates, equal in size but different in the amount of

  2. De novo protein structure prediction by dynamic fragment assembly and conformational space annealing.

    Science.gov (United States)

    Lee, Juyong; Lee, Jinhyuk; Sasaki, Takeshi N; Sasai, Masaki; Seok, Chaok; Lee, Jooyoung

    2011-08-01

    Ab initio protein structure prediction is a challenging problem that requires both an accurate energetic representation of a protein structure and an efficient conformational sampling method for successful protein modeling. In this article, we present an ab initio structure prediction method which combines a recently suggested novel way of fragment assembly, dynamic fragment assembly (DFA) and conformational space annealing (CSA) algorithm. In DFA, model structures are scored by continuous functions constructed based on short- and long-range structural restraint information from a fragment library. Here, DFA is represented by the full-atom model by CHARMM with the addition of the empirical potential of DFIRE. The relative contributions between various energy terms are optimized using linear programming. The conformational sampling was carried out with CSA algorithm, which can find low energy conformations more efficiently than simulated annealing used in the existing DFA study. The newly introduced DFA energy function and CSA sampling algorithm are implemented into CHARMM. Test results on 30 small single-domain proteins and 13 template-free modeling targets of the 8th Critical Assessment of protein Structure Prediction show that the current method provides comparable and complementary prediction results to existing top methods. Copyright © 2011 Wiley-Liss, Inc.

  3. Identification of structural domains in proteins by a graph heuristic

    NARCIS (Netherlands)

    Wernisch, Lorenz; Hunting, M.M.G.; Wodak, Shoshana J.

    1999-01-01

    A novel automatic procedure for identifying domains from protein atomic coordinates is presented. The procedure, termed STRUDL (STRUctural Domain Limits), does not take into account information on secondary structures and handles any number of domains made up of contiguous or non-contiguous chain

  4. Formatt: Correcting protein multiple